Grade Inflation UGA vs GaTech

I have a quick chart here showing grade inflation by schools. It would seem that my Alma Mater experienced less grade inflation than UGA. UGA had a significant bump during the 1990s. Were professors just more liberal with their grades during that time? It could also be that Georgia experienced an increase in higher quality applicants.

Other GA Schools are also available, play with the tableau viz below:

GPA Inflation

Discrete and Continous coloring in Tableau.

This has long bothered me in certain circumstances.  If you have a null value in a continuous pill then Tableau will color that null as if it were a zero.  Under most circumstances this is an okay solution, and I have to credit the programmers; it is better than the alternative.  When in doubt, show something and show it predictably.

So here is the problem I have faced.  Lets say zero is good, 10 is bad, and NULL is indifferent. Tableau will display null as the same color as zero.  Now lets say zero is good, 10 is bad, and NULL is BAD.  Now lets say 10 is good, -10 is bad, and NULL is something else.  When you place data into Tableau it will color nulls as zero.

However, Tableau is awesome and I was able to find an easy solution:


The secret is not really hard.  Basically all you need is two calculated fields, and then place  a  “transparent” image in your custom icons.

Formula 1
Float:  FLOAT([Type1])

This will take a number and convert it into a float, if there is a NULL or text it will make them all NULL.

Formula 2&3
NullFill_1: case [float]  when null then ‘NULL’
else ‘FILL’

Now just duplicate NullFill_1 as NullFill_2.

Formula 4:

This “1″ is solely just a place holder, call it anything and set it to 1.   Now take “1″ and place it on the column shelf twice.  Place Nullfill_1 on one shelf and NullFill_2 on another shelf and make 1 transparent, and give another a real shape.  If you need a transparent icon, just use this one:

NA blank trans

Now your Tableau workbook should look like this:


Now just dual axis those, adjust some of the legends and you will get this awesomeness!


Simple Hex Binning using R

R has of course numerous packages available.  One of the packages is hexbin.  Hexbinning gives the user a way to visualize high-density scatterplots.  There is a way to build it in Tableau without R, but it involves many more calculated fields.  A very simple way is to use hexbin() in R and using a Tableau custom shape.

Now this method relies on using the transparency field to duplicate the ‘look’ of a hexbin, by overlaying the hexagons on top of each other.  You can see that if you highlight one of the hexagons it will say ’5 items selected.’  This method does not give the actual function of grouping things into a hexagonal bin.  Here is the problem with this simple solution.  Tableau likes to receive the exact same number of rows as it sends to R.  Thus it is not possible (yet…) to send 50 rows and receive back 5 rows with a count.  This is essential for a true hexbin implemenation (I am hot on the trail of an idea around this).

Here are the formulas

Hexbin X:
SCRIPT_REAL(‘library(hexbin);hbin<-hexbin(.arg1,.arg2,xbins = .arg3,xbnds = c(-1,1),ybnds = c(-1,1));xys <- hcell2xy(hbin);xys$x’ ,avg(randx),avg(randy),[Bins])

Hexbin Y:
SCRIPT_REAL(‘library(hexbin);hbin<-hexbin(.arg1,.arg2,xbins = .arg3,xbnds = c(-1,1),ybnds = c(-1,1));xys <- hcell2xy(hbin);xys$y’,avg(randx),avg(randy),[Bins])

 (Notice the distinction here xys$x vs xys$y)

Here is what the Script is doing.  Library(hexbin) loads the library.  You may need to install hexbin on your R instance first and here is what the hexbin formula likes to see:  hexbin()

x: The x values
y: The y values
xbins # of bins
xbnds The +/- bound for x
ybnds The +/- bound for y

The next command hcell2xy simply prints the hexagon’s coordinates for each row so that Tableau can then receive it back and then display it.

Possible errors:

  • R error:  “xbnds[1] < xbnds[2] (or ybnds[1] < ybnds[2] )”

This specific error means that the ‘xbnds’ would like to see the lowest bound to the highest bound.

  • Hexagons do not match up and there is a star looking negative space between the ‘bins’. You just need to swap the axis.

But luckily this is a 1 button fix :)  :


Oh one more thing, here are two good hexbin files that you can use for custom shapes.  Add these to the \My Tableau Repository\Shapes\My Custom Shapes\ directory.



Here is the working Tableau Packaged workbook.  Note that due to the R integration, I cannot upload this to Tableau Public.

Using an R function and Tableau 8.1 to map custom areas

The R Project for Statistical Computing has a smörgåsbord of functions available to use.  This is obvious to anyone who has spent 30 minutes with the program.  Tableau is also equally awesome and really excels for displaying maps.  Again this should be manifest for anyone who’s spent 3 minutes with the product.
(Finished Product)

Steps to Mapping a District using Tableau and R.

Tableau 8.1 now includes R integration.  This is my first major application of R and Tableau.  I had played around with it before, using seasonal decomposition to remove the seasonal swings, but that was mainly for a personal view of the data and not anything to display publically.   This post will outline how to do something very cool by combining Tableau’s easy mapping features AND R’s powerful packages.   (want to skip over all these boring instructions? here is the finished workbook)

Basic setup needed before you attempt anything in Tableau (hold your horses we’ll get there soon):

Step 1:  Install R and R Studio.
Step 2:  Within R Studio, install the Rserve package.

Tools -> Install Packages

Type Rser and choose Rserve (and type the “ve” if you’re the OCD – intellisense hater type).
Choose Rserve.

Step 3:  Start up Rserve from the Rstudio Console:

> library(Rserve);
> Rserve()

Step 4: Connect Tableau to your local Rserve and test the connection

Use localhost and then test the connection:

BAM! You’re ready to rock with the data rock stars!   \m/ (>.<) \m/
But now onto the hard stuff in Tableau (see that didn’t take much time):

Fields you need:

  • Points you want to map.
  • A Group or Grouping Hierarchy (if need be just create a calculated field Group = “group”)
  • Latitude
  • Longitude

Tableau calculated Fields:

Size: = SIZE()

R-Derived Calculated Fields (In Tableau):

SCRIPT_REAL(“X<- matrix(,nrow=.arg1, ncol=2 );X[,1]<-.arg2;X[,2]<-.arg3;”+
“Y <- (ifelse(X[,1] %in% X[chull(X)],1,0));”+
“Z <- matrix(0,ncol = 1,nrow =.arg1);Z[c(chull(X)),1] <- seq_along(chull(X));Z”,[Size],attr([Lat]),attr([Long]) )

2nd Tableau calculated Field:

case [Script_PolyOrder] when 0.0 then ‘aInterior’
else ‘Exterior’

The first field “Script_PolyOrder” simply determines the polygon order of the points.  The second script then determines whether the point is interior or exterior.  I forced ‘Interior’ into ‘aInterior’ to make it come before Exterior.  “Why didn’t your just rename ‘Exterior’ ‘Outside’?”  Okay I realize that now as I write this blog, but I was going for pure function when I wrote this Tableau file.  Below are two additional optional fields, which are useful for experimenting and determining how Tableau sees things.  I will go more into this later.

SCRIPT_REAL(“X<- matrix(,nrow=.arg1, ncol=2 );X[,1]<-.arg2;X[,2]<-.arg3;”+
“Y <- (ifelse(X[,1] %in% X[chull(X)],1,0));”+
“Z <- X*Y;X[,1]“,[Size],attr([Lat]),attr([Long]) )

SCRIPT_REAL(“X<- matrix(,nrow=.arg1, ncol=2 );X[,1]<-.arg2;X[,2]<-.arg3;”+
“Y <- (ifelse(X[,1] %in% X[chull(X)],1,0));”+
“Z <- X*Y;X[,2]“,[TotalItems],attr([Lat]),attr([Long]) )

Building the Map:

Drag the pills on the map like this:blog_pic5

And dual axis the 2nd Lat pill.  The dual axis part can be done later if you’d likeIt is somewhat nice for clarity to build the different parts separately and THEN see them some together.

Make the first Lat a Polygon.  Drag “Group” onto the detail shelf.  Uncheck Aggregate Measures in the Analysis Menu.  Drag Script_PolyOrder onto path and WHAM you have a polygon!

Now this is the important part for visualization.  Make the 2nd Lat pill the Circle (or shape if you want to) and make sure the first Lat pill is the polygon.  This is important because Tableau places the 2nd pill over the top of the first one.

Shelf Examples:

 BAD shelf / Bad Vis example:


Good Shelf / Good Viz Example:



Now you have an outline for the “Group” and the interior points for that group do not matter to the polygon.  Thus preventing any sort of jagged border that looks ugly.

Last Step:

Drag the calculate field Script_In_Ext onto the page shelf.  Check “Show History”.  Move the page shelf to the last page “Ext”.  Adjust the “Fade” to fade out the previous.   Now you are finished and it should look amazing.  Keep reading if you want to know the details of how it was achieved:

Alternate strategy:

There is an another option I toyed with:

Basically if you already know your polygon order, and do not wish to have a dynamic polygon drawing, you can ignore using the page shelf and R entirely.  Create a uniqueID which separates interior points from themselves and groups exterior points into the same group.


case [Script_In_Ext] when ‘In’ then “Str”+str(Attr([LowerHierarchy]))
else “D” + str(attr([NextLevelUpHierarchy]))

Add this to the shelf in place of your lowest hierarchy.  In essence this calculated field dumps interior points into their own unique 1-point polygon and groups all the exterior points into 1 polygon.  This was the first way I solved it, but looking back this is slightly more complicated AND requires me to know the polygon order.  Adding in the polygon order in Excel is another post but it is possible and maybe preferable in some circumstances.

One benefit of this is by integrating R somewhere earlier in the pipe (Say with Rexcel or KNIME) you don’t need to bog down the visualization server with calculations (and recalculations).  This is a good solution for many cases.

Tableau and R

Tableau wants to see Matrices and vectors from R.  You can only pass 1 column of a matrix back to Tableau.

Lets go over Script_Lat.  In this calculated field, you will pass the latitude and longitude to R, and get R to pass back the Latitude.  This is purely illustrative and not needed to actually function

SCRIPT_REAL(“X<- matrix(,nrow=.arg1, ncol=2 );X[,1]<-.arg2;X[,2]<-.arg3;”+
“Y <- (ifelse(X[,1] %in% X[chull(X)],1,0));”+
“Z <- X*Y;X[,1]“,[Size],attr([Lat]),attr([Long]) )

First function:

X<- matrix(,nrow=.arg1, ncol=2 );

This creates a matrix X with a variable number of rows and 2 columns.  This number of rows is based on the Size calculated field, which is the first argument after the R code in the Tableau

2nd & 3rd Functions:


Load Column 1 of matrix X with the 2nd argument, which is latitude, and load longitude into the 2nd column.

4th Function the confusing part:

Y <- (ifelse(X[,1] %in% X[chull(X)],1,0));

Basically save a matrix Y of the results of an if-then statement of points of X that are in the Convex Hull results of X.  I know it’s hard to describe:

5th function:

Z <- X*Y;

Save a Matrix Z which is matrix X and vector Y note that this is an “Element-Wise Multiplication” and not true matrix multiplication.  This zeros outs any Interior points leaving only exterior points.  The Tableau calculated field then uses this.  This is a screen grab of Z within RStudio using random data showing zeros for the interior points.

6th function:


Now return Latitude back.  X[,1] X[,2] or Z[,1] .  This is the last step Tableau expects to see.  Finished Calc’d field in Tableau:

Possible Errors:

You may get this error (possibly many times :D ) :blog_pic10

Press details.  It could be that you need to start Rserve.  Or you could need re-do the Size field (configuring the size field was difficult, play with Aggregate Measures / use Tableau’s Total fields or generally adjust so you’re sending R the correct # of elements to R). Or you did some sort of error within R.  If the problem is R syntax, I suggest following the trail within RStudio to determine the error.  Use this to generate random points and start debugging:
X <- matrix(stats::rnorm(100), ncol = 2)


Personal Traffic patterns

For the past two years I have been using OpenPaths to analyze my personal movement data.  I looked at the data through Tableau and came up with some cool views. I decided to slice the data in new ways by using longitude and latitude binning. This easy Excel map hack gave me ideas and I then created a variable latitude/longitude binning method in Tableau.  One problem of this approach is that the bin sizes are pretty static.  The variable zoom method uses a formula of Round([x]*[Zoom],0)/[Zoom]  instead of just rounding to a magnitude of 10.  So for instance, placing the formula =ROUND(A1*4,0)/4 into excel will round the values to the nearest quarter.  A formula of =ROUND(A1*100,0)/100 will round it to two digits.

I used that binning method to create a series of maps tracking my personal location.  In this first map I have a general record of two paths I took to work.  Each bin is colored by the number of records.  Red is high, green is low.


The next map shows those two routes.  I took the average velocity for that bin and colored it by that.  Lower velocity is in blue while a higher velocity is in red.  What is interesting is that the route to job 1 was much faster, while route 2 had several sections of slowness (bottlenecks)


This third map is interesting, showing the effect of a traffic light on velocity.  I only analyzed this location on the afternoon commute (when I traveled north), so you can see a blue coloring leading to the intersection and immediately after, followed by a bright red (fast) bin.  As velocity is calculated using the distance from the previous measurement, and a congested intersection takes a little bit of time to clear out, it makes sense that this would lag the intersection some.


Full listing of Obamacare Health Care Plans. Price visualization.

So here is a Tableau visualization of all the Obamacare health plans for states that elected to use the federal exchange (not sure about the states where the Feds and state Gov’t split some responsibility). Here is a breakout map. The values differ by county so I am showing the averages. Full visualization of all the health plans can be found here.

Clicking on any circle, or highlighting a section will bring up those plans below. You can then sort and filter as needed.

Tenure and the important of direction and movement in a graphic.

Here is a tenure change report that I did awhile ago in a data viz class. I wanted to post it because I think the page shelf in Tableau is under-utilized. All the pages are shown at once by using the ‘show history’ option in Tableau. This example here shows how tenure in some positions have changed, but others have not. I have two graphs showing the same thing, but in the chart at the bottom I have turned everything to the right. Here are some tips I’ve stumbled upon:

  • A dotted line seems to show ‘something is changing’ better than a solid. It shows impermanence.
  • The past tends to fade. Fade (adding transparency or whiteness) seems to indicate the past better than a change in hue.
  • Keep the change 1 dimensional. After all our dimension of time is one dimensional (and so far one directional).  Sometimes you can get away with 2 though as Hans Rosling has done in
  • Up does not mean the same as Down, Left, or Right.

In this chart, I have deliberately chosen Up rather than right.

2nd chart: Notice how everything looks ‘faster’

Although the 1st graph has a wider professor category area, the Number of Positions axis on both are relatively close in size. (Please note that you need to measure to 160k, I forced it to 180k on the 2nd graph to get the label in). However the small changes in perspective change this graph completely.

Top paid Georgia University Employees

Recently I read an article from the AJC showing the top paid GA State employees. Their article can be found here.  (note that at the time of this publishing their website does not link to the article).  I have made a dashboard showing who gets paid the most by university.  Travel and Salary expenses are included. First of all,this excludes the coaches, who get paid a large amount.  Secondly if the person started the position half-way through the year then their pay is only 1/2 what it should have been (i.e. Savannah State’s President). Third thing to know is that there was no Part-Time or Full-Time designation in the data I received. As such you will have to decide on an arbitrary ‘part-time cut off’ using the slider at the top. I have decided on a $10,000 level. You can put in $15k if you wish. It’s really not certain what part-time is. This could make a huge difference.

The ‘ratio’ column is the ratio between the highest-paid person and the average pay for the university (again that average pay is dependent on the slider).

The graph shows the university’s average pay, median pay (both based on the filter), max pay, and the president’s pay.  The person’s name to the right is the highest paid person for that university,  If you hover over the colored box next to their name you will see their position.

All the data came from It includes Travel and Salary. One special thing to note is that apparently some people may be paid by the university foundations and that pay may not be included in this dashboard.

Full view is here if the ads get in the way