Sunday, January 20, 2013

Business analytics - plotting Geo-tagged data

In this section the challenge is to explore data comparing digital and traditional media options. The digital media is priced differently from the print. I think that this is not the clearest visualization in the world. But it made for a great learning experience.

To create the geographic plot I used the map package together with ggplot2 which is based on by +Leland Wilkinson's book The Grammar of Graphics. I had read the book in 2007, but found it disappointing that the system had been used only in a propitiatory format. However ggplot2 brought this powerful idea into the opensource world or R.

After researching a number of options I eventually decided using ggplot2. I preferred it since it ended up reducing the code to simple and clean form. As you can see I've improved the documentation format to include loading and installation of required packages.

Geo-Tagged sales (digital are in red paper are black)

I ended up with a small annoyance - the center of the plotted points contain a block dot which makes this look like an the results of an military exercise.

ggplot is very intuitive but does require a couple of hours to figure out. To get you started here are  a couple of articles and books which I found noteworthy of mention.
  • The ggplot tutorial I used to get started.
  • ggplot documentation I used to fix the details.
  • If R is missing your map it is possible to get more via OpenStreetMap as shown here.
  • An advanced map graphic inspired by the Financial Times 2004 election results from here, though it is not very clear what is really going on there.
  • There is a book covering this library - ggplot2: Elegant Graphics for Data Analysis by +Hadley Wickham
  • Finally the text "R Graphics" by +Paul Murrell (2006) covers using the maps package without ggplot.