Visualisation on Top 100 Chemical Companies with Google Refine and Google FusionTable tools

On 29th November, Plant Life team (3 Developers Hackers  and 4 Journalists Hacks) managed to implement a visualisation task on top 100 Chemical Companies in less than 7 hours and won the runner up prize for the first RBI Hacks and Hacker Day event

The result look like those on http://www.google.com/fusiontables/DataSource?dsrcid=332788. If you click on the Visualize menu, try some of the options. Also click on the ‘Show Options’ to allow for filtering.

Basically the original data was in Excel spreadsheet which only has company name and relevant financial figure. So we decided to utilise two new google tools Google Refine and Google Fusion Table to visualise data.

First Part: Data Cleansing and Geocode generation, Google Refine can manipulate data easily as it explained in those three short videos listed on their website. The task we encountered was to obtain each company’s geocode. Fortunately, Google offers another Geocode API which can return geocode based on address being passed to. So with helps from four Journalists Hacks, we manged to mapped the top 50 company addresses with their geocode.

Google refine is application running on the local machine, the interface would look like the one blow when you run google-refine.exe.

The following screen shots explain how we map Geocode from Google API for all top 100 companies.

  • Click the address column where company addresses are stored, then chose edit column/add column by fetching urls that will bring on another screen as below.

  • Build up the address to Google Geocode API, and make sure your expression is valid by adding more condition to bypass Null value if they are any in your column.
  • The API will return value as this: { “status”: “OK”, “results”: [ { “types”: [ “street_address” ], “formatted_address”: “Carl-Bosch-Straße 38, 67063 Ludwigshafen, Germany”, “address_components”: [ { “long_name”: “38”, “short_name”: “38”, “types”: [ “street_number” ] }, { “long_name”: “Carl-Bosch-Straße”, “short_name”: “Carl-Bosch-Straße”, “types”: [ “route” ] }, { “long_name”: “Ludwigshafen”, “short_name”: “Ludwigshafen”, “types”: [ “sublocality”, “political” ] }, { “long_name”: “Ludwigshafen”, “short_name”: “Ludwigshafen”, “types”: [ “locality”, “political” ] }, { “long_name”: “Ludwigshafen am Rhein”, “short_name”: “Ludwigshafen am Rhein”, “types”: [ “administrative_area_level_2”, “political” ] }, { “long_name”: “Rheinland-Pfalz”, “short_name”: “RP”, “types”: [ “administrative_area_level_1”, “political” ] }, { “long_name”: “Germany”, “short_name”: “DE”, “types”: [ “country”, “political” ] }, { “long_name”: “67063”, “short_name”: “67063”, “types”: [ “postal_code” ] } ], “geometry”: { “location”: { “lat”: 49.4946500, “lng”: 8.4360000 }, “location_type”: “ROOFTOP”, “viewport”: { “southwest”: { “lat”: 49.4915024, “lng”: 8.4328524 }, “northeast”: { “lat”: 49.4977976, “lng”: 8.4391476 } } }, “partial_match”: true } ] }
  • We are only interesting in the geocode it returned in red. So we can chose edit column/add column based on this column that will pop up the window below:

  • Again fill in the expression with(value.parseJson().results[0].geometry.location, pair, pair.lat + “, ” + pair.lng) will filter out the geocode obtaining from Google API

Second Part Google Fusion Table: The next step is  to load the well formated data into Google Fusion Table.

  • Go to http://www.google.com/fusiontables/Home (assume you already have a gmail account)
  • Click New table button on the left hand top corner, then Import Data
  • Click Visualize, choose Map option.
  • Click configure styles to configure the map makers
  • Click configure Window to customise the template for pop up information window, we used a template as below
  • The Merge option next to visualize allow you merging different tables by one common column which is very handy without any database work involved.
  • The visualize option also has much more offerings, such as traditional Bar, Line, Pie Charts. However, the one thing particularly interesting is Intensity Map type.  Blow is the Sales volume on Map based on all top 100 companies 09’s sales figure. So you can easily spot the big Germany BASF (Dark Green Circle)comparing to others in terms of sales.

Finally, Google Fusion Table gives the ability to embed those charts/map into your own website by clicking “get embeddable codes”.


Advertisements

About data visualisation

Software developer interested in Web Data Visualisation
This entry was posted in Data Visualisation and tagged , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s