Programme for International Student Assessment (PISA)

OECD has recently published a survey on Programme for International Student Assessment (PISA),  and Shanghai has firstly taken part into this survey and top all  reading, maths, and science.

Data can be download here. The map can be viewed on (use Chrome)  http://www.google.com/fusiontables/DataSource?dsrcid=347323

P.S. Google has geocoded Chinese Taipei to a Chinese Taipei school in Malaysia (Chinese Taipei School, 1 Persiaran Sungai Selangor, Bukit Rimau, 40460 Shah Alam, Selangor, Malaysia) ;o)

Posted in Data Visualisation | Leave a comment

Oxbridge Success Rate by local education agency (LEA)

Guardian data blog published an article about data visualisation on analysis of Oxford and Cambridge success rate. It also offered the raw data on Google doc for anyone to do something with this data. Oxbridge elitism: how many black and poor students go to Oxford and Cambridge?

So with Google Fusion table, we can produce a intensity map displaying all LEAs which have more than 400 applicants to both universities and more than 20% success rate in Year 2009. The table is made public for everyone to view. http://www.google.com/fusiontables/DataSource?dsrcid=342879 

Intensity Map only works well in Google Chrome

LEA (>400 Applicants) to Cambridge               LEA (>400 Applicants) to Oxford

Posted in Data Visualisation | Leave a comment

Visualisation on Top 100 Chemical Companies with Google Refine and Google FusionTable tools

On 29th November, Plant Life team (3 Developers Hackers  and 4 Journalists Hacks) managed to implement a visualisation task on top 100 Chemical Companies in less than 7 hours and won the runner up prize for the first RBI Hacks and Hacker Day event

The result look like those on http://www.google.com/fusiontables/DataSource?dsrcid=332788. If you click on the Visualize menu, try some of the options. Also click on the ‘Show Options’ to allow for filtering.

Basically the original data was in Excel spreadsheet which only has company name and relevant financial figure. So we decided to utilise two new google tools Google Refine and Google Fusion Table to visualise data.

First Part: Data Cleansing and Geocode generation, Google Refine can manipulate data easily as it explained in those three short videos listed on their website. The task we encountered was to obtain each company’s geocode. Fortunately, Google offers another Geocode API which can return geocode based on address being passed to. So with helps from four Journalists Hacks, we manged to mapped the top 50 company addresses with their geocode.

Google refine is application running on the local machine, the interface would look like the one blow when you run google-refine.exe.

The following screen shots explain how we map Geocode from Google API for all top 100 companies.

  • Click the address column where company addresses are stored, then chose edit column/add column by fetching urls that will bring on another screen as below.

  • Build up the address to Google Geocode API, and make sure your expression is valid by adding more condition to bypass Null value if they are any in your column.
  • The API will return value as this: { “status”: “OK”, “results”: [ { “types”: [ “street_address” ], “formatted_address”: “Carl-Bosch-Straße 38, 67063 Ludwigshafen, Germany”, “address_components”: [ { “long_name”: “38”, “short_name”: “38”, “types”: [ “street_number” ] }, { “long_name”: “Carl-Bosch-Straße”, “short_name”: “Carl-Bosch-Straße”, “types”: [ “route” ] }, { “long_name”: “Ludwigshafen”, “short_name”: “Ludwigshafen”, “types”: [ “sublocality”, “political” ] }, { “long_name”: “Ludwigshafen”, “short_name”: “Ludwigshafen”, “types”: [ “locality”, “political” ] }, { “long_name”: “Ludwigshafen am Rhein”, “short_name”: “Ludwigshafen am Rhein”, “types”: [ “administrative_area_level_2”, “political” ] }, { “long_name”: “Rheinland-Pfalz”, “short_name”: “RP”, “types”: [ “administrative_area_level_1”, “political” ] }, { “long_name”: “Germany”, “short_name”: “DE”, “types”: [ “country”, “political” ] }, { “long_name”: “67063”, “short_name”: “67063”, “types”: [ “postal_code” ] } ], “geometry”: { “location”: { “lat”: 49.4946500, “lng”: 8.4360000 }, “location_type”: “ROOFTOP”, “viewport”: { “southwest”: { “lat”: 49.4915024, “lng”: 8.4328524 }, “northeast”: { “lat”: 49.4977976, “lng”: 8.4391476 } } }, “partial_match”: true } ] }
  • We are only interesting in the geocode it returned in red. So we can chose edit column/add column based on this column that will pop up the window below:

  • Again fill in the expression with(value.parseJson().results[0].geometry.location, pair, pair.lat + “, ” + pair.lng) will filter out the geocode obtaining from Google API

Second Part Google Fusion Table: The next step is  to load the well formated data into Google Fusion Table.

  • Go to http://www.google.com/fusiontables/Home (assume you already have a gmail account)
  • Click New table button on the left hand top corner, then Import Data
  • Click Visualize, choose Map option.
  • Click configure styles to configure the map makers
  • Click configure Window to customise the template for pop up information window, we used a template as below
  • The Merge option next to visualize allow you merging different tables by one common column which is very handy without any database work involved.
  • The visualize option also has much more offerings, such as traditional Bar, Line, Pie Charts. However, the one thing particularly interesting is Intensity Map type.  Blow is the Sales volume on Map based on all top 100 companies 09’s sales figure. So you can easily spot the big Germany BASF (Dark Green Circle)comparing to others in terms of sales.

Finally, Google Fusion Table gives the ability to embed those charts/map into your own website by clicking “get embeddable codes”.


Posted in Data Visualisation | Tagged , | Leave a comment