Getting data out of PROD and its triplestore

For a while I have been wondering about the best way of creating a how-to guide around getting data out of the JISC CETIS project directory and in particular around its linked data triple store. A few weeks ago Martin Hawksy posted some great examples of work he’s been doing, including maps using data generated by PROD, I think these examples are great and thought that they would be a good starting point for a how to guide.

Don’t be put off by scary terms as I think these things are relatively easy to do and I’ve left out as much technobabble as possible, The difficulty really lies with knowing both the location of various resources and some useful tricks. I’ve split the instructions into 3 steps.

  1. Getting data out PROD in a Google Spreadsheet
  2. Getting Institution, Long and Lat data out of PROD
  3. Mapping with Google maps.

The  steps currently live in a Google Doc while I update them. I’ve also created short screen casts of me following the instructions in case anybody get lost. Hopefully from here you have built up enough confidence to edit the queries for different results, the Google Spreadsheet Mapper to change the look and feel of your map or explore some the technologies behind the techniques.

You’ll want the Step By Step Guide and you can see Sheila’s example in her post here.

Obtain Data from PROD (via Talis store) to populate a google doc

http://youtu.be/U5FXuNmpqN4

Getting Prod Project Data w/ Long Lat Data into Google Spreadsheet

Mapping PROD Data with Google Maps

Visualisation session at the CETIS conference. Thoughts and resources.

We are 34 days away from the CETIS conference. On day two I have signed up to a session on Social Network Analysis and Data Visualisation being run by Sheila and Lorna. I’m really looking forward to the session as recently I have been thinking about visualisations, what they mean and how they can be used in the most effective manner and I have found understanding them quite difficult. I am only just getting my head around the area and hope that the session might be a hub for the experienced to share some of their protips. I thought that by airing some of my questions and sharing some favourite resources might be a good way to get the tips rolling in and a conversation going before the event. I guess that everybody at the session will have his or her own interests and questions and I would be interested to know what these are.

Tips please

Some of the questions I have:

  • When are visualisations useful, when are they not and what makes a good visualisation?
  • When is a visualisation more then a bunch of lines connecting things?
  • What models sit behind the visualisation?
  • How do you find and validate good data, particularly data about social networks?
  • What are the most effective ways of visualisation, any tips on development environment?

Resources

I’d also be grateful for any resources you think might be useful. I’ll start with two I’m working with at the moment.

  • A github repository belonging to Adam Cooper with examples to find emergent trends and “weak signals” in paper abstracts.
  • A handy book that aims to “introduce the principles of statistics and modern statistical analysis for a non-mathematical audience”. Does it well and introduces R at the same time.

Looking forward to the session and a protip from me.