Standards used in JISC programmes and projects over time

Today I took part in an introduction to R workshop being held at The University of Manchester. R is a software environment for statistics  and while it does all sorts of interesting things that are beyond my ability one thing that I can grasp and enjoy is exploring all the packages that are available for R, these packages extend Rs capabilities and let you do all sorts of cool things in a couple of lines of code.

The target I set out for myself was to use JISC CETIS Project Directory data and find a way of visualising standards used in JISC funded projects and programmes over time. I found a Google Visualisation package and using this I was surprised at how easy it was to generate an output , the hardest bits being manipulating the data (and thinking about how to structure it).  Although my output from the day is incomplete I thought I’d write up my experience while it is fresh in my mind.

First I needed a dataset of projects, start dates, standards and programme. I got the results in CSV format by using the sparqlproxy web service that I use in this tutorial and stole and edited a query from Martin

Sparql:

PREFIX rdfs:
PREFIX jisc:
PREFIX doap:
PREFIX prod:
SELECT DISTINCT ?projectID ?Project ?Programme ?Strand ?Standards ?Comments ?StartDate ?EndDate
WHERE {
?projectID a doap:Project .
?projectID prod:programme ?Programme .
?projectID jisc:start-date ?StartDate .
?projectID jisc:end-date ?EndDate .
OPTIONAL { ?projectID prod:strand ?Strand } .
# FILTER regex(?strand, “^open education”, “i”) .
?projectID jisc:short-name ?Project .
?techRelation doap:Project ?projectID .
?techRelation prod:technology ?TechnologyID .
FILTER regex(str(?TechnologyID), “^http://prod.cetis.ac.uk/standard/”) .
?TechnologyID rdfs:label ?Standards .
OPTIONAL { ?techRelation prod:comment ?Comments } .
}

From this I created a pivot table of all standards, and how much they appeared in each projects and programmes for each year (using the project start date). After importing this into R, it took two lines to grab the google visualisation package and plot this as Google Visualisation Chart.

library(“googleVis”)
M = gvisMotionChart(data=prod_csv, idvar=”Standards”, timevar=”Year”, chartid=”Standards”)

Which gives you the ‘Hans Rosling’ style flow chart. I can’t get this to embed in my wordpress blog, but you can click the diagram to view the interaction version. The higher up a standard is the more projects it is in and the further across it goes the more programmes it spans.

Google Visualisation Chart

Some things it made me think about:

  1. Data from PROD is inconsistent
  2. Standards can be spelt differently; some programmes/projects might have had a more time spent on inputting related standards than others

  3. How useful is it?
  4. This was extremely easy to do, but is it worth doing? I feel it has value for me because its made me think about the way JISC CETIS staff use PROD and the sort of data we input. Would this be of value to anybody else?  Although it was interesting to see the high number of projects across three programmes that involved XCRI in 2008.

  5. Do we need all that data?
  6. There are a lot of standards represented in the visualisation. Do we need them all? Can we concentrate on subsets of this data.

8 thoughts on “Standards used in JISC programmes and projects over time

  1. Hi David

    This is great – and you raise some very interesting issues, about our use and collection of data. I’ve found that there lots of standards and actually more different types of technologies in use across programmes. However there are very low instances of lots of them which makes visualisation and making sense of things quite challenging. So, as we discussed yesterday maybe what we need to do to a further refinement of the most popular ones, and then have maybe just a simple list of the other instances as an appendix. Lots to think about!

    I’d going to try and to do this for a specific programme ans see what that looks like.

  2. Hi David,

    You make this look stunningly simple. Just discovering the tribulations of R myself and like you still find data manipulation a bit of a head scratcher. For example would have never known that as.data.frame(table(dataset$acolumnname)) would look at acolumnname in a dataset and create a frequency table!

    Anyway great work and I look forward to learning more about R from the both of you

    Martin

  3. Hi Martin,

    I have been finding data manipulation hard (and sometimes doing it in a Google spreadsheet first). The quite a few things that I wouldn’t have worked out on my own. I find Rs huge community is a real help for things like this.

  4. Pingback: OER Visualisation Project: Exploring automated reporting using linked data and R/Sweave/R2HTML [day 36] – MASHe

  5. Pingback: OER Visualisation Project: Exploring automated reporting using linked data and R/Sweave/R2HTML [day 36] – MASHe

  6. Pingback: OER Visualisation Project: Fin [day 40.5] – MASHe

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>