A Seasonal Sociogram for Learning Analytics Research

SoLAR, the Society for Learning Analytics Research has recently made available a dataset covering research publications in learning analytics and educational data mining and issued the LAK Data Challenge, challenging the community to use the dataset to answer the question:

What do analytics on learning analytics tell us? How can we make sense of this emerging field’s historical roots, current state, and future trends, based on how its members report and debate their research?

Thanks to too many repeats on the TV schedule I managed to re-learn a bit of novice-level SPARQL and manipulate the RDF/XML provided into a form I can handle with R.

Now, I’ve had a bit of a pop at the sociograms – i.e. visualisations of social networks – in the past but they do have their uses and one of these is getting a feel for the shape of a dataset that deals with relations. In the case of the LAK challenge dataset, the relationship between authors and papers is such a case. So as part of thinking about whether I’m up for the approaching the challenge from this perspective it makes sense to visualise the data.

And with it being the Christmas season, the colour scheme chose itself.

Bipartite Sociogram for Paper Authorship for Proceedings from LAK, EDM and the JETS Special Edition on Learning Analytics

Paper Authorship for Proceedings from LAK, EDM and the JETS Special Edition on Learning and Knowledge Analytics (click on image for full-size version)

This is technically a “bipartite sociogram” since it shows two kinds of entity and relationships between types. In this case people are shown as green circles and papers shown as red polygons. The data has been limited to the conferences on Learning Analytics and Knowledge (LAK) 2011 and 2012 (red triangles) and the Educational Data Mining (EDM) Conference for the same years (red diamonds). The Journal of Educational Technology and Society special edition on learning and knowledge analytics was also published in 2012 (red pentagons). Thus, we have a snapshot of the main venues for scholarship vicinal to learning analytics.

So, what does it tell me?

My first observation is that there are a lot of papers that have been written by people who have written no others in the dataset for 2011/12(from now on, please assume I always mean this subset). I see this as being consistent with this being an emergent field of research. It is also clear that JETS attracted papers from people who were not already active in the field. This is not the entire story, however as the more connected central region of the diagram shows. Judging this region by eye and comparing it to the rest of the diagram, it looks like there is a tendency for LAK papers (triangles) to be under-represented in the more-connected region compared to EDM (diamonds). This is consistent with EDM conferences having been run since 2008 and their emergence from workshops on the Artificial Intelligence in Education. LAK, on the other hand began in 2011. Some proper statistics are needed to confirm judgement by eye. It would be interesting to look for signs of evolution following the 2013 season.

A lot of papers were written by people who wrote no others.

A lot of papers were written by people who wrote no others.

The sign of an established research group is the research group head who co-authors several papers with each paper having some less prolific co-authors who are working for the PhDs. The chief and Indians pattern. A careful inspection of the central region shows this pattern as well as groups with less evidence of hierarchy.

Cheif and Indians.

Chief and Indians.

A less hierarchical group.

A less hierarchical group.

LAK came into being and attracted people without a great deal of knowledge of the prior existence of the EDM conference and community so some polarisation is to be expected. There clearly are people, even those with many publications, the have only published to one venue. Consistent with previous comments about the longer history of EDM it isn’t surprising that this is most clear for that venue since there are clearly established groups at work. What I think will be some comfort to the researchers in both camps who have made efforts to build bridges is that there are signs of integration (see the Chiefs and Indians snippet). Whether this is a sign of integrating communities or a consequence of individual preference alone is an open question. Another question to consider with more rigour and something to look out for in the 2013 season.

Am I any the wiser? Well… slightly, and it didn’t take long. There are certainly some questions that could be answered with further analysis and there are a few attributes not taken account of here, such as institutional affiliation or country/region. I will certainly have a go at using the techniques I outlined in a previous post if the weather is poor over the Christmas break but I think I will have to wait until the data for 2013 is available before some of the interesting evolutionary shape of EDM and LAK becomes accessible.

Merry Christmas!

Looking Inside the Box of Analytics and Business Intelligence Applications

To take technology and social process at face value is to risk failing to appreciate what they mean, do, and can do. Analytics and business intelligence applications or projects, in common with all technology supported innovations, are more likely to be successful if both technology and social spheres are better understood. I don’t mean to say that there is no room for intuition in such cases, rather that it is helpful to decide which aspects are best served by intuition or not and by whose intuition, if so. But how to do this?

Just looking can be a poor guide to understanding an existing application and just designing can be a poor approach to creating a new one. Some kind of method, some principles, some prompts or stimulus questions – I will use “framework” as an umbrella term – can all help to avoid a host of errors. Replication of existing approaches that may be obsolete or erroneous, falling into value or cognitive traps, failure to consider a wider range of possibilities, etc are errors we should try to avoid. There are, of course, many approaches to dealing with this problem other than a framework. Peer review and participative design have a clear role to play when adopting or implementing analytics and business intelligence but a framework can play a part alongside these social approaches as well as being useful to an individual sense-maker.

The culmination of my thinking about this kind of framework has just been published as the seventh paper in the CETIS Analytics Series, entitled “A Framework of Characteristics for Analytics“. This started out as a personal attempt to make sense of my own intuitive dissatisfaction with the traditions of business intelligence combined with concern that my discussions with colleagues about analytics were sometimes deeply at cross purposes or just unproductive because our mental models lacked sufficient detail and clarity to properly know what we were talking about or to really understand where our differences lay.

The following quotes from the paper.

A Framework of Characteristics for Analytics considers one way to explore similarities, differences, strengths, weaknesses, opportunities, etc of actual or proposed applications of analytics. It is a framework for asking questions about the high level decisions embedded within a given application of analytics and assessing the match to real world concerns. The Framework of Characteristics is not a technical framework.

This is not an introduction to analytics; rather it is aimed at strategists and innovators in post-compulsory education sector who have appreciated the potential for analytics in their organisation and who are considering commissioning or procuring an analytics service or system that is fit for their own context.

The framework is conceived for two kinds of use:

  1. Exploring the underlying features and generally-implicit assumptions in existing applications of analytics. In this case, the aim might be to better comprehend the state of the art in analytics and the relevance of analytics methods from other industries, or to inspect candidates for procurement with greater rigour.
  2. Considering how to make the transition from a desire to target an issue in a more analytical way to a high level description of a pilot to reach the target. In this case, the framework provides a starting-point template for the production of a design rationale in an analytics project, whether in-house or commissioned. Alternatively it might lead to a conclusion that significant problems might arise in targeting the issue with analytics.

In both of these cases, the framework is an aid to clarify or expose assumptions and so to help its user challenge or confirm them.

I look forward to any comments that might help to improve the framework.