<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	>

<channel>
	<title>Adam Cooper's Work Blog</title>
	<atom:link href="http://blogs.cetis.ac.uk/adam/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.cetis.ac.uk/adam</link>
	<description>JISC CETIS and other projects</description>
	<pubDate>Wed, 15 May 2013 15:18:55 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.7</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Learning Analytics Interoperability</title>
		<link>http://blogs.cetis.ac.uk/adam/2013/05/03/learning-analytics-interoperability/</link>
		<comments>http://blogs.cetis.ac.uk/adam/2013/05/03/learning-analytics-interoperability/#comments</comments>
		<pubDate>Fri, 03 May 2013 15:43:10 +0000</pubDate>
		<dc:creator>adam</dc:creator>
		
		<category><![CDATA[analytics]]></category>

		<category><![CDATA[standards]]></category>

		<guid isPermaLink="false">http://blogs.cetis.ac.uk/adam/?p=660</guid>
		<description><![CDATA[The ease with which data can be transferred without loss of meaning from a store to an analytical tool - whether this tool is in the hands of a data scientist, a learning science researcher, a teacher, or a learner – and the ability of these users to select and apply a range of tools [...]]]></description>
			<content:encoded><![CDATA[<p>The ease with which data can be transferred without loss of meaning from a store to an analytical tool - whether this tool is in the hands of a data scientist, a learning science researcher, a teacher, or a learner – and the ability of these users to select and apply a range of tools to data in formal and informal learning platforms are important factors in making learning analytics and educational data mining efficient and effective processes.</p>
<p>I have recently written a report that describes, in summary form, the findings of a survey into: a) the current state of awareness of, and research or development into, this problem of seamless data exchange between multiple software systems, and b) standards and pre-standardisation work that are candidates for use or experimentation. The coverage is, intentionally, fairly superficial but there are abundant references.</p>
<p>The paper is available in three formats: <a href="http://blogs.cetis.ac.uk/adam/files/2013/05/learning-analytics-interoperability-v1p1odt.zip">Open Office</a>, <a href="http://blogs.cetis.ac.uk/adam/files/2013/05/learning-analytics-interoperability-v1p1.pdf">PDF</a>, <a href="http://blogs.cetis.ac.uk/adam/files/2013/05/learning-analytics-interoperability-v1p1.docx">MS Word</a>. If printing, note that the layout is "letter" rather than A4.</p>
<p>Comments are very welcome since I intend to release an improved version in due course.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cetis.ac.uk/adam/2013/05/03/learning-analytics-interoperability/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Open Standards Board and the Cabinet Office Standards Hub</title>
		<link>http://blogs.cetis.ac.uk/adam/2013/04/22/open-standards-board-and-the-cabinet-office-standards-hub/</link>
		<comments>http://blogs.cetis.ac.uk/adam/2013/04/22/open-standards-board-and-the-cabinet-office-standards-hub/#comments</comments>
		<pubDate>Mon, 22 Apr 2013 17:00:00 +0000</pubDate>
		<dc:creator>adam</dc:creator>
		
		<category><![CDATA[standards]]></category>

		<guid isPermaLink="false">http://blogs.cetis.ac.uk/adam/?p=647</guid>
		<description><![CDATA[Early last week the government announced the Open Standards Board had finally been convened via a press release from Francis Maude, the Minister for the Cabinet Office, and via a blog post from Liam Maxwell, the government's Chief Technology Officer. This is a welcome development but what chuffed me most was that my application to [...]]]></description>
			<content:encoded><![CDATA[<p>Early last week the government announced the Open Standards Board had finally been convened via a <a href="https://www.gov.uk/government/news/open-standards-board-members-revealed">press release from Francis Maude</a>, the Minister for the Cabinet Office, and via a <a href="http://digital.cabinetoffice.gov.uk/2013/04/15/the-open-standards-board/">blog post from Liam Maxwell</a>, the government's Chief Technology Officer. This is a welcome development but what chuffed me most was that my application to be a Board member had been successful.</p>
<p>I say "finally" because it has taken quite a while for the process to move from a shadow board and a consultation on policy (<a href="http://blogs.cetis.ac.uk/adam/2012/05/02/uk-government-open-standards-consultation-cetis-response/">Cetis provided feedback</a>), through an extension of the consultation to allay fears of bias in a stakeholder event, analysis of the comments, publication of the <a href="http://www.cabinetoffice.gov.uk/news/government-bodies-must-comply-open-standards-principles">final policy</a>, and deciding on the role of the Open Standards Board. The time taken has been a little frustrating but I take comfort from my conclusion that these delays are signs of a serious approach, that this is not an empty gesture.</p>
<p>Before going on, I should publicly recognise the contribution of others that enabled me to make a successful application. Firstly: Jisc has provided the funding for Cetis and a series of supporters(*) for the idea of open standards in Jisc has kept the flame alive. Many years ago they had the vision and stuck with it in spite of wider scepticism, progress that has been often slow, a number of flawed standards (mistakes do happen), and the difficulty in assessing return on investment for activities that are essentially systemic in their effect. Secondly: my colleagues in Cetis from whom I have harvested wisdom and ideas and with whom I have shared revealing (and sometimes exhausting) debate. Looking back at what we did in the early 2000's, I think we were quite naive but so was everyone else. I believe we now have much more sophisticated ideas about the process of standards-development and adoption, and of the kinds of interventions that work. I hope that is why I was successful in my application.</p>
<p>The Open Standards Board is concerned with open standards for government IT and is closely linked with actions dealing with Open Source Software and Open Data. All three of these are close to our hearts in Cetis and we hope both to contribute to their development (in government and the wider public sector) as well as helping there to be a bit more spill-over into the education system.</p>
<p>The public face of Cabinet Office open standards activity is the <a href="http://standards.data.gov.uk/" target="_blank">Standards Hub</a>, which gives anyone the chance to nominate candidates to be addressed using open standards and to comment on the nominations of others. I believe this is the starting point for the business of the Board. The suggestions are bit of a mixed bag and the process is in need of more suggestions so - to quote the Hub website - if you know of an open standard that could be "applied consistently across the UK government to make our services better for users and to keep our costs down", you know what to do!</p>
<p>The Open Standards Board has an <a href="http://standards.data.gov.uk/osb/members">interesting mix of members</a> and I'm full of enthusiasm for what promises to be an exciting first meeting in early May.</p>
<p>----</p>
<p>* - there are too many to mention but the people Cetis had most contact with include Tish Roberts, Sarah Porter and Rachel Bruce.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cetis.ac.uk/adam/2013/04/22/open-standards-board-and-the-cabinet-office-standards-hub/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Analytics is Not New!</title>
		<link>http://blogs.cetis.ac.uk/adam/2013/01/21/analytics-is-not-new/</link>
		<comments>http://blogs.cetis.ac.uk/adam/2013/01/21/analytics-is-not-new/#comments</comments>
		<pubDate>Mon, 21 Jan 2013 18:05:52 +0000</pubDate>
		<dc:creator>adam</dc:creator>
		
		<category><![CDATA[analytics]]></category>

		<guid isPermaLink="false">http://blogs.cetis.ac.uk/adam/?p=640</guid>
		<description><![CDATA[As we collectively climb up the hype cycle towards the peak of inflated expectations for analytics, and I think this can be argued for many industries and applications of analytics, a bit of historical perspective makes a good antidote both to exaggerated claims but also to the pessimists who would say it is "all just [...]]]></description>
			<content:encoded><![CDATA[<p>As we collectively climb up the hype cycle towards the peak of inflated expectations for analytics, and I think this can be argued for many industries and applications of analytics, a bit of historical perspective makes a good antidote both to exaggerated claims but also to the pessimists who would say it is "all just hype".</p>
<p>That was my starting point for a  paper I wrote towards the end of 2012 and which is now published as "<a href="http://publications.cetis.ac.uk/2012/529" target="_blank">A Brief History of Analytics</a>". As I did the desk research, three aspects recurred:</p>
<ol>
<li>much that appears recent can be traced back for decades;</li>
<li>the techniques being employed by different communities of specialists are rather complementary;</li>
<li>there is much that is not under the narrow spotlight of marketing hype and hyperbole.</li>
</ol>
<p>The historical perspective gives us inspiration in the form of <a href="http://en.wikipedia.org/wiki/Florence_Nightingale" target="_blank">Florence Nightingale</a>'s pioneering work on using statistics and visualisation to address problems of health and sanitation and to make the case for change. It also reminds us that Operational Researchers (Operations Researchers) have been dealing with complex optimisation problems including taking account of human factors for decades.</p>
<p>I found that writing the paper helped me to clarify my thinking about  what is feasible and plausible and what the likely kinds of success  stories for analytics will be in the medium term. Most important, I think, is that our collective heritage of techniques for data analysis, visualisation and use to inform practical action shows that the future of analytics is a great deal richer than the next incarnation of Business Intelligence software or the application of predictive methods to <a href="http://en.wikipedia.org/wiki/Big_data">Big Data</a>. These have their place but there is more; analytics has many themes that combine to make it an interesting story that unfolds before us.</p>
<p>The paper "<a href="http://publications.cetis.ac.uk/2012/529" target="_blank">A Brief History of Analytics</a>" is the ninth in the <a href="http://publications.cetis.ac.uk/c/analytics" target="_blank">CETIS Analytics Series</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cetis.ac.uk/adam/2013/01/21/analytics-is-not-new/feed/</wfw:commentRss>
		</item>
		<item>
		<title>A Seasonal Sociogram for Learning Analytics Research</title>
		<link>http://blogs.cetis.ac.uk/adam/2012/12/20/a-seasonal-sociogram-for-learning-analytics-research/</link>
		<comments>http://blogs.cetis.ac.uk/adam/2012/12/20/a-seasonal-sociogram-for-learning-analytics-research/#comments</comments>
		<pubDate>Thu, 20 Dec 2012 19:06:15 +0000</pubDate>
		<dc:creator>adam</dc:creator>
		
		<category><![CDATA[analytics]]></category>

		<guid isPermaLink="false">http://blogs.cetis.ac.uk/adam/?p=629</guid>
		<description><![CDATA[SoLAR, the Society for Learning Analytics Research has recently made available a dataset covering research publications in learning analytics and educational data mining and issued the LAK Data Challenge, challenging the community to use the dataset to answer the question:
What do analytics on learning analytics tell us? How  can we make sense of this [...]]]></description>
			<content:encoded><![CDATA[<p>SoLAR, the <a href="http://www.solaresearch.org/" target="_blank">Society for Learning Analytics Research</a> has recently made available a dataset covering research publications in learning analytics and educational data mining and issued the <a href="http://www.solaresearch.org/events/lak/lak-data-challenge/" target="_blank">LAK Data Challenge</a>, challenging the community to use the dataset to answer the question:</p>
<blockquote><p><strong><em>What do analytics on learning analytics tell us? </em></strong>How  can we make sense of this emerging field’s historical roots, current  state, and future trends, based on how its members report and debate  their research?</p></blockquote>
<p>Thanks to too many repeats on the TV schedule I managed to re-learn a bit of novice-level SPARQL and manipulate the RDF/XML provided into a form I can handle with R.</p>
<p>Now, I've had a bit of a pop at the sociograms - i.e. visualisations of social networks - in the past but they do have their uses and one of these is getting a feel for the shape of a dataset that deals with relations. In the case of the LAK challenge dataset, the relationship between authors and papers is such a case. So as part of thinking about whether I'm up for the approaching the challenge from this perspective it makes sense to visualise the data.</p>
<p>And with it being the Christmas season, the colour scheme chose itself.</p>
<div id="attachment_631" class="wp-caption aligncenter" style="width: 310px"><a href="http://blogs.cetis.ac.uk/adam/files/2012/12/ldk-edm-and-jets-2011-and-2012.png"><img class="size-medium wp-image-631" title="Bipartite Sociogram for Paper Authorship" src="http://blogs.cetis.ac.uk/adam/files/2012/12/ldk-edm-and-jets-2011-and-2012-300x281.png" alt="Bipartite Sociogram for Paper Authorship for Proceedings from LAK, EDM and the JETS Special Edition on Learning Analytics" width="300" height="281" /></a><p class="wp-caption-text">Paper Authorship for Proceedings from LAK, EDM and the JETS Special Edition on Learning and Knowledge Analytics (click on image for full-size version)</p></div>
<p>This is technically a "bipartite sociogram" since it shows two kinds of entity and relationships between types. In this case people are shown as green circles and papers shown as red polygons. The data has been limited to the conferences on Learning Analytics and Knowledge (LAK) 2011 and 2012 (red triangles) and the Educational Data Mining (EDM) Conference for the same years (red diamonds). The Journal of Educational Technology and Society special edition on learning and knowledge analytics was also published in 2012 (red pentagons). Thus, we have a snapshot of the main venues for scholarship vicinal to learning analytics.</p>
<p>So, what does it tell me?</p>
<p>My first observation is that there are a lot of papers that have been written by people who have written no others in the dataset for 2011/12(from now on, please assume I always mean this subset). I see this as being consistent with this being an emergent field of research. It is also clear that JETS attracted papers from people who were not already active in the field. This is not the entire story, however as the more connected central region of the diagram shows. Judging this region by eye and comparing it to the rest of the diagram, it looks like there is a tendency for LAK papers (triangles) to be under-represented in the more-connected region compared to EDM (diamonds). This is consistent with EDM conferences having been run since 2008 and their emergence from workshops on the Artificial Intelligence in Education. LAK, on the other hand began in 2011. Some proper statistics are needed to confirm judgement by eye. It would be interesting to look for signs of evolution following the 2013 season.</p>
<p style="text-align: center;">
<div id="attachment_633" class="wp-caption aligncenter" style="width: 220px"><a href="http://blogs.cetis.ac.uk/adam/files/2012/12/isolated.png"><img class="size-medium wp-image-633 " title="isolated" src="http://blogs.cetis.ac.uk/adam/files/2012/12/isolated-300x207.png" alt="A lot of papers were written by people who wrote no others." width="210" height="145" /></a><p class="wp-caption-text">A lot of papers were written by people who wrote no others.</p></div>
<p>The sign of an established research group is the research group head who co-authors several papers with each paper having some less prolific co-authors who are working for the PhDs. The chief and Indians pattern. A careful inspection of the central region shows this pattern as well as groups with less evidence of hierarchy.</p>
<table style="height: 208px;" border="0" width="413">
<tbody>
<tr>
<td>
<p><div id="attachment_634" class="wp-caption aligncenter" style="width: 192px"><a href="http://blogs.cetis.ac.uk/adam/files/2012/12/jefe.png"><img class="size-full wp-image-634" title="jefe" src="http://blogs.cetis.ac.uk/adam/files/2012/12/jefe.png" alt="Cheif and Indians." width="182" height="148" /></a><p class="wp-caption-text">Chief and Indians.</p></div></td>
<td>
<p><div id="attachment_636" class="wp-caption aligncenter" style="width: 114px"><a href="http://blogs.cetis.ac.uk/adam/files/2012/12/group.png"><img class="size-full wp-image-636" title="group" src="http://blogs.cetis.ac.uk/adam/files/2012/12/group.png" alt="A less hierarchical group." width="104" height="96" /></a><p class="wp-caption-text">A less hierarchical group.</p></div></td>
</tr>
</tbody>
</table>
<p>LAK came into being and attracted people without a great deal of knowledge of the prior existence of the EDM conference and community so some polarisation is to be expected. There clearly are people, even those with many publications, the have only published to one venue. Consistent with previous comments about the longer history of EDM it isn't surprising that this is most clear for that venue since there are clearly established groups at work. What I think will be some comfort to the researchers in both camps who have made efforts to build bridges is that there are signs of integration (see the Chiefs and Indians snippet). Whether this is a sign of integrating communities or a consequence of individual preference alone is an open question. Another question to consider with more rigour and something to look out for in the 2013 season.</p>
<p>Am I any the wiser? Well... slightly, and it didn't take long. There are certainly some questions that could be answered with further analysis and there are a few attributes not taken account of here, such as institutional affiliation or country/region. I will certainly have a go at using the techniques I <a href="http://blogs.cetis.ac.uk/adam/2012/11/23/modelling-social-networks/" target="_blank">outlined in a previous post</a> if the weather is poor over the Christmas break but I think I will have to wait until the data for 2013 is available before some of the interesting evolutionary shape of EDM and LAK becomes accessible.</p>
<p>Merry Christmas!</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cetis.ac.uk/adam/2012/12/20/a-seasonal-sociogram-for-learning-analytics-research/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Looking Inside the Box of Analytics and Business Intelligence Applications</title>
		<link>http://blogs.cetis.ac.uk/adam/2012/12/13/looking-inside-the-box-of-analytics-and-business-intelligence-applications/</link>
		<comments>http://blogs.cetis.ac.uk/adam/2012/12/13/looking-inside-the-box-of-analytics-and-business-intelligence-applications/#comments</comments>
		<pubDate>Thu, 13 Dec 2012 11:11:57 +0000</pubDate>
		<dc:creator>adam</dc:creator>
		
		<category><![CDATA[analytics]]></category>

		<guid isPermaLink="false">http://blogs.cetis.ac.uk/adam/?p=623</guid>
		<description><![CDATA[To take technology and social process at face value is to risk failing to appreciate what they mean, do, and can do. Analytics and business intelligence applications or projects, in common with all technology supported innovations, are more likely to be successful if both technology and social spheres are better understood. I don't mean to [...]]]></description>
			<content:encoded><![CDATA[<p>To take technology and social process at face value is to risk failing to appreciate what they mean, do, and can do. Analytics and business intelligence applications or projects, in common with all technology supported innovations, are more likely to be successful if both technology and social spheres are better understood. I don't mean to say that there is no room for intuition in such cases, rather that it is helpful to decide which aspects are best served by intuition or not and by whose intuition, if so. But how to do this?</p>
<p>Just looking can be a poor guide to understanding an existing application and just designing can be a poor approach to creating a new one. Some kind of method, some principles, some prompts or stimulus questions - I will use "framework" as an umbrella term - can all help to avoid a host of errors. Replication of existing approaches that may be obsolete or erroneous, falling into value or cognitive traps, failure to consider a wider range of possibilities, etc are errors we should try to avoid. There are, of course, many approaches to dealing with this problem other than a framework. Peer review and participative design have a clear role to play when adopting or implementing analytics and business intelligence but a framework can play a part alongside these social approaches as well as being useful to an individual sense-maker.</p>
<p>The culmination of my thinking about this kind of framework has just been published as the seventh paper in the CETIS Analytics Series, entitled "<a href="http://publications.cetis.ac.uk/2012/524" target="_blank">A Framework of Characteristics for Analytics</a>". This started out as a personal attempt to make sense of my own intuitive dissatisfaction with the traditions of business intelligence combined with concern that my discussions with colleagues about analytics were sometimes deeply at cross purposes or just unproductive because our mental models lacked sufficient detail and clarity to properly know what we were talking about or to really understand where our differences lay.</p>
<p><em>The following quotes from the paper.</em></p>
<p><a href="http://publications.cetis.ac.uk/2012/524" target="_blank">A Framework of Characteristics for Analytics</a> considers one  way to explore similarities, differences, strengths, weaknesses,  opportunities, etc of actual or proposed applications of analytics. It  is a framework for asking questions about the high level decisions  embedded within a given application of analytics and assessing the match  to real world concerns. The Framework of Characteristics is not a  technical framework.</p>
<p>This is not an introduction to analytics; rather it is aimed at  strategists and innovators in post-compulsory education sector who have  appreciated the potential for analytics in their organisation and who  are considering commissioning or procuring an analytics service or  system that is fit for their own context.</p>
<p>The framework is conceived for two kinds of use:</p>
<ol>
<li>Exploring the underlying features and generally-implicit assumptions  in existing applications of analytics. In this case, the aim might be  to better comprehend the state of the art in analytics and the relevance  of analytics methods from other industries, or to inspect candidates  for procurement with greater rigour.</li>
<li>Considering how to make the transition from a desire to target an  issue in a more analytical way to a high level description of a pilot to  reach the target. In this case, the framework provides a starting-point  template for the production of a design rationale in an analytics  project, whether in-house or commissioned. Alternatively it might lead  to a conclusion that significant problems might arise in targeting the  issue with analytics.</li>
</ol>
<p>In both of these cases, the framework is an aid to clarify or expose  assumptions and so to help its user challenge or confirm them.</p>
<p>I look forward to any comments that might help to improve the framework.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cetis.ac.uk/adam/2012/12/13/looking-inside-the-box-of-analytics-and-business-intelligence-applications/feed/</wfw:commentRss>
		</item>
		<item>
		<title>What does &#8220;Analytics&#8221; Mean? (or is it just another vacuuous buzz word?)</title>
		<link>http://blogs.cetis.ac.uk/adam/2012/11/30/what-does-analytics-mean-or-is-it-just-another-vacuuous-buzz-word/</link>
		<comments>http://blogs.cetis.ac.uk/adam/2012/11/30/what-does-analytics-mean-or-is-it-just-another-vacuuous-buzz-word/#comments</comments>
		<pubDate>Fri, 30 Nov 2012 12:25:56 +0000</pubDate>
		<dc:creator>adam</dc:creator>
		
		<category><![CDATA[analytics]]></category>

		<guid isPermaLink="false">http://blogs.cetis.ac.uk/adam/?p=616</guid>
		<description><![CDATA["Analytics" certainly is a buzz word in the business world and almost impossible to avoid at any venue where the relationship between technology and post-compulsory education is discussed, from bums-on-seats to MOOCs. We do bandy words like analytics or cloud computing around rather freely and it is so often the case with technology-related hype words [...]]]></description>
			<content:encoded><![CDATA[<p>"Analytics" certainly is a buzz word in the business world and almost impossible to avoid at any venue where the relationship between technology and post-compulsory education is discussed, from bums-on-seats to MOOCs. We do bandy words like analytics or cloud computing around rather freely and it is so often the case with technology-related hype words that  they are used by sellers of snake oil or old rope to confuse the ignorant and by the careless to refer vaguely to something that seems to be important.</p>
<p>Cloud computing is a good example. While it is an occasionally useful umbrella term for a range of technologies, techniques and IT service business models, it masks differences that matter in practice. Any useful thinking about cloud must work on a more clear understanding of the kinds of cloud computing service delivery level and the match to the problem to be solved. To understand the very real benefits of cloud computing, you need to understand the distinct offerings; any discussion that just refers to cloud computing is likely to be vacuuous.  These distinctions are discussed in a <a href="http://wiki.cetis.ac.uk/images/3/38/Cloud_Computing.pdf" target="_blank">CETIS briefing paper on cloud computing</a>.</p>
<p>But is analytics like cloud computing, is the word itself useful? Can a useful and clear meaning, or even a definition, of analytics be determined?</p>
<p>I believe the answer is "yes" and the latest paper in our Analytics Series, which is entitled "<a href="http://publications.cetis.ac.uk/2012/521" target="_blank">What is Analytics? Definition and Essential Characteristics</a>" explores the background and discusses previous work on defining analytics before proposing a definition. It then extends this to a consideration of what it means to be analytical as opposed to being just quantitative. I realise that the snake oil and old rope salesmen will not be interested in this distinction; it is essentially a stance against uncritical use of "analytics".</p>
<p>There is another way in which I believe the umbrella terms of cloud computing and analytics differ. Whereas cloud computing  becomes meaningful by breaking it down and using terms such as "software as a service", I am not convinced that a similar approach is applicable to analytics. The explanation for this may be that cloud computing  is bound to hardware and software, around which different business models become viable, whereas analytics is foremost about decisions, activity and process.</p>
<p>Terms for kinds of analytics, such as "learning analytics", may be useful to identify the kind of analytics that a particular community is doing but to define such terms is probably counter-productive (although working definitions may be very useful to allow the term to be used in written or oral communications). One of the problems with definitions is the boundaries they draw. Where would learning analytics and business analytics have boundary in an educational establishment? We could probably agree that some cases of analytics were on one side or the other but not all cases. Furthermore, analytics is a developing field that certainly has not covered all that is possible and is very immature in many industries and public sector bodies. This is likely to mean revision of definitions is necessary, which rather defeats the object.</p>
<p>Even the use of nouns, necessary though it may be in some circumstances, can be problematical. If we both say "learning analytics", are we talking about the same thing? Probably not, because we are not really talking about a thing but about processes and practices. There is a danger that newcomers to something described as "learning analytics" will construct quite a narrow view of "learning analytics is ...." and later declaim that learning analytics doesn't work or that learning analytics is no good because it cannot solve problem X or Y. Such blinkered sweeping statements are a warning sign that opportunities will be missed.</p>
<p>Rather than say what business analytics, learning analytics, research analytics, etc <span style="text-decoration: underline;">is</span>, I think we should focus on the applications, the questions and the people who care about these things. In other words, we should think about what analytics can and cannot help us with, what it is for, etc. This is reflected in most of the titles in the CETIS Analytics Series, for example our recently-published paper entitled "<a href="http://publications.cetis.ac.uk/2012/516" target="_blank">Analytics for Learning and Teaching</a>". The point being made about avoiding definitions of kinds of analytics is expanded upon in  "<a href="http://publications.cetis.ac.uk/2012/521" target="_blank">What is Analytics? Definition and Essential Characteristics</a>".</p>
<p>The full set of papers in the series is available from the <a href="http://publications.cetis.ac.uk/c/analytics" target="_blank">CETIS Publications</a> site.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cetis.ac.uk/adam/2012/11/30/what-does-analytics-mean-or-is-it-just-another-vacuuous-buzz-word/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Modelling Social Networks</title>
		<link>http://blogs.cetis.ac.uk/adam/2012/11/23/modelling-social-networks/</link>
		<comments>http://blogs.cetis.ac.uk/adam/2012/11/23/modelling-social-networks/#comments</comments>
		<pubDate>Fri, 23 Nov 2012 20:06:31 +0000</pubDate>
		<dc:creator>adam</dc:creator>
		
		<category><![CDATA[analytics]]></category>

		<guid isPermaLink="false">http://blogs.cetis.ac.uk/adam/?p=579</guid>
		<description><![CDATA[Social network analysis has become rather popular over the last five (or so) years; the proliferation of different manifestations of the social web has propelled it from being a relatively esoteric method in the social sciences to become something that has touched many people, if only superficially. The network visualisation - not necessarily a social [...]]]></description>
			<content:encoded><![CDATA[<p>Social network analysis has become rather popular over the last five (or so) years; the proliferation of different manifestations of the social web has propelled it from being a relatively esoteric method in the social sciences to become something that has touched many people, if only superficially. The network visualisation - not necessarily a social network, e.g. <a href="http://www.chrisharrison.net/index.php/Visualizations/InternetMap">Chris Harrison's internet map</a> - has become a symbol of the transformation in connectivity and modes of interaction that modern hardware, software and infrastructure has brought.</p>
<p>This is all very well, but I want more than network visualisations and computed statistics such as network density or betweenness centrality. The alluring visualisation that is the sociogram tends to leave me rather non-plussed.</p>
<div id="attachment_586" class="wp-caption aligncenter" style="width: 269px"><a href="http://blogs.cetis.ac.uk/adam/files/2012/11/face.png"><img class="size-medium wp-image-586" title="How I often feel about the sociogram" src="http://blogs.cetis.ac.uk/adam/files/2012/11/face-259x300.png" alt="How I often feel about the sociogram" width="259" height="300" /></a><p class="wp-caption-text">How I often feel about the sociogram</p></div>
<p>Now, don't get me wrong: I'm not against this stuff and I'm not attacking the impressive work of people like <a href="http://mashe.hawksey.info/" target="_blank">Martin Hawksey</a> or <a href="http://blog.ouseful.info/" target="_blank">Tony Hirst</a>, the usefulness of tools like <a href="http://www.snappvis.org/" target="_blank">SNAPP</a> or recent work on the Open University (UK) <a href="http://stadium.open.ac.uk/stadia/preview.php?whichevent=2065&amp;s=29&amp;option=&amp;record=0&amp;schedule=2620" target="_blank">SocialLearn data using the NAT</a> tool. I just want more and I want an approach which opens up the possibility of model building and testing, of hypothesis testing, etc. I want to be able to do this to make more sense of the data.</p>
<pre><strong>Warning:
this article assumes familiarity with Social Network Analysis.</strong></pre>
<h2>Tools and Method</h2>
<p>Several months ago, I became rather excited to find that exactly this kind of approach - social network modelling - has been a productive area of social science research and algorithm development for several years and that there is now a quite mature package called "<a href="http://cran.r-project.org/web/packages/ergm/index.html" target="_blank">ergm</a>" for <a href="http://cran.r-project.org/" target="_blank">R</a>. This package allows its user to propose a model for small-scale social processes and to evaluate the degree of fit to an observed social network. The mathematical formulation involves an exponential to calculate probability hence the approach is known as "Exponential Random Graph Models" (ERGM). The word "random" captures the idea that the actual social network is only one of many possibilities that could emerge from the same social forces, processes, etc and that this randomness is captured in the method.</p>
<p>I have added some what I have found to be the most useful papers and a related book to a <a href="http://www.mendeley.com/groups/2831251/social-network-modelling/papers/" target="_blank">Mendeley group</a>; please consult these for an outline of the historical development of the ERGM method and for articles introducing the R package.</p>
<p>The essential idea is quite simple, although the algorithms required to turn it into a reality are quite scary (and I don't pretend to understand enough to do proper research using the method). The idea is to think about some arguable and real-world social phenomena at a small scale and to compute what weightings apply to each of these on the basis of a match between simulations of the overall networks that could emerge from these small-scale phenomena and a given observed network. Each of the small-scale phenomena must be expressed in a way that a statistic can be evaluated for it and this means it must be formulated as a sub-graph that can be counted.</p>
<div id="attachment_591" class="wp-caption aligncenter" style="width: 168px"><a href="http://blogs.cetis.ac.uk/adam/files/2012/11/elements.png"><img class="size-full wp-image-591" title="elements" src="http://blogs.cetis.ac.uk/adam/files/2012/11/elements.png" alt="Example sub-graphs that illustrate small-scale social process." width="158" height="308" /></a><p class="wp-caption-text">Example sub-graphs that illustrate small-scale social process.</p></div>
<p>The diagram above illustrates three kinds of sub-graph that match three different kinds of evolutionary force on an emerging network. Imagine the arrows indicate something like "I consider them my friend", although we can use the same formalism for less personal kinds of tie such as "I rely on" or even the relation between people and resources.</p>
<ul>
<li>The idea of mutuality is captured by the reciprocal relationships between A and B. Real friendship networks should be high in mutuality whereas workplace social networks may be less mutual.</li>
<li>The idea of transitivity is captured in the C-D-E triangle. This might be expressed as "my friend's friend is my friend".</li>
<li>The idea of homophily is captured in the bottom pair of subgraphs, which show preference for ties to the same colour of person. Colour represents any kind of attribute, maybe a racial label for studies of community polarisation or maybe gender, degree subject, football team... This might be captured as "birds of a feather fly together".</li>
</ul>
<p>One of the interesting possibilities of social network modelling is that it may be able to discover the likely role of different social processes, which we cannot directly test, with qualitatively similar outcomes. For example, both homophily and transitivity favour the formation of cohesive groups. A full description of research using ERGMs to deal with this kind of question is "Birds of a Feather, or Friend of a Friend? Using Exponential Random Graph Models to Investigate Adolescent Social Networks" (Goodreau, Kitts &amp; Morris): <a href="http://www.mendeley.com/groups/2831251/social-network-modelling/papers/">see the Mendeley group</a>.</p>
<h2>A First Experiment</h2>
<p>In the spirit of active learning, I wanted to have a go. This meant using relatively easily-available data about a community that I knew fairly well. Twitter follower networks are fashionable and not too hard to get, although the API is a bit limiting, so I wrote <a href="https://github.com/arc12/SNA" target="_blank">some R to crawl follower/friends and create a suitable data structure</a> for use with the ERGM package.</p>
<p>Several evenings later I concluded that a network defined as <a href="https://twitter.com/ectel2012/followers" target="_blank">followers of the EC-TEL 2012 conference</a> was unsuitable. The problem seems to be that the network is not at all homogeneous while at the same time there are essentially no useful person attributes to use; the location data is useless and the number of tweets is not a good indicator of anything. Without some quantitative or categorical attribute you are forced to use models that assume homogeneity. Hence nothing I tried was a sensible fit.</p>
<p>Lesson learned: knowledge of person (vertex) attributes is likely to be important.</p>
<p>My second attempt was to consider the Twitter network between CETIS staff and colleagues in the JISC Innovation Group. In this case, I know how to assign one attribute that might be significant: team membership.</p>
<p>Without looking at the data, it seems reasonable to hypothesise as follows:</p>
<ol>
<li>We might expect a high density network since:
<ul>
<li>Following in Twitter is not an indication of a strong tie; it is a low cost action and one that may well persist due to a failure to un-follow.</li>
<li>All of the people involved work directly or indirectly (CETIS) for JISC and within the same unit so we might expect.</li>
</ul>
</li>
<li>We might expect a high degree of mutuality since this is a professional peer network in a university/college setting.</li>
<li>The setting and the nature of Twitter may lead to a network that does not follow organisational hierarchy.</li>
<li>We might expect teams to form clusters with more in-team ties than out-of-team ties. i.e. a homphily effect.</li>
<li>There is no reason to believe any team will be more sociable than another.</li>
<li>Since CETIS was created primarily to support the eLearning Team we might expect there to be a preferential mixing-effect.</li>
</ol>
<div id="attachment_597" class="wp-caption aligncenter" style="width: 310px"><a href="http://blogs.cetis.ac.uk/adam/files/2012/11/jisc_and_cetis.png"><img class="size-medium wp-image-597" title="jisc_and_cetis" src="http://blogs.cetis.ac.uk/adam/files/2012/11/jisc_and_cetis-300x295.png" alt="CETIS and JISC Innovation Group Twitter follower network. Colours indicate the team and arrows show the &quot;follows&quot; relationship in the direction of the arrow." width="300" height="295" /></a><p class="wp-caption-text">CETIS and JISC Innovation Group Twitter follower network. Colours indicate the team and arrows show the &quot;follows&quot; relationship in the direction of the arrow.</p></div>
<p>Nonplussed? What of the hypotheses?</p>
<p>Well... I suppose it is possible to assert that this is quite a dense network that seems to show a lot of mutuality and, assuming the Fruchterman-Reingold layout algorithm hasn't distorted reality, which shows some hints at team cohesiveness and a few less-connected individuals. I think JISC management should be quite happy with the implications of this picture, although it should be noted that there are some people who do not use Twitter and that this says nothing about what Twitter mediates.</p>
<p>A little more attention to the visualisation can reveal a little more. The graph below (which is a link to a full-size image) was created using Gephi with nodes coloured according to team again but now sized according to the <a href="http://en.wikipedia.org/wiki/Centrality#Eigenvector_centrality" target="_blank">eigenvector centrality</a> measure (area proportional to centrality), which gives an indication of the influence of that person's communications within the given network.</p>
<div id="attachment_600" class="wp-caption aligncenter" style="width: 310px"><a href="http://blogs.cetis.ac.uk/adam/files/2012/11/gephi.png"><img class="size-medium wp-image-600" title="gephi" src="http://blogs.cetis.ac.uk/adam/files/2012/11/gephi-300x300.png" alt="Visualising the CETIS and JISC Innovation network with centrality measures." width="300" height="300" /></a><p class="wp-caption-text">Visualising the CETIS and JISC Innovation network with centrality measures. The author is among those who do  not tweet.</p></div>
<p>This does, at least, indicate who is most, least and middling in centrality. Since I know most of these people, I can confirm there are no surprises.</p>
<p>Trying out several candidate models in order to try to decide on the previously enumerated hypotheses (and some others omitted for brevity) leads to the following tentative conclusions, i.e. to a model that appeared to be consistent with the observed network. "Appeared to be consistent" means that my inexperienced eye considered that there was acceptable goodness of fit between a range of statistics computed on the observed network and ensembles of networks simulated using the given model and best-fit parameters.</p>
<p>Keeping the same numbering as the hypotheses:</p>
<ol>
<li>ERGM isn't needed to judge network density but the method does show the degree to which connections can adequately be put down to pure chance.</li>
<li>There is indeed a large positive coefficient for mutuality, i.e. that reciprocal "follows" are not just a consequence of chance in a relatively dense network.</li>
<li>It is not possible to make conclusions about organisational hierarchy.</li>
<li>There is a statistically significant greater density within teams. i.e. team homophily seems to be affecting the network. This seems to be strongest for the Digital Infrastructure team, then CETIS then the eLearning team but the standard errors are too large to claim this with confidence.  The two other teams were considered too small to draw a conclusion</li>
<li>None of CETIS, the eLearning team or the Digital Infrastructure team seem to be more sociable. The two other teams were considered too small to draw a conclusion. This is known as a "main effect".</li>
<li>There is no statistically significant preference for certain teams to follow each other. In the particular case of CETIS, this makes sense to an insider since we have worked closely with JISC colleagues across several teams.</li>
</ol>
<p>One factor that was not previously mentioned but which turned out to be critical to getting the model to fit was individual effects. Not everyone is the same. This is the same issue as was outlined for the EC-TEL 2012 followers: heterogeneity. In the present case, however, only a minority of people stand out sufficiently to require individual-level treatment and so it is reasonable to say that, while these are necessary for goodness of fit, they are adjustments. To be specific, there were four people who were less likely to follow and another four who were less likely to be followed. I will not reveal the names but suffice to say that, surprising though the results was at first, it is explainable for the people in CETIS.</p>
<h2>A Technical Note</h2>
<p>This is largely for anyone who might play with the R package. The Twitter rules prevent me from distributing the data but I am happy to assist anyone wishing to experiment (I can provide csv files of nodes and edges, a .RData file containing a <a href="http://cran.r-project.org/web/packages/network/index.html" target="_blank">network object</a> suitable for use with the ERGM package or the Gephi file to match the picture above).</p>
<p>The final model I settled on was:</p>
<pre>twitter.net ~ edges +
sender(base=c(-4,-21,-29,-31)) +
receiver(base=c(-14,-19,-23,-28)) +
nodematch("team", diff=TRUE, keep=c(1,3,4)) +
mutual</pre>
<p>This means:</p>
<ul>
<li>edges = &gt; the random chance that A follows B unconditionally on anything.</li>
<li>sender =&gt; only these four vertices are given special treatment in terms of their propensity to follow.</li>
<li>receiver =&gt; special treatment for propensity to be followed.</li>
<li>nodematch =&gt; consider the team attribute for teams 1, 3 and 4 and use a different parameter for each team separately (i.e. differential homophily).</li>
<li>mutual =&gt; the propensity for a person to reciprocate being followed.</li>
</ul>
<p>And for completeness the estimated model parameters for my last run. The parameter for "edges" indicates the baseline random chance and, if the other model elements are ignored, an estimate of -1.64 indicates that there is about a 16% chance of a randomly chosen A-&gt;B tie being present (the estimate = logit(p)). The interpretation of the other parameters is non-trivial but in general terms, a randomly chosen network containing a higher value statistic for a given sub-graph type will be more probable than one containing a lower value when the estimated parameter is positive and less probable when it is negative. The parameters are estimated such that the observed network has the maximum likelihood according to the model chosen.</p>
<pre>                         Estimate Std. Error MCMC %  p-value
edges                     -1.6436     0.1580      1  &lt; 1e-04 ***
sender4                   -1.4609     0.4860      2 0.002721 **
sender21                  -0.7749     0.4010      0 0.053583 .
sender29                  -1.9641     0.5387      0 0.000281 ***
sender31                  -1.5191     0.4897      0 0.001982 **
receiver14                -2.9072     0.7394      9  &lt; 1e-04 ***
receiver19                -1.3007     0.4506      0 0.003983 **
receiver23                -2.5929     0.5776      0  &lt; 1e-04 ***
receiver28                -2.5625     0.6191      0  &lt; 1e-04 ***
nodematch.team.CETIS       1.9119     0.3049      0  &lt; 1e-04 ***
nodematch.team.DI          2.6977     0.9710      1 0.005577 **
nodematch.team.eLearning   1.1195     0.4271      1 0.008901 **
mutual                     3.7081     0.2966      2  &lt; 1e-04 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1</pre>
<h2>Outlook</h2>
<p>The point of this was a learning experience; so what did I learn?</p>
<ol>
<li>It does seem to work!</li>
<li>Size is an issue. Depending on the model used, a 30 node network can take several tens of seconds to either determine the best fit parameters or to fail to converge.</li>
<li>Checking goodness of fit is not simple; the parameters for a proposed model are only determined for the statistics that are in the model and so goodness of fit testing requires consideration of other parameters. This can come down to "doing it by eye" with various plots.</li>
<li>Proper use should involve some experimental design to make sure that useful attributes are available and that the network is properly sampled if not determined a-priori.</li>
<li>There are some pathologies in the algorithms with certain kinds of model. These are documented in the literature but still require care.</li>
</ol>
<p>The outlook, as I see it, is promising but the approach is far from being ready for "real users" in a learning analytics context. In the near term I can, however, see this being applied by organisations whose business involves social learning and as a learning science tool. In short: this is a research tool that is worthy of wider application.</p>
<p><em>This is an extended description of a lightning talk given at the inaugural <a href="http://www.solaresearch.org/flare/solar-flare-uk/" target="_blank">SoLAR Flare UK event</a> held on November 19th 2012. It may contain errors and omissions.<br />
</em></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cetis.ac.uk/adam/2012/11/23/modelling-social-networks/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Open Source and Open Standards in the Public Sector</title>
		<link>http://blogs.cetis.ac.uk/adam/2012/11/22/open-source-and-open-standards-in-the-public-sector/</link>
		<comments>http://blogs.cetis.ac.uk/adam/2012/11/22/open-source-and-open-standards-in-the-public-sector/#comments</comments>
		<pubDate>Thu, 22 Nov 2012 16:13:54 +0000</pubDate>
		<dc:creator>adam</dc:creator>
		
		<category><![CDATA[open source]]></category>

		<category><![CDATA[standards]]></category>

		<guid isPermaLink="false">http://blogs.cetis.ac.uk/adam/?p=569</guid>
		<description><![CDATA[Yesterday I attended day 1 of a conference entitled "Public Sector: Open Source" and, while Open Source Software (OSS) was the primary subject, Open Standards were very much on the agenda. I went in particular because of an interest in what the UK Government Cabinet Office is doing in this area.
I have previously been quite [...]]]></description>
			<content:encoded><![CDATA[<p>Yesterday I attended day 1 of a conference entitled "<a href="https://osepa.shef.ac.uk/" target="_blank">Public Sector: Open Source</a>" and, while Open Source Software (OSS) was the primary subject, Open Standards were very much on the agenda. I went in particular because of an interest in what the UK Government Cabinet Office is doing in this area.</p>
<p>I have previously been quite positive about both the information principles and the open standards consultation (blog posts <a href="http://blogs.cetis.ac.uk/adam/2012/01/30/information-principles-for-the-public-sector-the-case-of-principle-4/">here</a> and <a href="http://blogs.cetis.ac.uk/adam/2012/05/02/uk-government-open-standards-consultation-cetis-response/">here</a> respectively). We provided a response to the consultation and were pleased to see the <a href="http://www.cabinetoffice.gov.uk/news/government-bodies-must-comply-open-standards-principles" target="_blank">Nov 1st announcement</a> that government bodies must comply with a set of open standards principles.</p>
<p>The speaker from the Cabinet Office was Tariq Rashid (IT Reform group) and we were treated to a quite candid assessment of the challanges faced by government IT, with particular reference to OSS. His assessment of the issues and how to deal with them was cogent and believable, if also a little scary.</p>
<p>Here are a few of the things that caught my attention.</p>
<h2>Outsource the Brawn not the Brain</h2>
<p>Over a period of many years the supply of well-informed and deeply technical capability in government has been depleted such that too many decisions are made without there being an appropriate "<a href="http://en.wikipedia.org/wiki/Intelligent_customer" target="_blank">intelligent customer</a>". To quote Tariq: "we shouldn't be spending money unless we know what the alternatives are." The particular point being made was about OSS alternatives - and they have produced an <a href="https://update.cabinetoffice.gov.uk/resource-library/open-source-procurement-toolkit">Open Source Procurement Toolkit</a> to challenge myths and to guide people to alternatives - but the same line of argument extends to there being a poor understanding of the sources of technical lock-in (as opposed to commercial lock-in) and how chains of dependency could introduce inertia through decisions that are innocuous from a naive analysis.</p>
<p>By my analysis, the Cabinet Office IT reform team are the exception that proves the general point. It is also a point that universities and colleges should be wary of as their senior management tries to cut out "expensive people we don't really need".</p>
<h2>The Current Procurement Approach is Pathological</h2>
<p>There is something slightly ironic that it takes a Tory government to seriously attack an approach which sees the greatest fraction of the incredible £21 billion p.a. central government spend on IT go to a handful of big IT houses (yes, countable on 2 hands).</p>
<p>In short: the procurement approach, which typically involves a large amount of bundling-up, reduces competition and inhibits SMEs and providers of innovative solutions as well as blocking more agile approaches.</p>
<p>At the intersection between procurement approach and brain-outsourcing is the critical issue that the IT that is usually acquired lacks a long term view of architecture; this becomes reduced to the scope of tendered work and build around the benefits of the supplier.</p>
<h2>Emphasis on Procurement</h2>
<p>Most of the presentations placed most emphasis on the benefits of OSS in terms of procurement and cost and this was a central theme of Tariq's talk also. Having spent long enough consorting with OSS-heads I found this to be rather narrow. What, for example, about the opportunities for public sector bodies to engage in acts of co-creation, either to lead or significantly contribute to OSS projects. There are many examples of commercial entities making significant investments in developer salaries while taking a hands-off approach to governance of the open source product (e.g. IBM and the <a href="http://en.wikipedia.org/wiki/Eclipse_%28software%29" target="_blank">eclipse platform</a>).</p>
<p>For now, it seems, this kind of engagement is one step ahead of what is feasible in central government; there is a need for thinking to move on, to mature, from where it is now. I also suspect that there is plenty of low-hanging fruit - easy cases to make for cost savings in the near term - whereas co-creation is a longer term strategy. Tariq added that it might be only 2-3 years before government was ready to begin making direct contributions to <a href="http://www.libreoffice.org/" target="_blank">LibreOffice</a>, which is already being trialled in some departments.</p>
<p>Another of the speakers, representing <a href="http://www.sambruk.se/ovrigt/inenglish.4.72ebdc8412fd172bb7480001338.html" target="_blank">sambruk</a> (one of the partners in <a href="http://www.osepa.eu/" target="_blank">OSEPA</a>, the project that organised the conference) seems to be heading towards more of a consortium model that could lead to something akin to the <a href="http://www.sakaiproject.org/" target="_blank">Sakai</a> or <a href="http://kuali.org/" target="_blank">Kuali</a> model for Swedish municipality administration.</p>
<h2>Conclusion</h2>
<p>For all the Cabinet Office has a fairly small budget, its gatekeeper role - it must approve all spending proposals over £5 million and has some good examples of having prompted significant savings (e.g. £12 -&gt; £2 million on a UK Borders procurement) - makes it a force to be reckoned with. Coupled with an attitude (as I perceive it) of wanting to understand the options and best current thinking on topics such as open source and open standards, this makes for a potent force in changing government IT.</p>
<p>The challenge for universities and colleges is to effect the same kind of transformation <span style="text-decoration: underline;">without</span> an equivalent to the Cabinet Office and in the face of sector fragmentation (and, at best, some fairly loose alliances of sovereign city states).</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cetis.ac.uk/adam/2012/11/22/open-source-and-open-standards-in-the-public-sector/feed/</wfw:commentRss>
		</item>
		<item>
		<title>How to do Analytics Right&#8230;</title>
		<link>http://blogs.cetis.ac.uk/adam/2012/11/21/how-to-do-analytics-right/</link>
		<comments>http://blogs.cetis.ac.uk/adam/2012/11/21/how-to-do-analytics-right/#comments</comments>
		<pubDate>Wed, 21 Nov 2012 12:36:21 +0000</pubDate>
		<dc:creator>adam</dc:creator>
		
		<category><![CDATA[analytics]]></category>

		<guid isPermaLink="false">http://blogs.cetis.ac.uk/adam/?p=561</guid>
		<description><![CDATA[There is, of course, no simple recipe, no cookie-cutter template and perfection is an unattainable... but there are some good examples.
The Signals Project at Perdue University is among the most celebrated examples of analytics in Higher Education at the moment so I was intrigued as to what the person behind it would have to say [...]]]></description>
			<content:encoded><![CDATA[<p>There is, of course, no simple recipe, no cookie-cutter template and perfection is an unattainable... but there are some good examples.</p>
<p>The <a href="http://www.itap.purdue.edu/studio/signals/" target="_blank">Signals Project at Perdue</a> University is among the most celebrated examples of analytics in Higher Education at the moment so I was intrigued as to what the person behind it would have to say when I met him just prior to his presentation at the recent SURF Education Day (actually "<a href="http://www.deonderwijsdagen.nl/">Dé Onderwijsdagen 2012</a>"; SURF is a similar organisation to JISC but in the Netherlands). This person is John Campbell and he is not at all the slightly exhausting (to dour Brits) kind of American IT leader, full of hyperbole and sweeping statement; his is a level-headed and grounded story. It is also a story from which I think we can draw some tips on how to do analytics right. These are my take-home thoughts.</p>
<h2>Analytics = Actionable Intelligence</h2>
<p>Anyone who has read my previous blog posts on analytics will know I'm rather passionate about "actionable insight" as a key point about analytics so I was naturally pleased to hear John's similar take on the subject. We vigorously agreed that more reports is not what we need. If you can't use the results of analysis to act differently it isn't worth the effort. The corollary is that we should design systems around the people who need to take action.</p>
<h2>Take a Multi-disciplinary Approach</h2>
<p>Putting analytics into practice (at scale) is not "just" addressing IT or a statistical matters but requires domain knowledge of the area to be addressed and an understanding of the operational and cultural realities of the context of use. John stressed the varied team as a means to taking this kind of rounded approach. Important actors in this kind of team are people who understand how to influence change in organisational culture: politics.</p>
<p>You do still need good technical knowledge to avoid false insights, of course.</p>
<h2>Take Account of "User" Psychology</h2>
<p>The people who use the analytics - whether driving it or intended to be influenced by it - are the engine for change. This is really pointing out aspects of a multi-disciplinary approach; think soft systems, participatory design, and a team with some direct experience as a teacher/tutor/etc.</p>
<p>Signals has several examples, all elementary is some respects but significant by their presence:</p>
<ul>
<li>teaching staff trigger the analysis and can over-ride the results (although rarely do);</li>
<li>it is emphasised to students that signals is NOT about grades but about engagement;</li>
<li>there are helpful suggestions given to students in addition to the traffic-light and, although these come from a repertoire, the teachers have a hand in targetting these.</li>
</ul>
<h2>Start Off Manually</h2>
<p>OK, a process based on spreadsheets and people manually pushing and pulling data between databases and analysis software is not scalable but this can be an important stage. Is it really wise to start investing money and reputation in a big system before you have properly established what you really need, what your data quality can sustain, what works in practice?</p>
<p>This provides opportunity to move from research into practice, to properly adapt (rather than blindly adopt or superficially replicate) effective practice from elsewhere,etc. A manual start-off helps to expose limitations and risks (see next point).</p>
<h2>KISS</h2>
<p>The old adage "keep it simple stupid" (a modern vernacular expression of <a href="http://en.wikipedia.org/wiki/Occam%27s_razor">Occam's razor</a>) is not what John actually said, but he got close. Signals uses some well established and thoroughly mainstream statistical methods. It does not use the latest fancy predictive algorithms.</p>
<p>Why? Because fancy treatments would be like putting F1 tyres on a Citroen 2CV: worse than pointless. The data quality and a range of systematic biases* means that the simpler method and a traffic-light result is appropriate technology. John made it clear that quoting a percentage chance of drop-out (etc) is simply an indefensible level of precision given the data: red, amber and green with teacher over-ride is.</p>
<p>(*- VLE data, for example, does not mean the same across all courses/modules, teachers)</p>
<h2>Be Part of a Community</h2>
<p>OK... I liked this one because it is the kind of thing that JISC and CETIS has been promoting across all of their areas of work for many years. Making sense of what is possible, imagining and realising new ideas works so much better when ideas, experiences and reflections are shared.</p>
<p>This is why we were pleased to be part of the <a href="http://www.solaresearch.org/flare/solar-flare-uk/">first SoLAR Flare UK event</a> earlier this week and hope to be working with that community for some time.</p>
<h2>Conclusion</h2>
<p>Many have, and will, attempt to replicate the success of Signals in addressing student retention but not all will succeed. The points I mentioned above are indicative of an approach that worked in totality; a superficial attempt to replicate Signals will probably fail. This is about matching an appropriate level technology with organisational culture and context. It is innovation in socio-technical practice. So: doing analytics right is about holism directed towards action.</p>
<p><em>The views above include my own and not necessarily John Campbell's.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cetis.ac.uk/adam/2012/11/21/how-to-do-analytics-right/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Snapshots on the Changing Landscape of &#8220;Open &#8230;&#8221;</title>
		<link>http://blogs.cetis.ac.uk/adam/2012/11/08/snapshots-on-the-changing-landscape-of-open/</link>
		<comments>http://blogs.cetis.ac.uk/adam/2012/11/08/snapshots-on-the-changing-landscape-of-open/#comments</comments>
		<pubDate>Thu, 08 Nov 2012 20:03:37 +0000</pubDate>
		<dc:creator>adam</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<category><![CDATA[jiscobs]]></category>

		<category><![CDATA[telmap]]></category>

		<guid isPermaLink="false">http://blogs.cetis.ac.uk/adam/?p=553</guid>
		<description><![CDATA[A little bit of text mining on a fairly large number of blogs with an educational technology (or technology enhanced learning...) makes a neat set of snapshots on "open ...".
Considering the words following "open" from January 2009 to the end of October 2012 shows the following distribution (where words with a relative frequency of &#60;2% [...]]]></description>
			<content:encoded><![CDATA[<p>A little bit of text mining on a fairly large number of blogs with an educational technology (or technology enhanced learning...) makes a neat set of snapshots on "open ...".</p>
<p>Considering the words following "open" from January 2009 to the end of October 2012 shows the following distribution (where words with a relative frequency of &lt;2% are ignored, as are low-value words like "and"). Hence it shows a share of the dominant themes.</p>
<div id="attachment_557" class="wp-caption aligncenter" style="width: 617px"><a href="http://blogs.cetis.ac.uk/adam/files/2012/11/share-of-open.png"><img class="size-full wp-image-557" title="Share of &quot;Open ...&quot; from Jan 2009 to Oct 2012" src="http://blogs.cetis.ac.uk/adam/files/2012/11/share-of-open.png" alt="Share of &quot;Open ...&quot; from Jan 2009 to Oct 2012" width="607" height="331" /></a><p class="wp-caption-text">Share of &quot;Open ...&quot; from Jan 2009 to Oct 2012</p></div>
<p>The share for "online+course" is largely attributable to MOOCs and similar, although some of it is likely to be the use of "open online" referring to something else. This probably confirms the guesswork of followers of Ed Tech fashion but it may be a bit more of a surprise to see that open educational/content has taken such a tumble. I wonder whether some of the "open education" share has been diverted into "open online/course". I'm also pleased to see "open standards" gaining more of a foothold but a left with a feeling that "open data" got a bit over-hyped in 2011.</p>
<p>About the data: 28116 blog posts were harvested and these contained 13723 uses of "open". The blog post harvesting was done by the <a href="http://dbis.rwth-aachen.de/cms/projects/mediabase">Mediabase</a> and the analysis was done by the author, both as part of the EC funded <a href="http://www.learningfrontiers.eu/">TELMap project</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cetis.ac.uk/adam/2012/11/08/snapshots-on-the-changing-landscape-of-open/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Innovation Networks</title>
		<link>http://blogs.cetis.ac.uk/adam/2012/07/23/innovation-networks/</link>
		<comments>http://blogs.cetis.ac.uk/adam/2012/07/23/innovation-networks/#comments</comments>
		<pubDate>Mon, 23 Jul 2012 12:48:59 +0000</pubDate>
		<dc:creator>adam</dc:creator>
		
		<category><![CDATA[innovation]]></category>

		<guid isPermaLink="false">http://blogs.cetis.ac.uk/adam/?p=548</guid>
		<description><![CDATA[Realising benefits from applying ICT in post-compulsory education is something that might reasonably be described as innovation and networks of people and organisations provide an interesting means to achieve this aim. This is something that JISC and CETIS have been involved with for many years and we collectively have many lessons learned.
I recently set out [...]]]></description>
			<content:encoded><![CDATA[<p>Realising benefits from applying ICT in post-compulsory education is something that might reasonably be described as innovation and networks of people and organisations provide an interesting means to achieve this aim. This is something that JISC and CETIS have been involved with for many years and we collectively have many lessons learned.</p>
<p>I recently set out to think about both innovation and innovation networks in a more structured way, being aware that these are complex topics with considerable existing literature, and to try to capture some of the "what it means for us" in the form of an essay. This is available in <a href="http://blogs.cetis.ac.uk/adam/files/2012/07/innovation-networks-0p2.pdf">PDF</a> and <a href="http://blogs.cetis.ac.uk/adam/files/2012/07/innovation-networks-0p2.doc">DOC</a> formats.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cetis.ac.uk/adam/2012/07/23/innovation-networks/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Exploratory Data Analysis</title>
		<link>http://blogs.cetis.ac.uk/adam/2012/05/18/exploratory-data-analysis/</link>
		<comments>http://blogs.cetis.ac.uk/adam/2012/05/18/exploratory-data-analysis/#comments</comments>
		<pubDate>Fri, 18 May 2012 12:26:14 +0000</pubDate>
		<dc:creator>adam</dc:creator>
		
		<category><![CDATA[analytics]]></category>

		<guid isPermaLink="false">http://blogs.cetis.ac.uk/adam/?p=542</guid>
		<description><![CDATA[It doesn't take much to trigger me into a rant about the weaknesses of reports on data and "dashboards" purporting to be "analytics" or "business intelligence". Lots of pie charts and line graphs with added bling are as the proverbial red rag to a bull.
Until recently my response was to demand more rigorous statistics: hypothesis [...]]]></description>
			<content:encoded><![CDATA[<p>It doesn't take much to trigger me into a rant about the weaknesses of reports on data and "<a href="https://www.google.com/search?q=BI+dashboard&amp;hl=en&amp;tbm=isch" target="_blank">dashboards</a>" purporting to be "analytics" or "business intelligence". Lots of pie charts and line graphs with added bling are as the proverbial red rag to a bull.</p>
<p>Until recently my response was to demand more rigorous statistics: hypothesis testing, confidence limits, tests for reverse causality (but recognising that causality is a slippery concept in complex systems). Having recently spent some time thinking about using data analysis to gain actionable insights, particularly in the setting of an educational institution, it has become clear to me that this response is too shallow. It embeds an assumption of a linear process: ask a question, operationalise it in terms of data and statistics and crunch some numbers. As <a href="blogs.cetis.ac.uk/adam/2012/05/18/a-poem-for-analytics/">my previous post</a> indicates, I don't suppose all questions are approachable. Actually, thinking back to the ways I've  done a little text and data mining in the past, it wasn't quite like this either.</p>
<p>The label "<a href="http://en.wikipedia.org/wiki/Exploratory_data_analysis" target="_blank">exploratory data analysis</a>" captures the antithesis to the linear process. It was popularised in statistical circles by <a href="http://en.wikipedia.org/wiki/John_Tukey" target="_blank">John W Tukey</a> in the early 1960's and he used it as a title for a highly influential book. Tukey was trying to challenge a statistical community that was very focused on hypothesis testing and other forms of "confirmatory data analysis". He argued that statisticians should do both, approaching data with flexibility and an open frame of mind and he saw having a well-stocked toolkit of graphical methods as being essential for exploration (Tukey was responsible for inventing a number of plot types that are now widely used).</p>
<p>Tukey read a paper entitled "<a href="http://cm.bell-labs.com/cm/ms/departments/sia/tukey/memo/techtools.html" target="_blank">The Technical Tools of Statistics</a>" at the 125th Anniversary Meeting of the American Statistical Association in 1964 which anticipated the development of computational tools (e.g. <a href="http://cran.r-project.org/" target="_blank">R</a> and <a href="http://rapid-i.com/content/view/181/190/" target="_blank">RapidMiner</a>), is well worth a read and has timeless gems like:</p>
<p><em>"Some of my friends felt that I should be very explicit in warning you of how much time and money can be wasted on computing, how much clarity and insight can be lost in great stacks of computer output. In fact, I ask you to remember only two points:<br />
</em></p>
<ol>
<li><em>The tool that is so dull that you cannot cut yourself on it is not likely to be sharp enough to be either useful or helpful.</em></li>
<li><em>Most uses of the classical tools of statistics have been, are, and will be, made by those who know not what they do."</em></li>
</ol>
<p>There is a correspondence between the open-minded and flexible approach to exploratory data analysis that Tukey advocated and the <a href="http://en.wikipedia.org/wiki/Grounded_theory" target="_blank">Grounded Theory</a> (GT) Method of the social sciences. As a non-social scientist, GT seems to be a trying a bit too hard to be a Methodology (academic disputes and all) but the premise of using both inductive and deductive reasoning and going in to a research question free of the prejudice of a hypothesis that you intend to test (prove? how often is data analysed to find a justification for a prejudice?) is appealing.</p>
<p>Although GT is really focussed on qualitative research, some of the practical methods that the GT originators and practitioners have proposed might be applicable to data captured in IT systems and for practitioners of analytics. I quite like the dictum of "no talk" (see the <a href="http://en.wikipedia.org/wiki/Grounded_theory" target="_blank">wikipedia entry</a> for an explanation).</p>
<p>My take home, then, is something like: if we are serious about analytics we need to be thinking about exploratory data analysis <span style="text-decoration: underline;">and</span> confirmatory data analysis and the label "analytics" is certainly inappropriate if neither is occurring. For exploratory data analysis we need: visualisation tools, an open mind and an inquisitive nature.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cetis.ac.uk/adam/2012/05/18/exploratory-data-analysis/feed/</wfw:commentRss>
		</item>
		<item>
		<title>A Poem for Analytics</title>
		<link>http://blogs.cetis.ac.uk/adam/2012/05/18/a-poem-for-analytics/</link>
		<comments>http://blogs.cetis.ac.uk/adam/2012/05/18/a-poem-for-analytics/#comments</comments>
		<pubDate>Fri, 18 May 2012 10:32:59 +0000</pubDate>
		<dc:creator>adam</dc:creator>
		
		<category><![CDATA[analytics]]></category>

		<guid isPermaLink="false">http://blogs.cetis.ac.uk/adam/?p=538</guid>
		<description><![CDATA[There are many traps for the unwary in the practice of analytics, which I take to be the process of developing actionable insights through problem definition and the application of statistical models. The technical traps are most obvious but the epistemological traps are better disguised.
That these traps exist and are seemingly not recognised in the [...]]]></description>
			<content:encoded><![CDATA[<p>There are many traps for the unwary in the practice of analytics, which I take to be the process of developing actionable insights through problem definition and the application of statistical models. The technical traps are most obvious but the epistemological traps are better disguised.</p>
<p>That these traps exist and are seemingly not recognised in the commercial and corporate rhetoric around analytics worries the more philosphically-minded; Virginia Tech's Garner Campbell has shared some clear and well-received thoughts on the potential for damaging reductionism in Learning Analytics. I particularly like <a href="http://annezelenka.com/2012/03/04/on-the-reductionism-of-analytics-in-education/" target="_blank">Anne Zelenka's blogged reaction</a> to Gardner's LAK12 MOOC (I believe <a href="http://lak12.wikispaces.com/Recordings" target="_blank">there is a recording</a> but elluminate recordings don't seem to play on linux) and my colleague <a href="http://blogs.cetis.ac.uk/sheilamacneill/2012/03/09/learning-analytics-where-do-you-stand/" target="_blank">Sheila has also blogged on the topic</a>.</p>
<p>I don't see reduction as being the issue <em>per se</em> but careless reductionism and failing to remember that our models are surrogates for what might be does worry me. Analytics does give us power for "myth busting" and a means to reduce the degree to which anecdote, prejudice and the opinion of the powerful determines action but let us be very wary indeed.</p>
<p>This all reminded me of the following poem by my favourite poet and mythographer, Robert Graves. Let us be slow.</p>
<p><em><strong>In Broken Images</strong></em></p>
<p><em> He is quick, thinking in clear images;<br />
I am slow, thinking in broken images.</em></p>
<p><em>He becomes dull, trusting to his clear images;<br />
I become sharp, mistrusting my broken images,</p>
<p>Trusting his images, he assumes their relevance;<br />
Mistrusting my images, I question their relevance.</p>
<p>Assuming their relevance, he assumes the fact,<br />
Questioning their relevance, I question the fact.</p>
<p>When the fact fails him, he questions his senses;<br />
When the fact fails me, I approve my senses.</p>
<p>He continues quick and dull in his clear images;<br />
I continue slow and sharp in my broken images.</p>
<p>He in a new confusion of his understanding;<br />
I in a new understanding of my confusion.</p>
<p></em></p>
<p><em>Robert Graves</em></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cetis.ac.uk/adam/2012/05/18/a-poem-for-analytics/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Making Sense of &#8220;Analytics&#8221;</title>
		<link>http://blogs.cetis.ac.uk/adam/2012/05/02/making-sense-of-analytics/</link>
		<comments>http://blogs.cetis.ac.uk/adam/2012/05/02/making-sense-of-analytics/#comments</comments>
		<pubDate>Wed, 02 May 2012 15:22:47 +0000</pubDate>
		<dc:creator>adam</dc:creator>
		
		<category><![CDATA[analytics]]></category>

		<guid isPermaLink="false">http://blogs.cetis.ac.uk/adam/?p=527</guid>
		<description><![CDATA[There is currently a growing interest in increasing the degree to which data from various sources can be put to use by organisations to be more effective and a growing number of strategies for doing this. The term “analytics” is frequently being applied to descriptions of these situations but often without clarity as to what [...]]]></description>
			<content:encoded><![CDATA[<p>There is currently a growing interest in increasing the degree to which data from various sources can be put to use by organisations to be more effective and a growing number of strategies for doing this. The term “analytics” is frequently being applied to descriptions of these situations but often without clarity as to what the word is intended to mean. This makes it difficult to make sense of what is happening, to decide what to appropriate from other sectors, and to make creative leaps forward in exploring how to adopt analytics.</p>
<p>I have just completed a public draft of a paper entitled "Making Sense of Analytics: a framework for thinking about analytics" [link removed - please <a href="http://publications.cetis.ac.uk/c/analytics">visit our publications site</a> to access the final versions] in an attempt to help anyone who is grappling with these questions in relation to post-compulsory education (as I am). It does so by:</p>
<ul>
<li>considering the definition of “analytics”;</li>
</ul>
<ul>
<li>outlining analytics in relation to research management, teaching and learning or whole-institution strategy and operational concerns;</li>
</ul>
<ul>
<li>describing some of the key characteristics of analytics (the Framework).</li>
</ul>
<p>The Framework is intended to support critical evaluation of examples of analytics, whether from commerce/industry or the research community, without resorting to definition of application or product categories. The intention behind this approach is to avoid discussion of "what it is" and to focus on "what it does" and "how it does it".</p>
<p>This is a draft. Please feel free to comment via this blog or directly to me. A revised version will be published in June.</p>
<p><em>This paper is the first of a series that CETIS is producing and commissioning. These will be emerging during the coming months and collected together in a unified online resource in July/August. This is referred to briefly by Sheila MacNeill in her recent post "<a href="http://blogs.cetis.ac.uk/sheilamacneill/2012/03/09/learning-analytics-where-do-you-stand/" target="_blank">Learning Analytics, where do you stand?</a>"</em></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cetis.ac.uk/adam/2012/05/02/making-sense-of-analytics/feed/</wfw:commentRss>
		</item>
		<item>
		<title>UK Government Open Standards Consultation - CETIS Response</title>
		<link>http://blogs.cetis.ac.uk/adam/2012/05/02/uk-government-open-standards-consultation-cetis-response/</link>
		<comments>http://blogs.cetis.ac.uk/adam/2012/05/02/uk-government-open-standards-consultation-cetis-response/#comments</comments>
		<pubDate>Wed, 02 May 2012 09:12:59 +0000</pubDate>
		<dc:creator>adam</dc:creator>
		
		<category><![CDATA[standards]]></category>

		<guid isPermaLink="false">http://blogs.cetis.ac.uk/adam/?p=521</guid>
		<description><![CDATA[Earlier this year the UK Government Cabinet Office published what I thought was a rather good set of proposals for the role of open standards in government IT. They describe it as a "formal public consultation on the definition and mandation of open standards for software interoperability, data and document formats in government IT." There [...]]]></description>
			<content:encoded><![CDATA[<p>Earlier this year the UK Government Cabinet Office published what I thought was a rather good set of <a href="http://consultation.cabinetoffice.gov.uk/openstandards/" target="_blank">proposals for the role of open standards in government IT</a>. They describe it as a "formal public consultation on the definition and mandation of open standards for software interoperability, data and document formats in government IT." There are naturally points where we have critical comments but the direction of travel is broadly one that CETIS supports. The topic of mandation is, however, one to be approached with a great deal of caution in our view.</p>
<p>Our full response, which should be read alongside the consultation document (which includes the questions), is <a href="http://blogs.cetis.ac.uk/adam/files/2012/05/open-standards-consultation-cetis-web.doc">available for your information</a>.</p>
<p>The consultation has now been extended to June 4th 2012 following the revelation of a conflict of interest; the chair of a public consultation meeting in April was found to be also working for Microsoft. This is the latest in a long series of concerns about Microsoft lobbying <a href="http://www.computerweekly.com/blogs/editors-blog/2012/04/open-means-open---why-transpar.html" target="_blank">reported in Computer Weekly</a> and elsewhere. I am actually encouraged by the Cabinet Office response both to FoI requests linked to meetings with Microsoft and to this recent revelation; they do seem to be trying to do the right thing.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cetis.ac.uk/adam/2012/05/02/uk-government-open-standards-consultation-cetis-response/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Analytics and Big Data - Reflections from the Teradata Universe Conference 2012</title>
		<link>http://blogs.cetis.ac.uk/adam/2012/04/27/analytics-and-big-data-reflections-from-the-teradata-universe-conference-2012/</link>
		<comments>http://blogs.cetis.ac.uk/adam/2012/04/27/analytics-and-big-data-reflections-from-the-teradata-universe-conference-2012/#comments</comments>
		<pubDate>Fri, 27 Apr 2012 18:39:23 +0000</pubDate>
		<dc:creator>adam</dc:creator>
		
		<category><![CDATA[analytics]]></category>

		<category><![CDATA[tduniv]]></category>

		<guid isPermaLink="false">http://blogs.cetis.ac.uk/adam/?p=492</guid>
		<description><![CDATA[As part of our current work on investigating trends in analytics and in contextualising it to post-compulsory education - which we are calling our Analytics Reconnoitre - I attended the Teradata Universe Conference recently. Teradata Universe is very much not an academic conference; this was a trip to the far side of the moon, to [...]]]></description>
			<content:encoded><![CDATA[<p>As part of our current work on investigating trends in analytics and in contextualising it to post-compulsory education - which we are calling our Analytics Reconnoitre - I attended the <a href="http://www.teradataemea.com/universe/">Teradata Universe Conference</a> recently. Teradata Universe is very much not an academic conference; this was a trip to the far side of the moon, to the land of corporate IT, grey-suits galore and a dress code...</p>
<p>Before giving some general impressions and then following with some more in-depth reflections and arising thoughts, I should be clear about the terms "analytics" and "big data".</p>
<p>My working definition for Analytics, which I will explain in more detail in a forthcoming white paper and associated blog post is:<br />
<em>"Analytics is the process of developing <span style="text-decoration: underline;">actionable insights</span> through problem definition and the application of statistical models and analysis against existing and/or simulated future data."</em></p>
<p>I am interpreting Big Data as being data that is at such a scale that conventional databases (single server relational databases) can no longer be used.</p>
<p><a href="http://www.teradata.com/">Teradata</a> has a 30 year history of selling and supporting Enterprise <a href="http://en.wikipedia.org/wiki/Data_warehouse">Data Warehouses</a> so it should not have been a surprise that infrastructure figured in the conference. What was surprising was the degree to which infrastructure (and infrastructural projects) figured compared to applications and analytical techniques. There were some presentations in which brief case studies outlined applications but I did not hear any reference to algorithmic, methodological, etc development nor indeed any reference to any existing techniques from the data mining (a.k.a. "knowledge discovery in databases") repertoire.</p>
<p>My overall impression is that the corporate world is generally grappling with pretty fundamental data management issues and generally focused on reporting and descriptive statistics rather than inferential and predictive methods. I don't believe this is due to complacency but simply to the reality of where they are now. As the saying goes "if I was going there, I wouldn't start here".</p>
<h2>The Case for "Data Driven Decisions"</h2>
<p><a href="http://digital.mit.edu/erik/">Erik Brynjolfsson</a>, Director of the <a href="http://digital.mit.edu/">MIT Center for Digital Business</a>, gave an interesting talk entitled "Strength in Numbers: How do Data-Driven Decision-Making Practices affect Performance?"</p>
<p>The phrase "data driven decisions" raises my hackles since it implies automation and the elimination of the human component. This is not an approach to strive for. Stephen Brobst, Teradata CTO, touched on this issue in the last plenary of the conference when he asserted that "Sucess = Science + Art" and backed up the assertion with examples. Whereas my objections to data driven decisions revolve around the way I anticipate such an approach would lead to staff alientation and significant disruption to effective working of an organisational, Brobst was referring to the pitfall trap of incremental improvement leading to missed opportunities for breakthrough innovation.</p>
<p>As an example of a case where incremental improvement found a locally optimal solution but a globally sub-optimal one, Brobst cited actuarial practice in car insurance. Conventionally, risk estimation uses features of the car, the driver's driving history and location and over time the fit between these parameters and statistical risk has been honed to a fine point. It turns out that credit risk data is actually a substantially better fit to car accident risk, a fact that was first exploited by Progressive Insurance back in 1996.</p>
<p>Rather than "data driven decisions", I advocate human decisions supported by the use of good tools to provide us with data-derived insights. Paul Miller argues the same case against just letting the data speak for itself <a href="http://cloudofdata.com/2012/03/hubris-and-the-data-scientist/">on his "cloud of data" blog</a>.</p>
<p>This is, I should add, something Brynjolfsson and co-workers also advocate; they are only adopting terminology from the wider business world. See, for example an article in The Futurist (Brynjolfsson, Erik and McAfee, Andrew, "<a href="http://digital.mit.edu/erik/MA2012_Brynjolfsson_McAfee.pdf">Thriving in the Automated Economy</a>" The Futurist, March-April 2012.). In this article, Brynjolfsson and McAfee make the case for partnering humans and machines throughout the world of work and leisure. They cite an interesting example of the current best chess "player" in the world, which is 2 amateur American chess players using 3 computers. They go on to make some specific recommendations to try to make sure that we avoid some socio-economic pathologies that might arise from a humans vs technology race (as opposed to humans with machines), although not everyone will find all the recommendations ethically acceptable.</p>
<p>To return to the topic of Brynjolfsson's talk, which is expanded in a <a href="http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1819486">paper of the same title</a> (Brynjolfsson, Erik, Hitt, Lorin and Kim, Heekyung "Strength in Numbers: How Does Data-Driven Decisionmaking Affect Firm Performance", April, 2011). The abstract:<br />
<em>"We examine whether performance is higher in firms that emphasize decisionmaking based on data and business analytics (which we term a data-driven decisionmaking approach or DDD). Using detailed survey data on the business practices and information technology investments of 179 large publicly traded firms, we find that firms that adopt DDD have output and productivity that is 5-6% higher what would be expected given their other investments and information technology usage. Using instrumental variables methods, we find evidence that these effects do not appear to be due to reverse causality. Furthermore, the relationship between DDD and performance also appears in other performance measures such as asset utilization, return on equity and market value. Our results provide some of the first large scale data on the direct connection between data-driven decisionmaking and firm performance."</em></p>
<p>This is an important piece of research, adding to a relatively small existing body - which shows correlation between <span style="text-decoration: underline;">high</span> levels of analytics use and factors such as growth (see the paper) - and one which I have no doubt will be followed up. They have taken a thorough approach to the statistics of correlation and tested for reverse causation. The limitations of the conclusion is clear from the abstract, however in "large publicly traded firms". What of smaller firms? Futhermore, business sector (industry) is treated as a "control" but my hunch is that the 5-6% figure conceals some really interesting variation. The study also fails to establish mechanism, i.e. to demonstrate what it is about the context of firm A and the interventions undertaken that leads to enhanced productivity etc. These kinds of issues with evaluation in the social sciences are the subject of writings by Nick Tilley and Ray Pawson (see for example, "<a href="http://evidence-basedmanagement.com/wp-content/uploads/2011/11/nick_tilley.pdf">Realistic Evaluation: An Overview</a>") which I hold in high regard. My hope is that future research will attend to these issues. For now we must settle for less complete, but still useful, knowledge.</p>
<p>I expect that as our Analytics Reconnoitre proceeds we will return to this and related research to explore further whether any kind of business case for data-driven decisions can be robustly made for Higher or Further Education, or whether we need to gather more evidence by doing. I suspect the latter to be the case and that for now we will have to resort to arguments on the basis of analogy and plausibility of benefits.</p>
<h2>Zeitgeist: Data Scientists</h2>
<p>"Data Scientist" is a term which seems to be capturing the imagination in the corporate big data and analytics community but which has not been much used in our community.</p>
<p>A facetious definition of data scientist is "a business analyst who lives in California". Stephen Brobst gave his distinctions between data scientist and business analyst in his talk. His characterisation of a business analyst is someone who: is interested in understanding the answers to a business question; uses BI tools with filters to generate reports. A data scientist, on the other hand, is someone who: wants to know what the question should be; embodies a combination of curiosity, data gathering skills, statistical and modelling expertise and strong communication skills. Brobst argues that the working environment for a data scientist should allow them to self-provision data, rather than having to rely on what is formally supported in the organisation, to enable them to be inquisitive and creative.</p>
<p>Michael Rappa from the Institute for Advanced Analytics doesn't mention curiosity but offers a similar conception of the skill-set for a data scientist in an <a href="http://www.forbes.com/sites/danwoods/2012/03/05/what-is-a-data-scientist-michael-rappa-north-carolina-state-university/3/">interview in Forbes magazine</a>. The Guardian Data Blog has also reported on various <a href="http://www.guardian.co.uk/news/datablog/2012/mar/02/data-scientist">views of what comprises a data scientist</a> in March 2012, following the Strata Conference.</p>
<p>While it can be a sign of hype for new terminology to be spawned, the distinctions being drawn by Brobst and others are appealing to me because they are putting space between mainstream practice of business analysis and some arguably more effective practices. As universities and colleges move forward, we should be cautious of  adopt the prevailing view from industry - the established business  analyst role with a focus on reporting and descriptive statistics - and miss out on a set of more effective practices. Our lack of baked-in BI culture might actually be a benefit if it allows us to more quickly adopt the data scientist perspective alongside necessary management reporting. Furthermore, our IT environment is such that self-provisioning is more tractable.</p>
<h2>Experimentation, Culture and HiPPOs</h2>
<p>Like most stereotypes, the HiPPO is founded on reality; this is decision-making based on the Highest Paid Person's Opinion. While it is likely that UK universities and colleges are some cultural distance from the world of corporate America that stimulated the coining of "HiPPO", we are certainly not immune from decision-making on the basis of management intuition and anecdote suggests that many HEIs are falling into more autocratic and executive style management in response to a changing financial regime. As a matter of pride, though, academia really should try to be more evidence-based.</p>
<p>Avanish Kaushik (Digital Marketing Evangelist at Google) talked of <a href="http://www.kaushik.net/avinash/seven-steps-to-creating-a-data-driven-decision-making-culture/">HiPPOs and data driven decision making</a> (sic) culture back in 2006 yet in 2012 these issues are still main stage items at Teradata 2012. Cultural inertia. In addition to proposing seven steps to becoming more data-driven, Kaushik's posting draws the kind of distinctions between reporting and analysis that accords with the business analyst vs data scientist distinctions, above.</p>
<p>Stephen Brobst's talk - "Experimentation is the Key to Business Success" - took a slightly different approach to challenging the HiPPO principle. Starting from an observation that business culture expects its leadership to have the answers to important and difficult questions, something even argumentative academics can still be found to do, Brobst argued for experimentation to <span style="text-decoration: underline;">acquire</span> the data necessary for informed decision-making. He gained a further nod from me by asserting that the experiment should be designed on the basis of theorisation about mechanism (see earlier reference to the work of Tilley and Paulson).</p>
<p>Proctor and Gamble's approach to pricing a new product by establishing price elasticity through a set of trial markets with different price points is one example. It is hard to see this being tractable for fee-setting in most normal courses in most universities but maybe not for all and it becomes a lot more realistic with large-scale distance education. Initiatives like coursera have the opportunity to build-out for-fee services with much better intelligence on pricing than mainstream HE can dream of.</p>
<h2>Big Data and Nanodata Velocity</h2>
<p>There is quite a lot of talk about Big Data - data that is at such a scale that conventional databases can no longer be used - but I am somewhat sceptical that the quantity of talk is merited. One presenter at Teradata Universe actually proclaimed that big data was largely an urban myth but this was not the predominant impression; others boasted about how many petabytes of data they had (1PB = 1,000TB = 1,000,000GB). There seems to be an unwarranted implication that big data is necessary for gaining insights. While it is clear that more data points improves the statistical significance and that if you have a high volume of transactions/interactions then even small % improvements can have significant bottom line value (e.g. a 0.1 increase in purchase completion at Amazon), there remains a great deal of opportunity to be more analytical in the way decisions are made using smaller scale data sources. The absence of big data in universities and colleges is an asset, not an impediment.</p>
<p>Erik Brynjolfsson chose the term "nanodata" to draw attention to the fine-grained components of most Big Data stores. Virtually all technology-mediated interactions are capable of capturing such "nanodata" and many do. The availability of nanodata is, of course, one of the key drivers of innovation in analytics. Brynjolfsson also pointed to data "velocity", i.e. the near-real-time availability of nanodata.</p>
<p>The insights gained from using <a href="http://www.google.org/flutrends/">Google search terms to understand influenza</a> is a fairly well-known example of using the "digital exhaust" of our collective activities to short-cut traditional epidemiological approaches (although I do not suggest it should replace them). Brynjolfsson cited a similar approach used in work with former co-worked Lynn Wu on <a href="www.nber.org/confer/2009/PRf09/Wu_Brynjolfsson.pdf">house prices and sales</a> (pdf), which anticipated official figures rather well. The US Federal Reserve Bank, we were told, was envious.</p>
<p>It has taken a long time to start to realise the vision of <a href="http://en.wikipedia.org/wiki/Project_Cybersyn">Cybersyn</a>. Yet still our national and institutional decision-making relies on slow-moving and broadly obsolete data; low velocity information is tolerated when maybe it should not be. In  some cases the opportunities from more near-real-time data may be  neglected low-hanging fruit, and it doesn't necessarily have to be Big  Data. Maybe there should be talk of  Fast Data?</p>
<h2>Data Visualisation</h2>
<p><a href="http://www.perceptualedge.com/">Stephen Few</a>, author and educator on the topic of "visual business intelligence", gave both a keynote and a workshop that could be very concisely summarised as a call to: 1) take more account of how human perception works when visualising data; 2) make more use of visualisation for sense-making. Stephen Brobst (Teradata CTO) made the latter point too: that data scientists use data visualisation tools for exploration, not just for communication.</p>
<p>Few gave an accessible account of visual perception as applied to data visualisation with some clear examples and reference to cognitive psychology. His "<a href="http://www.perceptualedge.com/">Perceptual Edge</a>" website/blog covers a great deal of this - see for example "<a href="http://www.perceptualedge.com/articles/ie/visual_perception.pdf">Tapping the Power of Visual Perception</a>" (pdf) - as does his accessible book, "<a href="http://www.amazon.co.uk/gp/product/0970601980?ie=UTF8&amp;tag=perceedge-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=0970601980">Now You See It</a>". I will not repeat that information here.</p>
<p>His argument that "visual reasoning" is powerful is easily demonstrated by comparing what can be immediately understood from the right kind of graphical presentation with tabulation of the data. The point that visual reasoning usually happens transparently (subconsciously) and hence that we need to guard against visualisation techniques that mislead, confuse of overwhelm.</p>
<p>I did feel that he advocated visual reasoning beyond the point at which it is reliable by itself. For example, I find <a href="http://en.wikipedia.org/wiki/Parallel_coordinates">parallel coordinates</a> quite difficult. I would have liked to see more emphasis on visualising the results of statistical tests on the data (e.g. correlation, statistical significance) particularly as I am a firm believer that we should know of the strength of an inference before deciding on action. Is that correlation really significant? Are those events really independent in time?</p>
<p>Few's second key point - about the use of data visualisation for sense-making - began with claims that the BI industry has largely failed to support it. He summarised the typical pathway for data as: collect &gt; clean &gt; transform &gt; integrate &gt; store &gt; report. At this point there is a wall, Few claims, that blocks a productive sense-making pathway: explore &gt; analyse &gt; communicate &gt; monitor &gt; predict.</p>
<p>Visualisation tools tend to have been created with before-the-wall use cases, to be about the plot in the report. I rather agree with Few's criticism that such tool vendors tend to err towards a "bling your graph" feature-set or flashy dashboards but there is hope in the form of tools such as Tibco <a href="http://spotfire.tibco.com/">Spotfire</a> and <a href="http://www.tableausoftware.com/">Tableau</a>, while Open Source afficionados or the budget-less can use <a href="http://www.ggobi.org/">ggobi</a> for complex data visualisation, <a href="http://www.gnu.org/software/octave/">Octave</a> or <a href="http://cran.r-project.org/">R</a> (among others). The problem with all of these is complexity; the challenge to visualistion tool developers is to create more accessible tools for sense-making. Without this kind of advance it requires too much skill acquisition to move beyond reporting to real analytics and that limits the number of people doing analytics that an organisation can sustain.</p>
<p>It is worth noting that "<a href="https://silverspotfire.tibco.com/us/get-spotfire?product=6">Spotfire Personal</a>" is free for one year and that "<a href="http://www.tableausoftware.com/public/download">Tableau Public</a>" is free and intended to let data-bloggers et al publish to their public web servers, although I have not yet tried them.</p>
<h2>Analytics &amp; Enterprise Architecture</h2>
<p>The presentation by Adam Gade (CIO of Maersk, the shipping company)  was ostensibly about their use of data but it could equally have been  entitled "Maersk's Experiences with Enterprise Architecture". Although  at no point did Gade utter the words "Enterprise Architecture" (EA), many of the issues he raised have appeared in <a href="http://emergingpractices.jiscinvolve.org/wp/doing-ea-workshop/">talks at the JISC Enterprise Architecture Practice Group</a>:  governance, senior management buy-in, selection of high-value targets,  tactical application, ... etc. It is interesting to note that Adam Gade  has a marketing and sales background - not the norm for a CIO - yet  seems to have been rather successful; maybe he could sell the idea  internally?</p>
<p>The link between EA and Analytics is not one which has been widely  made (in my experience and on the basis of Google search results) but I  think it is an important one which I will talk of a little more in a  forthcoming blog post, along with an exploration of the Zachman  Framework in the context of an analytics project. It is also worth  noting that one of the enthusiastic adopters of our ArchiMate (TM)  modelling tool, "<a href="http://archi.cetis.ac.uk/">Archi</a>", is <a href="http://www.progressive.com/">Progressive Insurance</a> which established a reputation as a leader in putting analytics to work  in the US insurance industry (see, for example the book <a href="http://www.amazon.co.uk/Analytics-Work-Smarter-Decisions-Results/dp/1422177696">Analytics at Work</a>, which I recommend, and the <a href="http://www.accenture.com/SiteCollectionDocuments/PDF/Accenture_Analytics_At_Work_Smarter_Decisions.pdf">summary from Accenture</a>, pdf).</p>
<p>Adam Gade also talked of the importance of "continuous delivery",  i.e. that analytics or any other IT-based projects start demonstrating  benefits early rather than only after the "D-Day". I've come across a  similar idea - "time to value" - being argued for as being more tactically  important than return on investment (RoI). RoI is, I think, a rather  over-used concept and a rather poor choice if you do not have good  baseline cost models, which seems to be the case in F/HEIs. Modest  investments returning tangible benefits quickly seems like a more  pragmatic approach than big ideas.</p>
<h2>Conclusions - Thoughts on What this Means for Post-compulsory Education</h2>
<p>For all the general perception is that universities and colleges are relatively undeveloped in terms of putting business intelligence and analytics to good use, I think there are some important "but ..." points to make. The first "but" is that we shouldn't measure ourselves against the most effective users from the commercial sector. The second is that the absence of entrenched practices means that there should be less inertia to adopting the most modern practices. Third, we don't have data at the scale that forces us to acquire new infrastructure.</p>
<p>My overall impression is that there is opportunity if we make our own path, learning from (but not following) others. Here are my current thoughts on this path:</p>
<p><strong>Learn from the Enterprise Architecture pioneers in F/HE</strong></p>
<p>Analytics  and EA are intrinsically related and the organisational soft issues in  adopting EA in F/HE have many similarities to those for adopting  analytics. One resonant message from the EA early adopters, which can be adapted for analytics, was "use just enough EA".</p>
<p><strong>Don't get hung up on Big Data</strong></p>
<p>While Big Data is a relevant technology trend, the possession of big data is not a pre-requisite for making effective use of analytics. The fact that we do not have Big Data is a freedom not a limitation.</p>
<p><strong>Don't focus on IT infrastructure (or tools)</strong></p>
<p>Avoid the temptation (and sales pitches) to focus on IT infrastructure as a means to get going with analytics. While good tools are necessary, they are not the right place to start.</p>
<p><strong>Develop a culture of being evidence-based</strong></p>
<p>The success of analytics depends on people being prepared to critically engage with evidence based on data (including its potential weaknesses or bias and to avoid being over-trusting of numbers) and to take action on the analysis rather then being slaves to anecdote and the HiPPO. This should ideally start from the senior management. "In God we trust, all others bring data" (<a href="http://en.wikipedia.org/wiki/W._Edwards_Deming#Quotations_and_concepts">probably mis-attributed to W. Edwards Deming</a>).</p>
<p><strong>Experiment with being more analytical at craft-scale</strong></p>
<p>Rather than thinking in terms of infrastructure or major initiatives, get some practical value with the infrastructure you have. Invest in someone with "data scientist" skills as master crafts-person and give them access to all data but don't neglect the value of developing apprentices and of developing wider appreciation of the capabilities and limitations of analytics.</p>
<p><strong>Avoid replicating the "analytics = reporting" pitfall</strong></p>
<p>While the corporate sector fights its way out of that hole, let us avoid following them into it.</p>
<p><strong>Ask questions that people can relate to and that have efficiency or effectiveness implications</strong></p>
<p>Challenge custom and practice or anecdote on matters such as: "do we assess too much?", "are our assessment instruments effective and efficient?", "could we reduce heating costs with different use of estate?", "could research groups like mine gain greater REF impact through publishing in OA journals?", "how important is word of mouth or twitter reputation in recruiting hard-working students?", "can we use analytics to better model our costs?"</p>
<p><strong>Look for opportunities to exploit near-real-time data</strong></p>
<p>Are decisions being made on old data, or no changes being made because the data is essentially obsolete? Can the "digital exhaust" of day-to-day activity be harnessed as a proxy for a measure of real interest in near-real-time?</p>
<p><strong>Secure access to sector data</strong></p>
<p>Sector organisations have a role to play in making sure that F/HEIs have access to the kind of external data needed to make the most of analytics. This might be open data or provisioned as a sector shared service. The data might be geospatial, socio-economic or sector-specific. JISC, HESA, TheIA, LSIS and others have roles to play.</p>
<p><strong>Be open-minded about "analytics"</strong></p>
<p>The emerging opportunities for analytics lie at the intersection of practices and technologies. Different communities are converging and we need to be thinking about creative borrowing and blurring of boundaries between web analytics, BI, learning analytics, bibliometrics, data mining, ... etc. Take a wide view.</p>
<p><strong>Collaborate with others to learn by doing</strong></p>
<p>We don't yet know the pathway for F/HE and there is much to be gained from sharing experiences in dealing with both the "soft" organisational issues and the challenge of selecting and using the right technical tools. While we may be competing for students or research funds, we will all fail to make the most from analytics and to effectively navigate the rapids of environmental factors if we fail to collaborate; competitive advantage is to be had from how analytics is applied  but that can only occur if capability exists.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cetis.ac.uk/adam/2012/04/27/analytics-and-big-data-reflections-from-the-teradata-universe-conference-2012/feed/</wfw:commentRss>
		</item>
		<item>
		<title>New Draft British Standard - Exchanging Course Related Information</title>
		<link>http://blogs.cetis.ac.uk/adam/2012/02/29/new-draft-british-standard-exchanging-course-related-information/</link>
		<comments>http://blogs.cetis.ac.uk/adam/2012/02/29/new-draft-british-standard-exchanging-course-related-information/#comments</comments>
		<pubDate>Wed, 29 Feb 2012 12:37:05 +0000</pubDate>
		<dc:creator>adam</dc:creator>
		
		<category><![CDATA[standards]]></category>

		<guid isPermaLink="false">http://blogs.cetis.ac.uk/adam/?p=488</guid>
		<description><![CDATA[The two parts of a draft British Standard (BS), "BS 8581 - Exchanging course related information – Course advertising profile" have recently been released for public comment on the British Standard Institute "Draft Review" website.
This standard is heavily based on the XCRI-CAP 1.2 specification, which has been developed and piloted over the past few years [...]]]></description>
			<content:encoded><![CDATA[<p>The two parts of a draft British Standard (BS), "BS 8581 - Exchanging course related information – Course advertising profile" have recently been released for public comment on the British Standard Institute "Draft Review" website.</p>
<p>This standard is heavily based on the <a href="http://www.xcri.co.uk/">XCRI-CAP</a> 1.2 specification, which has been developed and piloted over the past few years with support from JISC and CETIS, and would create a British Standard that is consistent with the European Standard "Metadata for Learning Opportunities - Advertising" (EN 15982, also to be adopted as BS) but extends it and provides more detail suited to UK application.</p>
<p>The two parts for public review, which closes on April 30th 2012, are:</p>
<ul>
<li><a href="http://drafts.bsigroup.com/Home/Details/952">BS 8581-1 - Exchanging course related information – Course advertising profile – Specification</a></li>
<li><a href="http://drafts.bsigroup.com/Home/Details/953">BS 8581-2 - Exchanging course related information – Course advertising profile – Code of practice</a></li>
</ul>
<p>Registration is required to access the drafts and comment.</p>
<p>Background information and details of implementations of XCRI may be found on the <a href="http://www.xcri.co.uk/">XCRI-CAP website.</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cetis.ac.uk/adam/2012/02/29/new-draft-british-standard-exchanging-course-related-information/feed/</wfw:commentRss>
		</item>
		<item>
		<title>EdTech Blogs - a visualisation playground</title>
		<link>http://blogs.cetis.ac.uk/adam/2012/02/21/edtech-blogs-a-visualisation-playground/</link>
		<comments>http://blogs.cetis.ac.uk/adam/2012/02/21/edtech-blogs-a-visualisation-playground/#comments</comments>
		<pubDate>Tue, 21 Feb 2012 23:57:33 +0000</pubDate>
		<dc:creator>adam</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<category><![CDATA[telmap]]></category>

		<guid isPermaLink="false">http://blogs.cetis.ac.uk/adam/?p=462</guid>
		<description><![CDATA[During the CETIS Conference today (Feb 22nd), I showed a few graphs, plots and other visualisations that show the results of text mining around 7500 blog posts, mostly from 2011 and into early 2012. These were crawled by the RWTH Aachen University "Mediabase".
There are far too many to show here and each of three analyses [...]]]></description>
			<content:encoded><![CDATA[<p>During the CETIS Conference today (Feb 22nd), I showed a few graphs, plots and other visualisations that show the results of text mining around 7500 blog posts, mostly from 2011 and into early 2012. These were crawled by the RWTH Aachen University "<a href="http://dbis.rwth-aachen.de/cms/projects/mediabase" target="_blank">Mediabase</a>".</p>
<p>There are far too many to show here and each of three analyses has its separate auto-generated output, which is linked to below. Each of these outlines key aspects of the method and headline statistics. I am <span style="text-decoration: underline;">quite aware</span> that it is bad practice just to publish a load of visualisations without either an explicit or implicit story. If this bothers you, you might want to stop now, or visit my short piece "<a href="http://www.learningfrontiers.eu/?q=story/east-and-west-two-worlds-technology-enhanced-learning">East and West: two worlds of technology enhanced learning</a>", which uses the first method outlined below but is not such a "bag of parts". If you want to weave your own story... read on!</p>
<h2>Stage 1: Dominant Themes</h2>
<p>The starting point is simply to look at the dominant themes in blog posts from 2011 and early 2012 through the lens of frequent terms used. Common words with little significance (stop words) are removed and similar words are aggregated (e.g. learn, learner, learning). This set of blog posts is then split into two sets: those from CETIS and those from a broadly representative set of Ed Tech blogs. The frequent terms are then filtered into those that are statistically more significant in the CETIS set and those that are statistically more significant in the Ed Tech set.</p>
<p>The results of doing this are: "<a href="http://arc12.github.com/Text-Mining-Weak-Signals-Output/Compair/CETIS%20Conf%202012/Report.html" target="_blank">Comparison: CETIS Blogging vs EdTech Bloggers Generally (Jan 2011-Feb 2012)</a>"</p>
<div id="attachment_464" class="wp-caption aligncenter" style="width: 310px"><a href="http://blogs.cetis.ac.uk/adam/files/2012/02/co-occurrence-b.png"><img class="size-medium wp-image-464" title="co-occurrence-b" src="http://blogs.cetis.ac.uk/adam/files/2012/02/co-occurrence-b-300x300.png" alt="Co-occurrence Pattern - Ed Tech Blogger Frequent Terms" width="300" height="300" /></a><p class="wp-caption-text">Co-occurrence Pattern - Ed Tech Blogger Frequent Terms. (see the &quot;results&quot; link above for explanation and more...)</p></div>
<h2>Stage 2: Emerging and Declining Themes</h2>
<h3>Stage 2a: Finding Rising and Falling Terms</h3>
<p>In this case, I home in on CETIS blogs only, but go back further in time: to January 2009. The blog posts are split into two sets: one contains posts from the last 6 months and the other contains posts since the end of January 2009. The distribution of terms appearing in each set is compared to find those which are statistically significant in the change, taking into account the sample size. This process identifies four classes of term: terms that appear anew in recent months, terms that rose from very low frequencies, those that rose from moderate or higher frequencies and those that fell (or vanished).</p>
<p>The results of doing this are: "<a href="http://arc12.github.com/Text-Mining-Weak-Signals-Output/Rising%20and%20Falling%20Terms/TELBlogs/CETIS%202012-02/Report.html" target="_blank">Rising and Falling Terms - CETIS Blogs Jan 31 2012</a>". This has a VERY LARGE number of plots, many of which can be skipped over but are of use when trying to dig deeper. This auto-generated report also contains links to the relevant blog posts and ratings for "novelty" and "subjectivity".</p>
<div id="attachment_467" class="wp-caption aligncenter" style="width: 310px"><a href="http://blogs.cetis.ac.uk/adam/files/2012/02/fallingp-vals.png"><img class="size-medium wp-image-467" title="fallingp-vals" src="http://blogs.cetis.ac.uk/adam/files/2012/02/fallingp-vals-300x300.png" alt="Significant Falling Terms" width="300" height="300" /></a><p class="wp-caption-text">Significant Falling Terms</p></div>
<h3>Stage 2b: Visualising Changes Over Time</h3>
<p>Various terms were chosen from Stage 2a and the changes in time rendered using the (in-) famous "bubble chart". Although these should not be taken too seriously since the quantity of data per time step is rather small, these allow for quite a lot of experimentation with a range of related factors: term frequency, number of documents containing the term, positive/negative sentiment in posts containing the term. Four separate charts were created for CETIS blogs from 2009-2012: <a href="http://arc12.github.com/Text-Mining-Weak-Signals-Output/History%20Visualiser/CETIS%20Conf%202012%20-%20CETIS%20Blogs/Rising.html">Rising</a>, <a href="http://arc12.github.com/Text-Mining-Weak-Signals-Output/History%20Visualiser/CETIS%20Conf%202012%20-%20CETIS%20Blogs/Established.html">Established</a>, <a href="http://arc12.github.com/Text-Mining-Weak-Signals-Output/History%20Visualiser/CETIS%20Conf%202012%20-%20CETIS%20Blogs/Falling.html">Falling</a> and<a href="http://arc12.github.com/Text-Mining-Weak-Signals-Output/History%20Visualiser/CETIS%20Conf%202012%20-%20CETIS%20Blogs/Familiar.html"> Familiar</a> (dominant terms from Stage 1). The <a href="arc12.github.com/Text-Mining-Weak-Signals-Output/History Visualiser/CETIS Conf 2012 - EdTech Blogs/Familiar.html" target="_blank">dominant non-CETIS terms are also available</a>, but only for 2011.</p>
<p><script src="//www.gmodules.com/ig/ifr?url=http://hosting.gmodules.com/ig/gadgets/file/104028096938989264141/Familiar_gadget.xml&amp;synd=open&amp;w=480&amp;h=380&amp;title=History+Visualiser%3A+CETIS+Blogs+-+Jan+2009-Feb+2012+-+select+dominant+terms+(c.f.+EdTech+generally)&amp;border=%23ffffff%7C3px%2C1px+solid+%23999999&amp;output=js"></script></p>
<h2>Final Words</h2>
<p>Due to some problems with the blog crawler, a number of blogs could not be processed or had incompletely extracted postings so this is not truly representative. The results are not expected to change dramatically but there will be some terms appearing and some disappearing when these issues are fixed. This posting will be altered and the various auto-generated reports will be re-generated in due course.</p>
<p>The R code used, and results from using the same methods on conference abstracts/papers are <a href="https://github.com/arc12/Text-Mining-Weak-Signals/wiki">available from my GitHub</a>. This site also includes some notes on the technicalities of the methods used (i.e. separate from the way these were actually coded).</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cetis.ac.uk/adam/2012/02/21/edtech-blogs-a-visualisation-playground/feed/</wfw:commentRss>
		</item>
		<item>
		<title>The Network of Society of Scholars (Fiction)</title>
		<link>http://blogs.cetis.ac.uk/adam/2012/02/20/the-network-of-society-of-scholars-fiction/</link>
		<comments>http://blogs.cetis.ac.uk/adam/2012/02/20/the-network-of-society-of-scholars-fiction/#comments</comments>
		<pubDate>Mon, 20 Feb 2012 11:23:24 +0000</pubDate>
		<dc:creator>adam</dc:creator>
		
		<category><![CDATA[jiscobs]]></category>

		<category><![CDATA[telmap]]></category>

		<guid isPermaLink="false">http://blogs.cetis.ac.uk/adam/?p=469</guid>
		<description><![CDATA[As preparation for the session "Emerging Reality: Making sense new models of learning organisations" at this week's CETIS Conference (which is a session hosted by the TELMap Project), I have created the following scenario to try to make real some plausible drivers/issues/etc. The session will be debating the plausibility of these and other issues and [...]]]></description>
			<content:encoded><![CDATA[<p>As preparation for the session "<a href="http://wiki.cetis.ac.uk/Emerging_Reality:_Making_sense_new_models_of_learning_organisations" target="_blank">Emerging Reality: Making sense new models of learning organisations</a>" at this week's CETIS Conference (which is a session hosted by the TELMap Project), I have created the following scenario to try to make real some plausible drivers/issues/etc. The session will be debating the plausibility of these and other issues and hence their potential shapers of future learning organisations. I will emend this posting once the outcomes of the workshop are published.</p>
<p>The scenario is pure fiction, an informal speculation about something that <span style="text-decoration: underline;">might</span> happen by around 2020-5.</p>
<h2>The Scenario</h2>
<h3>What is a “Society of Scholars”?</h3>
<p>A Society of Scholars is based in one or more large old houses with a combination of study-bedrooms, communal cooking and social spaces, a library and a central seminar. Students and some of the Fellows live there while many Fellows live with their families and study there.</p>
<p>There are no fixed courses but a framework within which depth and breadth of scholarship is guided and measured. This framework is validated by an established University, which awards the degree and provides QA (all for a fee). The Network of Societies of ScholarsTM has additional ethical codes and strict membership rules.<br />
The physical co-location is a central part of the Society, combined with the wider (virtual) network of peers.</p>
<h3>Genesis</h3>
<p>Societies of Scholars sprang out of an initial “wild card” experiment where a small group of progressive academics with experience of inquiry-based learning pooled their redundancy payments from one of many rounds of staff-culling. A few sold their houses.</p>
<p>Their idea was to strip out the accumulation of both central services and formality of teaching and learning setting and to get back to basics while reducing cost and being able to do more of what they enjoy: thinking and talking. In doing this they hoped to attract students who were otherwise being asked to pay ever higher fees to endure ever more “commoditised” offerings and suffer poor employment prospects. The promise of high wages to pay off high debt is elusive for many who follow the conventional route. Graduate employment and student satisfaction are worse for those who opt for the newer “no frills degree course” offerings, which have cut costs without re-inventing the educational experience.</p>
<p>For several years they struggled to attract students but gradually a few gifted students managed to develop ultra-high web reputations started to attract more applications. The turning point was the winning of an international prize for work on “Smart Cities”, which led to a media frenzy in 2018. This triggered a spate of endowments of new Societies by successful entrepreneurs and the establishment of satellite societies to Cambridge and Oxford Universities in the UK and ETH Zurich in Switzerland with others quickly following (all recognising the threat but also the early-mover opportunity).</p>
<h3>Character of a Society of Scholars</h3>
<p>Societies are highly reputation conscious as are the individuals within them. They are highly effective at using the web, what we called “new media” in the naughties and in media management generally.</p>
<p>With the exception of assessment, Fellows and Students undertake essentially the same kind of activities; the Students strive to emulate the attitude and work of the fellows. Both divide their time between private study, informal and formal discussion. Collaboration works. There is no “Fellows teach Students”; all teach each other through the medium of the seminar. All consider “teaching the world” to be an important (but not dominating) part of what they do.</p>
<p>The selection process plays a key role in shaping the character of the Society. Students are admitted NOT primarily on the basis of examination grades but on evidence of self-discipline, self-awareness and especially self-directed intellectual activity.</p>
<h3>Course and Assessment</h3>
<p>There are no specified courses and all Students follow a unique pathway of their own. Fellows offer guidance and almost all Students piece together a collection of topics that are identifiable (e.g. similar to a conference theme, a textbook, etc). There is no fixed minimum or maximum period of study.</p>
<p>Societies typically focus on 3-4 disciplines but always adopt a multi-disciplinary perspective, for example computer science, electronic engineering, built environment and social theory was the combination that led to the “Smart Cities” prize.</p>
<p>Online resources are exploited to the fullest extent. Free or cheap MOOCs (massively open online courses, especially the form pioneered by Stanford University and <a href="http://www.udacity.com/" target="_blank">udacity.com</a>) are combined with the for-fee examinations offered alongside them.</p>
<p>Wikipedia is considered to be a “has been”; Society members (across the Network) and others collaborate on DIY textbooks using a system build on top of “<a href="http://en.wikipedia.org/wiki/Git_%28software%29">git</a>” (permitting multiple versions, derivatives, etc see <a href="https://github.com/" target="_blank">GitHub</a> for a "social coding" example) and a decentralised network of small servers. While being widely useful this activity is also a valued learning activity with the side effect of promoting coherence in the study pathway.</p>
<p>Assessment is complicated primarily by the idiosyncrasy of all pathways but also by the need to connect achievement to the breadth/depth framework. An award is typically evidenced by a mixture of: externally taught and examined modules; public examinations of the University of London; a patchwork of personal work (a “portfolio”); contributions to the DIY textbooks; seminar performance.</p>
<h3>Demand and Expansion</h3>
<p>Societies of Scholars are niche occupiers in a much wider higher education landscape. Demand is no more than 5% and supply  only about 3% in 2025. There is a feeling that graduates of the Societies are the “new elite”.</p>
<p>While some politicians call for the massification of the Society concept, society at large recognises that they need a special kind of student: more of an intellectual entrepreneur. The rise of the Society of Scholars has, however, started to change the way society understands (and answers) questions like: “what is the purpose of education?”; “how does learning happen?”... The long-term effect of this change on the face of education is not known yet (2025).</p>
<p>Employers in particular have understood what Societies offer and, while graduate unemployment for those following a conventional route to a degree remains close to 2012 levels, Society graduates are highly employable. Employers value: creativity, good communication skills, media-savvy people, multi-disciplinary thinking, self-motivation, intellectual flexibility, collaborative and community-oriented lifestyle.</p>
<h2>The Drivers/Issues</h2>
<p><em>This is a summary of some of the implicit or explicit assumed drivers/issues embedded in the scenario and which determine the plausibility of it (or alternatives). They are intentionally phrased as statements that could be disagreed with, argued for, ...</em></p>
<ol>
<li>Physical co-location and (especially intimate) face-to-face interactions will continue to be seen to be an essential aspect of high quality education. Students who can afford (or otherwise access) this will generally do so. Employers will value awards arising from courses containing it more highly than those that do not. <a href="http://www.telegraph.co.uk/education/universityeducation/6677998/Funding-cuts-threaten-Oxford-University-tutorial-system.html" target="_blank">Telegraph newspaper article</a>.</li>
<li>Graduate unemployment will be an issue for years to come. Effective undergraduates will find ways to distinguish themselves. <a href="http://www.hesa.ac.uk/content/view/2188/393/" target="_blank"></a><a href="http://www.hesa.ac.uk/content/view/2188/393/" target="_blank">HESA Statistics</a></li>
<li>Wikipedia (and similar centralised "web commons" services) are unsustainable in their current form. As the demand from users rises and the support from contributors and sponsors wanes (it becomes less cool to be a Wikipedian) a point of unsustainability is reached. One option is to monetise but another is to "go feral" and transition to peer-to-peer or decentralised approaches. <a href="http://www.digitaltrends.com/computing/analysts-advise-wikipedia-to-stop-asking-for-donations/" target="_blank">Digital Trends article</a>.</li>
<li>Universities and colleges will increase the supply of course and educational components, disaggregated from "the course", "the programme" and "the institutional offering". Examinations, Award Granting and Quality Assurance are all potentially independent marketable offerings. <a href="http://www.bbc.co.uk/news/10278662" target="_blank">David Willets article on the BBC</a> (see "Flexible Learning")</li>
<li>Cheap large-scale online courses are capable of replacing a significant percentage of conventional teaching time. The “Introduction to AI” course demonstrated this: see <a href="http://is.gd/JOseb" target="_blank">http://is.gd/JOseb</a>.</li>
<li>Employers are conservative when it comes to education. While employers bemoan narrow knowledge of graduates, poor "soft skills", etc, their shortlisting criteria continue to favour candidates with conventional degree titles and high grades from research-intensive universities. They will generally fail to take advantage of rich portfolio evidence.</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cetis.ac.uk/adam/2012/02/20/the-network-of-society-of-scholars-fiction/feed/</wfw:commentRss>
		</item>
		<item>
		<title>The Stanford &#8220;Introduction to AI&#8221; Course - the sign of a disruptive innovation?</title>
		<link>http://blogs.cetis.ac.uk/adam/2012/02/08/the-stanford-introduction-to-ai-course-the-sign-of-a-disruptive-innovation/</link>
		<comments>http://blogs.cetis.ac.uk/adam/2012/02/08/the-stanford-introduction-to-ai-course-the-sign-of-a-disruptive-innovation/#comments</comments>
		<pubDate>Wed, 08 Feb 2012 15:41:29 +0000</pubDate>
		<dc:creator>adam</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blogs.cetis.ac.uk/adam/?p=460</guid>
		<description><![CDATA[Over on the JISC Observatory website a recent interview with Seb Schmoller has just been published in which he talks about his experiences - from the perspective of an online distance educator - of the recent large scale open online course "Introduction to AI" run in association with Stanford University. As the interview unfolded it [...]]]></description>
			<content:encoded><![CDATA[<p>Over on the JISC Observatory website a <a href="http://blog.observatory.jisc.ac.uk/2012/02/08/observations-on-learning-technology-innovation-%E2%80%93-an-interview-with-seb-schmoller/" target="_blank">recent interview with Seb Schmoller </a>has just been published in which he talks about his experiences - from the perspective of an online distance educator - of the recent large scale open online course "Introduction to AI" run in association with Stanford University. As the interview unfolded it occurred to me that the aspects of the course that had struck Seb as being of potentially profound importance fitted the criteria for a "low end <a href="http://en.wikipedia.org/wiki/Disruptive_innovation">disruptive innovation</a>" in the terminology of innovation theorist Clayton M Christensen. Low end disruption refers to the way apparently well-run businesses could be disrupted  by newcomers with cheaper but good-enough offerings that focus on core  customer needs and often make use generic off-the-shelf technologies.</p>
<p>Interesting stuff to ponder...</p>
<p>(<a href="http://blog.observatory.jisc.ac.uk/2012/02/08/observations-on-learning-technology-innovation-%E2%80%93-an-interview-with-seb-schmoller/">interview on the JISC Observatory site</a>)</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cetis.ac.uk/adam/2012/02/08/the-stanford-introduction-to-ai-course-the-sign-of-a-disruptive-innovation/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
