ebooks 2013

Every year for the past dozen or so years the Department of Information Sciences at UCL have organised a meeting on ebooks. I’ve only been to one of them before, two or three years ago, when the big issues were around what publishers’ DRM requirements for ebooks meant for libraries. I came away from that musing on what the web would look like if it had been designed by publishers and librarians (imagine questions like: “when you lend out our web page, how will you know that the person looking at the screen is a member of your library?”…). So I wasn’t sure what to expect when I decided to go to this year’s meeting. It turned out to be far more interesting than I had hoped, I latched on to three themes of particular interest to me: changing paradigms (what is an ebook?), eTextBooks and discovery.

Changing paradigms

With the earliest printed books, or incunabula, such as the Gutenberg Bible, printers sought to mimic the hand written manuscripts with which 15th cent scholars were familiar; in much the same way as publishers now seek to replicate printed books as ebooks.

With the earliest printed books, or incunabula, such as the Gutenberg Bible, printers sought to mimic the hand written manuscripts with which 15th cent scholars were familiar; in much the same way as publishers now seek to replicate printed books as ebooks.

In the first presentation of the day Lorraine Estelle, chief executive of Jisc Collections, focussed on access to electronic resources. Access not lending; resources not ebooks. She highlighted the problems of using yesterday’s language and thinking as being problematic in this context, like having a “horseless carriage” and buying it hay. [This is my chance to make the analogy between incunabula and ebooks again, see right.] The sort of discussions I recalled from the previous meeting I attended reflect this thinking, publishers wanting a digital copy of a book to be equivalent to the physical book, only lendable to one person at a time and to require replacing after a certain number of loans.

We need to treat digital content as offering new possibilities and requiring new ways of working. This might be uncomfortable for publishers (some more than others) and there was some discussion about how we cannot assume that all students will naturally see the advantages, especially if they have mostly encountered problematic content that presents little that could not be put on paper but is encumbered with DRM to the point that it is questionable as to whether they really own the book. But there is potential as well as resistance. Of course there can be more interesting, more interactive content–Will Russell of the Royal Society of Chemistry described how they have been publishing to mobile devices, with tools such as Chem Goggles that will recognise a chemical structure and display information about the chemical. More radically, there can also be new business models: Lorraine suggested Institutions could become publishers of their own teaching content, and later in the day Caren Milloy, also of Jisc Collections, and Brian Hole of Ubiquity Press pointed to the possibilities of open access scholarly publishing.

Caren’s work with the OAPEN Library is worth looking through for useful information relating to quality assurance in open monograms such as notifying readers of updates or errata. Caren also talked about the difficulties in advertising that a free online version of a resource is available when much of the dissemination and discovery ecosystem (you know, Amazon, Google…) is geared around selling stuff, difficulties that work with EDitEUR on the ONIX metadata scheme will hopefully address soon.

Brian described how Ubiquity Press can publish open access ebooks by driving down costs and being transparent about what they charge for. They work from XML source, created overseas, from which they can publish in various formats including print on demand, and explore economies of scale by working with university presses, resulting in a charge to the author (or their funders) of about £150 for a chapter assuming there is nothing to complex in that chapter.

eTextBooks

All through the day there were mentions of eTextBooks, starting again with Lorraine who highlighted the paperless medic and how his quest to work only with digital resources is complicated by the non-articulation of the numerous systems he has to use. When she said that what he wanted was all his content (ebooks, lecture handouts, his own notes etc.) on the same platform, integrated with knowledge about when and where he had to be for lectures and when he had exams, I really started to wonder how much functionality can you put into an eContent platform before it really becomes a single-person content-oriented VLE. And when you add in the ability to share notes with the social and communication capability of most mobile devices, what then do you have?

A couple of presentations addressed eTextBooks directly, from a commercial point of view. Jenni Evans spoke about Vital Source and Andrejs Alferovs about Kortext both of which are in the business of working with institutions distributing online textbooks to students. Both seem to have a good grasp of what students want, which I think should be useful requirements to feed into eTextBook standardization efforts such as eTernity, these include:

  • ability to print
  • offline access
  • availability across multiple devices
  • reliable access under load
  • integration with VLE
  • integration with syllabus/curriculum
  • epub3 interactive content
  • long term access
  • ability for student to highlight/annotate text and share this with chosen friends
  • ability to search text and annotations

Discovery

There was also a theme of resource discovery running through the day, and I have already mentioned in passing that this referenced Google and Amazon, but also social media. Nick Canty spoke about a survey of library use of social media, I thought it interesting that there seemed to be some sophisticated use of the immediacy of Twitter to direct people to more permanent content, e.g. to engagement on Facebook or the library website.

Both Richard Wallis of OCLC and Robert Faber of OUP emphasized that users tend to use Google to search and gave figures for how much of the access to library catalogue pages came direct from Google and other external systems, not from their own catalogue search interface. For example the Biblioteque Nationale de France found that 80% of access to their catalogue pages cam directly from web search engines not catalogue searches, and Robert gave similar figures for access to Oxford Journals. The immediate consequence of this is that if most people are trying to find content using external systems then you need to make sure that at least some (as much as possible, in fact) of your content is visible to them–this feeds in to arguments about how open access helps solve discoverability problems. But Richard went further, he spoke about how the metadata describing the resources needs to be in a language that Google/Bing/Yahoo understand, and that language is schema.org. He did a very good job distinguishing between the usefulness of specialist metadata schema for exchanging precise information between libraries or publishers, but when trying to pass general information to Google:

it’s no use using a language only you speak.

Richard went on to speak about the Google Knowledge graph and their “things not strings” approach facilitated by linked data. He urged libraries to stop copying text and to start linking, for example not to copy an author name from an authority file but to link to the entry in that file, in Eric Miller’s words to move from cataloguing to “catalinking”.

ebooks?

So was this really about ebooks? Probably not, and the point was made that over the years the name of the event has variously stressed ebooks and econtent and that over that time what is meant by “ebook” has changed. I must admit that for me there is something about the idea of a [e]book that I prefer over a “content aggregation” but if we use the term ebook, let’s use it acknowledging that the book of the future will be as different from what we have now as what we have now is from the medieval scroll.

Picture Credit
Scanned image of page of the Epistle of St Jerome in the Gutenberg bible taken from Wikipedia. No Copyright.

Learning Resource Metadata is Go for Schema

The Learning Resource Metadata Initiative aimed to help people discover useful learning resources by adding to the schema.org ontology properties to describe educational characteristics of creative works. Well, as of the release of schema draft version 1.0a a couple of weeks ago, the LRMI properties are in the official schema.org ontology.

Schema.org represents two things: 1, an ontology for describing resources on the web, with a hierarchical set of resource types each with defined properties that relate to their characteristics and relationships with other things in the schema hierarchy; and 2, a syntax for embedding these into HTML pages–well, two syntaxes, microdata and RDFa lite. The important factor in schema.org is that it is backed by Google, Yahoo, Bing and Yandex, which should be useful for resource discovery. The inclusion of the LRMI properties means that you can now use schema.org to mark up your descriptions of the following characteristics of a creative work:

audience the educational audience for whom the resource was created, who might have educational roles such as teacher, learner, parent.

educational alignment an alignment to an established educational framework, for example a curriculum or frameworks of educational levels or competencies. Expressed through an abstract thing called an Alignment Object which allows a link to and description of the node in the framework to which the resource aligns, and specifies the nature of the alignment, which might be that the resource ‘assesses’, ‘teaches’ or ‘requires’ the knowledge/skills/competency to which the resource aligns or that it has the ‘textComplexity’, ‘readingLevel’, ‘educationalSubject’ or ‘educationLevel’ expressed by that node in the educational framework.

educational use a text description of purpose of the resource in education, for example assignment, group work.

interactivity type The predominant mode of learning supported by the learning resource. Acceptable values are ‘active’, ‘expositive’, or ‘mixed’.

is based on url A resource that was used in the creation of this resource. Useful for when a learning resource is a derivative of some other resource.

learning resource type The predominant type or kind characterizing the learning resource. For example, ‘presentation’, ‘handout’.

time required Approximate or typical time it takes to work with or through this learning resource for the typical intended target audience

typical age range The typical range of ages the content’s intended end user.

Of course, much of the other information one would want to provide about a learning resource (what it is about, who wrote it, who published it, when it was written/published, where it is available, what it costs) was already in schema.org.

Unfortunately one really important property suggested by LRMI hasn’t yet made the cut, that is useRightsURL, a link to the licence under which the resource may be used, for example the creative common licence under which is has been released. This was held back because of obvious overlaps with non-educational resources. The managers of schema.org want to make sure that there is a single solution that works across all domains.

Guides and tools

To promote the uptake of these properties, the Association of Educational Publishers has released two new user guides.

The Smart Publisher’s Guide to LRMI Tagging (pdf)

The Content Developer’s Guide to the LRMI and Learning Registry (pdf)

There is also the InBloom Tagger described and demonstrated in this video.

LRMI in the Learning Registry

As the last two resources show, LRMI metadata is used by the Learning Registry and services built on it. For what it is worth, I am not sure that is a great example of its potential. For me the strong point of LRMI/schema.org is that it allows resource descriptions in human readable web pages to be interpreted as machine readable metadata, helping create services to find those pages; crucially the metadata is embedded in the web page in way that Google trusts because the values of the metadata are displayed to users. Take away the embedding in human readable pages, which is what seems to happen when used with the learning registry, and I am not sure there is much of an advantage for LRMI compared to other metadata schema,–though to be fair I’m not sure that there is any comparative disadvantage either, and the effect on uptake will be positive for both sides. Of course the Learning Registry is metadata agnostic, so having LRMI/schema.org metadata in there won’t get in the way of using other metadata schema.

Disclosure (or bragging)

I was lucky enough to be on the LRMI technical working group that helped make this happen. It makes me vary happy to see this progress.

Brief reflections on Open Practice and OER Sustainability

Lorna and I ran a session at the CETIS conference on the topic of Open Practice and OER Sustainability, we had 10-minute presentations from ten brilliant people who have been involved in the UKOER programme each giving a view from their own perspective on the general problem of “what now that the Jisc money has gone?” It’s fruitless to try to summarise that in full, so what I will do is add links to presentations to the session page linked-to above and give my own very cursory summary of a few of the themes. Lorna has also written a summary on her own blog.

“Scratch your own itch”

One of the most telling comments on sustainability, from Julian Tenney talking about the Xerte project, was that a project would most likely be sustainable if it was about doing something that the people involved needed doing anyway. Not necessarily something that would be done anyway (though in Xerte’s case mostly it was), but definitely not something that was being done just because the money was there. I agree with a comment that was made that there is a problem with the way that Universities treat project funding in this respect (at least in research departments), always the emphasis is on chasing money, getting the next grant. There were many examples of what it might be that “needs doing anyway”, at personal, subject community, institutional, and national/sector-wide level, from the sharing of resources between humanities teachers using HumBox, extra mural studies of the Department of continuing Education at Oxford University, the institutional teaching and learning policy at Leeds Met University, FE colleges in Scotland working in ever closer union and student progression from College to University.

nickbalance(By: Nick Sheppard, Leeds Metropolitan University)

Nick Sheppard asked for a technical infrastructure to support these institutional and other policies. He (and others) asked for APIs and other links between repositories (and the rest of the web, I assume) so that the greatest advantage could be had for effort. Sarah Currier told us about the new offers from Mimas to make your OER effort “Jorum Powered” through a hosted repository, a web interface into Jorum, or by building custom applications using the new Jorum API.

But with technical infrastructure come technical requirements, David Kernohan was worried that these requirements are only bearable by an academic with help, and that once the Jisc funding goes that support will also go. Suzanne Hardy also touched on this.

davidimbalance
by David Kernohan, Jisc. The teddy bear is an academic.

The concept involved here was identified by Yvonne Howard as relative advantage, the advantage of something has to be compared to the costs and the costs have to be minimised, as can be done through clever technology such as maximum use of machine created metadata.

“It’s like MOOCs stole OER’s girlfriend”

footpathSo far I’ve mentioned advantages for many people but glossed over the fact that different people will see different advantages; they don’t and for that reason they will pursue different directions, as we have seen with MOOCs. Amber Thomas of Warwick University (but yes, the same Amber as was of JISC) described MOOCs and OERs as distant cousins who used to get on but are now no longer friendly for some reason. And it’s not like the O for Open in the two really stands for the same thing, as Pat Lockley said, their open is not necessarily our open. But, he asked, what is open? a footpath through private land or a National Park with the right to roam where you please (if you can manage to get there)?lakedistrict

(this last photo is mine and is covered by the CC-BY licence of this blog; the others aren’t and are used according to their various licences or permissions from their creators.)

eTextBooks Europe

I went to a meeting for stakeholders interested in the eTernity (European textbook reusability networking and interoperability) initiative. The hope is that eTernity will be a project of the CEN Workshop on Learning Technologies with the objective of gathering requirements and proposing a framework to provide European input to ongoing work by ISO/IEC JTC 1/SC36, WG6 & WG4 on eTextBooks (which is currently based around Chinese and Korean specifications). Incidentally, as part of the ISO work there is a questionnaire asking for information that will be used to help decide what that standard should include. I would encourage anyone interested to fill it in.

The stakeholders present represented many perspectives from throughout Europe: publishers, publishing industry specification bodies (e.g. IPDF who own EPUB3, and DAISY), national bodies with some sort of remit for educational technology, and elearning specification and standardisation organisations. I gave a short presentation on the OER perspective.

Many issues were raised through the course of the day, including (in no particular order)

  • Interactive and multimedia content in eTextbooks
  • Accessibility of eTextbooks
  • eTextbooks shouldn’t be monolithic and immutable chunks of content, it should be possible to link directly to specific locations or to disaggregate the content
  • The lifecycle of an eTextbook. This goes beyond initial authoring and publishing
  • Quality assurance (of content and pedagogic approach)
  • Alignment with specific curricula
  • Personalization and adaptation to individual needs and requirements
  • The ability to describe the learning pathway embodied in an eTextbook, and vary either the content used on this pathway or to provide different pathways through the same content
  • The ability to describe a range IPR and licensing arrangements of the whole and of specific components of the eTextbook
  • The ability to interact with learning systems with data flowing in both directions

If you’re thinking that sounds like a list of the educational technology issues that we have been busy with for the last decade or two, then I would agree with you. Furthermore, there is a decade or two’s worth of educational technology specs and standards that address these issues. Of course not all of those specs and standards are necessarily the right ones for now, and there are others that have more traction within digital publishing. EPUB3 was well represented in the meeting (DITA is the other publishing standard mentioned in the eTernity documentation, but no one was at the meeting to talk about that) and it doesn’t seem impossible to meet the educational requirements outlined in the meeting within the general EPUB3 framework. The question is which issues should be prioritised and how should they be addressed.

Of course a technical standard is only an enabler: it doesn’t in itself make any change to teaching and learning; change will only happen if developers create tools and authors create resources that exploit the standard. For various reasons that hasn’t happened with some of the existing specs and standards. A technical standard can facilitate change but there needs to a will or a necessity to change in the first place. One thing that made me hopeful about this was a point made by Owen White of Pearson that he did not to think of the business he is in as being centred around content creation and publishing but around education and learning and that leads away from the view of eBooks as isolated static aggregations.

For more information keep an eye on the eTernity website

At the end of the JLeRN experiment

The JLeRN experiment was a toe dipped in the learning registry, a trial at different approach to sharing information about learning resources and how they are used that focusses on getting the information out there and not on worrying over the schemas and formats in which the information is conveyed. That experiment (JLeRN, not the Learning Registry as a whole) is drawing to a close, so we had a meeting earlier this week to review what had been done, what had been learnt and what was left to do and learn.

Sarah Currier had arranged for projects that had worked with JLeRN blog something about what they had done before the meeting, here’s the email with a summary of them, if you haven’t come across JLeRN before you might want to have a look through them before reading on. What I want to describe here is my own understanding of where the Learning Registry is and to report some of the issues about it raised at the meeting.

The Learning Registry: Nodes or a network?

The learning registry as a network from a presentation by Dan Rehak and others.. © Copyright 2011 US Advanced Distributed Learning Initiative: CC-BY-3.0.

The learning registry as a network from a presentation by Dan Rehak and others.. © Copyright 2011 US Advanced Distributed Learning Initiative: CC-BY-3.0.

From the outset the Learning Registry was conceived as a network, the software created would be nodes that connected together to share data about resources. Some of the details have been put on the back burner since those early descriptions, for example the ideas of communities and gateway nodes haven’t been much developed.

The community map on the Learning Registry website shows three nodes (the red pins), including the JLeRN node; Steve Midgely told us via email “There are a few development nodes out there that we know of: Agilix, Illinois Dept of Commerce and California Dept of Ed. To my knowledge there are no production nodes beyond the ones we currently run. Several companies have expressed interest in taking over our production nodes including Dell, Cisco and Amazon.” To that tally I can add the EngRich node at Liverpool. Steve adds that the only network he knows of is the LR public network. Now, I’m not sure about the other nodes, but I do know that the JLeRN and EngRich nodes haven’t interacted with the public network in any meaningful way (yet).

So I think we have to say that, to date, there isn’t really much to prove the concept of the Learning Registry as a network. There are, however some developments in the works that I think will change that, for example the Learning Registry Index, see below.

Services
The other aspect of the development of the Learning Registry against the vision shown in the diagram above is that of services being built to interact with the data in the nodes (these are shown as square in the diagram above). This is crucial since the Learning Registry is no more than plumbing to shift data around, it does nothing with that data that would interest a teacher or learner. It is left to others to develop services that meet user needs–Pat Lockley summed this up quite nicely in his presentation showing how the learning registry was targeted at developers and promoted relationships between developers, service managers and users more than was the case with traditional repository software.

“I think the major point of my slides was to suggest the learning registry is a “developer’s repository” – not that you need a developer to use it, more that you develop services around a node. Also, I feel there is a greater role for the developer in the ecosystems around a node than around a repository – the services on offer, and the scope of services you create seem richer – partially as any data can be stored.”

Well, there are some services for getting data in, there is the OAI-PMH to Learning Registry Publish Utility, and there is Pat’s RSS importer, Ramanathan, and his Google analytics data importer, Pliny. Also at least two projects–Scott Wilson’s SPAWS and Liverpool University’s EngRich–had involved the submission of data to Learning Registry nodes as part of the services they created.

But putting data in is meeting a service manager’s needs, it’s no good in itself since it doesn’t meet any user needs. There are a few user oriented services built off data in the Learning Registry. Pat showed us a couple of Chrome plugins, demos here and here. These are great as proofs of concept, and really important as such, they help show non-technical people what the learning registry is for. But there then follows some expectation management while you explain the limitations of the demonstrators. Other projects had embedded means of getting data out of the Learning Registry nodes into their project outputs, for example EngRich have an iLike widget for the Liverpool student portal that shows what resources students on specific courses have recommended based on data in their Learning Registry node.

Steve Midgely provided us with some very promising information, “the Gates foundation is funding several groups to build index and search services on top of Learning Registry (called Learning Registry Index) and that will require running nodes of some kind.”

Does it work?
One message that I picked up during the meeting and elsewhere is that the Learning Registry, as software, works. The people who set up nodes seem to have done so quickly, the people who used the APIs didn’t report problems in doing so. That’s a good place to be starting.

At a deeper level I guess we need to wait until there are more services built off the data in the Learning Registry to find out whether the Learning Registry works as a concept. Some known problems have been deliberately pushed out of scope in the development of the Learning Registry, one key one is not worrying about what formats and schemas for the data that goes in. This is good if you are submitting data, but unless some level of agreement is reached it does place the onus for making sense of the data on the people who are creating services that use the data. So far, the extent to which this (reaching agreement or making sense of arbitrary data) is possible in the context of the Learning Registry is untested.

Other questions remain over how the learning registry will function as a network, for example how duplicate and complementary records about the same resource will be dealt with when many people might be providing information about the same resource.

Why use it?
Owen Stephens and David Kay were at the meeting asking some very pertinent questions. Neither are particularly caught up in the education technology world, with more of a background in information systems for libraries, where of course there are different approaches to solving similar problems. So, why use the Learning Registry rather than raw couchDB, or some other schemaless, NoSQL, document store (e.g. MongoDB, which is popular for research data management), or free text indexing and search software such as Lucene/Solr, or RDF triple stores, or just a traditional relational database with SQL? To some extent the aim at the moment is to try and answer some of those questions: we won’t know if we don’t try it. But it’s valid to ask how far have we got to answering them, and here is my appraisal.

RDF?
Schemaless sharing of data still appeals to me because I don’t think we know what schema we want to use to share some of the interesting information about the use of resources for teaching and learning. I think the RDF approach will influence the data that is submitted, for example there is interest in using the Learning Registry to store LRMI style metadata. LRMI is adding properties to schema.org so that educational characteristics of resources can be described, and schema.org is only a step or two away from semantic web approaches such as RDF. But some influences of RDF we don’t want. For example there is a tendency at times for RDF approaches to fixate on ontologies. That would stall us. So, for example in LRMI it is possible to say that a resource “aligns” with some point in an educational framework: i.e. it is useful for teaching some topic in a standard curriculum, or assessing some skill required by a competency framework. That’s really useful, but the vocabulary for the nature of the alignment has had to be left open (“teaches” and “assess” are two suggested terms, others are that the resource has a certain “text complexity” or requires a “reading level” or other “educational level”)–the understanding of what education is about varies so much over the world and between settings that agreement on a closed ontology seems unattainable. Still, you could use RDF if you didn’t specify and ontology, and if you could make sense of the RDF without one.

Another weakness of RDF in this context, as I understand it, is its ability to deal with subjective opinions. As soon as a teacher or learner sees an assertion that resource X is good for teaching topic Y (to continue the example used above) they should be asking “says who”. Engineering students at Liverpool are more interested in what other Engineering students find useful, especially those at Liverpool, than they are in the opinions of physics students. Yes, you can have named graphs in RDF and provide information about who asserted which triples, but it goes beyond what is usual, whereas in it is built in from the start in the Learning Registry concept of paradata.

All of that is somewhat conjectural though, because as yet there is little in the Learning Registry that is not metadata that could be expressed in some standard schema such as LOM XML or DC RDF.

Other schemaless data stores
Why not use just CouchDB, without the Learning Registry API, or MongoDB, or Lucene? All of these would make sense for single instance data stores, which is pretty much what we have now with single more-or-less isolated nodes rather than a network. And, yes, I am sure that some way of sharing data between them could be worked up if that is what you wanted. So again any advantages of the Learning Registry is still putative at this stage.

One advantage of the Learning Registry is that, as I mentioned above, it does seem to work: it does seem to come out of the package as a functional way of storing and sharing data that is tailored to education. So as an introduction to No SQL databases it’s not a bad place for the education community to start.

In summary
In a post about the end of the JLeRN project David Kay has quoted Simon Schama on his not being sure whether the French Revolution was over. I’ll quote what Chairmain Mao supposedly said when asked what he thought of the French Revolution; “it’s too early to tell”. The things to look out for are a functioning network of nodes and user-facing services being delivered from data in those nodes. Then we can ask whether that data could be shared in any other way. For the time being I think that the main achievement of JLeRN and the UK’s involvement in the Learning Registry is that it has started people thinking about alternatives to relational databases and they have taken first steps into working with these. Too often, I think, data has been squeezed into an relational data where the benefits of doing so are simply that it is what the developer happens to be familiar with. If all you have is a hammer then you can have real problems dealing with screws.

[updated to correct an attribution error as to who was comparing JLeRN to the French revolution]

Some adventures with HTML5

A couple of weeks ago I hosted an online webinar for JISC OER Rapid Innovation projects. Here I will attempt to summarise what was said about HTML5.

Rapid Innovation projects are short projects, typically only a few months long, that JISC fund to do some development; they’re not the place for open-ended explorations of new concepts, but that doesn’t mean that they aren’t projects from which we can learn a lot. They are quite a good test bed for assumptions that certain developments should be quite easily achievable: you think that the state of technology X is such that a couple of months of developer effort should be enough to realise idea Y: a rapid innovation project is a way of testing this. The aim of this webinar was to collect reflections from this round of projects on a number of technologies that several projects had tried. HTML5, with associated aspects of Javascript, video and accessibility was one of those technologies.

One of the projects that had the strongest dependency on HTML5 was XENITH (Xerte Experience Now Improved: Targeting HTML5) which was predicated on converting the Xerte online toolkit (a popular wizard-based approach to creating OERs) from Flash output to HTML5. This seems even more important now than it did when the project started, as we have seen an accelerating shift away from Flash to HTML5 on mobile platforms. Tellingly, we were told that once busy Flash mailing lists now have very little traffic, a sign that developers are deserting Flash tools.

Julian Tenney, the XERTE project manager (and Flash developer by background), reported that he had initially been nervous about the feasibility of replacing the functionality of the flash player with HTML5, but he said he was “much much more comfortable with it now, it seems that [the project] haven’t really hit an awful lot of problems.” The project was running ahead of expectations, with solid core implemented with a good interface and more than half of the 75 templates for different types of page converted to HTML5. The project has used JQuery as the gernal JavaScript framework, which is a popular choice. After a fair amount of investigation into how to support audio and video playback they adopted JW Player, which did most of what the project needed to do without them trying to create anything new from scratch.

One advantage that HTML5 has over Flash, highlighted by EA Draffan of the Synote Mobile project, is that in principle it should help make resources accessible to all. XERTE has a good record for supporting access, for example it will work through the JAWS screen reader, and Julian pointed to a disadvantage of HTML5: that the accessibility was left to the browser, and not as in the case of Flash under the control of the developer. This sentiment that was echoed by Josef Baker who has been working on displaying maths in HTML5 compared to pdf for the Maxtract project, who had found that neither accessible pdf nor HTML 5 worked as well for blind and visually impaired users as plain text.

This problem seems most acute with video playback, where making resources accessible for anyone can be a problem on some devices. Several projects reported that there is a problem still getting the acceptable behaviour for video playback across different browser/platform combinations; an issue which Synote have documented. Several people voiced concern at this inconsistency concerning video, the plethora of Javascript libraries for controlling video, even at the level of there being no one video format that would work across platforms; the poor performance on small mobile screens and lack of mature development framework elements (especially compared to apps). Simon Morris of the ensemble project and associated rapid innovation project for OER data infrastructure was especially critical of the ease of developing tools for sophisticated manipulation of the video stream. While it seems possible to create HTML5 applications to do this that work for a specific target browser & device, the difficulty seems to be getting something that will work across multiple platforms. He was doubtful about whether the document centric layout engines for HTML5 would ever be as easy to use for graphics oriented purposes as those available for native mobile apps. He also pointed to the example of controlling video from YouTube, where the API functionality to do such things as tracking which part of the video was being viewed was only available in the Flash and not in HTML5. According to Simon, there are deep-seated problems associated with the file format standards with respect to pseudo-streaming, for example the information that allows one to jump in to a video at an arbitrary point is held at the end of mpeg video files, meaning the entire video has to be loaded before the viewer can jump to the bit they want to see.

It seems clear that libraries such as JQuery have helped overcome many of the inconsistencies of creating good user experiences in HTML5. HTML5 video still has a way to go, especially on mobiles. There was disagreement on whether the problems described were signs of immaturity and indicated a need to support the further development of JavaScript libraries that aim to iron over platform inconsistencies for video in a similar way to JQuery, or an obstacle to using HTML5 that would be difficult to overcome while native apps provide an alternative. The “native Vs HTML5 web app” question is one that goes far beyond the experiences of a few projects with video.

Examples of good licence embedding

I was asked last week to provide some good examples of embedded licences in OERs. I’m pleased to do that (with the proviso that this is just my personal opinion of “good”) since it makes a change from carping about how some of the outputs of the UKOER programme demonstrate a neglect of seemingly obvious points about self-description. For example anyone who gets hold of a copy of the resource would want see that it is an OER, so it seems obvious that the Creative Commons licence should be clearly displayed on the resource; they would also want to see something about who created, owned or published the resource, partly to comply with the attribution condition of Creative Commons licences but also to conform with good academic and information literacy practice around provenance and citation. With few exceptions, the machine readable metadata hidden in the OERs’ files (such as MS Office file properties, id3 tags, EXIF etc.) are an irremediable mess, especially for licence and attribution information which cannot on the whole created automatically, and so are generally ignored. Also, the metadata stored in a content management system such as a repository and displayed on the landing page for the resource are not relevant when the resource is copied and used in some other system. So what I’m looking at here is human readable information about licence and attribution that travels with the resource when it is copied. Different approaches are required for different resource types, so I’ll take them in turn.

Text, e.g. office documents, MS Word, Powerpoint, PDF
Pretty simple really, you can have a title section with the name of resource creator and a footer with the copyright and licensing information. You can also have a more extensive “credits” page at the end of the document. Running page headers and footers work well if you think that people might take just a few pages rather than the whole document.
Example text OER with attribution and licence information. Note that the licence statement and logo link to the legal deed on the Creative Commons website.
Example OER powerpoint with licence and attribution information. Note how the final slide gives licence and attribution information of third party resources used.

Web pages
Basically a special case of a text document, the attribution and licence information can be included in a title or footer section, scroll down to the bottom of this page to see an example. For HTML there is a good case for making this information machine readable by wrapping the information in microdata or RDFa tags. Plugins exist for many web content management systems to do this, and the Creative Commons licensing generator will produce an HTML snippet that includes such tags.

ImagesExample of photo with attribution and licence information
Really the only option for putting the essentially textual information about licence and attribution into an image is to add it as a bar to the image. The Attribute Images and related projects at Nottingham have been doing good work on automating this.

Audio
A spoken introduction can provide the information required. BBC podcasts give good examples, though they are not OERs; also the introduction to the video below works as audio.

Video
An introductory screen or credits at the end (with optional voice over) can provide the required information. See for example this video from MIT OCW (be sure to skip to the end to see credits to third party resources used).

Podcasts (and other RSS feeds)

As well as having <copyright> and <creativeCommons:license> tags in the RSS feed at channel and item level, Oxford Universities OER podcasts use an image for the channel that includes the creative commons logo. This is useful because the image is displayed by many feed readers and podcast applications. Of course the recordings should have licence information in them just as any other audio or video OER.