CETIS “What metadata…?” meeting summary

Yesterday we had a meeting in London with about 25 people thinking about the question “What metadata is really useful?

My thinking behind having a meeting on this subject was that resource description can be a lot of effort; so we need to be careful that the decisions we make about how it is done are evidence-based. Given the right data we should be able to get evidence about what metadata is really used for, as opposed to what we might speculate that it is useful for (with the caveat that we need to allow for innovation, which sometimes involves supporting speculative usage scenarios). So, what data do we have and what evidence could we get that would help us decide such things as whether providing a description of a characteristic such as the “typical learning time for using a resource” either is or isn’t helpful enough to justify the effort? Pierre Far went to an even more basic level and asked in his presentation, why do we use XML for sharing metadata?–is it the result of a reasoned appraisal of the alternatives, such as JSON, or did just seem the right thing to do at some point?

Dan Rehak made the very useful point to me that we need a reason for wanting to answer such questions, i.e. what is it we want to do? what is the challenge? Most of the people in the room were interested in disseminating educational resources (often OERs): some have an interest in disseminating resources that had been provided by their own project or organization, others have an interest in services that help users find resources from a wide range of providers. So I had “help users find resources they needed” as the sort of reason for asking these questions; but I think Dan was after something new, less generic, and (though he would never say this) less vague and unimaginative. What he suggested as a challenge was something like “how do you build a recommender system for learning materials?” Which is a good angle, and I know it’s one that Dan is interested in at the moment; I hope that others can either buy into that challenge or have something equally interesting that they want to do.

I have suggested that user surveys, existing metadata and search logs are potential sources of data reflecting real use and real user behaviour, and no one has disagreed so I structured much of the meeting around discussion of those. We had short overviews of examples previous work on each each of these, and some discussion about that, followed by group discussions in more depth for each. I didn’t want this to be an academic exercise, I wanted the group discussions to turn up ideas that could be taken forward and acted on, and I was happy at the end of the day. Here’s a sampler of the ideas turned up during the day:
* continue to build the resources with background information that I gathered for the meeting.
* promote the use common survey tools, for example the online tool used by David Davies for the MeDeV subject centre (results here).
* textual analysis of metadata records to show what is being described in what terms.
* sharing search log in a common format so that they can be analysed by others (echoes here of Dave Pattern’s sharing of library usage data and subsequent work on business intelligence that can be extracted from it).
* analysis of search logs to show which queries yield zero hits which would identify topics on which there was unmet demand.

In the coming weeks we shall be working through the ideas generated at the meeting in more depth with the intention of seeing which can actually be brought to fruition. In the meantime keep an eye on the wikipage for the meeting which I shall be turning into a more detailed record of the event.