Analysing OCWSearch logs

We have a meeting coming up on the topic of investigating what data we have (or could acquire) to answer the question of what metadata is really required to support the discovery, selection, use and management of educational resources. At the same time as I was writing a blog post about that, over at OCWSearch they were publishing the list of top searches for their collection (I think Pierre Far is the person to thank for that). So, what does this tell us about metadata requirements?

I’ve been through the terms at the top half of the list (it says that the list is roughly in descending order of popularity, however it would be really good to know more about how popular each search term was) and tried to judge what characteristic or property of the resource the searcher was searching on.

There were just under 170 search terms in total. It doesn’t surprise me that the vast majority (over 95%) of them are subject searches. Both higher-level, broad subject terms (disciplines, e.g. “Mathematics”) and lower-level, finer-grained subject terms (topics, e.g. “Applied Geometric Algebra”) crop up in abundance. I’m not sure you can say much about their relative importance.

What’s left is (to me) more interesting. We have:

  • resource types, specifically: “online text book”, “audio”, “online classes”.
  • People, who seem to be staff at MIT, so while it’s possible someone is searching for material about them or about their theories, I think it is likely that people are searching for them as resource creators
  • level, specifically: 101, Advanced (x2), college-level. These are often used in conjunction with subject terms.
  • Course codes e.g. HSM 260, 15.822, Psy 315. (These also imply a level and a subject.)

I think with more data and more time spent on the analysis we could get some interesting results from this sort of approach.

2 thoughts on “Analysing OCWSearch logs

  1. Perhaps unsurprising to see the subject and level given perhaps a starting point of “I’m teaching {level} {subject} next semester, I wonder what other people use to teach that”. That was certainly the experience from Curriculum Online. Interesting though that resource creators are being used – will there ever be OER superstars?

    Anyway, its definitely a good idea to grab as much search data as possible to see what can be discerned.

  2. Pingback: Turning metadata, usage data and stuff into something interesting for OCW

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>