Developing Semantic-Web-friendly specifications

This serves a personal position statement for the CETIS Future of Interoperability Standards Meeting 2010-01-12

Why and how the Semantic Web

We want interoperability specifications and standards with a Semantic Web underlay,

  • because that is
    • the fundamental common denominator,
    • well-adapted to evolving systems,
    • good for reuse,
    • post-modern;
  • using and enabling a “linked data” strategy, with emphasis on:
    • URI-identified resources,
      • with types of resource that are widely agreed for a domain;
    • links between them,
      • using common DC-like relationships/properties/predicates;
  • but with no immediate need for RDF all at once…
    • for RDF, think more Turtle than RDF/XML;
    • any XML should be RDF friendly:
      • able to be clearly mapped and transformed to triples;
      • there may be blank nodes
        • which may be filled in one day;
    • can approach RDF via RDFa and/or GRDDL approaches;
    • may not need XML, as long what there is can be transformed to RDF;
  • using the DCMI Abstract Model as a reference point.

Where a community of practice exists

Where there is good existing practice with electronic tools, experience suggests that it is effective to start with an informal, community-driven specification initiative, and the community in question would be in the best position to decide if and when to offer the specification to a formal body for standardisation. Such an initiative could:

  • start from existing data;
  • inclusively unify current good practice.

This unification would involve:

  • establishing common conceptual models as groundwork (see below);
  • identifing elements that are close enough, and merging them;
  • retaining elements that are likely to be used in more than one system.

Good qualities for a target specification include:

  • the appropriate reuse of existing RDF-friendly specs;
  • ease of implementation;
  • graceful degradation for lesser-used features;
  • being able to be repurposed and reused in the same way that it reuses other specs.

Common conceptual models

Where there is as yet insufficient practice to fuel a specification effort by a community of practice, it is useful to get together as many people as are interested, from formal and informal groupings, and:

  • seek first to agree on a clear common conceptual model  where everyone’s point of view is represented, filling out hidden, elided concepts,
    • recognising that this involves all in development, as
    • it is a challenge to loosen up a conceptual scheme, so
    • people need support and the right context.

Such a conceptual modelling process can work well by being primed with personal discussions between those able to develop their conceptual models. It is essential to the viability of a common conceptual model that everyone with a significant variant opinion is drawn in to the process of working towards a common model. Each of these discussions needs to focus on mutual understanding and a mutual development of positions so that each position comes to include a partial model that is shared between the parties. This takes time – typically several hours, not a few minutes – but is very promising.

This deep communication and shared modelling process is certainly not well-adapted to formal committee procedure. Nor is it suitable for a collective process of a community of interest or of practice. But both formal and informal bodies can perfectly well encourage dialogues of this kind to happen, and seek to check whether they have in fact taken place sufficiently to provide the basis of a usable common model.

Clearly, some people find loosening their conceptual structures more difficult than others. Bodies, formal and informal, should ideally stress that this is necessary, provide encouragement (and perhaps even education or training) in how to do it, and finally discourage those who are unable or unwilling to do this from participating in these processes at all.

Information models, specifications and standards

After the agreement of a common conceptual model, information models can be based on it, as the basis for specifications and eventually standards. This does not mean that the whole conceptual model needs to be represented in any information model, nor even that complete parts of the conceptual model need to be. If no relevant information attaches to a particular concept in the conceptual model, it is quite reasonable to leave it out from a practical information model (resulting in what I have termed “elision”) as long as the conceptual model is kept in mind to refer back to.

Derivative information models should, rather:

  • feel comfortable to practitioners;
  • not be hard to implement;
  • but still be interoperable.

Notes and references

Development of a conceptual model

Reflecting on the challenging field of conceptual models, I thought of the idea of exposing my evolving conceptual model that extends across the areas of learner mobility, learning, evaluation/assessment, credit, qualifications and awards, and intended learning outcomes — which could easily be detailed to cover knowledge, skill and competence.

eurolmcm10

This is more or less the whole thing as it is at present. It will evolve, and I would like that to illustrate how a model can evolve as a result of taking into account other ideas. It also wants a great deal of explanation. I invite questions as comments (or directly) so that I can judge what explanation is helpful. I also warmly welcome views that might be contrasting, to help my conceptual model to grow and develop.

It originates in work with the European Learner Mobility team specifying a model for European Learner Mobility documents — that currently include the Diploma Supplement (DS) and Certificate Supplement. This in turn is based on the European draft standard Metadata for Learning Opportunities (MLO), which is quite similar to the UK’s (and CETIS’s) XCRI. (Note: some terminology has been modified from MLO.) Alongside the DS, the model is intended to cover the UK’s HEAR — Higher Education Achievement Report. And the main advance from previous models of these things, including transcripts of course results, is that it aims to cover intended learning outcomes in a coherent way.

This work is evolving already with valued input from colleagues I talk to in

but I wanted to publish it here so that anyone can contribute, and anyone in any of these groups can refer to it and pass it round — even if as a “straw man”.

It would have been better to start from the beginning, so that I could explain the origin of each part. However that is not feasible, so I will have to be content with starting from where I am, and hoping that the reasoning supporting each feature will become clear in time, as there is an interest. Of course, at any time, the reasoning may not adequately support the feature, and on realising that I will want to change the model.

Please comment if there are discrepancies between this model and your model of the same things, and we can explore the language expressing the divergence of opinion, and the possibility for unification.

Obviously this relates to the SC36 model I discussed yesterday.

See also the next version.

More PDP and e-portfolios - Reading

Yesterday I went to an interesting event in Reading (at the University) called “Future-proofing PDP and ePortfolios“. My role was only to answer questions in a Q&A session on interoperability, but as there were few technical people around it was called “Can I take away what I’ve put into our PDP system?”

Two really interesting points emerged.

  1. Many institutions feel stuck with Blackboard at present, even for their portfolio functionality. Generally, they are unhappy with this.
  2. There are a few interesting tools that work in Blackboard, and people are keen on using the wiki facility for building portfolio presentations.

The wiki tool in question is from Learning Objects. The general idea is that learners find wiki technology an easy way to write a presentation, and if that is what they want to do with a “portfolio”, it should work fine - as indeed any wiki technology. I don’t know how important the integration with the rest of the e-learning system would be.

But this in turn brings up the question, if wikis are used as a platform for constructing e-portfolio presentations, can we make them interoperable with other e-portfolio systems? It would be great if we could. I intend to ask around, and think around, this issue, and write more. The basic idea would be to get a new version of LEAP2 out - LEAP2R - that would be LEAP2 in RDFa - and then see if a wiki system can be tweaked to export and import XHTML+RDFa in LEAP2R format. We would of course also build transforms to convert between LEAP2A and LEAP2R.

Tags - some sense at last?

Through the RDFa blog I see a new specification, Common Tag, that various service providers have agreed, for a way of associating local text tags to URIs. This could be very helpful for many reasons, but in general, for making your own tags understandable to the outside world, including software working through the Semantic Web.

Doing XML semantically

When looking at XML specifications, first look for what are the resources, or objects, or entities. When you have one of these contained in another, ask, what is their relationship? That will help inform a sensible version of the XML spec, if you really must have one.

Didn’t I do well getting the core ideas into less than however many words? OK, now for the full version…

Yesterday we (Scott and I) were visited by Karim Derrick of TAG Learning. Karim and TAG are championing a BSI initiative, scheduled to be BS 8518, for the transfer of assessment data - particularly focused on coursework. They are being generous: they are doing the development work, based on their own and their clients’ needs, and handing it over to BSI for standardisation, so that all can benefit.

One of the things that we are keen on in CETIS is doing standards and specifications in a sensible way. We have long had a strong line in discouraging people from doing ill-advised things (perhaps a bit like the supposed Google message of not being evil) but I’m not very well-adapted for that, so I welcome the complementary approach of positively trying to encourage people to do sensible things, which I think is gaining strength in CETIS. The inherent challenge is coming to some kind of collective view on how to standardise the subject matter in hand - even if this is, wait (until something happens), and only then, do it. Within this line of doing good things, one that we seem to agree on is to do with XML specifications. And so I come back to the main thrust of this post.

Doing XML semantically is what has happened in XCRI (thanks to Scott Wilson and others) and now, with my involvement, in LEAP2A. It is easy in an Atom-based specification to follow this pattern, because Atom’s simple basic structure invites any kind of portfolio item to be an entry, and the relationships between them to be Atom links. For the same reason, Atom tends to be easy to read. But it is not too difficult to do this as well in your own XML language, if you just take a little care. You should look at every element, to see whether it is a thing, a relationship, or data - in RDF terms, a resource, a property or predicate, or a literal. TAG’s draft specification has pupils, as it is designed primarily for schools, rather than students. Pupils are things, in these terms! It has centres, which are often where the teaching and the coursework assessment takes place. What is the relationship between a student and a centre? Just taking leave of the TAG proposal for a minute, and thinking of other possibilities, if there were always only one centre, and all the students belonged to that centre, there would be no need even to represent the students within (in XML terms) the centre. If there are different groups of students within a centre, it might make sense to have within the centre element, elements defining what the relationship is between the centre and this particular group of students.

Then, one part of the draft has pupil elements containing marksheets. Again, what is the relationship? If there is only one possible, you don’t need a container element standing between the pupil and individual marksheet elements. If there is more than one possible relationship, then it would make sense for to have a pupil element containing a wrapper for marksheets, and that wrapper would be associated with the relationship (properly; predicate in RDF terms).

I hope that gives some kind of hint, at least, on how to do XML in a way that makes sense both from the domain point of view, and semantically. The payoff is this. If the mapping to RDF is clear, then someone should be able, without too much difficulty, to create an XSLT to do the transform. Then, if someone else wants to do a different XML spec, or has already done so, and it also transforms to RDF, there is a good basis for knowing whether similar information presented in the two XML specs is actually the same, or not.

One particularly attractive version of this is to have an RDFa representation, which of course of its very nature yeilds RDF on transformation. So you can present exactly the same information in XHTML, readable by anyone in a browser, and formatted to make it easy to read and to understand, and still have all the information just as machine-processable as any XML spec. That’s just what I want to do for LEAP2.

All this is an extension on what I wrote earlier

Interoperability through semantics

I was on a call this afternoon, with the HR-XML people discussing that old chestnut, contact information. The really interesting comment that came up was that many people don’t get any kind of intermediate domain model - rather, they just want to see their implementation (or practice) reflected directly in the specification, and so they are disappointed when (inevitably) it doesn’t happen. The HR-XML solution may be serviceable in the end, but what interested me more was the process which is really needed to do interoperability properly. I’ve been going on about semantic web approaches to interoperability for a while, but I hadn’t really thought through the processes which are necessary to implement it. So it’s a step forward for me.

Here’s how I now see it. Lots of people start off with their own way of seeing, thinking about, or conceptualising things. The primary task of the interoperability analyst or consultant (inventing a term that I’d feel comfortable with for myself) is to create a model into which all the initial variants can be mapped, one way or another. We don’t want one single uniform model into which everyone’s concepts are forced to fit, but rather a common model containing all the differences of view. Now, as I see it, that’s one of the big advantages of the semantic web: it’s so flexible and adaptable that you really can make a model which is a superset of just about any set of models that you can pick. Just what sort of model is needed becomes clearer when we think of the detailed steps of its creation and use.

If one group of people have a particular way of seeing things, the mapping to this common model must be acceptable to them. It won’t always be so immediately, so one has to allow for an educational process, possibly a Socratic one, of leading them to that acceptance. But you don’t have to show them all the other mappings at the same time, just theirs. Relating to other models comes later.

From the mappings to the common model, it is possible, likely even, that there will be some correspondence between concepts, so that different people can recognise they are talking about the same thing. One way of confirming this is to show the various people user interfaces of their systems, dealing with that kind of information. You could easily get remarks such as “yes, we have that, too”. Though one has to look out for, and evaluate, the “except that…” riders.

On the other hand, there are bound to be several concepts which don’t match directly in the common model. To complete the road to interoperability, what is needed is to ascertain, and get agreed, the logical connections between the common model concepts into which the original people’s concepts map. This, of course, is the area of ontologies, but it has a very different feel to the normal one of formalising the logical relationships between the concepts in just one group’s model. We are aiming at a common ontology, not in the sense that everyone must understand and use all the concepts, but that everyone agrees on the way that the concepts interrelate; the way that “their” concepts touch on “foreign” concepts, all within the same ontology.

Once the implications have been agreed between the different concepts in the common model, the way is open to create transforms between information seen in one view and information seen in another view. Each different group can, if they want, keep their own XML schemas to represent their own way of conceptualising the domain, but there will be (approximate) ways of translating this to other conceptualisations, perhaps via an intermediate RDF form. But, perhaps more ambitiously, once these implications are agreed, it is likely that people will be free to migrate towards more coherent views of the domain - actually to change the way they see things.

It is potentially a long process, and supporting it is not straightforward. I could imagine a year’s full-time postgraduate study - an MSc if you like - being needed to study, understand and put together the different roots and branches of philosophy, logic, communication, consensus process, IT, and education that are needed. But if we had trained, not just the naturally gifted, practitioners in this area, perhaps we could have enough people to get beyond the pitfalls of processes that are too often bogged down in mutual misunderstanding or incomprehension, or just plain amateurishness.

TRACE project, Brussels, 2007-11-19

Monday 19th November: I was invited as an expert to the final meeting of the TRACE project, held in Brussels. TRACE stands for Transparent Competences in Europe. The project web site is meant to be at http://trace.education-observatories.net/ . I didn’t realise how many competence projects there were in Europe at the moment, as well as TEN Competence which some CETIS people are involved with.

The meeting consisted of some presentations of the project work, followed by a general discussion which particularly involved the invited experts.

TRACE has created a prototype system to illustrate the competence transparency concept. In essence, this does employment matching based on inferences using domain knowledge embedded in an ontology, as well as job offers on the one side, and and CV-based personal competence profiles on the other. They didn’t try to do the full two-way matching thing as the Dutch Centre for Work and Income do. On the surface, the TRACE matching looks like a simpler version of what is done by the Belgian company Actonomy.

The meeting seemed to recognise that factors other than competences are also important in employment matching, but this has not been explored in the context of the TRACE project; nor has the idea that a system which can be used for competence-based matching in the employment domain could easily and advantageously be used for several other applications. It would be good to get a wider framework together, and this might go some way towards countering social exclusion worries.

Karsten Lundqvist, working at Reading with the project leader Prof. Keith Baker, was mainly responsible for the detailed ontology work, and he recognises that the relationships chosen to represent in the top-level ontology are vitally essential to what the ontology can support, and what domain ontologies can represent. They have a small number of relationships in their ontology:

  • has part
  • part of
  • more specific
  • more general
  • synonym
  • antonym

While these are reasonable first guesses at useful relationships, some of my previous work (presented at a TEN Competence meeting) proposes slightly different ones. I made the point in this meeting that it would be a good idea to check the relevance, appropriateness and meaningfulness of chosen relationships with people engaged in the domain itself. I’d say it is important in this kind of system to gain the trust of the end users by itself being transparently understandable.

But further than this, comprehensible relationships as well as terms are vital to the end of getting communities to take responsibility for ontologies. People in the community must be able to discuss the ontology. And, if the ontology is worked in to a structure to support communications, by being the basis of tags, people that work in the field will have plenty of motivation to understand the ontology. Put the motivation to understand together with structures and concepts that are easily understandable, and there is nothing in the way of widespread use of ontologies by communities, for a variety of purposes.

Putting together the main points that occurred to me, most of which I was able to say at the meeting:

  • relationships chosen for a top-level ontology for competence are vitally central, providing the building blocks for domain ontologies where the common knowledge of a community is represented;
  • we need further exploration about which relationships are most suitable and comprehensible for the community;
  • this will enable community development and maintenance of their own ontologies;
  • the UK already has some consensus-building communities, in the Sector Skills Councils;
  • SSCs produce National Occupational Standards, and it is worthwhile studying what is already produced and how, rather than reinventing the complete set of wheels (see my work for ioNW2);
  • to get practical success, we should acknowledge the human tendency for everyone to produce their own knowledge structures, including domain ontologies;
  • but we need to help people interrelate different domain ontologies, by providing in particular relationships suited to cross-link nodes in different ontologies (see my previous work on this)

All in all, an interesting and stimulating meeting.

Uniform tags, not uniform templates

On Friday (16th November) I was at a meeting of the Academy Subject Centre e-portfolio projects, in Wolverhampton. Among more gently interesting topics, the big thing that came out in the end was the desire not to be able to navigate around e-portfolio related information and resources without being swamped or overloaded. Initially discussing case studies, people at the meeting agreed that we don’t really want uniform templates, but rather uniform tags. Exactly how these might work was not discussed, but the question got me thinking.

Ideally, to get a coherent and consistent set of domain tags, they need to be based on an agreed domain model. This could be a domain ontology, but what one calls it is less important than the reality of it being: (a) widely and commonly agreed; and (b) able to be put in a machine-processable form for use on the web - the Semantic Web in fact.

JISC could perhaps fund one or more projects, absolutely not, under any circumstances, to invent their own domain models or ontologies in isolation, but to explore what common ground there is in a particular domain, and to explore also the processes which can result in broadening the area of agreement in the domain model. A by-product of this would be an evaluation of the usefulness of the tools needed to facilitate broadening of consensus in the domain model. Here, graphical representations will be vitally central.