A simpler sourcing maturity assessment approach

Knowing how to procure your IT services, software and hardware is a vital function in any organisation. Assessing one’s maturity in this aspect can be complex, which is why SURF developed a simpler approach.

There are a number of perspectives to take on IT and its place in an organisation, but for further and higher education institutions, the procurement or sourcing of services – in the widest sense of the word ‘services’ – may be among the most important ones. With the ongoing move to cloud provisioning, determining where a particular service is going to come from and how it is managed is crucial.

A number of approaches to measure and improve an organisation’s maturity in this area exist, but, as Bert van Zomeren points out in the EUNIS paper that presents the SURF Sourcing Maturity Assessment Approach, these are quite complex. They can be so sophisticated that organisations hire consultancies that it do it for them. The SURF method doesn’t go quite as deep as those exercises, but is a much easier first step.

The heart of the approach is simple: a champion identifies the key stakeholders in the organisation with regard to the sourcing process, each of the stakeholders fills out the questionnaire, the results are analysed, the stakeholders meet, and appropriate adjustments to the process are agreed upon.

As in many of these approaches, the questions in the questionnaire describe an ideal situation, and respondents are asked to rank their organisation on how closely they think their organisation resembles that ideal on a scale. Some of these ideals may be uncontroversial, but it is certainly possible that others do provoke debate – adapting processes to suit services, rather than the other way round, for example. Still, such a debate can be a valuable input into the wider maturation process.

I’ve just translated the questionnaire into English, and it has been made available as a combination Google form and spreadsheet. To test it yourself, you need to sign into Google drive, put the form and spreadsheet into your drive, then make copies. The spreadsheet has two sheets: one that gathers the data and another that turns the data into a crude, but extensible report.

It’d probably be a good idea to read van Zomeren and Levinson’s short EUNIS paper before you start. There is a much more extensive guide to the approach in Dutch as well, but we thought we’d gather some feedback first before translating that as well. A guide of that sort will almost certainly be necessary in order to use the simpler sourcing maturity assessment approach in anger at an institution.

Doing analytics with open source linked data tools

Like most places, the University of Bolton keeps its data in many stores. That’s inevitable with multiple systems, but it makes getting a complete picture of courses and students difficult. We test an approach that promises to integrate all this data, and some more, quickly and cheaply.

Integrating a load of data in a specialised tool or data warehouse is not new, and many institutions have been using them for a while. What Bolton is trying in its JISC sponsored course data project is to see whether such a warehouse can be built out of Linked Data components. Using such tools promises three major advantages over existing data warehouse technology:

It expects data to be messy, and it expects it to change. As a consequence, adding new data sources, or coping with changes in data sources, or generating new reports or queries should not be a big deal. There are no schemas to break, so no major re-engineering required.

It is built on the same technology as the emergent web of data. Which means that increasing numbers of datasets – particularly from the UK government – should be easily thrown into the mix to answer bigger questions, and public excerpts from Bolton’s data should be easy to contribute back.

It is standards based. At every step from extracting the data, transforming it and loading it to querying, analysing and visualising it, there’s a choice of open and closed source tools. If one turns out not to be up to the job, we should be able to slot another in.

But we did spend a day kicking the tires, and making some initial choices. Since the project is just to pilot a Linked Enterprise Data (LED) approach, we’ve limited ourselves to evaluate just open source tools. We know there plenty of good closed source options in any of the following areas, but we’re going to test the whole approach before deciding on committing to license fees.

Data sources

D2RQ

Google Refine logo

Before we can mash, query and visualise, we need to do some data extraction from the sources, and we’ve come down on two tools for that: Google Refine and D2RQ. They do slightly different jobs.

Refine is Google’s power tool for anyone who has to deal with malformed data, or who just wants to transform or excerpt from format to another. It takes in CSV or output from a range of APIs, and puts it in table form. In that table form, you can perform a wide range of transformations on the data, and then export in a range of formats. The plug-in from DERI Galway, allows you to specify exactly how the RDF – the linked data format, and heart of the approach – should look when exported.

What Refine doesn’t really do (yet?) is transform data automatically, as a piece of middleware. All your operations are saved as a script that can be re-applied, but it won’t re-apply the operations entirely automagically. D2RQ does do that, and works more like middleware.

Although I’ve known D2RQ for a couple of years, it still looks like magic to me: you download, unzip it, tell it where your common or garden relational database is, and what username and password it can use to get in. It’ll go off, inspect the contents of the database, and come back with a mapping of the contents to RDF. Then start the server that comes with it, and the relational database can be browsed and queried like any other Linked Data source.

Since practically all relevant data in Bolton are in a range of relational databases, we’re expecting to use D2R to create RDF data dumps that will be imported into the data warehouse via a script. For a quick start, though, we’ve already made some transforms with Refine. We might also use scripts such as Oxford’s XCRI XML to RDF transform.

Storage, querying and visualisation

Callimachus project logo

We expected to pick different tools for each of these functions, but ended up choosing one, that does it all- after a fashion. Callimachus is designed specifically for rapid development of LED applications, and the standard download includes a version of the Sesame triplestore (or RDF database) for storage. Other triple stores can also be used with Callimachus, but Sesame was on the list anyway, so we’ll see how far that takes us.

Callimachus itself is more of a web application on top that allows quick visualisations of data excerpts- be they straight records of one dataset or a collection of data about one thing from multiple sets. The queries that power the Callimachus visualisations have limitations – compared to the full power of SPARQL, the linked data query language – but are good enough to knock up some pages quickly. For the more involved visualisations, Callimachus SPARQL 1.1 implementation allows the results a query to be put out as common or garden JSON, for which many different tools exist.

Next steps

We’ve made some templates already that pull together course information from a variety of sources, on which I’ll report later. While that’s going on, the main other task will be to set up the processes of extracting data from the relational databases using D2R, and then loading it into Callimachus using timed scripts.

Approaches to building interoperability and their pros and cons

System A needs to talk to System B. Standards are the ideal to achieve that, but pragmatics often dictate otherwise. Let’s have a look at what approaches there are, and their pros and cons.

When I looked at the general area of interoperability a while ago, I observed that useful technology becomes ubiquitous and predictable enough over time for the interoperability problem to go away. The route to get to such commodification is largely down to which party – vendors, customers, domain representatives – is most powerful and what their interests are. Which describes the process very nicely, but doesn’t help solve the problem of connecting stuff now.

So I thought I’d try to list what the choices are, and what their main pros and cons are:

A priori, global
Also known as de jure standardisation. Experts, user representatives and possibly vendor representatives get together to codify whole or part of a service interface between systems that are emerging or don’t exist yet; it can concern either the syntax, semantics or transport of data. Intended to facilitate the building of innovative systems.
Pros:

  • Has the potential to save a lot of money and time in systems development
  • Facilitates easy, cheap integration
  • Facilitates structured management of network over time

Cons:

  • Viability depends on the business model of all relevant vendors
  • Fairly unlikely to fit either actually available data or integration needs very well

A priori, local
i.e. some type of Service Oriented Architecture (SOA). Local experts design an architecture that codifies syntax, semantics and operations into services. Usually built into agents that connect to each other via an ESB.
Pros:

  • Can be tuned for locally available data and to meet local needs
  • Facilitates structured management of network over time
  • Speeds up changes in the network (relative to ad hoc, local)

Cons:

  • Requires major and continuous governance effort
  • Requires upfront investment
  • Integration of a new system still takes time and effort

Ad hoc, local
Custom integration of whatever is on an institution’s network by the institution’s experts in order to solve a pressing problem. Usually built on top of existing systems using whichever technology is to hand.
Pros:

  • Solves the problem of the problem owner fastest in the here and now.
  • Results accurately reflect the data that is actually there, and the solutions that are really needed

Cons:

  • Non-transferable beyond local network
  • Needs to be redone every time something changes on the local network (considerable friction and cost for new integrations)
  • Can create hard to manage complexity

Ad hoc, global
Custom integration between two separate systems, done by one or both vendors. Usually built as a separate feature or piece of software on top of an existing system.
Pros:

  • Fast point-to-point integration
  • Reasonable to expect upgrades for future changes

Cons:

  • Depends on business relations between vendors
  • Increases vendor lock-in
  • Can create hard to manage complexity locally
  • May not meet all needs, particularly cross-system BI

Post hoc, global
Also known as standardisation, consortium style. Service provider and consumer vendors get together to codify a whole service interface between existing systems; syntax, semantics, transport. The resulting specs usually get built into systems.
Pros:

  • Facilitates easy, cheap integration
  • Facilitates structured management of network over time

Cons:

  • Takes a long time to start, and is slow to adapt
  • Depends on business model of all relevant vendors
  • Liable to fit either available data or integration needs poorly

Clearly, no approach offers instant nirvana, but it does make me wonder whether there are ways of combining approaches such that we can connect short term gain with long term goals. I suspect if we could close-couple what we learn from ad hoc, local integration solutions to the design of post-hoc, global solutions, we could improve both approaches.

Let me know if I missed anything!

The cloud is for the boring

Members of the Strategic Technologies Group of the JISC’s FSD programme met at King’s Anatomy Theatre to, ahem, dissect the options for shared services and the cloud in HE.

The STG’s programme included updates on projects of the members as well as previews of the synthesis of the Flexible Service Delivery programme of which the STG is a part, and a preview of the University Modernisation Fund programme that will start later in the year.

The main event, though, was a series of parallel discussions on business problems where shared services or cloud solutions could make a difference. The one I was at considered a case from the CUMULUS project; how to extend rather than replace a Student Record System in a modular way.

View from the King's anatomy theatre up to the clouds

View from the King's anatomy theatre up to the clouds

In the event, a lot of the discussion revolved around what services could profitably be shared in some fashion. When the group looked at what is already being run on shared infrastructure and what has proven very difficult, the pattern is actually very simple: the more predictable, uniform, mature, well understood and inessential to the central business of research and education, the better. The more variable, historically grown, institution specific and bound up with the real or perceived mission of the institution or parts thereof, the worse.

Going round the table to sort the soporific cloudy sheep from the exciting, disputed, in-house goats, we came up with following lists:

Cloud:

  • email
  • Travel expenses
  • HR
  • Finance
  • Student network services
  • Telephone services
  • File storage
  • Infrastructure as a Service

In house:

  • Course and curriculum management (including modules etc)
  • Admissions process
  • Research processes

This ought not to be a surprise, of course: the point of shared services – whether in the cloud or anywhere else – is economies of scale. That means that the service needs to be the same everywhere, doesn’t change much or at all, doesn’t give the users a competitive advantage and has well understood and predictable interfaces.

Enterprise Architecture throws out bath water, saves baby in the nick of time

Enterprise architecture started as a happily unreconstituted techy activity. When that didn’t always work, a certain Maoist self-criticism kicked in, with an exaltation of “the business” above all else, and taboos on even thinking about IT. Today’s Open Group sessions threatened to take that reaction to its logical extreme. Fortunately, it didn’t quite end up that way.

The trouble with realising that getting anywhere with IT involves changing the rest of the organisation as well, is that it gets you out of your assigned role. Because the rest of the organisation is guaranteed to have different perspectives on how it wants to change (or not), what the organisation’s goals are and how to think about its structure, communication is likely to be difficult. Cue frustration on both sides.

That can be addressed by going out of your way to go to “the business”, talk it’s language, worry about its concerns and generally go as native as you can. This is popular to the point of architects getting as far away from dirty, *dirty* IT as possible in the org chart.

So when I saw the sessions on “business architecture”, my heart sank. More geeks pretending to be suits, like a conference hall full of dogs trying to walk on their hind legs, and telling each other how it’s the future.

When we got to the various actual case reports in the plenary and business transformation track, however, EA self-negation is not quite what’s happening in reality. Yes, speaker after speaker emphasised the need to talk to other parts of the organisation in their own language, and the need to only provide relevant information to them. Tom Coenen did a particularly good job of stressing the importance of listening while the rest of the organisation do the talking.

But, crucially, that doesn’t negate that – behind the scenes – architects still model. Yes, for their own sake, and solely in order to deliver the goals agreed with everyone else, but even so. And, yes, there are servers full of software artefacts in those models, because they are needed to keep the place running.

This shouldn’t be surprising. Enterprise architects are not hired to decide what the organisation’s goals are, what its structure should be or how it should change. Management does that. EA can merely support by applying its own expertise in its own way, and worry about the communication with the rest of the organisation both when requirements go in and a roadmap comes out (both iteratively, natch).

And ‘business architecture’? Well, there still doesn’t appear to be a consensus among the experts what it means, or how it differs from EA. If anything, it appears to be a description of an organisation using a controlled vocabulary that looks as close as possible to non-domain specific natural language. That could help with intra-disciplinary communication, but the required discussion about concepts and the word to refer to them makes me wonder whether having a team who can communicate as well as they can model might not be quicker and more precise.

Bare bones TOGAF

Do stakeholder analysis. Cuddle the uninterested powerful ones, forget about the enthusiasts without power. Agree goal. Deliver implementable roadmap. The rest is just nice-to-have.

That was one message from today’s slot on The Open Group’s Architecture Framework (TOGAF) at the Open Group’s quarterly meeting in Amsterdam. In one session, two self-described “evil consultants” ran a workshop on how to extract most value from an Enterprise Architecture (EA) to institutional change.

While they agreed about the undivided primacy of keeping the people with power happy when doing EA, the rest of their approach differed more markedly.

Dave Hornford zero-ed in mercilessly on the do-able roadmap as the centre of the practice. But before that, find those all-powerful stakeholders and get them to agree on the organisational vision and its goal. If there is no agreement: celebrate. You’ve just saved the organised an awful lot of money in an expensive and unimplementable EA venture.

Once past that hurdle, Dave contended that the roadmap should identify what the organisation really needs – which may not always be sensible or pretty.

Jason Uppal took a slightly wider view, by focussing on the balance between quick wins and how to EA the norm in an organisation.

The point about ‘quick wins’ is that both ‘quick’ and ‘win’ are relative. It is possible to go after a long term value proposition with a particular change, as long as you have a series of interim solutions that provide value now. Even if you throw them away again later. And the first should preferably have no cost.

That way, EA can become part of the organisation’s practice: by providing value. This does pre-suppose that the EA practice is neither a project, nor a programme- just a practice.

An outline of the talks on the Open Group’s website

Linked Data meshup on a string

I wanted to demo my meshup of a triplised version of CETIS’ PROD database with the impressive Linked Data Research Funding Explorer on the Linked Data meetup yesterday. I couldn’t find a good slot, and make my train home as well, so here’s a broad outline:

The data

The Department for Business Innovation and Skills (BIS) asked Talis if they could use the Linked Data Principles and practice demonstrated in their work with data.gov.uk to produce an application that would visualise some grant data. What popped out was a nice app with visuals by Iconomical, based on a couple of newly available data sets that sit on Talis’ own store for now.

The data concerns research investment in three disciplines, which are illustrated per project, by grant level and number of patents, as they changed over time and plotted on a map.

CETIS have PROD; a database of JISC projects, with a varying amount of information about the technologies they use, the programmes they were part of, and any cross links between them.

The goal

Simple: it just ought to be possible to plot the JISC projects alongside the advanced tech of the Research Funding Explorer. If not, than at least the data in PROD should be augmentable with the data that drives the Research Funding Explorer.

Tools

Anything I could get my hands on, chiefly:

The recipe

For one, though PROD pushes out Description Of A Project (DOAP, an RDF vocabulary) files per project, it doesn’t quite make all of its contents available as linked data right now. The D2R toolkit was used to map (part of) the contents to known vocabs, and then make the contents of a copy of PROD available through a SPARQL interface. Bang, we’re on the linked data web. That was easy.

Since I don’t have access to the slick visualisation of the Research Funding Explorer, I’d have to settle for augmenting PROD’s data. This is useful for two reasons: 1) PROD has rather, erm, variable institutional names. Synching these with canonical names from a set that will go into data.gov.uk is very handy. 2) PROD doesn’t know much about geography, but Talis’ data set does.

To make this work, I made a SPARQL query that grabs basic project data from PROD, and institutional names and locations from the Talis data set, and visualises the results.

Results

A partial map of England, Wales and southern Scotland with markers indicating where projects took place
An excerpt of PROD project data, augmented with proper institutional names and geographic positions from Talis’ Research Grant Explorer, visualised in OpenLink RDF browser.

A star shaped overview of various attributes of a project, with the name property highlighted
Zooming in on a project, this time to show the attributes of a single project. Still in OpenLink RDF browser.

A two column list of one project's attributes and their values
A project in D2R’s web interface; not shiny, but very useful.

From blagging a copy of the SQL tables from the live PROD database to the screen shots above took about two days. Opening up the live server straight to the web would have cut that time by more than half. If I’d have waited for the Research Grant Explorer data to be published at data.gov.uk, it’d have been a matter of about 45 minutes.

Lessons learned

Opening up any old database as linked data is incredibly easy.

Cross-searching multiple independent linked data stores can be surprisingly difficult. This is why a single SPARQL endpoint across them all, such as the one presented by uberblic‘s Georgi Kobilarov yesterday, is interesting. There are many other good ways to tackle the problem too, but whichever approach you use, making your linked data available as simple big graphs per major class of thing (entity) in your dataset helps a lot. I was stymied somewhat by the fact that I wanted to make use of data that either wasn’t published properly yet (Talis’ research grant set), or wasn’t published at all (our own PROD triples).

A bit of judicious SPARQLing can alleviate a lot of inconsistent data problems. This is salient to a recent discussion on twitter around Brian Kelly’s Linked Data challenge. One conclusion was that it was difficult, because the data was ‘bad’. IMHO, this is the web, so data isn’t really bad, just permanently inconsistent and incomplete. If you’re willing to put in some effort when querying, a lot can be rectified. We, however, clearly need to clean up PROD’s data to make it easier on everyone.

SPARQL-panning for gold in multiple datastores (or even feeds or webpages) is way too much fun to seem like work. To me, anyway.

What’s next

What needs to happen is to make all the contents of PROD and related JISC project information available as proper linked data. I can see three stages for this:

  1. We clean up the PROD data a little more at source, and load it into the Data Incubator to polish and debate the database to triple mapping. Other meshups would also be much easier at that point.
  2. We properly publish PROD as linked data either on a cloud platform such as Talis’, or else directly from our own server via D2R or OpenLink Virtuoso. Simal would be another great possibility for an outright replacement of PROD, if it’s far enough along at that point.
  3. JISC publishes the public part of its project information as Linked Data, and PROD just augments (rather than replicates) it.

Pinning enterprise architecture to the org chart

Recent discussion during the Open Group’s Seattle conference shows that we’re still not done debating the place of Enterprise Architecture (EA) in an organisation.

For one thing, EA is still a bit of a minority sport, as Tim Westbrock reminded everyone: 99+% of organisations don’t do EA, or, at least, not consciously. Nonetheless, impressive, linear, multi-digit growth in downloads and training in The Open Group’s Architectural Framework (TOGAF) indicates that an increasing number of organisations want to surface their structure.

Question is: where does that activity sit?

Traditionally, most EA practice comes out of the IT department, because the people in it recognise that an adequate IT infrastructure requires a holistic view of the organisation and its mission. As a result, extraordinary amounts of time and energy are spent on thinking about, engaging with, thinking as or generally fretting about “the business” in EA circles. To the point that IT systems or infrastructure are considered unmentionable.

While morally laudable, I fear that this anxiety is a tad futile if “the business” is unwilling or unable to understand anything about IT – as it frequently seems to –, but that’s just my humble opinion.

Mike Rollins of the Burton Group seems to be thinking along similar lines, in his provocative notion that EA is not something that you are, but something that you do. That is, in order for an architectural approach to be effective, you shouldn’t have architects (in the IT department or elsewhere), but you should integrate doing EA into the general running of the organisation.

Henri Peyret of Forrester wasn’t quite so willing to tell an audience of a few hundred people to quit their jobs, but also emphasised the necessity to embed EA in the general work of the organisation. In practical terms, that the EA team should split their time evenly between strategic work, and regular project work.

Tim Westbrock did provide a sharper contrast with the notion of letting EA become an integral part of the whole organisation inasmuch as he argued that, in a transformative scenario, the business and IT domains become separate. The context, though, was his plea for ‘business architecture’, which, simplifying somewhat, looks like EA done by non-IT people using business concepts and language. In such a situation, the scope of the IT domain is pretty much limited to running the infrastructure and coaching ‘the business’ in the early phases of the deployment of a new application that they own.

Stuart MacGregor of realIRM was one of the few who didn’t agonise so much about who’d do EA and where, but he did make a strong case for two things: building and deploying EA capacity long term, and spending a lot of time on the soft, even emotional side of engaging with other people in the organisation. A consequence of the commitment to the long term is to wean EA practices of their addiction to ‘quick wins’ and searches for ‘burning platforms’. Short term fixes nearly always have unintended consequences, and don’t necessarily do anything to fix the underlying issues.

Much further beyond concerns of who and where is the very deep consideration of the concepts and history of ‘architecture’ as applied to enterprise of Len Fehskens of the Open Group. For cyberneticians and soft systems adepts, Len’s powerpoint treatise is probably the place to start. Just expect your heckles to be raised.

Resources

Tim Westbrock’s slides on Architecting the Business is Different than Architecting IT

Mike Rollins’ slides on Enterprise Architecture: Disappearing into the Business

Henry Peyret’s slides on the Next generation of Enterprise Architects

Stuart MacGregor’s slides on Business transformation Powered by EA

Len Feshkens slides on Rethinking Architecture

SOA only really works webscale

Just sat through a few more SOA talks today, and, as usual, the presentations circled ’round to governance pretty quick and stayed there.

The issue is this: soa promises to make life more pleasant by removing duplication of data and functionality. Money is saved and information is more accurate and flows more freely, because we tap directly into the source systems, via their services.

So far the theory. The problem is that organisations in soa exercises have a well documented tendency to re-invent their old monolithic applications as sets of isolated services that make most sense to themselves. And here goes the re-use argument: everyone uses their own set of services, with lots of data and functionality duplication.

Unless, of course, your organisation has managed to set up a Governance Police that makes everyone use the same set of centrally sanctioned services. Which is, let’s say, not always politically feasible.

Which made me think of how this stuff works on the original service oriented architecture: the web. The most obvious attribute of the web, of course, is that there is no central authority over service provision and use. People just use what is most useful to them- and that is precisely the point. Instead of governance, the web has survival of the fittest: the search engine that gives the best answers gets used by everyone.

Trying to recreate that sort of Darwinian jungle within the enterprise seems both impossible and a little misguided. No organisation has the resources to just punt twenty versions of a single service in the full knowledge that at least nineteen will fail.

Or does it? Once you think about the issue webscale, such a trial-and-error approach begins to look more do-able. For a start, an awful lot of current services are commodities that are the same across the board: email, calendars, CRM etc. These are already being sourced from the web, and there are plenty more that could be punted by entrepreneurial -shared – service providers with a nous for the education system (student record system, HR etc.)

That leaves the individual HE institutions to concentrate on those services that provide data and functionality that are unique to themselves. Those services will survive, because users need them, and they’re also so crucial that institutions can afford to experiment before a version is found that does the job best.

I’ll weasel out of naming what those services will be: I don’t know. But I suspect it will be those that deal with the institution’s community (‘social network’ if you like) itself.