Category Archives: web standards

Four Stars of Open Standards

I’m writing this at UKGovCamp, a wonderful unconference. This post constitutes notes, which I will flesh out and polish later.

I’m in a session on open standards in government, convened by my good friend Terence Eden, who is the Open Standards Lead at Government Digital Service, part of the United Kingdom government’s Cabinet Office.

Inspired by Tim Berners-Lee’s “Five Stars of Open Data“, I’ve drafted “Four Stars of Open Standards”.

These are:

  1. Publish your content consistently
  2. Publish your content using a shared standard
  3. Publish your content using an open standard
  4. Publish your content using the best open standard

Bonus points for:

  • making clear which standard you use
  • publishing your content under an open licence
  • contributing your experience to the development of the standard.

Point one, if you like is about having your own local standard — if you publish three related data sets for instance, be consistent between them.

Point two could simply mean agreeing a common standard with other items your organisation, neighbouring local authorities, or suchlike.

In points three and four, I’ve taken “open” to be the term used in the “Open Definition“:

Open means anyone can freely access, use, modify, and share for any purpose (subject, at most, to requirements that preserve provenance and openness).

Further reading:

United Kingdom parliamentary URL structure: change needed

In Wikidata, Wikipedia’s sister project for storing statements of fact as , we record a number of unique identifiers.

For example, Tim Berners-Lee has the identifier “85312226” and is known to the as “nm3805083”.

We know that we can convert these to URLs by adding a prefix, so

by adding the prefixes:

  • https://viaf.org/viaf/
  • http://www.imdb.com/Name?

respectively. We only need to store those prefixes in Wikidata once each.


HOUSES OF PARLIAMENT DSC 7057 pano 2

The in August 2014,
picture by Henry Kellner, CC BY-SA 3.0

The United Kingdom Parliament website also uses identifiers for MPs and members of the House of Lords.

For example, Tom Watson, an MP, is “1463”, and Jim Knight, aka The Lord Knight of Weymouth, is “4160”.

However, the respective URLs are:

meaning that the prefixes are not consistent, and require you to know the name or exact title.

Yet more ridiculous is that, if Tom Watson ever gets appointed to the House of Lords, even though his unique ID won’t change, the URL required to find his biography on the parliamentary website will change — and, because we don’t know whether he would be, say Lord Watson of Sandwell Valley, or Lord Watson of West Bromwich, we can’t predict what it will be.

When building databases, like Wikidata, this is all extremely unhelpful.

What we would like the parliamentary authorities to do — and what would benefit others wanting to make use of parliamentary URLs — is to use a standard, predictable type of URL, for example http://www.parliament.uk/biographies/1463 which uses the unique identifier, but does not require the individual’s house, name or title, and does not change if they shift to “the other place”.

If necessary they could then make that redirect to the longer URLs they prefer (though I wouldn’t recommend it).

I’ve asked them; but they don’t currently do this. In fact they explained their preference for the longer URLs thus:

…we are unable [sic] to shorten the url any further as the purpose of the current pattern of the web address is to display a pathway to the page.

The url also identifies the page i.e the indication of biographies including the name of the respective Member as to make it informative for online users who may view the page.

I find these arguments unconvincing, to say the least.


Screenshot, with Watson's name in the largest font on the page

There’s a big enough clue on the page, without needing to read the URL to identify its subject

Furthermore, the most verbose parts of the URLs are non-functioning; if we truncate Tom’s URL by simply dropping the final digit: http://www.parliament.uk/biographies/commons/tom-watson/146, then we get the biography of a different MP. On the other hand, if we change it to, say: http://www.parliament.uk/biographies/commons/t/1463, we still get Tom’s page. Try them for yourself.

So, how can we help the people running the Parliamentary website to change their minds, and to use a more helpful URL structure? Who do we need to persuade?

ORCID plugin for WordPress

ORCID, the “Open Research Contributor ID”, is an identifier for contributors to academic papers, journals, and other publications. It’s the equivalent, for such people, of an ISBN for a book or a DOI for a paper. ORCID is an open data project, run by a not-for-profit foundation.

I’ve been working with ORCID for over a year, on their “works metadata working group“, as an outreach ambassador, and integrating ORCID into Wikipedia and Wikidata (link is a PDF).

I’m currently at the ORCID outreach event at the University of Illinois in Chicago, USA, and participating in the codefest (a hackathon by another name).

I came up with the idea for a plugin for WordPress, which would allow authors to add their ORCID identifier to their profile, and which would allow users to add their ORCIDs to comments.

Roy Boverhof (kindly sponsored by Elsevier) has kindly coded it (it’s his first WordPress plugin!); I’ve installed it; and used it on this post; so you can see my ORCID “0000-0001-5882-6823”) above, and Roy’s in his comment.

If you have an ORCID, please leave a comment here, and include it in the field provided.

The plugin is very much in beta mode (its not yet tested in multiple browsers, for instance; and we need to add documentation and additional functionality such as check-digit validation), but you can get it from Roy’s GitHub repository (there’s a “download zip” button on the right hand side, in the default view) and install it on any self-hosted WordPress installation using Plugins > Add New. (If upgrading from a previous version, please delete the original first.)

Your feedback will be welcome, in comments below, as will code contributions at GitHub.

Thanks, Roy!

Update, 2014-05-22: There were prizes for the best product; all of them were great, but we came second!

Update, 2014-05-28: New version, with various improvements. Please delete the old version before installing the new one, per the above (revised) instructions.

Update, 2014-05-28b: And again! Now at version 0.5

A reply from the UK government to my request for road gritting open data

The government website data.gov.uk, to quote its about page, hosts datasets — as open data — ”from all central government departments and a number of other public sector bodies and local authorities” (my emphasis). This is a good thing.

The site’s FAQ says “If there are particular datasets that you believe should be made available more quickly, please use the data request process” (link in original). This is also good.

Accordingly, in September 2012 (that’s sixteen months ago) I submitted a request asking for:

Lists of roads gritted by councils and other bodies, in times of freezing temperatures, with priorities and criteria if applicable.

I specified that those “other bodies” included the Highways Agency, which is “an Exec­u­tive Agency of the Depart­ment for Trans­port (DfT), and is respon­si­ble for oper­at­ing, main­tain­ing and improv­ing the strate­gic road net­work in Eng­land on behalf of the Sec­re­tary of State for Transport” (again, quoting the HA’s about page) and thus an agency of the UK government. They grit most motorways and certain trunk (“A”) roads.

Highways Agency 1995 Foden Telstar gritter truck, 4 February 2009

A highways Agency gritting vehicle at work

The data would allow my fellow volunteers and I to label (“tag”) gritting routes in OpenStreetMap, improving Satnav routing. Here’s a map of some we’ve already done.

I have, today, received a reply from the Cabinet Office, which I reproduce here in full and verbatim:

Hi Andy,

I am getting in touch with you about your data request. I sincerely apologise for the length of time it has taken to get you a response to your request. Local Authorities are responsible for winter gritting within their boundaries. Local Authorities are data owners and they are responsible for the format, access and cost of their data. This means that you will need to get in touch with the Local Authority’s [sic] who’s [sic] data you are seeking directly for access to their data. Some Local Authorities do publish information about gritting on data.gov.uk, but they do not have a reporting requirement. You may find the following links helpful.

http://data.gov.uk/data/search?q=gritting – Data.gov.uk Local Authorities gritting data
http://www.local.gov.uk/community-safety/-/journal_content/56/10180/3510492/ARTICLE – Local Government Association information on how Local Authorities gritting responsibilities.
https://www.gov.uk/roads-council-will-grit – Access to each local Authority page on gritting
http://www.highways.gov.uk/our-road-network/managing-our-roads/operating-our-network/how-we-manage-our-roads/area-teams/area-9/area-9-our-winter-work/ – Highways Agency information
http://www.highways.gov.uk/about-us/contact-us/ – Contacting the Highways Agency
http://www.highways.gov.uk/freedom-of-information-2/ – information on submitting an Freedom of Information request to the Highways Agency

I am very sorry for the amount of time it has taken us to get back to you. I hope this helps.

Kind Regards,

[name redacted]
Transparency Team
4th Floor
1 Horse Guards Road, London, SW1A 2HQ
Email: [redacted]@cabinet-office.gsi.gov.uk
Find out more about Open Data @ Data.gov.uk

I note the following:

  • Although an apology for the — inordinate — delay in replying is given, no reason for that is offered.
  • It should — surely? — be possible to make one centralised request rather than having to make the same request to every local authority (at the relevant tier) in the country?
  • No mention is made of Highways Agency data, other than links to their web pages, including their FoI page.
  • The reply was sent to me by email, but is not in the Comments section of the page for the request, so is not available to other interested people, including the person who commented in support of it. (I’ll post a link to this post there.)

What do you think?

I hope my recent request, for The National Heritage List for England, receives more prompt consideration and achieves a more positive outcome.

For I’m a Jolly Good Fellow (of the RSA)

I may have been overlooked, once again, in the new year’s honours list, but in mid-December I received an unsolicited and very flattering email; I’d been nominated, by their Regional Programme Manager, to become a Fellow of the Royal Society for the encouragement of Arts, Manufactures and Commerce (the Royal Society of Arts, for short, or RSA, for shorter). The nomination was “for your work on open data, Wikipedia and social media”.

You could have knocked me down with a metaphor.

Royal Society of Arts - from the Strand, London

RSA headquarters
Photo by Elliott Brown, on Flickr, CC-BY

Founded in 1754, the RSA is an independent enlightenment organisation committed to finding practical solutions to today’s social challenges (their email pointed out). That sounded right up my street. I was delighted to accept, and confirmation arrived by e-mail on Wednesday.

I’m in some illustrious company. My fellow fellows include Sir Tim Berners-Lee, Dr Sue Black, Stephen Hawking and Gareth Malone. Past fellows have included Charles Dickens, Benjamin Franklin and Karl Marx!

As a fellow, I shall have use of facilities at the RSA headquarters, off The Strand, pictured above. I shall henceforth refer to this, tongue firmly in cheek, as “my London club”.

My fellowship also means that I now have extra initials after my name. I’m “Andy Mabbett, FRSA”.

But you can still call me Andy.

Don’t link to my Twitter profile!

From time to time, people are kind enough to mention me, with a link, in their blog posts. Usually, in a positive way. I’m very grateful when they do.

A rusty chain

Lovely links (geddit?)
Photo by pratanti, on Flickr, CC-BY

But…

They often link to my Twitter account, like this::

Here’s something about .

or like this:

Here’s something about Andy Mabbett (he’s @pigsonthewing on Twitter).

(the relevant HTML markup being, in the first example,
<a href="http://twitter.com/pigsonthewing">Andy Mabbett</a>).

Now, like I say, I’m very grateful for the attention. But I do wish they would link to my website, instead:

Here’s something about Andy Mabbett.

or even both:

Here’s something about Andy Mabbett (he’s @pigsonthewing on Twitter).

(the relevant HTML markup being
<a href="http://pigsonthewing.org.uk">Andy Mabbett</a>).

Why?

For two reasons. Firstly, though Twitter is fun, and I use it a lot, it’s ephemeral, and not everyone reading those post will want to use it. My website, on the other hand, has more about me and the work I do. Secondly, I need the Google juice (the value afforded to incoming web links by , the Google search algorithm ) more than Twitter does.

This isn’t just about me, though. The same applies every time a blogger or other web page author — and that probably includes you — links to anyone or any organisation, with their own website or blog. Please don’t just link to their page on Twitter, Facebook, LinkedIn, or on some other social networking site. Of course, do that as well, or if it’s the only online presence they have.

But if they have a website, as I do, please make that the primary destination to which you link. And hopefully, they will reciprocate.

Thank you.

Requesting open-licensed, open-format recordings of the voices of Wikipedia subjects for Wikimedia Commons

The Idea

A little while ago, my friend and fellow Wikipedia editor (he’s the Wikipedian in Residence at the British Library!) mentioned to me that Wikipedia could do with more sound files. We discussed recordings of music, industrial and everyday sounds (what does a printing press sound like? Or a Volkswagen Beetle? What do different kinds of breakfast cereal sound like when milk is added?), as well as people’s voices, so that we have a record of what they sound like.

A giant ear-trumpet

Beethoven’s Trumpet (With Ear) By John Baldessari, at the Saatchi Gallery.
Photo by Jim Linwood, on Flickr, CC-BY

In the spirit of Wikipedia, all such recordings would be open-licensed, to allow others to use them, freely. They can then be uploaded to Wikimedia Commons (the media repository for Wikipedia and its related projects) in an open format, namely Ogg Vorbis (that’s like mp3, but without patent encumbrances).

So I’m working on a new initiative to provide short (under ten-second) open-licensed audio clips of examples of the speaking voices of notable people (i.e. people who have Wikipedia articles about them).

What To Do

As a pilot, I’m asking some of my (cough) celebrity friends to kindly record the following, or a variation of their choice, with no background noise:

Hello, my name is [name]. I was born in [place] and I have been [job or position] since [year]

(but without mentioning Wikipedia!) They can do that, in quiet room, with a modern mobile phone, or a computer.

[Stop Press: See update 4, below, for update regarding use of “Vocaroo”, to avoid this step]

Once they’ve done that, they can convert the file to Ogg Vorbis using this free tool and then upload it to Wikimedia Commons, with an open-licence, with no “non-commercial (NC)” or “no derivatives (ND)” restrictions, (e.g. CC-By or CC-By-SA), and add the category “Voice intro project”.

If that’s too much fuss, they can e-mail it, or its URL, to me (andy@pigsonthewing.org.uk), using common file formats like mp3 or .wav, stating that it’s under one of those licences, and CC the mail to: permissions-en@wikimedia.org to formally record the open licence. Then I or other Wikipedia editors will make the conversion.

Alternatively, perhaps, they can point to a suitable, open-licensed, example of their speaking voice, which is already online.

Anyone Can Help

If you’re not the subject of a Wikipedia article, you can still help, by recording and uploading to Wikimedia Commons audio files, as described above, of machinery or everyday activities and occurrences.

Updates

  1. A couple of Wikipedia article subjects have asked why they would do this. In short, so that there is a public — and freely reusable — record of what they sound like, for current and future generations. And so that we know how they pronounce their names.
  2. The uploaded files are now gathered in a Wikimedia Commons category. Thank you to the early contributors.
  3. I’ve been asked about multi-lingual recordings. The best thing would be separate files, one in each language, please.
  4. If you have a microphone on your computer (doesn’t work on iPhone/iPad), it’s possible to record directly into the Vocaroo website, and just email or tweet me a link. But you still need to agree to an open licence!

How should a hackday be run?

I’m working with a large public-sector organisation who have a considerable — and potentially very useful — body of data. They’re keen to open it up, and would like to encourage people to use it by having a hack event of some kind. At the same time, it’s gratifying that they’re clear that they don’t wish to unfairly exploit anyone.

We’re considering a number of options, and would welcome comments and additional suggestions.

The event could be held in the Midlands; over one day or two, on weekdays, weekend, or Friday-Saturday. Or a competition could be announced online, with a virtual or real-life “dragons den” type event, for people to present things they’ve worked on at home.

Cray-2 super computer

You won’t need one of these to take part…
Computer Museum: Cray-2 by cmnit, on Flickr, CC-BY

Should we set a specific challenge, or just ask people to do something interesting with the data?

I’ve suggested prizes might be offered for both the most compete solution, and the best idea, whether compete or not. There might be prizes in other categories, such as the best idea by a young person or the most accessible product, or different categories for commercial and hobbyist entrants.

The data holders might also like to consider developing business relationships to the developers of one or more of the products, separate to any prize giving; rights in all the entries would of course remain with their developers, otherwise.

How would you like such an event to happen? We’re aware of the Hackday Manifesto, but what else is best practice, and what other pitfalls should be avoided?

Over to you…

Politician pin ups – open-licensed pictures, please

Politicians, like visits to the dentist and taxes, are a necessary evil. We all moan about them, but someone has to take care of the machinery of state.

So it’s important that we hold them to account, and elsewhere document their activities in a neutral way. Hyperlocal bloggers do the former, and the latter takes place on Wikipedia, and on sites like the excellent OpenlyLocal (both of whose content is open-licensed).

To illustrate such articles, bloggers and Wikipedians need photographs of the politicians (and senior officers). While it’s possible for individuals to take such pictures (and even open-license them, as I described previously), it would be better if such pictures were available from official channels. Such organisations already take or commission professional quality shots and make them available to the press. If they don’t already, they should make sure that their contract with photographers pays for full rights, enabling open-licensing.

I recently asked Birmingham City Council’s press office to make their pictures of members of BCC’s cabinet available under an open licence, and, to their credit, they did so. I was then able to use one of them on :

Wikipedia article using a picture open-licensed by Birmingham City Council

Some might ask “but what if the pictures are misused, to misrepresent those people”. Well, if someone’s going to do that, then they won’t bother about copyright anyway, and other laws (libel, human rights) already enable redress.

So come on all you councils, civil service departments, police forces/ authorities and so on — let us have pictures of your elected members and senior officers, free (i.e. with no “non-commercial” or “no derivatives” restrictions) for reuse on our blogs, Wikipedia and other sites. Major companies, too, could do this for their most-public board members.

Then there’s all public bodies’ other photographs. After all, West Midlands Police kindly agreed to my request to open-license the fantastic aerial shots from their helicopter…

St. Martin in the Bullring Church, Birmingham
Birmingham’s Bull Ring, from the West Midlands Police helicopter. Although this picture is ©WM Police, I can use it, here and on Wikipedia, because they kindly make it available under a CC-BY-SA licence