Category Archives: web standards

In Wikidata, Wikipedia’s sister project for storing statements of fact as linked, open data, we record a number of unique identifiers.

For example, Tim Berners-Lee has the VIAF identifier “85312226” and is known to the Internet Movie Database as “nm3805083”.

We know that we can convert these to URLs by adding a prefix, so

85312226 becomes https://viaf.org/viaf/85312226/
nm3805083 becomes http://www.imdb.com/Name?nm3805083

by adding the prefixes:

https://viaf.org/viaf/
http://www.imdb.com/Name?

respectively. We only need to store those prefixes in Wikidata once each.

The Houses of Parliament in August 2014,
picture by Henry Kellner, CC BY-SA 3.0

The United Kingdom Parliament website also uses identifiers for MPs and members of the House of Lords.

For example, Tom Watson, an MP, is “1463”, and Jim Knight, aka The Lord Knight of Weymouth, is “4160”.

However, the respective URLs are:

meaning that the prefixes are not consistent, and require you to know the name or exact title.

Yet more ridiculous is that, if Tom Watson ever gets appointed to the House of Lords, even though his unique ID won’t change, the URL required to find his biography on the parliamentary website will change — and, because we don’t know whether he would be, say Lord Watson of Sandwell Valley, or Lord Watson of West Bromwich, we can’t predict what it will be.

When building databases, like Wikidata, this is all extremely unhelpful.

What we would like the parliamentary authorities to do — and what would benefit others wanting to make use of parliamentary URLs — is to use a standard, predictable type of URL, for example http://www.parliament.uk/biographies/1463 which uses the unique identifier, but does not require the individual’s house, name or title, and does not change if they shift to “the other place”.

If necessary they could then make that redirect to the longer URLs they prefer (though I wouldn’t recommend it).

I’ve asked them; but they don’t currently do this. In fact they explained their preference for the longer URLs thus:

…we are unable [sic] to shorten the url any further as the purpose of the current pattern of the web address is to display a pathway to the page.

The url also identifies the page i.e the indication of biographies including the name of the respective Member as to make it informative for online users who may view the page.

I find these arguments unconvincing, to say the least.

Screenshot, with Watson's name in the largest font on the page

There’s a big enough clue on the page, without needing to read the URL to identify its subject

Furthermore, the most verbose parts of the URLs are non-functioning; if we truncate Tom’s URL by simply dropping the final digit: http://www.parliament.uk/biographies/commons/tom-watson/146, then we get the biography of a different MP. On the other hand, if we change it to, say: http://www.parliament.uk/biographies/commons/t/1463, we still get Tom’s page. Try them for yourself.

So, how can we help the people running the Parliamentary website to change their minds, and to use a more helpful URL structure? Who do we need to persuade?

ORCID plugin for WordPress

29 Replies

0000-0001-5882-6823

ORCID, the “Open Research Contributor ID”, is an identifier for contributors to academic papers, journals, and other publications. It’s the equivalent, for such people, of an ISBN for a book or a DOI for a paper. ORCID is an open data project, run by a not-for-profit foundation.

I’ve been working with ORCID for over a year, on their “works metadata working group“, as an outreach ambassador, and integrating ORCID into Wikipedia and Wikidata (link is a PDF).

I’m currently at the ORCID outreach event at the University of Illinois in Chicago, USA, and participating in the codefest (a hackathon by another name).

I came up with the idea for a plugin for WordPress, which would allow authors to add their ORCID identifier to their profile, and which would allow users to add their ORCIDs to comments.

Roy Boverhof (kindly sponsored by Elsevier) has kindly coded it (it’s his first WordPress plugin!); I’ve installed it; and used it on this post; so you can see my ORCID “0000-0001-5882-6823”) above, and Roy’s in his comment.

If you have an ORCID, please leave a comment here, and include it in the field provided.

The plugin is very much in beta mode (its not yet tested in multiple browsers, for instance; and we need to add documentation and additional functionality such as check-digit validation), but you can get it from Roy’s GitHub repository (there’s a “download zip” button on the right hand side, in the default view) and install it on any self-hosted WordPress installation using Plugins > Add New. (If upgrading from a previous version, please delete the original first.)

Your feedback will be welcome, in comments below, as will code contributions at GitHub.

Thanks, Roy!

Update, 2014-05-22: There were prizes for the best product; all of them were great, but we came second!

Update, 2014-05-28: New version, with various improvements. Please delete the old version before installing the new one, per the above (revised) instructions.

Update, 2014-05-28b: And again! Now at version 0.5

Update, 2025-12-22: Yes, that’s right—after more than a decade, there is new version, 1.0, with significant improvements, new features and updates. As always, see Roy’s GitHub repository for details and to download the file (the link is now hidden behind GitHub’s “Code” button), and install as described above.

A reply from the UK government to my request for road gritting open data

6 Replies

0000-0001-5882-6823

The government website data.gov.uk, to quote its about page, hosts datasets — as open data — ”from all central government departments and a number of other public sector bodies and local authorities” (my emphasis). This is a good thing.

The site’s FAQ says “If there are particular datasets that you believe should be made available more quickly, please use the data request process” (link in original). This is also good.

Accordingly, in September 2012 (that’s sixteen months ago) I submitted a request asking for:

Lists of roads gritted by councils and other bodies, in times of freezing temperatures, with priorities and criteria if applicable.

I specified that those “other bodies” included the Highways Agency, which is “an Executive Agency of the Department for Transport (DfT), and is responsible for operating, maintaining and improving the strategic road network in England on behalf of the Secretary of State for Transport” (again, quoting the HA’s about page) and thus an agency of the UK government. They grit most motorways and certain trunk (“A”) roads.

A highways Agency gritting vehicle at work

The data would allow my fellow volunteers and I to label (“tag”) gritting routes in OpenStreetMap, improving Satnav routing. Here’s a map of some we’ve already done.

I have, today, received a reply from the Cabinet Office, which I reproduce here in full and verbatim:

Hi Andy,

I am getting in touch with you about your data request. I sincerely apologise for the length of time it has taken to get you a response to your request. Local Authorities are responsible for winter gritting within their boundaries. Local Authorities are data owners and they are responsible for the format, access and cost of their data. This means that you will need to get in touch with the Local Authority’s [sic] who’s [sic] data you are seeking directly for access to their data. Some Local Authorities do publish information about gritting on data.gov.uk, but they do not have a reporting requirement. You may find the following links helpful.

http://data.gov.uk/data/search?q=gritting – Data.gov.uk Local Authorities gritting data
http://www.local.gov.uk/community-safety/-/journal_content/56/10180/3510492/ARTICLE – Local Government Association information on how Local Authorities gritting responsibilities.
https://www.gov.uk/roads-council-will-grit – Access to each local Authority page on gritting
http://www.highways.gov.uk/our-road-network/managing-our-roads/operating-our-network/how-we-manage-our-roads/area-teams/area-9/area-9-our-winter-work/ – Highways Agency information
http://www.highways.gov.uk/about-us/contact-us/ – Contacting the Highways Agency
http://www.highways.gov.uk/freedom-of-information-2/ – information on submitting an Freedom of Information request to the Highways Agency

I am very sorry for the amount of time it has taken us to get back to you. I hope this helps.

Kind Regards,

[name redacted]
Transparency Team
4th Floor
1 Horse Guards Road, London, SW1A 2HQ
Email: [redacted]@cabinet-office.gsi.gov.uk
Find out more about Open Data @ Data.gov.uk

I note the following:

Although an apology for the — inordinate — delay in replying is given, no reason for that is offered.
It should — surely? — be possible to make one centralised request rather than having to make the same request to every local authority (at the relevant tier) in the country?
No mention is made of Highways Agency data, other than links to their web pages, including their FoI page.
The reply was sent to me by email, but is not in the Comments section of the page for the request, so is not available to other interested people, including the person who commented in support of it. (I’ll post a link to this post there.)

What do you think?

I hope my recent request, for The National Heritage List for England, receives more prompt consideration and achieves a more positive outcome.

BBC open licences voice samples from radio programmes; ‘Speakerthon’ event invitation

For I’m a Jolly Good Fellow (of the RSA)

7 Replies

0000-0001-5882-6823

I may have been overlooked, once again, in the new year’s honours list, but in mid-December I received an unsolicited and very flattering email; I’d been nominated, by their Regional Programme Manager, to become a Fellow of the Royal Society for the encouragement of Arts, Manufactures and Commerce (the Royal Society of Arts, for short, or RSA, for shorter). The nomination was “for your work on open data, Wikipedia and social media”.

You could have knocked me down with a metaphor.

Royal Society of Arts - from the Strand, London

RSA headquarters
Photo by Elliott Brown, on Flickr, CC-BY

Founded in 1754, the RSA is an independent enlightenment organisation committed to finding practical solutions to today’s social challenges (their email pointed out). That sounded right up my street. I was delighted to accept, and confirmation arrived by e-mail on Wednesday.

I’m in some illustrious company. My fellow fellows include Sir Tim Berners-Lee, Dr Sue Black, Stephen Hawking and Gareth Malone. Past fellows have included Charles Dickens, Benjamin Franklin and Karl Marx!

As a fellow, I shall have use of facilities at the RSA headquarters, off The Strand, pictured above. I shall henceforth refer to this, tongue firmly in cheek, as “my London club”.

My fellowship also means that I now have extra initials after my name. I’m “Andy Mabbett, FRSA”.

But you can still call me Andy.

Don’t link to my Twitter profile!

2 Replies

0000-0001-5882-6823

From time to time, people are kind enough to mention me, with a link, in their blog posts. Usually, in a positive way. I’m very grateful when they do.

Lovely links (geddit?)
Photo by pratanti, on Flickr, CC-BY

But…

They often link to my Twitter account, like this::

Here’s something about Andy Mabbett.

or like this:

Here’s something about Andy Mabbett (he’s @pigsonthewing on Twitter).

(the relevant HTML markup being, in the first example,
<a href="http://twitter.com/pigsonthewing">Andy Mabbett</a>).

Now, like I say, I’m very grateful for the attention. But I do wish they would link to my website, instead:

Here’s something about Andy Mabbett.

or even both:

Here’s something about Andy Mabbett (he’s @pigsonthewing on Twitter).

(the relevant HTML markup being
<a href="http://pigsonthewing.org.uk">Andy Mabbett</a>).

Why?

For two reasons. Firstly, though Twitter is fun, and I use it a lot, it’s ephemeral, and not everyone reading those post will want to use it. My website, on the other hand, has more about me and the work I do. Secondly, I need the Google juice (the value afforded to incoming web links by PageRank, the Google search algorithm ) more than Twitter does.

This isn’t just about me, though. The same applies every time a blogger or other web page author — and that probably includes you — links to anyone or any organisation, with their own website or blog. Please don’t just link to their page on Twitter, Facebook, LinkedIn, or on some other social networking site. Of course, do that as well, or if it’s the only online presence they have.

But if they have a website, as I do, please make that the primary destination to which you link. And hopefully, they will reciprocate.

Thank you.

Requesting open-licensed, open-format recordings of the voices of Wikipedia subjects for Wikimedia Commons

39 Replies

0000-0001-5882-6823

The Idea

A little while ago, my friend and fellow Wikipedia editor Andrew Gray (he’s the Wikipedian in Residence at the British Library!) mentioned to me that Wikipedia could do with more sound files. We discussed recordings of music, industrial and everyday sounds (what does a printing press sound like? Or a Volkswagen Beetle? What do different kinds of breakfast cereal sound like when milk is added?), as well as people’s voices, so that we have a record of what they sound like.

Beethoven’s Trumpet (With Ear) By John Baldessari, at the Saatchi Gallery.
Photo by Jim Linwood, on Flickr, CC-BY

In the spirit of Wikipedia, all such recordings would be open-licensed, to allow others to use them, freely. They can then be uploaded to Wikimedia Commons (the media repository for Wikipedia and its related projects) in an open format, namely Ogg Vorbis (that’s like mp3, but without patent encumbrances).

So I’m working on a new initiative to provide short (under ten-second) open-licensed audio clips of examples of the speaking voices of notable people (i.e. people who have Wikipedia articles about them).

What To Do

As a pilot, I’m asking some of my (cough) celebrity friends to kindly record the following, or a variation of their choice, with no background noise:

Hello, my name is [name]. I was born in [place] and I have been [job or position] since [year]

(but without mentioning Wikipedia!) They can do that, in quiet room, with a modern mobile phone, or a computer.

[Stop Press: See update 4, below, for update regarding use of “Vocaroo”, to avoid this step]

Once they’ve done that, they can convert the file to Ogg Vorbis using this free tool and then upload it to Wikimedia Commons, with an open-licence, with no “non-commercial (NC)” or “no derivatives (ND)” restrictions, (e.g. CC-By or CC-By-SA), and add the category “Voice intro project”.

If that’s too much fuss, they can e-mail it, or its URL, to me (andy@pigsonthewing.org.uk), using common file formats like mp3 or .wav, stating that it’s under one of those licences, and CC the mail to: permissions-en@wikimedia.org to formally record the open licence. Then I or other Wikipedia editors will make the conversion.

Alternatively, perhaps, they can point to a suitable, open-licensed, example of their speaking voice, which is already online.

Anyone Can Help

If you’re not the subject of a Wikipedia article, you can still help, by recording and uploading to Wikimedia Commons audio files, as described above, of machinery or everyday activities and occurrences.

Updates

A couple of Wikipedia article subjects have asked why they would do this. In short, so that there is a public — and freely reusable — record of what they sound like, for current and future generations. And so that we know how they pronounce their names.
The uploaded files are now gathered in a Wikimedia Commons category. Thank you to the early contributors.
I’ve been asked about multi-lingual recordings. The best thing would be separate files, one in each language, please.
If you have a microphone on your computer (doesn’t work on iPhone/iPad), it’s possible to record directly into the Vocaroo website, and just email or tweet me a link. But you still need to agree to an open licence!

How should a hackday be run?

13 Replies

0000-0001-5882-6823

I’m working with a large public-sector organisation who have a considerable — and potentially very useful — body of data. They’re keen to open it up, and would like to encourage people to use it by having a hack event of some kind. At the same time, it’s gratifying that they’re clear that they don’t wish to unfairly exploit anyone.

We’re considering a number of options, and would welcome comments and additional suggestions.

The event could be held in the Midlands; over one day or two, on weekdays, weekend, or Friday-Saturday. Or a competition could be announced online, with a virtual or real-life “dragons den” type event, for people to present things they’ve worked on at home.

You won’t need one of these to take part…
Computer Museum: Cray-2 by cmnit, on Flickr, CC-BY

Should we set a specific challenge, or just ask people to do something interesting with the data?

I’ve suggested prizes might be offered for both the most compete solution, and the best idea, whether compete or not. There might be prizes in other categories, such as the best idea by a young person or the most accessible product, or different categories for commercial and hobbyist entrants.

The data holders might also like to consider developing business relationships to the developers of one or more of the products, separate to any prize giving; rights in all the entries would of course remain with their developers, otherwise.

How would you like such an event to happen? We’re aware of the Hackday Manifesto, but what else is best practice, and what other pitfalls should be avoided?

Over to you…

Andy Mabbett, aka pigsonthewing.

Freelance Wikipedia, Wikidata and OpenStreetMap consultant and Wikimedian in Residence, from Birmingham, England.

Category Archives: web standards

Developer needed to make Wikidata’s geographical data compatible with GPS tools

Like this:

Four Stars of Open Standards

Like this:

United Kingdom parliamentary URL structure: change needed

Like this:

ORCID plugin for WordPress

Like this:

A reply from the UK government to my request for road gritting open data

Like this:

BBC open licences voice samples from radio programmes; ‘Speakerthon’ event invitation

Like this:

Requesting open-licensed, open-format recordings of the voices of Wikipedia subjects for Wikimedia Commons

The Idea

What To Do

Anyone Can Help

Updates

Like this:

How should a hackday be run?

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

The Idea

What To Do

Anyone Can Help

Updates

Share this:

Like this:

Share this:

Like this: