Tag Archives: url

United Kingdom parliamentary URL structure: change needed

In Wikidata, Wikipedia’s sister project for storing statements of fact as , we record a number of unique identifiers.

For example, Tim Berners-Lee has the identifier “85312226” and is known to the as “nm3805083”.

We know that we can convert these to URLs by adding a prefix, so

by adding the prefixes:

  • https://viaf.org/viaf/
  • http://www.imdb.com/Name?

respectively. We only need to store those prefixes in Wikidata once each.


The in August 2014,
picture by Henry Kellner, CC BY-SA 3.0

The United Kingdom Parliament website also uses identifiers for MPs and members of the House of Lords.

For example, Tom Watson, an MP, is “1463”, and Jim Knight, aka The Lord Knight of Weymouth, is “4160”.

However, the respective URLs are:

meaning that the prefixes are not consistent, and require you to know the name or exact title.

Yet more ridiculous is that, if Tom Watson ever gets appointed to the House of Lords, even though his unique ID won’t change, the URL required to find his biography on the parliamentary website will change — and, because we don’t know whether he would be, say Lord Watson of Sandwell Valley, or Lord Watson of West Bromwich, we can’t predict what it will be.

When building databases, like Wikidata, this is all extremely unhelpful.

What we would like the parliamentary authorities to do — and what would benefit others wanting to make use of parliamentary URLs — is to use a standard, predictable type of URL, for example http://www.parliament.uk/biographies/1463 which uses the unique identifier, but does not require the individual’s house, name or title, and does not change if they shift to “the other place”.

If necessary they could then make that redirect to the longer URLs they prefer (though I wouldn’t recommend it).

I’ve asked them; but they don’t currently do this. In fact they explained their preference for the longer URLs thus:

…we are unable [sic] to shorten the url any further as the purpose of the current pattern of the web address is to display a pathway to the page.

The url also identifies the page i.e the indication of biographies including the name of the respective Member as to make it informative for online users who may view the page.

I find these arguments unconvincing, to say the least.

Screenshot, with Watson's name in the largest font on the page

There’s a big enough clue on the page, without needing to read the URL to identify its subject

Furthermore, the most verbose parts of the URLs are non-functioning; if we truncate Tom’s URL by simply dropping the final digit: http://www.parliament.uk/biographies/commons/tom-watson/146, then we get the biography of a different MP. On the other hand, if we change it to, say: http://www.parliament.uk/biographies/commons/t/1463, we still get Tom’s page. Try them for yourself.

So, how can we help the people running the Parliamentary website to change their minds, and to use a more helpful URL structure? Who do we need to persuade?

Twitter: canonical URLs and Protocols

On Twitter, I’m twitter.com/pigsonthewing, but in my preferred twitter client, Dabr, I’m dabr.co.uk/user/pigsonthewing. We might refer to the former as the “canonical” URL.

There are a number of other web-based Twitter clients, too, and people using them can find my twitter stream, variously, at:

Likewise each of my Twitter posts, or “tweets”, has a URL on each of some of those domains (though not on all, it seems). For example:



are all the same tweet. We can again regard the first of them, on twitter.com, as canonical.

Anyone using one of those services, and who wants to link to my profile or one of my tweets will either post the URL as it appears in their service, which isn’t much use to people not using that service, or expend time and effort translating the URL into the generic, canonical, Twitter format — which even then may not be of much use to someone using something else.

In the short term, we could do with some recognition of this fact from the above services, which might provide a link to the “standard” or canonical URL for that tweet; and when doing so on an individual page, should link to it using rel="alternate" and/ or rel="canonical".

Better still, there could be browser tools (such as FireFox plug-in or Greasemonkey script) to do that task, automagically.

Ultimately, though, as Twitter becomes ever more widespread, perhaps we need a pair of protocols for linking to Twitter profiles and posts. Using this, authors would be able to mark up links to me and my comments on Twitter as, say:

<a href="twitter:pigsonthewing">Andy Mabbett</a> said <a href="twitterpost:1827840116">something witty</a>.

Then, each reader could set their computer to open those links their choice of browser-based or desk-top/ mobile phone client. The setting to do could even be changed in the installation package for such tools, to aid non-technical users.

Footnote: if you know of another URL for my Twitter stream, please let me know!