Category Archives: web standards

My Open Data Challenge to UK Local Government: a Wikipedia Page for Every Council

At yesterday’s excellent West Midlands “Open Data: Challenges & Opportunities” event, hosted by the West Midlands Regional Observatory, Chris Taggart (), who runs the very useful Openly Local website, aggregating data about councils and their elected members, mentioned the problems he has extracting linked data about councils from Wikipedia, via DBPedia, because Wikipedia tends to conflate places with their local authorities.

See, for example, the Wikipedia article on the Metropolitan Borough of Dudley; or those on , which (at the time of writing) has only a small section on its town council and Lichfield district (so a challenge there for Stuart Harrison, , and his colleagues!); and compare them with the separate articles about and ; or , the , and . The former, all-on-one-page, pattern is far more common. (Disclosure: I created some, and have edited all, of those articles.)

I suggested at the event that this problem could be solved if staff from each UK council simply started a Wikipedia article about their council, where none already exists.

As each UK council is, inherently, (to use the Wikipedia jargon) notable, there should be no issue with this, provided that they are mindful of Wikipedia’s policy on conflicts of interest (which explicitly allows for such editing), and the requirement that articles maintain a neutral point-of-view, and be referenced. Short “stub” articles can be created in the first instance.

(If council staff are hesitant to do so themselves, then I can help to pair them up with volunteer Wikipedia editors who will assist them, or create articles directly.)

Update: Added Dudley & Lichfield district examples.

Manu Sporny recommends me on LinkedIn

I hope you will forgive me for immodesty repeating Manu Sporny’s kind and fulsome recommendation of me, from my LinkedIn profile, for the benefit of those of you who don’t have accounts there:

I had worked with Andy in the Microformats community, developing international standards for the Web. During this time Andy not only excelled at providing technical feedback and review, but led several bold initiatives to standardize the classification of planetary-geo-location and living species on the web. While a logically consistent and wise technical contributor, his influence on the direction of the community was also vital. Andy’s role in questioning and influencing the core philosophy and community process was and continues to be deeply appreciated.

I’m genuinely touched by that. Thank you, Manu!

Manu Sporny is CEO of Digital Bazaar.

Machine Tagging Flickr

I’ve posted some more thoughts on machine- (or triple-) tags and microformats on Flickr, in their Flickr Ideas group.

Update: There is now a tool to automatically generate tags for Flickr images of living things; iNaturalist tagger.

Triple-tag references to Twitter posts

Further to my post about a protocol for Twitter posts, you can also triple-tag blog posts, Flickr images and similar web utterances, which refer to a specific twitter post (or status) like this: twitter:status=1975532392 – and this post is tagged with that!

[Update: See also my Flickr screenshot of a Twitter post, triple tagged with #twitter:status=1828036334 to reference the same post.]

Twitter: canonical URLs and Protocols

On Twitter, I’m twitter.com/pigsonthewing, but in my preferred twitter client, Dabr, I’m dabr.co.uk/user/pigsonthewing. We might refer to the former as the “canonical” URL.

There are a number of other web-based Twitter clients, too, and people using them can find my twitter stream, variously, at:

Likewise each of my Twitter posts, or “tweets”, has a URL on each of some of those domains (though not on all, it seems). For example:

Twitter

Dabr

are all the same tweet. We can again regard the first of them, on twitter.com, as canonical.

Anyone using one of those services, and who wants to link to my profile or one of my tweets will either post the URL as it appears in their service, which isn’t much use to people not using that service, or expend time and effort translating the URL into the generic, canonical, Twitter format — which even then may not be of much use to someone using something else.

In the short term, we could do with some recognition of this fact from the above services, which might provide a link to the “standard” or canonical URL for that tweet; and when doing so on an individual page, should link to it using rel="alternate" and/ or rel="canonical".

Better still, there could be browser tools (such as FireFox plug-in or Greasemonkey script) to do that task, automagically.

Ultimately, though, as Twitter becomes ever more widespread, perhaps we need a pair of protocols for linking to Twitter profiles and posts. Using this, authors would be able to mark up links to me and my comments on Twitter as, say:

<a href="twitter:pigsonthewing">Andy Mabbett</a> said <a href="twitterpost:1827840116">something witty</a>.

Then, each reader could set their computer to open those links their choice of browser-based or desk-top/ mobile phone client. The setting to do could even be changed in the installation package for such tools, to aid non-technical users.

Footnote: if you know of another URL for my Twitter stream, please let me know!

Marking up the scientific names of living things

As any web manager worth their salt knows, it’s <span lang=”fr”>trés important</span> that changes in language be marked up with HTML’s “lang” attribute, using an IETF language tag (such as “fr” for French, as shown above). This allows software like text readers for blind people to pronounce them correctly (instead of sounding like an outtake from ‘Allo ‘Allo!) and means that translation software can handle them appropriately.

But what happens when a page like this one includes the scientific (or taxonomic) name of a living thing, such as Circus cyaneus (the Hen Harrier)? It’s not English, and should not be translated, into, say, German, as Zirkus cyaneus.

It’s not really Latin, either, though some people mistakenly refer to scientific names as “Latin names”. Many of them are neologisms — new words, with no real Latin content, but based on Latinised Greek (for example Brachypelma albopilosum), people’s names (Ardeola grayii, in honour of John Edward Gray, a biologist), place names (Nepenthes sumatrana, from Sumatra), culture (Ba humbugi, a quote from Charles Dickens‘ ‘A Christmas Carol‘) or even humour (Phthiria relativitae, a play on “The Theory of Relativity”).

Back in 2003, on the IETF mailing list whcih discusses such langauge codes, I proposed that there should be a specific language code, or sub-code, so that scientific names such as these could be marked up and recognised by software. There wasn’t much interest (possibly because I made the proposal as an amateur, rather than a professional or academic taxonomist), and distractions in my work and domestic life meant that I didn’t, unfortunately, have time to pursue the matter.

However, the need for such a code has now been recognised by Gregor Hagedorn, of the Julius Kuehn Institute, Germany‘s Federal Research Centre for Cultivated Plants, in Berlin, who has rekindled my proposal.

With the support of Gregor and other taxonomists, via the Taxacom mailing list, I’m hopeful we can at last make a case that such a code is needed.

hAccessibility: BBC drop hCalendar microformat

Almost two years after I first raised the issue (to a reaction from the cabal that runs the microformats “community” which began with denial and moved to hostility) the BBC have stopped using the hCalendar microformat due to accessibility concerns.

Maybe now something can be done to incorporate one of the several, more accessible proposed work-arounds, into the relevant standards?

Thanks to Bruce Lawson and Patrick Lauke for breaking the news.

Update: Patrick now has a post on the subject, at webstandards.org

Unknown Beethoven symphony discovered!

I heard a new — to me — piece of music the other evening, It was on ClassicFM‘s rather lovely ‘The Full Works‘, the late evening show which plays whole pieces, rather than the shorter snippets featured during the day. The piece was clearly (to my admittedly untutored ears) Beethoven, and symphonic, but, familiar as I am with Beethoven’s symphonies, I’d never heard it before, and couldn’t place it. The use of horns was typically Beethovian, the woodwind was very Beethovian, the strings were quite Beethovian, and the structure of the piece itself was absolutely Beethovian. No doubt about it, it was a Beethovian piece. But what was it?

As soon as I could, I pulled the car over and parked at the side of the road, whipped out my trusty Nokia N95, and used ClassicFM’s useful, if appallingly inaccessible and not really mobile- friendly, on-line playlist to check what it was. And it wasn’t Beethoven at all. To my surprise, it was Georges Bizet‘s Symphony In C Major. Remarkably, it was written as a student exercise in 1855, when he was just 16, and lay forgotten and unperformed until it was rediscovered in 1935. You’d never tell, if you heard this impressive work.

Well worth seeking out, I reckon. Especially if you like Beethoven.

hAccessibility – Unhappy First Birthday

It’s one year today since Bruce Lawson and James Craig published “hAccessibility“, about the misuse of the ‘abbr’ element in microformats (an issue I first raised on 20 September 2006 in Accessify Forums).

As recent events show, the microformats cabal still has its collective head up its own^W^W^W in the sand.

Despite suggestions for a workaround, a solution seems no nearer, thanks to their apparent indifference. Shame on them.