Category Archives: web standards

My Open Data Challenge to UK Local Government: a Wikipedia Page for Every Council

At yesterday’s excellent West Midlands “Open Data: Challenges & Opportunities” event, hosted by the West Midlands Regional Observatory, Chris Taggart (@countculture), who runs the very useful Openly Local website, aggregating data about councils and their elected members, mentioned the problems he has extracting linked data about councils from Wikipedia, via DBPedia, because Wikipedia tends to conflate places with their local authorities.

See, for example, the Wikipedia article on the Metropolitan Borough of Dudley; or those on Lichfield, which (at the time of writing) has only a small section on its town council and Lichfield district (so a challenge there for Stuart Harrison, @pezholio, and his colleagues!); and compare them with the separate articles about Birmingham and Birmingham City Council; or Walsall, the Metropolitan Borough of Walsall, and Walsall Metropolitan Borough Council. The former, all-on-one-page, pattern is far more common. (Disclosure: I created some, and have edited all, of those articles.)

I suggested at the event that this problem could be solved if staff from each UK council simply started a Wikipedia article about their council, where none already exists.

As each UK council is, inherently, (to use the Wikipedia jargon) notable, there should be no issue with this, provided that they are mindful of Wikipedia’s policy on conflicts of interest (which explicitly allows for such editing), and the requirement that articles maintain a neutral point-of-view, and be referenced. Short “stub” articles can be created in the first instance.

(If council staff are hesitant to do so themselves, then I can help to pair them up with volunteer Wikipedia editors who will assist them, or create articles directly.)

Update: Added Dudley & Lichfield district examples.

Manu Sporny recommends me on LinkedIn

1 Reply

0000-0001-5882-6823

I hope you will forgive me for immodesty repeating Manu Sporny’s kind and fulsome recommendation of me, from my LinkedIn profile, for the benefit of those of you who don’t have accounts there:

I had worked with Andy in the Microformats community, developing international standards for the Web. During this time Andy not only excelled at providing technical feedback and review, but led several bold initiatives to standardize the classification of planetary-geo-location and living species on the web. While a logically consistent and wise technical contributor, his influence on the direction of the community was also vital. Andy’s role in questioning and influencing the core philosophy and community process was and continues to be deeply appreciated.

I’m genuinely touched by that. Thank you, Manu!

Manu Sporny is CEO of Digital Bazaar.

Machine Tagging Flickr

Triple-tag references to Twitter posts

2 Replies

0000-0001-5882-6823

Further to my post about a protocol for Twitter posts, you can also triple-tag blog posts, Flickr images and similar web utterances, which refer to a specific twitter post (or status) like this: twitter:status=1975532392 – and this post is tagged with that!

[Update: See also my Flickr screenshot of a Twitter post, triple tagged with #twitter:status=1828036334 to reference the same post.]

Twitter: canonical URLs and Protocols

5 Replies

0000-0001-5882-6823

On Twitter, I’m twitter.com/pigsonthewing, but in my preferred twitter client, Dabr, I’m dabr.co.uk/user/pigsonthewing. We might refer to the former as the “canonical” URL.

There are a number of other web-based Twitter clients, too, and people using them can find my twitter stream, variously, at:

Likewise each of my Twitter posts, or “tweets”, has a URL on each of some of those domains (though not on all, it seems). For example:

are all the same tweet. We can again regard the first of them, on twitter.com, as canonical.

Anyone using one of those services, and who wants to link to my profile or one of my tweets will either post the URL as it appears in their service, which isn’t much use to people not using that service, or expend time and effort translating the URL into the generic, canonical, Twitter format — which even then may not be of much use to someone using something else.

In the short term, we could do with some recognition of this fact from the above services, which might provide a link to the “standard” or canonical URL for that tweet; and when doing so on an individual page, should link to it using rel="alternate" and/ or rel="canonical".

Better still, there could be browser tools (such as FireFox plug-in or Greasemonkey script) to do that task, automagically.

Ultimately, though, as Twitter becomes ever more widespread, perhaps we need a pair of protocols for linking to Twitter profiles and posts. Using this, authors would be able to mark up links to me and my comments on Twitter as, say:

<a href="twitter:pigsonthewing">Andy Mabbett</a> said <a href="twitterpost:1827840116">something witty</a>.

Then, each reader could set their computer to open those links their choice of browser-based or desk-top/ mobile phone client. The setting to do could even be changed in the installation package for such tools, to aid non-technical users.

Footnote: if you know of another URL for my Twitter stream, please let me know!

Dates and coordinates in HTML5

7 Replies

0000-0001-5882-6823

I’m grateful to Bruce Lawson of Opera for alerting me to discussion of the <time> element on the HTML5 mailing list (where I’ve posted a copy of this blog post) and encouraging me participate; and indebted to him for the engaging discussions which have led me to the ideas expressed below. So please blame him if you don’t like what I have to say 😉

I’ve read up on what prior discussion I can find on that mailing list; but may have missed some. I’ll be happy to have anything I’ve overlooked pointed out to me.

I have considerable experience of marking up dates in microformats, both for forthcoming events on the West Midland Bird Club’s diary pages; and for historic events, on Wikipedia and Wikimedia Commons.

I’ve been a staunch and early critic of the accessibility problems caused by abusing the <abbr> element for things like machine-readable dates (as has Bruce). The HTML5 time element has the potential to resolve that problem, but only if it caters for all the cases in which microformats are — or could potentially be — used.

It seems to me that there are several outstanding, and overlapping, issues for <time> in HTML5, which include use-cases, imprecise dates, Gregorian vs. non-Gregorian dates and BCE (aka “BC“) dates. First, though, I should like to make the observation that, while hCalendar microformats are most commonly used to allow event details to be added to calendar apps, and that that use case drove their development, they should not be seen simply as a tool to that end. I see them, and hope that others do, as a way of adding semantic meaning to mark-up; and that’s how I view the “time” element, too. Once we indicate that the semantic meaning of a string of text is date, it’s up to other people to decide what they use that for — ”let a thousand flowers bloom”, as the adage goes.

Use-cases for machine-readable date mark-up are many: as well as the aforesaid calendar interactions, they can be used for sorting; for searching (“find me all the pages about events in 1923″ — recent developments in Yahoo’s YQL searching API (which now supports searching for microformats) have opened up a whole new set of possibilities, which is only just beginning to be explored). They can be mapped visually on a “SIMILE” or similar time-line. They can be translated into other languages more effectively than raw prose; they can be disambiguated (does “5/6/09” mean “5th June 2009” or “6th May 2009”?); and they can be presented in the user’s preferred format (I might want to see “5th June 2009”; you might see “June 5, 2009″ — such presentational preferences have generated arguments of little-endian proportions on Wikipedia).

hCalendar microformats are already used to mark up imprecise dates (“June 1977”; “2009”). ISO8601 already supports them. Why not HTML5? Though care needs to be taken, it’s even possible to mark up words like “today” with a precise date, if that’s generated real-time, server-side.

The issue of non-Gregorian (chiefly Julian) dates is a vexing one; and has already caused problems on Wikipedia. So far as I am aware, there is no ISO-, RFC- or similar standard for such dates, other than converting them to Gregorian dates. It is not the job of the HTML5 working group to solve this problem; but I think the group should recognise that at some point a solution must be forthcoming. One way to do so would be allow something like:

<time schema="[schema-name]" datetime="[value]">[date in plain text]</time>

where the schema defaults to ISO 8601 if not stated, and the whole element is treated as simply:

[date in plain text]

if the schema is unrecognised; thereby ensuring backwards compatibility. That way, if a hypothetical ISO- or other standard for Julian dates emerges in the future, authors may simply start to use it without any revision to HTML 5 being required.

As for BCE dates, they’re already allowed in ISO 8601 (since there was no year 0, the year 3 BCE is given as -0002 in ISO 8601). I see no reason why they should be disallowed in <time> elements in HTML5. We wouldn’t, to take an extreme example, say that “<P>” can be used for paragraphs in English but not French; or paragraphs about literature but not music, so why make an effectively arbitrary limit on the dates which can be marked up semantically? Surely the use case for marking-up a sortable table of Roman emperors, should allow all such emperors, and not just those who ruled from 0001AD, to be included?

Coordinates

Another abuse of ABBR in microformats for coordinates:

<abbr class="geo" title="52.548;-1.932">Great Barr</abbr>

Bruce and I agree that this could be resolved, and HTML5 usefully extended, by a “location” element:

<location latitude="52.548" longitude="-1.932">Great Barr</location>

Using the “schema” attribute shown above, for non-Gregorian dates, we can extend that for Martian, Lunar (and eventually other bodies):

<location schema="moon" latitude="52.548" longitude="23.47297">Sea of Tranquility</location>

and for nonWGS84 coordinates, in a manner similar to that I described in my proposals to extend the related Geo microformat.

Now all we need to do is to work-around the abuse of ABBR for durations, in the draft hAudio microformat:

<abbr title="PT3M23S">3 minutes 23 seconds</abbr>

Marking up the scientific names of living things

hAccessibility: BBC drop hCalendar microformat

5 Replies

0000-0001-5882-6823

Almost two years after I first raised the issue (to a reaction from the cabal that runs the microformats “community” which began with denial and moved to hostility) the BBC have stopped using the hCalendar microformat due to accessibility concerns.

Maybe now something can be done to incorporate one of the several, more accessible proposed work-arounds, into the relevant standards?

Thanks to Bruce Lawson and Patrick Lauke for breaking the news.

Update: Patrick now has a post on the subject, at webstandards.org

Unknown Beethoven symphony discovered!

1 Reply

0000-0001-5882-6823

I heard a new — to me — piece of music the other evening, It was on ClassicFM‘s rather lovely ‘The Full Works‘, the late evening show which plays whole pieces, rather than the shorter snippets featured during the day. The piece was clearly (to my admittedly untutored ears) Beethoven, and symphonic, but, familiar as I am with Beethoven’s symphonies, I’d never heard it before, and couldn’t place it. The use of horns was typically Beethovian, the woodwind was very Beethovian, the strings were quite Beethovian, and the structure of the piece itself was absolutely Beethovian. No doubt about it, it was a Beethovian piece. But what was it?

As soon as I could, I pulled the car over and parked at the side of the road, whipped out my trusty Nokia N95, and used ClassicFM’s useful, if appallingly inaccessible and not really mobile- friendly, on-line playlist to check what it was. And it wasn’t Beethoven at all. To my surprise, it was Georges Bizet‘s Symphony In C Major. Remarkably, it was written as a student exercise in 1855, when he was just 16, and lay forgotten and unperformed until it was rediscovered in 1935. You’d never tell, if you heard this impressive work.

Well worth seeking out, I reckon. Especially if you like Beethoven.

hAccessibility – Unhappy First Birthday

8 Replies

0000-0001-5882-6823

It’s one year today since Bruce Lawson and James Craig published “hAccessibility“, about the misuse of the ‘abbr’ element in microformats (an issue I first raised on 20 September 2006 in Accessify Forums).

As recent events show, the microformats cabal still has its collective head up its own^W^W^W in the sand.

Despite suggestions for a workaround, a solution seems no nearer, thanks to their apparent indifference. Shame on them.

Andy Mabbett, aka pigsonthewing.

Freelance Wikipedia, Wikidata and OpenStreetMap consultant and Wikimedian in Residence, from Birmingham, England.

Category Archives: web standards

My Open Data Challenge to UK Local Government: a Wikipedia Page for Every Council

Like this:

Machine Tagging Flickr

Like this:

Triple-tag references to Twitter posts

Like this:

Twitter: canonical URLs and Protocols

Like this:

Dates and coordinates in HTML5

Coordinates

Like this:

Marking up the scientific names of living things

Like this:

hAccessibility: BBC drop hCalendar microformat

Like this:

Unknown Beethoven symphony discovered!

Like this:

hAccessibility – Unhappy First Birthday

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Coordinates

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this: