Category Archives: ideas

My Open Data Challenge to UK Local Government: a Wikipedia Page for Every Council

At yesterday’s excellent West Midlands “Open Data: Challenges & Opportunities” event, hosted by the West Midlands Regional Observatory, Chris Taggart (@countculture), who runs the very useful Openly Local website, aggregating data about councils and their elected members, mentioned the problems he has extracting linked data about councils from Wikipedia, via DBPedia, because Wikipedia tends to conflate places with their local authorities.

See, for example, the Wikipedia article on the Metropolitan Borough of Dudley; or those on Lichfield, which (at the time of writing) has only a small section on its town council and Lichfield district (so a challenge there for Stuart Harrison, @pezholio, and his colleagues!); and compare them with the separate articles about Birmingham and Birmingham City Council; or Walsall, the Metropolitan Borough of Walsall, and Walsall Metropolitan Borough Council. The former, all-on-one-page, pattern is far more common. (Disclosure: I created some, and have edited all, of those articles.)

I suggested at the event that this problem could be solved if staff from each UK council simply started a Wikipedia article about their council, where none already exists.

As each UK council is, inherently, (to use the Wikipedia jargon) notable, there should be no issue with this, provided that they are mindful of Wikipedia’s policy on conflicts of interest (which explicitly allows for such editing), and the requirement that articles maintain a neutral point-of-view, and be referenced. Short “stub” articles can be created in the first instance.

(If council staff are hesitant to do so themselves, then I can help to pair them up with volunteer Wikipedia editors who will assist them, or create articles directly.)

Update: Added Dudley & Lichfield district examples.

Lists in Microformats: Suggested Optimisation

4 Replies

0000-0001-5882-6823

Based on my extensive experience of applying microformats to templates in Wikipedia (and other MediaWiki installations) I’ve come to the following conclusion…

For attributes which can occur more than once (such as nickname or category in hCard), lists having, or in container having, that property should be parsed as lists of individual instances of that property.

For example:
<div class="category"> <ul> <li>ornithologist</li> <li>driver</li> <li>gardener</li> </ul> </div>
and:
<ul class="category"> <li>ornithologist</li> <li>driver</li> <li>gardener</li> </ul>
should be treated as equivalent to:
<ul> <li class="category">ornithologist</li> <li class="category">driver</li> <li class="category">gardener</li> </ul>

Twitter: A microformat in lieu of a protocol

8 Replies

0000-0001-5882-6823

In May of this year I wrote about the problems of URLs for a given Twitter user’s profile, or for an individual post or “status” being different, depending the Twitter client in use. I suggested a new protocol for Twitter links. [You might want to read that, before the rest of this post]. I can’t believe I didn’t think of this simpler solution sooner!

The answer (in the short term) is to use a microformat (or a microformat-like “poshsformat”, if you prefer to call it that) for each case. Let’s say we use the classes twitter-user & twitter-status.

User-agents (that’s jargon for browsers) could then employ a script (such as those used by GreaseMonkey, or a Firefox extension) to ignore the encoded URL and substitute the equivalent for the user’s preferred Twitter client instead.

For links to user profiles:

<a href="http://twitter.com/pigsonthewing"> Andy Mabbett </a>

would become:

<a class="twitter-user" href= "http://twitter.com/pigsonthewing"> Andy Mabbett </a>

and:

<a href="http://accessibletwitter.com/app/user.php?uid=pigsonthewing"> Andy Mabbett</a>

would become:

<a class="twitter-user" href=" http://accessibletwitter.com/app/user.php?uid=pigsonthewing"> Andy Mabbett</a>

Likewise, for individual statuses:

<a href="twitter.com/pigsonthewing/status/1828036334"> something witty</a>

would become:

<a class="twitter-status" href="twitter.com/pigsonthewing/status/1828036334"> something wittyg<a>

and:

<a href="accessibletwitter.com/app/status.php?1828036334"> something witty<a>

would become:

<a class="twitter-status" href="accessibletwitter.com/app/status.php?1828036334"> something witty<a>

and:

<a href="m.slandr.net/single.php?id=1828036334" something witty</a>

would become:

<a class="twitter-status" href="m.slandr.net/single.php?id=1828036334"> something witty</a>

To simplify matters, the rules for extracting the user ID or the status update could be the same in both cases:

Parse the value of the href attribute of the element to which the class applies.
If there is a question mark, use everything after that.
Otherwise, if there is an equals sign, use everything after that.
Otherwise, use everything after the last slash.

That would deal with all the examples in my earlier post.

So, if you’re using a user-agent which is aware of this microformat, and find on a page:

<a class="twitter-user" href="http://twitter.com/pigsonthewing"> Andy Mabbett<a> said <a class="twitter-status" href="m.slandr.net/single.php?id=1828036334"> something witty<a>

but your preferred Twitter client is Dabr (one I recommend, BTW!) then your browser would treat (and possibly render) that as:

<a href="dabr.co.uk/user/pigsonthewing"> Andy Mabbett<a> said <a class="twitter-status" href="dabr.co.uk/status/1828036334"> something witty<a>

Simples!

Triple-tag references to Twitter posts

2 Replies

0000-0001-5882-6823

Further to my post about a protocol for Twitter posts, you can also triple-tag blog posts, Flickr images and similar web utterances, which refer to a specific twitter post (or status) like this: twitter:status=1975532392 – and this post is tagged with that!

[Update: See also my Flickr screenshot of a Twitter post, triple tagged with #twitter:status=1828036334 to reference the same post.]

How microformat developments are blocked

1 Reply

0000-0001-5882-6823

The hCard microformat can distinguish between a person and an organisation, by the use of the org property:

<div class="vcard"> <span class="fn">Andy Mabbett</span> </div>

<div class="vcard"> <span class="fn org">The Red Cross</span> </div>

but it cannot distinguish between an organisation and a place:

<div class="vcard"> <span class="fn org">The Wembley Stadium fan club</span> </div>

<div class="vcard"> <span class="fn org">Wembley Stadium</span> </div>

treating them both as organisations.

On 31 December 2007, I described a way in which hCard microformat could be used to differentiate between hCards for places and organisations.

On 9 January 2008, having received favourable comment, I made a formal proposal to update the hCard specification.

Despite this ten-day gap, Brian Suda, one of the microformats “admins”, the cabal who control microformats, complained that he’d only had two days to consider the matter, and that “More time is needed to fully look over the implications of this change.”

No objections to the method, nor issues with it, have been raised.

Toby Inkster’s superb microformats parser Swignition (formerly called “Cognition”) has supported the method since version 0.1-alpha8, released in May 2008.

One year on from my formal proposal, what changes have been made to the hCard specification, in this regard? None.

Update: Three years on from my formal proposal, what changes have been made to the hCard specification, in this regard? None.

Marking up the scientific names of living things

Facebook should allow groups to be rationalised

Suggested method of publishing microformats in Twitter posts

5 Replies

0000-0001-5882-6823

Twitter posts like this one:

We’re still deep in the Sundarbans, near Tambulbunia, meeting experts on dolphins and tigers. l:Tambulbunia, Bangladesh=22.27722,89.71905

have a place- name and corresponding coordinates (indicated by the prefix “l:”). This has allowed them to be plotted on a map.

It should be possible for the poster to send, say:

We’re still deep in the Sundarbans, near Tambulbunia, meeting experts on dolphins and tigers. #hcard: fn+locality:Tambulbunia: country-name:Bangladesh: geo:22.27722,89.71905

using colons as delimiters and have Twitter render that comment marked up as an hCard.

In the short term, this could be achieved by a third-party site, like #hashtags .

UPDATE: being more mindful of the 140 character limit than I have in the above example, perhaps class names might be abbreviated (“loc” for “locality”, “ctry” for “country-name”, and so on).

Andy Mabbett, aka pigsonthewing.

Freelance Wikipedia, Wikidata and OpenStreetMap consultant and Wikimedian in Residence, from Birmingham, England.

Category Archives: ideas

My Open Data Challenge to UK Local Government: a Wikipedia Page for Every Council

Like this:

Lists in Microformats: Suggested Optimisation

Like this:

Triple-tag references to Twitter posts

Like this:

How microformat developments are blocked

Like this:

Marking up the scientific names of living things

Like this:

Facebook should allow groups to be rationalised

Like this:

Suggested method of publishing microformats in Twitter posts

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this: