Twitter and semantic microblogging

“Life happens while you’re making other plans.”

The same may be said about the semantic web. While waiting for mainstream adoption of proper semantic web standards — such as RDF metadata embedded in web pages — the drive for organized structure and meaning in the great sea of web content is being addressed in perhaps the most unexpected of places: Twitter.

Technology adoption, like water, follows the path of least resistance.

The explosion of hashtags on Twitter — e.g., #iphone, #superbowl, #davos — has evolved into an interesting semantic phenomenon. First popularized on Twitter by Nate Ritter during the San Diego forest fires in 2007, hashtags became a way for users to track tweets on a particular subject. Combined with the myriad of Twitter search and tracking services out there — such as and Twemes — people can quickly find streams of conversation on topics of interest to them.

An increasingly common usage pattern is for users to post a URL (usually a short URL alias, such as from along with 1 – 3 hashtags and maybe a few words of annotation:

…which amounts to creating an independent metadata layer over the web.

Twitter vs. Bookmarking

Now, I know, there are a ton of issues with this. Like tags in any and flickr, for instance — you end up with conflicting meanings for the same tag, duplicate tags for the same topic, and often a certain amount of ambiguity. And because source URLs are rarely posted in their original form, but rather as a variety of squashed duplicate shortcuts, there’s an extra layer of indirection muddying the water.

But there are a few important differences between Twitter and social bookmarking services that preceded it, especially with its incentives for tag convergence:

Followers as an incentive. Followers — the number of people subscribed to your tweet stream — are a primary currency of Twitter. You gain new followers by posting tweets that appeal to others who share the same affinities. Using hashtags, and more importantly the right hashtags, helps people connect. This creates an incentive for communities to converge upon the same tag for something as quickly as possible.

Parsimony as an incentive. The 140 character limit of tweets — including any posted URL and space to permit easy re-tweeting — doesn’t leave room for the tag spam that afflicts most bookmarking services. You want to pick one or two of the right tags, and you can’t afford duplicates, as that will certainly annoy human readers as well (hurting your number of followers). Again, this encourages tag convergence.

Real-time feedback. On and flickr, you rarely get feedback about your tags. On Twitter, however, you can get immediate reaction from the community — either positive reinforcement or negative signals for course correction. This can be quantitative, such as number of followers, replies, and re-tweets, but also qualitative commentary. This too is an incentive for tag convergence.

It’s this social dynamic that takes semantifying the web in an intriguing direction.

The Social Semantic Web

Twitter hashtags seem like a real step towards a social semantic web, what has also been called a socio-semantic web. To be sure, Twitter is still pretty far away from the true semantic web vision. Probably the two biggest complaints are the lack of formal ontologies and the lack of precision for machine-readability. But innovations will emerge to tackle these challenges.

For instance, there has been some experimentation with triple tags on Twitter — akin to the machine tags that were popularized on flickr. For example, #taxonomy:binomial=Alcedo_atthis or #geo:lat=-1.56403 #geo:lon=53.60913 — tantalizing similar to RDF notation.

Unfortunately, the downside of triple tags in Twitter is that they make tweets less human readable, which causes a certain amount of hashtag backlash. Incremental solutions may involve hashtag filtering or other ways of managing the visibility of such metadata on Twitter (albeit with side effects to the incentive structure).

More innovative approaches may be necessary to reach the next level.

Benjamin Nowack has an excellent presentation on semantic microblogging on Slide Share that explores some ideas about “finding the sweet spot between simplicity and added value” in further semantic extensions to this environment:

For a slightly more technical analysis of future semantic microblogging technologies, you may want to take a look at this recent academic paper, Microblogging: A Semantic and Distributed Approach.

It’s hard to predict how hastags on Twitter will evolve. They might be a fad, or a new wave of enhancements to Twitter — or third-party services — might elevate them to a whole new level. Either way, they reveals some of the potential for intersecting social media and the semantic web on the road ahead.

