Previous post
Which gets me to the subject of the Semantic Web. The Semantic Web is about people, and specifically about making people's voices clearly audible and indelible:
- Audible – There are six billion of us on the planet. Some people would have you believe that you should never ask any of us for advice, because we lie. But today I can't even hear your lies. The Internet has made it immensely easier to connect with expertise from other humans who want to share it, but we are still largely shackled by cultural, geographic, social, and technological constraints limiting who we can consult for advice. Today I get most of my lies from whichever barbarians have clawed their way to the top of the local and national media outlets. But sometimes when I see an advertisement for an interesting new product, I want to be able to pick up my remote control and click on “connect me to five people who hate the product and ask them why”. I am sure that there are at least five people who want to give me a perspective different from the one being broadcast, so why can I not hear their voices?
- Indelible – Few people think about the noble role that librarians play. Our ability to collect, organize, and preserve the voices and observations of those who came before us is critical to our continued survival as a species. The story of Babel is a metaphor for what later happened at Alexandria; a reminder that we all suffer when we lose our ability to pass lessons to future generations. It is possible for a single person to memorize the Quran and pass it on to others, but word-of-mouth is not enough to perpetuate the bulk of knowledge that enables the planet to support six billion people today. Without written language and our knowledge stewards, we would have to eliminate many billions of people, because we wouldn't be able to maintain the capabilities that support them all. Again, the Internet has had a profound impact on our ability to preserve our collective memory, but we are still very fragile. A true librarian has vivid memories of Babel and Alexandria (when we also considered ourselves invincible), and lives the motto “never again!”. The first lesson of history (that we must learn and never repeat) is that history lost is humanity lost.
The key point here is that the web, and especially the semantic web is about capturing and communicating human knowledge. For people who have trouble understanding that “knowledge” is a truth-neutral word, it is fine to say that the semantic web is about capturing and communicating human voices. The web v1.0 was great, but still has many problems. (For example, you would think that the web would do a good job of documenting the history of the web, but the feedback loop created by copy/paste historians virtually erased Eric Bina from the history of the web initially, while elevating Marc Andresson to status of a god.) Most of the web's ability to filter voices is still based on information extraction from raw, unstructured text. Innovations like weblogs have made it easier for normal people to communicate their voices to the world, and people continue to simultaneously evolve the web's ability to filter voices, as demonstrated by Mark Pilgrim's cool use of cite with trackback. So it is obvious to me that the current web is evolving to become more semantic anyway. In fact, I would argue that people like Dave Winer (who overtly disparages certain semantic web technologies while producing code that gives people voice) have done more to advance the semantic web (the web of renmin voice) than many of the semantic web advocates.
RDF is often a whipping-boy, but a red-herring in this discussion. To know why, you need to understand that RDF is simply a syntax for exchanging knowledge representations, and not even a particularly ambitious or cutting-edge syntax. When you want to represent the quantity “five”, you can use Roman numerals, Arabic numerals, or some other symbolism. When you want to represent the statement “the author of http://www.netcrucible.com/blog/