The vast majority of semantic technologists are directing their efforts to search. It’s an important use of their talents; search is a hard problem worth solving. But it seems to me that we need to take a broader view. Semantics is the stuff of thought, of meaning, of our most personal and deeply held beliefs. A fully realized semantic web will be much more than “better search”. But the future is hard to imagine. We need concrete examples of semantic applications to demonstrate the potential and fuel our imaginations.
So as a glimpse into this future, consider the world’s first mainstream semantic web: Wikipedia. Wikipedia is most often celebrated as the poster child of Web 2.0. As a social application, this is most certainly true. But its output, its content, may be more illustrative of Web 3.0 semantics. Its articles are abstractions: they summarize a huge body of content down to only the essential aspects of each subject. By maintaining organizational standards and templates, it has become machine-readable (derivatives such as DBpedia and many other research projects make it explicitly so). And the subject matter of Wikipedia is clearly the stuff of thought. Semantic representation and encyclopaedic content have a deep and obvious kinship.
These technical attributes put Wikipedia in the semantics camp. Web 2.0 principles may be driving Wikipedia’s prodigious output, but Web 3.0 semantic qualities are driving the phenomenal mainstream consumption of the information. If Wikipedia is truly illustrative of semantic data, then insights into Wikipedia’s consumption may point to killer applications of semantics.
Why do you use Wikipedia? Are you looking for an authoritative viewpoint on a subject or merely trying to clarify your own worldview? Are you feeding your thoughts or trying to digest the thoughts of others? Are you creating your next great idea or investigating whether someone already owns it? These are less problems of search as much of opportunities for personal and collective thinking.
Wikipedia is one of the top 10 websites in the world. Underneath its veil of neutrality is a seething mass of ideas, discussion, debate, personal and collective thinking. With its millions of articles, it is a giant in the world of content. But as a semantic web, Wikipedia represents merely the tip of the iceberg, a sliver of our personal and collective thoughts. Wikipedia may be the first mainstream semantic web, but it’s also a very small semantic web! As more of our thoughts and thinking migrate online, the world of semantics will dwarf the world of content. Wikipedia provides a glimpse into a bright future that isn’t about a “better search”, just semantics on its own terms.
Thanks to the team at Primal Fusion for their thoughts and contributions to this post.
[...] Continues @ Primal Fusion Blog [...]
[...] World’s first mainstream semantic web – Primal Fusion [...]
Great article!!!
The stumbling block in developing the semantic web is that the meta-tags (labels put on text so that computers can decide whether something can be used as an “address” for example) have been developed in isolation from real problem solving.
Wikipedia is an example of the opposing process: it was designed to solve a real problem, which is to support collaborative development of reference material. Along the way, they dealt with navigation and cross-referencing problems.
I think that we’ll see more success in developing the semantic web when we get better meta-data dictionaries, and that is going to be linked to standards in business systems.
For example, what if the Postal Service allowed you to enter an address online, and then issued a tracking number that you put on your envelope? They could solve the problem of normalizing the addresses (‘St’ vs ‘Street’, for example). Then, when you wanted to find a mailing address, you could go online and search their database. When you wanted to insert and address, you would cut and paste the reference from their screen, so that it had the right postal service meta-tags on it.
You see what I’m getting at? The semantic web will come into being when we have institutions established with a remit to develop universal public information services.
Thanks for the feedback. That’s an interesting insight on the solution of the semantic web looking for a problem. And I agree: There are organizations with the authority and opportunity to stake a claim on the semantic web. For me, one of the most interesting things about the Wikipedia success story is that Wikipedia has become an authoritative source for these universal subjects and quite incidentally stepped into that role within the semantic web.
I’m not sure I’d agree that Wikipedia is the first mainstream semantic web, though it certainly is the first mainstream semantic web that shows off it’s semantic webbiness.
Imagine a company that harvests the implicit semantic content of human-created hyperlinks between pages. The content of these links are analyzed for common terms, and those same terms are used to identify the target pages. The “semantics” of this semantic web are stored and computed in the brains of the authors of these billions of web pages. So, you have a web3.0 network built on a giant wetware network a la web2.0 crowdsourcing.
That’s Google. Yahoo was doing something similar with the surfer-categorized directories, but had to change to the link-harvesting method eventually because it’s so much more scalable.
Using a “semantic web” to build a better search engine is a fools errand. Google and Yahoo are already using semantics in their networks. It would be incredibly difficult, if not impossible, to build better semantic analyzing mechanisms than a billion or so human brains in a distributed network.
Isaac, I’d be interested in hearing about other examples of early mainstream semantic webs. I’m defining “semantic web” very loosely here to focus on the attributes of abstraction in the form and subject matter of the content, in at least a standardized machine-readable presentation. By “mainstream” I mean a service that’s crossed over to mass market adoption.
I agree entirely with your characterization of Google. Most of the discussion around Google and semantics seems focused on whether they qualify as a semantic search engine. They’re among the most prodigious creators of semantic data; it seems they’ve earned the title. But it’s all in how you frame it.
I agree with you, Mr.Sweeney.