Now that I've blogged a bit about the difficulty of getting into the semantic web vocabulary, let me do my part and post a quick view and a few links to interesting introductory pages. This will also address Pieter Jansegers's comment to the post that started this discussion.
For the Semweb experts among you: I know you will be tempted to make corrections, or explain the differences between RDF and RDFa, the subtleties of N3, Turtle, and N-Triples as interchange formats, and the limitations of microformats. Feel free to do that in the comments, as long as you don't expect a reply from me, as indeed this is not the point of this post!
Now, if you're new to this, let me dive right in. We all know in the industry that the semantic web has a strange name. We'd change it if we could, but so far it's endured at least the test of time. Just like hippopotamus or floccinaucinihilipilification (which we hope the semantic web will never be subject to...). By the way, anywhere on this blog, you can double-click on any word to get its definition...
"Semantic" stands for "science of meaning". The goal of the "semweb" effort (semweb is how insiders abbreviate the long name, our secret handshake if you will) is to have machines get the meaning of things. In other words, to have machine understand human concepts and translate them into a machine-readable format for reuse. This way, they can process these concepts as we do (or better) and derive conclusions as we often attempt ourselves. Ultimately, they might even be able to reason like a (very logical) human being, but that's a lofty goal we generally keep for later stages of the web evolution, such as the intelligent web.
Image via Wikipedia
There are different themes and terms associated more or less closely with the Semantic Web, which include such things as Linked Data, Open Data, and the already-infamous Web 3.0. As an aside... my personal definition of web 1.0, web 2.0 and web 3.0 are respectively, "everything web" from 1990 to 1999, from 2000 to 2009, and from 2010 to 2019... not much room for confusion there... although experts will disagree.
The most important benefit provided so far by Semantic Technologies hinges on the idea of opening up your data, and having machines automatically create associations, based on semantically recognizeable data types and relationships such as date, contacts, events, locations. More importantly, additional vocabularies or "ontologies" (which describes the type of relationships existing in the data or between data types) and set of processing rules can be imported, modified or created, to guide the machine on how to display, transform or exchange content.
So far, the main technologies associated with the semweb are that found in the semantic web stack. If you have a little familiarity with the field, you will note that simpler technologies like microformats (including hCalendar, hCard, hResume) are not included in the stack, although they are often referred to as part of the semweb. That brings us to a point we want to just skim over, as the debate could capsize our post otherwise, but in short, there is a bit of a format war in the space, with the RDF suite located farther than microformats on the "complexity of implementation"/"semantic power" spectrum. Those technologies are also geared towards different usages and audiences. Ultimately we may also witness the emergence of some proprietary standards, as some companies believe the existing approach is doomed or impractical and are actively working on alternatives.
For now though, the RDF stack is imposing itself as the key semweb technology, and is pushed heavily by the web authorities at W3C (I have no stake in any particular technology, so this is just my attempt at capturing the general sentiment). Last, note that this stack only covers part of the technological set required to build a true semantic web. One key missing component is a technology for converting the existing content into semantically-enabled data. Many services take a crack at it, but the results generally require further enrichment before being of practical use.
Ok, after this long introduction, we are finally getting to probably the top reason you are reading this: practically, what can you do today to get on board of the semweb bandwagon? You're gonna hate me, but I will redirect you to a bunch of existing pages which did a great job at this (I hate reinventing the wheel, especially for a bandwagon...)
So, I recommend:
- If you're a web user and new to this, trying out MIT Simile's Piggy Bank, which semantically extracts and links data across websites; Gnosis (pretty much the same idea, by Reuters' Open Calais); Twine, which also connects information semantically but from a wider range of formats, and offers an online platform to make use of it (contact me if you need an invite); and Freebase, a semantic knowledge base. You can also get your own "business card" in a standard RDF format called FOAF through an application like Foaf-a-Matic. You can find other recognized "semweb-inside" applications here.
- If you're a blogger, checking out Zemanta, which recognizes the context of your posts, and use that to suggest relevant images, links, keywords and text to add to your blog. It works on most major content publishing platforms and web browsers. If you blog through Wordpress, you can also try Open Calais tool's Tagaroo, which will semantically tag your blog posts.
- If you're a webmaster, experimenting with tools like Dapper, which lets you scrap semantically-specific content from anywhere on the web (including your own site) and turn it into reusable structured content (useful to redistribute parts of your own site), Reuters' Open Calais, which automatically create RDF tags for your data, as well as Adobe XML tool (based on RDF). You might take a look at Yahoo's SearchMonkey site owner page as well for possibilities to improve your search results through "structured data" such as the microformats (hCard etc...) I described previously. You could also check these two articles on planning a semantic website, a short one by Sitepoint, and a more comprehensive one by IBM, as well as this recent Website Magazine article for more tips.
- If you're an application developer, experimenting with Reuters' Open Calais APIs, Freebase APIs, and Yahoo's SearchMonkey tools. If you're interested in developing intensive RDF applications, you can check RDF and OWL editors such as Altovas SemanticWorks, Intellidimensions RDF Gateway, TopQuadrants TopBraid Composer, and Java open source Jena framework (from HP Lab Semantic Web Programme).
- If you're interested in it for your company, there are many services marketed by companies for different applications. As a start, I recommend looking at the content on the Semantic Exchange website, and then contacting this group to guide you towards the right offers. Purchasing the Semantic Exchange comprehensive report and joining it would be a good idea, too (for full disclosure, I do act for them as an advisor on market issues)
- Finally, for some additional introductions, reading this article by Scientific American to complete your picture of the semweb activities, this ReadWriteWeb post on the different applications of the semweb, and this one too, to give you some of the flavours and, if you need more details, Ivan Herman's Intro to the Semantic Web presentation
In addition, there are books on the topics you can read. I present some of them in the Amazon-powered box in the right column of this blog. Frankly, this is a bit overkill for an introduction, so I'd suggest you stick to the above suggestions and email me any good introduction you come across, to complete the list.
Bottom line: this is the semantic web, not the cementic web, even if it can still feel heavy at times! If you just know that, you're already ahead of 99% of the folks out there. Congratulations!
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=576a7450-bc66-4a1c-b62e-4c1b07a8c67a)
