Founder and Editor of ReadWriteWeb Richard MacManus wrote a good post on Wednesday at Where Are All The RDF-based Semantic Web Apps?, where he builds on a previous ReadWriteWeb post to describe the difference between Top-Down vs. Bottom-Up semantic web applications, observing that Top-Down applications may be winning the "battle". Top-Down is seen as data analysis methods and Bottom-Up as standardized metadata such as RDFa and eRDF (Richard seems to give some importance to whether this metadata is embedded in the data or not, but I don't think it really matters.)
First, two questions:
- Richard mentioned that Twine uses RDF. Does it, really? I remember Spivack saying at the Semantic Technologies Conference that he moved away from RDF stores, because of scalability issues. Did I misunderstand that? Anyone with any insight on this, please let me know. This to me would be a clear indication of whether RDF is practical for real-world application, or not.
- In my previous post on Semantic Proxy, I asked how much Semantic Proxy leverages RDF triples by including smart predicates? Are currently these predicates mainly the verb "BE", as in "Paris Hilton IS a person", "Paris IS a city", "1945 IS a date" etc...? I haven't heard back from Reuters, and I hope to as I think that's a pretty important sign of the validity and potential of RDF.
Second, I'd like to contribute to the debate with some thoughts.
Unlike Richard, I don't think this is really a battle for domination of the Semantic Web space between two alternative approaches. The Bottom-Up and Top-Down technologies are hugely interdependent.
RDF and Linked Data efforts were launched as a result of the apparent "failure" of Artificial Intelligence efforts. The idea was that, for Intelligent Agents (the term now used for what really is semantic algorithms) to work well, there needed to be more standardized data types that the machine could interpret, process and build upon.
This is still true today. Intelligent Agents need a better way to represent information than HTML and XML, so they can make sense of that data. RDF, Linked Data, Microformats provide that, and there is little else that does it so far. There are proprietary efforts to come up with better solutions to that problem, and interestingly, most of them are driven by companies also pushing the Top-Down approach.
These efforts are driven by and fuel the concerns that the RDF format and its "stack" of related technologies (SPARQL, the SQL language for RDF; and OWL, the ontology protocol) do not get the job done. Posts like Ditching the Semantic Web call for dropping RDF all together. Talis CTO Ian Davis notes in a more refrained way in the last issue of Nodalities that there are still only a handful of applications that incorporate RDF at their heart. So the question I ask everyone here: does RDF serve intelligent agents properly?
Because the point here is that, if it doesn't, then we need an alternative method to get the job of representing information done. Intelligent Agents alias Top-down technologies simply can't do without it. HTML, XML, Relational Databases don't seem to cut it. So what does? Do we need a completely fresh approach, or changes to the existing RDF stack / Linked Data model? Those are questions I ask genuinely. I don't know the answer.
Now, let's see if we can turn the problem around: couldn't it be that RDF / Linked Data in fact does the job, but that the existing Intelligent Agents are well, not intelligent enough to populate RDF triple stores in intelligent ways, by taking advantage of the Subject-Predicate-Object relationship? Do we need more sophisticated Intelligent Agents? How I would like to know whether OpenCalais and Semantic Proxy produce intelligent predicates! And how about Twine?
If not, why not? This is a different problem than the RDF scalability issue. How does Freebase do it?
When back in February I commented on the different approaches to building the semweb in the post Metadata even Machines Can Process, I distinguished between algorithms on one side, and standardized encoding on the other. I'm revising that now, as I don't think those are two opposed approaches. The question of where to draw the line is important, but there is no question that both need and reinforce each other.
Tying it back up to the Semantic Value Chain, the Semantic Data Store will be coded into Semantic Formats and fed by Semantic Engines and Apps, to Create Semantic Content and other Semantic Apps. If we are to realize this beautiful vision we call the semantic web, those are pieces of the same puzzle that need to work together. So do we have the right pieces?
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=b1bca9b8-bff5-45d5-9ec8-206293d42d89)
