This presentation for me was in fact like opening up the legendary Pandora’s Box. It ignited a reflection about what we as a company do regarding the Semantic Web. Net7 in fact always characterizes itself as a “Semantic Web Company".
At about the same time I was contacted both by CTL and by a partner company to talk about this subject. On the one hand CTL expected suggestions and stimuli to use Semantic Web technologies in their work on the Digital Humanities field. The partner company was looking for professional training on these topics.
For 5 seconds I went into autopilot mode and started to think about explaining the Semantic Web in the standard fashion (RDF, ontologies, triple stores, SPARQL, RDFS, OWL, well, you got the idea…). Then three questions sprang to my mind...
Do these persons really need this kind of information? Are they going to really use all of this on their daily job?
The second question is a bit discomforting: do we at Net7 really use completely and especially consciously the whole of the Semantic Web technologies?
The third is even more serious: what’s the current state of the art of the Semantic Web? Is it still an important technology, with practical uses even for middle/low-sized projects, or should it stay confined in the Empyrean of research and huge knowledge management initiatives?
So, it was really important for me to do a presentation with the attempt to find answers to these questions, to present topics that could be of interest and useful to the audience and at the same time to put in a new perspective my knowledge on the field.
The presentation therefore came out as a reflection on the possible uses and advantages of "Semantics in the web", first and foremost for me, in order to reorder my mind, with the hope that it can be useful for others as well. I tried therefore to take a step back, hopefully to progress further in perspective.
For its preparation I read a great deal of material (see bibliography at the end) and was heavily influenced by the presentations and articles of Jim Hendler (not to mention the fantastic book “Semantic Web for the working ontologist” that he co-authored). So, even if you won’t read these lines, thank you very much Dr. Hendler for your insightful thoughts!
Coming back to my presentation, it is not a case that I used the concept “Semantics in the Web” and not “Semantic Web” in the title. Semantics in fact, in the light of all the readings that I did, seems to me more important than the technology behind it.
I started the presentation with a small historical digression, from the very first vision of the World Wide Web in the Tim Berners-Lee's 1989 original proposal, up to the seminal 2001 article on Scientific American, where Berners-Lee, James Hendler and Ora Lassila presented the Semantic Web.
I continued by explaining the key concepts of the Semantic Web, which served to prove how Semantics, despite the Semantic Web vocal critics, can still count huge success stories in the web of today.
The funny thing is that the Semantic Web’s vision didn’t exactly materialize as expected by its inventors. On the one hand is fundamental to comprehend how things in web history just happens through serendipity. On the other is crucial to have always in mind the Jim Hendler’s motto “a little semantics goes a long way”. Indeed just a small portion of the Semantic Web “pyramid” (see slide number 42 in my presentation, taken from a Jim Hendler’s keynote) finds a recurring use, while the rest (inferences and the most sophisticated OWL constructs included) has still a limited diffusion or is just relegated in high-end research initiatives.
So the Semantic Web hasn’t failed but materialized a bit differently than expected. One therefore should really think to Semantics first, that is to exploit the knowledge that can be extracted from documents, linked data repositories, machine readable annotations in web pages (SEO metadata included) before worrying about the orthodox application of the complete stack of Semantic Web technologies.
The Semantic Web is on the other hand a still promising and on certain aspects undiscovered territory. While I don’t honestly see it as a key technology to power web portals (there are plenty of more mature technologies, even open source, - think of Drupal or Django - that fit better this purpose) the idea of managing information through graph makes a lot of sense in several areas, including:
- knowledge management with highly interconnected data (think of Social Network relationships). Here the capacity of triple stores to handle big graph data will really make the difference, especially if an open source product can be used for this purpose (recently we @ Net7 have bet on Blazegraph and while we have been quite satisfied until now, it must be also said that our graphs are not exactly “that big”). There is no doubt in fact that solid open source products are fundamental to skyrocket the use of specific technologies and software architectures (think of LAMP).
- extraction of structured data from text: a great classic Semantic Web use case indeed
- linking independent repositories of information, implemented with traditional technologies in multiple legacy systems (another Semantic Web classic).
- raw data management and dissemination.
- formally described in great detail
- openly distributed, after a specific anonymizing process in order to remove “sensible information” from it
I concluded my slides by also noticing that Semantics is becoming more and more a commodity, offered through specialized cloud services. Named Entity Recognition SaaS offerings, SpazioDati’s DataTXT and AlchemyAPI included, are a consolidated reality. Cloud Machine Learning services are becoming mainstream (see in this regard this insightful article on ZDNet). Developers therefore can enjoy “a little semantics” in their application, without embracing the Semantic Web in full. As Jim Hendler says in fact… a little semantics goes a long way!
- Tim Berners-Lee, James Hendler and Ora Lassila: The Semantic Web, Scientific American May 2001
- Dean Allemang, James Hendler: Semantic Web for the Working Ontologist 2nd Edition, Morgan Kaufmann, 2011
- James Hendler: The Semantic Web: It’s for real http://www.slideshare.net/jahendler/semantic-web-what-it-is-and-why-you-should-care
- Dominiek ter Heide: Three reasons why the Semantic Web has failed https://gigaom.com/2013/11/03/three-reasons-why-the-semantic-web-has-failed/
- Seth Grimes: Semantic Web Business: Going Nowhere Slowly http://www.informationweek.com/software/information-management/semantic-web-business-going-nowhere-slowly/d/d-id/1113323
- Clay Shirky: Ontology is Overrated: Categories, Links, and Tags http://www.shirky.com/writings/ontology_overrated.html
- Michela Finizio: Il miraggio dell’anagrafe unica: più di 54mila banche dati gestite dalla Pa http://www.infodata.ilsole24ore.com/2015/03/11/il-miraggio-dellanagrafe-unica-piu-di-54mila-banche-dati-gestite-dalla-pa/
- James Hendler: “Why the Semantic Web will Never Work” (note the quote marks!) http://www.slideshare.net/jahendler/why-the-semantic-web-will-never-work
- James Hendler: Semantic Web: The Inside Story http://www.slideshare.net/jahendler/semantic-web-the-inside-story
- James Hendler: The Dark Side of the Semantic Web, IEEE Intelligent Systems, Jan/Feb 2007
- Tim Berners-Lee: Raw data, now http://www.wired.co.uk/news/archive/2012-11/09/raw-data
- Neelie Kroes: Digital Agenda and Open Data http://europa.eu/rapid/press-release_SPEECH-12-149_en.htm
- Google: Introducing the Knowledge Graph https://www.youtube.com/watch?v=mmQl6VGvX-c
- Kevan Lee: What Really Happens When Someone Clicks Your Facebook Like Button https://blog.bufferapp.com/facebook-like-button
- Vestforsk.no: Semantic Markup Report http://www.vestforsk.no/filearchive/semantic_markup_report.pdf
- European Commission: Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020 http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf