Search for Meaning & Bootstrapping the Semantic Web with Open Data - The Linking Open Data Project
Date: Thursday, April 17, 2008 - 6:45 pm
Location: hakia, Inc. 39 Broadway Floor 33 · New York, NY
Christian Hempelmann (hakia)
Richard Cyganiak (DERI)
Search for Meaning
Dr Christian Hempelmann, hakia
Abstract This talk presents a linguistic approach for deep-meaning representation, ontological semantics (OntoSem), for a specific, complex NLP application: A Meaning-Based Internet Search Engine.
It introduces the OntoSem resources and technology, which are available for licensing, to provide a general overview of the specific methods in which OntoSem is used in our Internet search approach and give an in-depth account of selected key issues in web search and how we address them. For web search, the OntoSem technology parses natural language web content and transposes it into a representation of its meaning, structured around the events described in the text and their participants. Queries can then be matched to this meaning representation in anticipation of any of the permutations in which they can surface in written text. These permutations centrally include overspecification (e.g., not listing all synonyms, which non-semantic search engines require their users to do) and, more importantly, underspecification (as language does in principle). For the latter case, ambiguity can only be reduced by giving the search engine what humans use for disambiguation, namely knowledge of the world as represented in an ontology. One key assumption is that meaning for web search requires complex description for automatic generation and can in principle not be extracted from surface text with statistical methods, since meaning is content and does not lend itself to automatic extraction from natural language without rich knowledge resources. For more information, please visit:
Bootstrapping the Semantic Web with Open Data - The Linking Open Data Project
A prerequisite for the Semantic Web is the existence of large amounts of meaningfully interlinked RDF data on the Web. The W3C SWEO community project Linking Open Data has made various open datasets available on the Web as RDF, and developed automated mechanisms to interlink them with RDF statements. Collectively the datasets currently consist of over one billion triples. We believe that large scale interlinking will demonstrate the value of the Semantic Web compared to more centralized approaches. This presentation outlines the work to date and gives a short demonstration.
The Open Data Movement aims at making data freely available to everyone. There are already various interesting open data sets availiable on the Web. Examples include Wikipedia, Wikibooks, Geonames, MusicBrainz, WordNet, the DBLP bibliography and many more which are published under Creative Commons or Talis licenses.
The goal of the W3C SWEO Linking Open Data community project is to extend the Web with a data commons by publishing various open datasets as RDF on the Web and by setting RDF links between data items from different data sources.
RDF links enable you to navigate from a data item within one data source to related data items within other sources using a Semantic Web browser. RDF links can also be followed by the crawlers of Semantic Web search engines, which may provide sophisticated search and query capabilities over crawled data. As query results are structured data and not just links to HTML pages, they can be used within other applications.