Tuesday, August 27, 2013

Semantic web

The Semantic Web refers to the inclusion of machine-readable data on web pages. While all data is theoretically readable by machine, the semantic web generally takes advantage of ontologies - formal and consensual specifications that provide a shared and common understanding of terms.

An ontology is essentially metadata - data describing the data. A database schema is an example of an ontology, although not one that is probably usefully formatted for web data. So is a class in an object-oriented language such as Java.

However, metadata is generally categorized into levels. The lowest level is simple data. The next highest level is syntactic data, simple attributes of the data such as language, format, source, and creation date.

Next is Structural Metadata, such as DTD's, XSL, and clustering.

Then, we have semantic metadata, where terms have semantic meaning. In metadata concerning medical treatment, for example, we might see data like
region: upper abdomen, organ: liver, pathological structure: abscess

Finally, at the highest level, we have a full-fledged Ontology. Examples of ontologies might include anatomy or diagnostics.Some currently existing ontologies include PapiNet.org, a vocabulary for for the paper industry , BPMI.org: a vocabulary for exchanging business process models , and XML-HR, vocabularies for human resources.

Interesting things:

  • Pellet - an OWL reasoner for Java.  http://clarkparsia.com/pellet/
  • Simile - http://simile-widgets.org/ Web widgets for supporting data visualizations
  • http://schema.org, a site that has standards for microformat markup.
  • Google Brain, a learning project
  • vivisimo - structured clusterization
  • http://www.semantic-conference.com/primer.html