Header graphic, containing LSoM logo and title of the currently show slide

Semantic Web Software

Sebastian Tramp / AKSW @ Information Systems Institute
16.11.2012

Working Definition

Semantic Web Software is Software which is used for of RDF knowledge bases.

Linked Data Lifecycle

In: S.Auer et.al: Introduction to Linked Data and Its Lifecycle on the Web

Storage

A triplestore is a purpose-built database for the storage and retrieval of Resource Description Framework (RDF) metadata.

wikipedia:Triplestore via Jack Rusher (2003)

A quadstore can handle multiple knowledge bases / named graphs.

RDF Import / Parser

RDF can be serialized in different syntax. Widely in use are:

But there are many more: N3, (2005, spec TBL), TriX (2004, by HP), GRDDL (2007), RDF/JSON (2008 by Talis), RDFa (2008)

RDF Serializations

http://www.w3.org/DesignIssues/diagrams/n3/venn

Syntax: Turtle / N3

Three simple triples …

@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

<http://sebastian.tramp.name> a foaf:Person;
    foaf:workInfoHomepage <http://aksw.org/SebastianTramp>;
    rdfs:label "Sebastian Tramp".
    

Syntax: RDF / XML

… can look …

<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF
   xmlns:foaf="http://xmlns.com/foaf/0.1/"
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
  <foaf:Person rdf:about="http://sebastian.tramp.name">
    <rdfs:label>Sebastian Tramp</rdfs:label>
    <foaf:workInfoHomepage rdf:resource="http://aksw.org/SebastianTramp"/>
  </foaf:Person>
</rdf:RDF>
    

Syntax: RDF / JSON

… soooo different.

{
  "http://sebastian.tramp.name" : {
    "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" : [ {
        "value" : "http://xmlns.com/foaf/0.1/Person",
        "type" : "uri"
        }

      ],
    "http://www.w3.org/2000/01/rdf-schema#label" : [ {
        "value" : "Sebastian Tramp",
        "type" : "literal"
        }

      ],
    "http://xmlns.com/foaf/0.1/workInfoHomepage" : [ {
        "value" : "http://aksw.org/SebastianTramp",
        "type" : "uri"
        }

      ]
    }
  }
    

RDF Import

Speed of triple import is a significant factor.

… the LUBM 8000 load speed was 160,739 triples-per-second on a single machine with 2 x Xeon 5520 and 72G RAM.

http://www.w3.org/wiki/LargeTripleStores

Note: Fastest (and biggest) syntax is N-Triples.

Querying

SPARQL Protocol and RDF Query Language

SPARQL Example

Give me all object properties which are in use on any resource of type person:

SELECT DISTINCT ?p ?label
WHERE
{
    ?s a foaf:Person.
    ?s ?p ?o.
    ?p a owl:ObjectProperty.
    OPTIONAL {
        ?p rdfs:label ?label.
    }
}
    

Querying

Triple stores (still) by a factor 5-50 slower than relational data management but performance increases steadily

Benchmarks:

DBpedia

DBpedia is a 8 core box with 96GB memory and 8 x 500GB HD in raid 1 (mirroring) to give 4 virtual partitions. Database is striped over these four channels.

Data

Access

DBpedia SPARQL Benchmark

DBPSB Results: Query Mixes per Hour (QMpH)
In: M. Morsey et.al: DBpedia SPARQL Benchmark – Performance Assessment with Real Queries on Real Data

Deployment

(Semantic aware) deployment options:

Or all together in combination with a valid VoID description.

Visit thedatahub.org to browse public available datasets.

Linked Data: The Four Rules

  1. Use URIs as names for things
  2. Use HTTP URIs so that people can look up those names.
  3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL)
  4. Include links to other URIs. so that they can discover more things.
http://www.w3.org/DesignIssues/LinkedData.html

HTTP Content Negotiation

When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL)
In: How to publish Linked Data

Extraction

How to generate RDF triple from non-RDF sources?

Enrichment

Improve the quality of a knowledge base by adding additional statements:

Link Generation

Inter-linking knowledge bases by finding and deploying relevant links and defining link specifications (e.g. LIMES and SILK)

Supervised Machine Learning

Derive axioms about a class by using its instances as positive examples (e.g. DL-Learner, ORE, HANNE)

Editing

Editing Schema / Ontology vs. editing instances.

Collaboration

Any collaboration environment can be adapted for the usage in an RDF authoring application but wikis are very successful and were adapted first:

Reasoning

Infer logical consequences from a set of asserted facts.

Two options:

Full DL reasoning does not web scale.

Rule Execution

Custom rule execution is a powerful tool for knowledge workers.

Triple / Quad Stores

Supported open-source and commercial versions as well as hosted services available:

Virtuoso

Virtuoso Conductor

Virtuoso OpenSource 6.1.3: Conductor web interface

Dydra

Our goal is to develop the github for graph data.

Arto Bendiken, 10/2010

Dydra Query Shell

dydra.com: standard query shell

Extraction: FOX

Federated Knowledge Extraction Framework

FOX Example Input

The foundation of the University of Leipzig in 1409 initiated the city's development into a centre of German law and the publishing industry. The philosopher and mathematician Gottfried Leibniz was born in Leipzig in 1646, and attended the university from 1661–1666.

FOX Example Output

[] a ann:Annotation , scmsann:ORGANIZATION ;
     scms:beginIndex "22"^^xsd:int ;
     scms:endIndex "43"^^xsd:int ;
     scms:means dbpedia:University_of_Leipzig ;
     scms:source <http://ns.aksw.org/scms/tools/FOX> ;
     ann:body "University of Leipzig"^^xsd:string .
This could be used to add
:mySource foaf:topic dbpedia:University_of_Leipzig
to the metadata of :mySource.

Enrichment: LIMES

LIMES is a link discovery framework for the Web of Data

LIMES Webinterface

limes.aksw.org

Editing: Protégé

a free, open source ontology editor and knowledge-base framework.

Protégé

Protégé

Editing: TopBraid Composer

a modelling environment for developing Semantic Web ontologies and building semantic applications

TopBraid Composer

Topbraid Composer

Collaboration: Semantic Media Wiki

Sebastian [[knows::Tilo]] since two years.
    

Collaboration: OntoWiki

OntoWiki

OntoWiki

More Tools

If we still have time …

Thank you for your Attention!

My WebID:

This slide deck: