Publishing 5-star Open Data

1. Use case: event schedule and presenters of a summer school
2. ★ PDF
3. ★★ Excel (*.xls)
4. ★★★ OpenDocument (*.ods)
5. ★★★☆ CSV
6. ★★★★ CSV for the Web
- 6.1. Links using Web-scale identifiers
  - 6.1.1. Dereferencing Linked Data Identifiers
    - 6.1.1.1. Dereferencing Example with cURL
- 6.2. Datatypes
7. ★★★★☆ CSV with a schema
8. ★★★★★ RDF (and a comparison to CSV)
- 8.1. Publishing RDF
9. ★★★★★☆ Further possible improvements
10. Credits
11. License and Citation

A tutorial with practical examples of how to create and publish 5-star open data on the Web; originally written for the 2014 Web Intelligence Summer School “Web of Data” supported by

Université franco-allemande / Deutsch-Französische Hochschule

and updated for the 2015 Web Intelligence Summer School “Answering Questions with the Web”

Published online at http://clange.github.io/5stardata-tutorial/

1 Use case: event schedule and presenters of a summer school

Original data (HTML): schedule, presenters

2 ★ PDF

Cost and benefits of ★ Web data

schedule.pdf

To be honest, this PDF was exported from Excel, which is more than one star. But organisations really often “publish” data in PDF.

3 ★★ Excel (*.xls)

Cost and benefits of ★★ Web data

schedule.xls

Even though the old binary *.xls format is proprietary, it is not impossible to read this file outside of Excel:

perl -MSpreadsheet::ParseExcel -le '
  print Spreadsheet::ParseExcel->new()
    ->parse("2star_Excel/schedule.xls")
    ->worksheet("Schedule")
    ->get_cell(1,0)
    ->value();'

Output:

25 Aug 2014 09:00

It is harder, but still possible, to have questions answered such as “when is the first coffee break”.

Think of an algorithm that does the following:

In the column titled “Event”, identify all cells whose value is “Coffee break”.
On each row of such a cell, get the entry of the cell in the “Time” column.
Sort these cells and return the smallest value.

However, free software libraries do not support all features of this file format. Here is what happens when we ask a popular free tool to determine the type of this file:

file 2star_Excel/schedule.xls

Output:

2star_Excel/schedule.xls: Composite Document File V2 Document, corrupt: Can't read SSAT

4 ★★★ OpenDocument (*.ods)

Cost and benefits of ★★★ Web data

schedule.ods

We can process this file with standard tools:

unzip -l 3star_OpenDocument/schedule.ods

Archive:  3star_OpenDocument/schedule.ods
  Length      Date    Time    Name
---------  ---------- -----   ----
       46  08-21-2014 08:13   mimetype
    52832  08-21-2014 08:13   Thumbnails/thumbnail.png
    27279  08-21-2014 08:13   styles.xml
    15227  08-21-2014 08:13   content.xml
      852  08-21-2014 08:13   meta.xml
     8774  08-21-2014 08:13   settings.xml
      899  08-21-2014 08:13   manifest.rdf
        0  08-21-2014 08:13   Configurations2/accelerator/current.xml
        0  08-21-2014 08:13   Configurations2/progressbar/
        0  08-21-2014 08:13   Configurations2/statusbar/
        0  08-21-2014 08:13   Configurations2/images/Bitmaps/
        0  08-21-2014 08:13   Configurations2/floater/
        0  08-21-2014 08:13   Configurations2/toolbar/
        0  08-21-2014 08:13   Configurations2/popupmenu/
        0  08-21-2014 08:13   Configurations2/toolpanel/
        0  08-21-2014 08:13   Configurations2/menubar/
     1093  08-21-2014 08:13   META-INF/manifest.xml
---------                     -------
   107002                     17 files

content.xml contains the actual tabular data, so we can process it using XPath/XQuery/XSLT tools such as Zorba:

zorba --serialize-text -q '
  declare namespace office="urn:oasis:names:tc:opendocument:xmlns:office:1.0";
  declare namespace table="urn:oasis:names:tc:opendocument:xmlns:table:1.0";
  doc("3star_OpenDocument/content.xml")//office:spreadsheet
    /table:table[@table:name="Schedule"]
    /table:table-row[2]/table:table-cell[1]'

25 Aug 2014 09:00

LibreOffice even stored this timestamp in a machine-friendly way. We'll realise the advantages of this later.

zorba --serialize-text -q '
  declare namespace office="urn:oasis:names:tc:opendocument:xmlns:office:1.0";
  declare namespace table="urn:oasis:names:tc:opendocument:xmlns:table:1.0";
  string(doc("3star_OpenDocument/content.xml")//office:spreadsheet
    /table:table[@table:name="Schedule"]
    /table:table-row[2]/table:table-cell[1]/@office:date-value)'

2014-08-25T09:00:00

5 ★★★☆ CSV

We need one CSV file per sheet:

6 ★★★★ CSV for the Web

Cost and benefits of ★★★★ Web data

From here onwards, the original 5-star open data examples use RDF. We will continue with CSV for a while, taking it to its limits, to point out that open data on the Web is not only RDF. We will introduce RDF in a later section.

The following examples roughly conform to Linked CSV, which was one of the original proposals for an RDF-conforming specification of CSV. The CSV on the Web Working Group is now taking a different approach. Their Working Draft on Generating RDF from Tabular Data on the Web suggests leaving the CSV untouched but providing complementary, external metadata annotations, e.g., in the form of JSON. This tutorial sticks with the simpler Linked CSV approach, which is self-contained in CSV.

6.1 Links using Web-scale identifiers

An example from the 3.5-star CSV:

Time,Event,Type,Presenter,Location
...
27 Aug 2014 09:00,Wikidata,Keynote,Markus Krötzsch,
27 Aug 2014 10:15,Working with Wikidata: A Hands-on Guide for Researchers and Developers,Tutorial,Markus Krötzsch,

Name,Affiliation,Town,Country
...
Markus Krötzsch,TU Dresden,Dresden,Germany

How do we know it's twice the same instructor?
How can we make this connection Web-safe? (There might be others by the same name; how about this person on Facebook?)

Give the presenter a unique identifier! On the Web, this means using a URI (Uniform Resource Identifier).

Time,Event,Type,Presenter,Location
...
2014-08-27T09:00:00+02:00,Wikidata,Keynote,http://purl.org/net/wiss2014/presenters/#markus,
2014-08-27T10:15:00+02:00,Working with Wikidata: A Hands-on Guide for Researchers and Developers,Tutorial,http://purl.org/net/wiss2014/presenters/#markus,

$id,Name,Affiliation,Town,Country
...
http://purl.org/net/wiss2014/presenters/#markus,Markus Krötzsch,TU Dresden,Dresden,Germany

(The timestamp format has also changed; we'll discuss this next.)

It is good practice to …

use HTTP URLs for such URIs,
choose them from a namespace that you own,
publish a machine-comprehensible, self-describing representation of the things identified by these URIs at that same URL,
so that any client who wants to know something about these things can easily look it up by downloading.

This approach is called linked data.

Linked data is essential for the Semantic Web – “a framework that allows data to be shared and reused across application, enterprise, and community boundaries”.

6.1.1 Dereferencing Linked Data Identifiers

The presenters in the summer school are now identified by URIs such as http://purl.org/net/wiss2014/presenters/#markus. As these are HTTP URLs, they can be dereferenced in order to download a description of a person. This is easiest to do by entering the URL into the address bar of a web browser, but a command-line HTTP client such as wget or cURL gives you more control.

wget -O - --header 'Accept: text/csv' 'http://purl.org/net/wiss2014/presenters/#markus'

--2015-09-02 11:21:11--  http://purl.org/net/wiss2014/presenters/
Resolving purl.org (purl.org)... 132.174.1.35
Connecting to purl.org (purl.org)|132.174.1.35|:80... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: http://www.iai.uni-bonn.de/~langec/wiss2014/presenters/ [following]
--2015-09-02 11:21:11--  http://www.iai.uni-bonn.de/~langec/wiss2014/presenters/
Resolving www.iai.uni-bonn.de (www.iai.uni-bonn.de)... 131.220.8.244
Connecting to www.iai.uni-bonn.de (www.iai.uni-bonn.de)|131.220.8.244|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://www.iai.uni-bonn.de/~langec/wiss2014/presenters/index.csv [following]
--2015-09-02 11:21:11--  http://www.iai.uni-bonn.de/~langec/wiss2014/presenters/index.csv
Reusing existing connection to www.iai.uni-bonn.de:80.
HTTP request sent, awaiting response... 200 OK
Length: 1499 (1.5K) [text/csv]
Saving to: 'STDOUT'
#,$id,Name,Affiliation,Town,Country
type,url,foaf:name,schema:affiliation,http://purl.org/net/wiss2014/vocab/#town,http://purl.org/net/wiss2014/vocab/#country
,http://purl.org/net/wiss2014/presenters/#soeren,Sören Auer,Universität Bonn;Fraunhofer IAIS,Bonn,Germany
,http://purl.org/net/wiss2014/presenters/#mathieu,Mathieu d'Aquin,,Milton Keynes,UK
,http://purl.org/net/wiss2014/presenters/#aba-sah,Aba-Sah Dadzie,University of Birmingham,Birmingham,UK
,http://purl.org/net/wiss2014/presenters/#jerome,Jérôme David,Université Pierre-Mendès-France;INRIA-LIG,Grenoble,France
,http://purl.org/net/wiss2014/presenters/#stefan,Stefan Decker,INSIGHT;National University of Ireland,Galway,Ireland
,http://purl.org/net/wiss2014/presenters/#paul,Paul Groth,VU Amsterdam,Amsterdam,Netherlands
,http://purl.org/net/wiss2014/presenters/#markus,Markus Krötzsch,TU Dresden,Dresden,Germany
,http://purl.org/net/wiss2014/presenters/#christoph,Christoph Lange,Universität Bonn;Fraunhofer IAIS,Bonn,Germany
,http://purl.org/net/wiss2014/presenters/#axel,Axel Polleres,WU Wien,Vienna,Austria
,http://purl.org/net/wiss2014/presenters/#eric,Eric Prud'hommeaux,W3C,,
,http://purl.org/net/wiss2014/presenters/#harald,Harald Sack,"HPI, Universität Potsdam",Potsdam,Germany
,http://purl.org/net/wiss2014/presenters/#thomas,Thomas Steiner,Université Lyon;Google,Lyon,France
,http://purl.org/net/wiss2014/presenters/#antoine,Antoine Zimmermann,École des mines de Saint-Étienne,Saint-Étienne,France

     0K .                                                     100% 17.7M=0s

2015-09-02 11:21:11 (17.7 MB/s) - written to stdout [1499/1499]

I will not go into full detail, but here are some observations, in the order of appearance:

I actually published the data in a place easily accessible for me: my personal webspace at the University of Bonn.
To publish the data in a sustainable way, independent from me leaving the University of Bonn, or the University of Bonn reorganising their IT infrastructure, I used the PURL (Persistent URL) redirection service.
The first redirect is due to the use of PURL.
The second redirect happens because we are using content negotiation to give data consumers a choice from multiple data formats. We will see another format, RDF/XML, below.
Instead of just the description of Markus Krötzsch, we get the descriptions of all presenters. This is because we lazily published all descriptions in the same file on the server and used hash (#) URIs for them. This approach is OK for small amounts of data. The part after the hash has to be interpreted by the client. Here, the client actually downloads http://purl.org/net/wiss2014/presenters/ from the server and then has to locate, inside the downloaded document, the fragment #markus by its own means.

Further background on publishing data on the Web can be found in the following specifications:

Cool URIs for the Semantic Web: how to choose the right URIs (hash vs. slash), how to design content negotiation
Best Practice Recipes for Publishing RDF Vocabularies (actually also addresses datasets, as vocabularies are just a special case of that): how to configure the Apache HTTP server for these settings

6.1.1.1 Dereferencing Example with cURL

Here is the same example as above, redone using cURL:

curl -i -H 'Accept: text/csv' -L 'http://purl.org/net/wiss2014/presenters/#markus'

HTTP/1.1 302 Moved Temporarily
Date: Wed, 02 Sep 2015 09:24:08 GMT
Server: 1060 NetKernel v3.3 - Powered by Jetty
Location: http://www.iai.uni-bonn.de/~langec/wiss2014/presenters/
Content-Type: text/html; charset=iso-8859-1
X-Purl: 2.0; http://localhost:8080
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Content-Length: 288

HTTP/1.1 302 Found
Date: Wed, 02 Sep 2015 09:24:08 GMT
Server: Apache
Location: http://www.iai.uni-bonn.de/~langec/wiss2014/presenters/index.csv
Content-Length: 248
Content-Type: text/html; charset=iso-8859-1

HTTP/1.1 200 OK
Date: Wed, 02 Sep 2015 09:24:08 GMT
Server: Apache
Last-Modified: Tue, 26 Aug 2014 04:44:11 GMT
ETag: "5db-50180f4611cc0"
Accept-Ranges: bytes
Content-Length: 1499
Content-Type: text/csv

#,$id,Name,Affiliation,Town,Country
type,url,foaf:name,schema:affiliation,http://purl.org/net/wiss2014/vocab/#town,http://purl.org/net/wiss2014/vocab/#country
,http://purl.org/net/wiss2014/presenters/#soeren,Sören Auer,Universität Bonn;Fraunhofer IAIS,Bonn,Germany
,http://purl.org/net/wiss2014/presenters/#mathieu,Mathieu d'Aquin,,Milton Keynes,UK
,http://purl.org/net/wiss2014/presenters/#aba-sah,Aba-Sah Dadzie,University of Birmingham,Birmingham,UK
,http://purl.org/net/wiss2014/presenters/#jerome,Jérôme David,Université Pierre-Mendès-France;INRIA-LIG,Grenoble,France
,http://purl.org/net/wiss2014/presenters/#stefan,Stefan Decker,INSIGHT;National University of Ireland,Galway,Ireland
,http://purl.org/net/wiss2014/presenters/#paul,Paul Groth,VU Amsterdam,Amsterdam,Netherlands
,http://purl.org/net/wiss2014/presenters/#markus,Markus Krötzsch,TU Dresden,Dresden,Germany
,http://purl.org/net/wiss2014/presenters/#christoph,Christoph Lange,Universität Bonn;Fraunhofer IAIS,Bonn,Germany
,http://purl.org/net/wiss2014/presenters/#axel,Axel Polleres,WU Wien,Vienna,Austria
,http://purl.org/net/wiss2014/presenters/#eric,Eric Prud'hommeaux,W3C,,
,http://purl.org/net/wiss2014/presenters/#harald,Harald Sack,"HPI, Universität Potsdam",Potsdam,Germany
,http://purl.org/net/wiss2014/presenters/#thomas,Thomas Steiner,Université Lyon;Google,Lyon,France
,http://purl.org/net/wiss2014/presenters/#antoine,Antoine Zimmermann,École des mines de Saint-Étienne,Saint-Étienne,France

6.2 Datatypes

With an alternative export configuration, the 3.5-star CSV may have ended up like this:

Time,Event,Type,Presenter,Location
08/25/2014 09:00:00,Introduction,,,
08/25/2014 09:15:00,Keynote,Keynote,Stefan Decker,

08/25/2014 is sufficiently unambiguous, but what does 01/02/03 mean?

1 February 2003?
2 January 2003?
3 February 2001?
…?

If we don't know how to interpret date entries, we can't answer queries such as “when is the first coffee break”.

Also, if your family from a different timezone wanted to phone you in the lunch break, how do we know that 09:00:00 is in CEST?

So let's use an ISO 8601 conforming date and time format, with time zone information:

Time,Event,Type,Presenter,Location
2014-08-25T09:00:00+02:00,Introduction,,,
2014-08-25T09:15:00+02:00,Keynote,Keynote,http://purl.org/net/wiss2014/presenters/#stefan,

7 ★★★★☆ CSV with a schema

Let's continue to make our CSV even more self-describing, by introducing a schema (also called vocabulary on the Web of Data, or ontology, especially when it involves more complex formal logic).

7.1 A vocabulary of domain-specific concepts

We introduced linked data style URIs for the presenters (so that they describe themselves); let's also do it for other concepts, e.g. the types of presentations.

Let's introduce a domain-specific vocabulary.

Instead of a string "Keynote" let's use a self-describing URI:

,2014-08-25T09:15:00+02:00,Keynote,http://purl.org/net/wiss2014/vocab/#Keynote,http://purl.org/net/wiss2014/presenters/#stefan,

And let's create another CSV file for the vocabulary, where we define our terms:

$id,label,description,see also
#Keynote,keynote,a talk that establishes a theme,http://en.wikipedia.org/wiki/Keynote

The relative URI #Keynote works out if this file is published at http://purl.org/net/wiss2014/vocab/.

7.2 An explicit description of types

We introduced ISO 8601 timestamps, but how does a client know, without having to resort to heuristics, that the first column of schedule.csv is intended to be an ISO 8601 timestamp?

Time,Event,Type,Presenter,Location
2014-08-25T09:00:00+02:00,Introduction,,,

We also introduced a vocabulary, but how do we make explicit what we mean by “label”, “description” and “see also”?

Let's explicitly indicate the types!

For the timestamps and other entries in the schedule:

#,Time,Event,Type,Presenter,Location
type,time,string,url,url,string
,2014-08-25T09:00:00+02:00,Introduction,,,

(We'll get to the structure of the new, first column later.)

For the properties of vocabulary terms:

$id,label,description,see also
url,rdfs:label,rdfs:comment,rdfs:seeAlso
#Keynote,keynote,a talk that establishes a theme,http://en.wikipedia.org/wiki/Keynote

rdfs: is a well-known prefix that abbreviates a URI. rdfs:label (actually: http://www.w3.org/2000/01/rdf-schema#label) once more is a vocabulary term, in the widely used standard vocabulary RDF Schema. Its rdfs:comment is “A human-readable name for the subject.”. So, RDF Schema is a vocabulary for describing vocabularies. Such vocabularies are also known as ontology languages.

7.3 Distinguishing data and metadata

When a CSV has a type declaration row such as url,rdfs:label,rdfs:comment,rdfs:seeAlso, how do we know that this is metadata rather than data?

Let's make it explicit!

#,Time,Event,Type,Presenter,Location
type,time,string,url,url,string
,2014-08-25T09:00:00+02:00,Introduction,,,

When the first column has a type entry, we are in the type declaration row.
An empty first column means “data”.

7.4 More precise types for data columns

Is the title of an event really just a string?
Is the presenter really just a URI (that happens to point to a presenter)?

No! – Let's also reuse some standard vocabularies here!

Schedule:

#,Time,Event,Type,Presenter,Location
type,dct:date,dct:title,rdf:type,http://id.loc.gov/vocabulary/relators/pre,http://linkedevents.org/ontology/atPlace
,2014-08-25T09:15:00+02:00,Keynote,http://purl.org/net/wiss2014/vocab/#Keynote,http://purl.org/net/wiss2014/presenters/#stefan,

Presenters:

#,$id,Name,Affiliation,Town,Country
type,url,foaf:name,schema:affiliation,http://purl.org/net/wiss2014/vocab/#town,http://purl.org/net/wiss2014/vocab/#country
,http://purl.org/net/wiss2014/presenters/#soeren,Sören Auer,Universität Bonn;Fraunhofer IAIS,Bonn,Germany

We found a lot of reusable terms in standard vocabularies.
Linked Open Vocabularies (LOV) is a search engine that helps with this task.
Where didn't find perfectly reusable terms, we defined our own, in our vocabulary.

8 ★★★★★ RDF (and a comparison to CSV)

Cost and benefits of ★★★★★ Web data

More widely than CSV, the RDF data model is used for linked data.

Whenever a URI conforms to linked data, you can expect RDF there (usually in the ugly but widely supported RDF/XML encoding).

Let's therefore redo our example in RDF, and discuss some differences from CSV.

data.ttl (Turtle, human-friendly)
presenters.rdf, schedule.rdf, vocab.rdf (RDF/XML, widely understood by machines)

(For purely pragmatic reasons, the Turtle, which is what I edit, is all-in-one, whereas the RDF/XML is in split files for easier deployment.)

<#day1intro>
        dct:date "2014-08-25T09:00:00+02:00"^^xsd:date ;
        dct:title "Introduction" .

CSV is based on records (one per row, with a fixed number of columns).

RDF is based on triples (subject–predicate–object statements).

Usually more than one triple belongs to a subject (resource), which is why it's convenient to group them.

Usually every resource has an identifier. (In the CSV, our events didn't have any.)

You can precisely indicate the datatype of an object, but you also have to do it always, except when the datatype is string.

<#day1keynote>
        a wv:Keynote ;
        dct:date "2014-08-25T09:15:00+02:00"^^xsd:date ;
        dct:title "Keynote" ;
        marcrel:pre <http://purl.org/net/wiss2014/presenters/#stefan> .

It's no problem for different resources to have different numbers of properties.

Compare sparsely populated CSV:

#,Time,Event,Type,Presenter,Location
type,dct:date,dct:title,rdf:type,http://id.loc.gov/vocabulary/relators/pre,schema:location
,2014-08-25T09:00:00+02:00,Introduction,,,

On the other hand, the CSV data model has a built-in order, which RDF does not have. Order can be expressed in RDF, but doing so leads to a high complexity. In the specification on Generating RDF from Tabular Data on the Web, compare the “minimal” RDF representation of a CSV table to the “standard” representation that preserves information about the tabular structure.

For one subject and predicate, there can be multiple objects. In the CSV we had to cheat:

,2014-08-26T18:00:00+02:00,Hackathon dinner,http://purl.org/net/wiss2014/vocab/#Dinner;http://purl.org/net/wiss2014/vocab/#Hackathon,,Maison des Élèves
,http://purl.org/net/wiss2014/presenters/#stefan,Stefan Decker,INSIGHT;National University of Ireland,Galway,Ireland

In RDF, that's no problem:

<#day2hackathondinner>
        rdf:type wv:Dinner, wv:Hackathon ;
        dct:date "2014-08-26T18:00:00+02:00"^^xsd:date ;
        dct:title "Hackathon dinner" ;
        schema:location "Maison des Élèves" .

<http://purl.org/net/wiss2014/presenters/#stefan>
        foaf:name "Stefan Decker" ;
        schema:affiliation "INSIGHT", "National University of Ireland" ;
        wv:town "Galway" ;
        wv:country "Ireland" .

Vocabulary definitions are no problem in RDF either, as RDF Schema itself has an RDF-based syntax:

wv:Hackathon
        rdfs:label "hackathon" ;
        rdfs:comment "an event of intensive collaboration on a software project" ;
        rdfs:seeAlso <http://dbpedia.org/resource/Hackathon> .

Here, we introduced a custom prefix to abbreviate the URI of our vocabulary. Here's how prefixes are declared:

@prefix dct: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix marcrel: <http://id.loc.gov/vocabulary/relators/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix schema: <http://schema.org/> .
@prefix wv: <http://purl.org/net/wiss2014/vocab/#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

This is just syntactic sugar, not part of the RDF data model.

Note that the rdfs:seeAlso link points to DBpedia. DBpedia is a linked dataset extracted from Wikipedia.

8.1 Publishing RDF

Linked data clients usually expect data to be published as RDF, and RDF/XML is the most widely supported serialization of RDF. Therefore, we have also published our data as RDF/XML:

wget --quiet -O - --header 'Accept: text/rdf+xml' 'http://purl.org/net/wiss2014/presenters/#markus'

<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
  <rdf:Description rdf:about="http://purl.org/net/wiss2014/presenters/#stefan">
    <ns1:country xmlns:ns1="http://purl.org/net/wiss2014/vocab/#">Ireland</ns1:country>
    <ns2:town xmlns:ns2="http://purl.org/net/wiss2014/vocab/#">Galway</ns2:town>
    <ns3:affiliation xmlns:ns3="http://schema.org/">INSIGHT</ns3:affiliation>
    <ns4:affiliation xmlns:ns4="http://schema.org/">National University of Ireland</ns4:affiliation>
    <ns5:name xmlns:ns5="http://xmlns.com/foaf/0.1/">Stefan Decker</ns5:name>
  </rdf:Description>
</rdf:RDF>

A few notes:

This RDF/XML was auto-generated from the Turtle source and therefore looks a bit unfriendly.
Additionally, it is good practice to also publish a human-comprehensible version of your data in HTML. Here, we did not do this.
We configured RDF/XML to be the content served by default. Therefore, it is also served when no specific content type is requested via the Accept HTTP request header.

This is the .htaccess configuration file that implements this behaviour in the Apache web server:

AddType application/rdf+xml .rdf
AddType text/csv .csv

RewriteEngine On
RewriteBase /~langec/wiss2014/
RewriteCond %{HTTP_ACCEPT} !application/rdf\+xml.*(text/csv)
RewriteCond %{HTTP_ACCEPT} text/csv
RewriteRule ^(presenters|schedule|vocab)/$ $1/index.csv [R=302]

RewriteRule ^(presenters|schedule|vocab)/$ $1/index.rdf [R=302]

9 ★★★★★☆ Further possible improvements

Additional stars have been suggested for publishing data …

… that uses standard schemas – we've done this already.
… whose quality has been checked – our group does research on this.

Also recall that our original use case started from an HTML homepage. With the following standards it's possible to embed linked data into HTML:

Microformats (very basic)
Microdata (more powerful; emphasizes syntactic conciseness)
RDFa (widest support of the RDF data model) – try it with http://rdfa.info/play/!

10 Credits

The idea for this tutorial was inspired by Antoine Zimmermann. The motivation was to prepare something for the 2014 Web Intelligence Summer School “Web of Data” that's not too heavily biased towards RDF.

This summer school was funded by