All that SPARQLs is not gold

From ActiveArchives
Jump to: navigation, search

Tagging Matters

Contents

Morning

Introductions

Introduction to the larger context of the Active Archives project

Round of introductions (Relations to Linked Data, acknowledge the different positions vis à vis Linked Data ...)

Screening: Web 3.0 (Kate Ray's documentary on the Semantic Web): Web 3.0 Kate Ray's documentary on the Semantic Web

Why we (AA team) think Linked Data is interesting.

  • Decentralized data sharing
  • Searching your own way
  • "Private" writing spaces

Issues:

  • Historical link with "hard" AI / logical / decision making utility (just the facts)
  • Centralizing (Implication that centralized crawlers / knowledge stores are essential)
  • Complete disregard to the work of writing / tagging (CF BOWKER/Star Sorting things out: Invisible work of the database)

Morning Exercise

Installing Redland

Project Gutenberg offers their full catalog as RDF, either as a single text file (over 200MB!), or as one file per book. We can start by downloading a single book's RDF file.

Alice in Wonderland on Project Gutenberg

Books by anonymous authors: http://www.gutenberg.org/ebooks/search.html/?default_prefix=author_id&sort_order=title&query=216

Working in pairs with "rapper" (librdf/Redland/Linux)

  • Download the RDF of Book "x"
less pg11.rdf
  • Use rapper to convert the xml format into turtle
rapper --output turtle pg11.rdf

and to save the output to a file

rapper --output turtle pg11.rdf > pg11.ttl

Graphviz is a visualization program for graph data. Rapper can output a "dot" file which graphviz can then use to turn the data into a diagram.

  • Use rapper to output graphviz dot format, and then use graphviz to draw an SVG file
rapper --output dot pg
fdp -Tsvg pg12345.dot > pg12345.svg

Outcome: SVG reveals some, but also obscures through it's density of linking and completness.

Next step: Filter & Combine with more context

Afternoon

SPARQL

SPARQL is a query language for Linked Data. It supports a number of forms: SELECT, and CONSTRUCT are the primarily useful ones. The key difference is that SELECT returns tabular results, and as such comes closest to the SQL language of relational databases (which are themselves table-oriented). SELECTS are often the most direct means of requesting information from an RDF file or store (database).

How could we filter & give more context to the single RDF source (using Linked Data).

Save this code as say "alice_01.rq"

Invalid language.

You need to specify a language like this: <source lang="html4strict">...</source>

Supported languages for syntax highlighting:

4cs, 6502acme, 6502kickass, 6502tasm, 68000devpac, abap, actionscript, actionscript3, ada, algol68, apache, applescript, apt_sources, asm, asp, autoconf, autohotkey, autoit, avisynth, awk, bascomavr, bash, basic4gl, bf, bibtex, blitzbasic, bnf, boo, c, c_loadrunner, c_mac, caddcl, cadlisp, cfdg, cfm, chaiscript, cil, clojure, cmake, cobol, coffeescript, cpp, cpp-qt, csharp, css, cuesheet, d, dcs, delphi, diff, div, dos, dot, e, ecmascript, eiffel, email, epc, erlang, euphoria, f1, falcon, fo, fortran, freebasic, fsharp, gambas, gdb, genero, genie, gettext, glsl, gml, gnuplot, go, groovy, gwbasic, haskell, hicest, hq9plus, html4strict, html5, icon, idl, ini, inno, intercal, io, j, java, java5, javascript, jquery, kixtart, klonec, klonecpp, latex, lb, lisp, llvm, locobasic, logtalk, lolcode, lotusformulas, lotusscript, lscript, lsl2, lua, m68k, magiksf, make, mapbasic, matlab, mirc, mmix, modula2, modula3, mpasm, mxml, mysql, newlisp, nsis, oberon2, objc, objeck, ocaml, ocaml-brief, oobas, oracle11, oracle8, oxygene, oz, pascal, pcre, per, perl, perl6, pf, php, php-brief, pic16, pike, pixelbender, pli, plsql, postgresql, povray, powerbuilder, powershell, proftpd, progress, prolog, properties, providex, purebasic, pycon, python, q, qbasic, rails, rebol, reg, robots, rpmspec, rsplus, ruby, sas, scala, scheme, scilab, sdlbasic, smalltalk, smarty, sql, systemverilog, tcl, teraterm, text, thinbasic, tsql, typoscript, unicon, uscript, vala, vb, vbnet, verilog, vhdl, vim, visualfoxpro, visualprolog, whitespace, whois, winbatch, xbasic, xml, xorg_conf, xpp, yaml, z80, zxbasic


SELECT ?p ?o FROM <pg11.rdf>
WHERE {
<http://www.gutenberg.org/ebooks/11> ?p ?o .
}

To perform the query, use roqet!

roqet alice_01.rq

Roqet supports a number of different results formats ("simple" is the default), but a few are more useful for reading on the Terminal like: table, tsv & csv.

roqet alice_01.rq -r table

Output of "html" (an HTML table) is also possible, useful for viewing in a browser.

roqet alice_01.rq -r table > alice_01.html
firefox alice_01.html

NB: Because the results of a SELECT query are tabular, using a format like "turtle" doesn't make sense (it actually makes the results harder to read). Instead, a tabular format like tab-separated-values (tsv) is actually the most straightforward.

Exercise: SPARQL Stories

morphology_pp92-93.gif

Tools & Resources

What links here

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox