RDF Glossary

From ActiveArchives
Discussions around RDF and the Semantic Web tend to rely on a lot of jargon and acronyms. This page attempts to clarify some of the key terms with specific relevance to the AA project. The W3C also maintains a comprehensive list of Semantic Web tools.

Resource Description Framework, a "standard model for data interchange on the Web". See RDF description on the W3. RDF has many layers and (optional) tools that can make the topic seem overwhelming to get a grip on at first.

At it's core however, RDF proposes a representation of data that is quite simple namely it adds a notion of "type" or "flavor" to the common web link. By giving links a "relation" property, information can be represented as "triples" with each part: the subject (or where the link is coming from), the object (where the link is pointing to), and the predicate (the "type" or flavor of link), each representable by a URL. (In practice, objects can also be "literals" or data in the form of text, a number or date.) By "crawling" from one URL to another and reading the associated "flavored links" or "triples", RDF can be said to create a web of "Linked Data".


N3 or "Notation 3" is an RDF representation that is compact and relatively easy to read. See http://www.w3.org/DesignIssues/Notation3.html


A means of encoding RDF data in the contents of a "regular" HTML / webpage. Inspired by the adhoc practices of the microformatting communities.


The practice of programmatically adding (guessing) structure based on an unstructured textual source. The semantic web aims to "solve" the problem of scraping by allowing explicit structures to be embedded in webpages and other online resources.

Relational database

Traditionally, information in a "database" (online or otherwise) is stored in tables (like spreadsheets) of data, organized by columns (different "fields" or types of information) and rows (the specific "records" of data related to a single item), and references between these tables (the relations).

RDF is a break from this tradition in that it's graph-based approach allows for much more flexibility than the tabular form of a relational database.

Linked (Open) Data

Linked (Open) Data (or LOD) means that archived items are served in both human and machine-readable mechanisms. Harvester responds according to the REQUEST header, so that browsers will see a descriptive page about a given resource, whereas a machine agent would receive RDF in XML format (or, indeed, whatever it has requested so long as Harvester knows how to respond).


A formal language for querying (requesting specific information) from an RDF database. (The name refers to the SQL language which is used extensively today to work with a #Relational database).

SPARQL endpoint

A (public) URL for a website that allows an outside program to request and retrieve specific information from, using #SPARQL. In this way, an organization can choose to make the contents of their archives available for outside use by tools using the RDF/SPARQL standards.

Examples of SPARQL queries on #dbpedia


dbpedia is a project with roots in European academia that offers a large collection of RDF-encoded data, with a free license. The data have been gathered from a variety of sources (including Wikipedia). DBPedia offers a SPARQL endpoint as its (main) means of accessing it's collection.

Freebase (Metaweb)

Freebase is a commercial service offering data in RDF format. The project is similar to dbpedia but offers a much more polished user interface for members of the public to help add to and correct its collection of "facts" (much in the way that Wikipedia allows anyone to be an editor of its articles). Metaweb is the company behind Freebase and was bought by Google in July 2010.



Open Archives Initiative - Protocol for Metadata Harvesting

A protocol for finding and retrieving metadata. It's not meant for queries, per se. That is to say, it can return all items with a specific metadata schema, but not all the items that have a certain 'Creator' or 'Date'.

