RDF 1.2 Interoperability

Interoperability between RDF [=Basic=] and RDF [=Full=]

Should we make this section normative?

This section provides transformations between [=Full=] [=RDF graphs=] (respectively, [=RDF datasets=]) and [=Basic=] [=RDF graphs=] (respectively, [=RDF datasets=]), to provide some level of interoperability between the different classes of Conformance.

Should we go even further and aim to provide interoperability between RDF 1.1 and RDF 1.2 [=Full=]?

AT RISK: The Working Group may decide to replace the terms `rdf:TripleTerm`, `rdf:ttSubject`, `rdf:ttPredicate`, and `rdf:ttObject` used in this section with other terms, possibly in a different namespace.

These transformation are designed to be:

Information preserving: It must be possible to reconstruct the input graph (respectively, dataset) from the output graph (respectively, dataset). Note, however, that these transformations are not designed to preserve semantics: the output graph is not semantically [=equivalent=] to the input graph, at least not in the entailment regimes defined in [[RDF12-SEMANTICS]].
Idempotent: Applying a transformation several times to a graph (respectively, dataset) should have the same effect as applying it once. Moreover, [=basic encoding=] a graph (respectively, dataset) that is already complying with RDF [=Basic=] (i.e., containing no [=triple term=]) must result in the same graph (respectively, dataset).
Universal: It should be possible to transform any [=Full=] graph (respectively, dataset) to a [=Basic=] graph (respectively, dataset) using this method. There is actually a minor caveat to this property.

From [=Full=] to [=Basic=]

Encoding an [=RDF graph=] to ensure that it is consumable by an RDF [=Basic=] implementation is called basic encoding it. [=Basic encoding=] consists of repeating the following steps until no [=triple term=] [=appears=] in the graph, and the graph is therefore compliant with RDF [=Basic=]: picking a [=triple term=] tt that [=appears=] in the graph; minting a fresh [=blank node=] b (i.e., a blank node not yet in use in the graph); replacing all occurrences of tt [=appearing=] in the graph with b; and then adding the following [=triples=] to the graph (where s, p, and o are respectively the [=subject=], [=predicate=] and [=object=] of tt):

(b, `rdf:type`, `rdf:TripleTerm`)
(b, `rdf:ttSubject`, s)
(b, `rdf:ttPredicate`, p)
(b, `rdf:ttObject`, o)

Note that this transformation is information preserving only when the input graph either has no [=triple term=] [=appearing=] in it, or contains no [=asserted triple=] (b, `rdf:type`, `rdf:TripleTerm`) where b is a [=blank node=]. Implementations encountering this situation MUST report an error. This limitation is discussed in Section .

The blank nodes generated to replace [=triple terms=] should not be confused with the [=reifiers=] that are typically associated with these [=triple terms=].

[=Basic encoding=] an [=RDF dataset=] consists of [=basic encoding=] its [=default graph=] and each of its [=named graph=]. In this case, the fresh [=blank node=] assigned to each [=triple term=] must not be used in any graph of the dataset.

A detailed algorithm of the transformation is found in Section .

Example

The examples in this section are expressed in the Turtle concrete syntax [[RDF12-TURTLE]].

From [=Basic=] to [=Full=]

Reverting a [=basic encoded=] graph to its original form consists of locating each [=asserted triple=] (b, `rdf:type`, `rdf:TripleTerm`) that has a [=blank node=] b as its subject, along with the three associated [=asserted triples=] that have the same [=blank node=] b as their subjects, i.e., (b, `rdf:ttSubject`, s), (b, `rdf:ttPredicate`, p), and (b, `rdf:ttObject`, o); removing these four [=triples=] from the graph; and replacing all remaining occurrences of b [=appearing=] in the graph with the [=triple term=] (s, p, o).

An implementation MUST report an error if, for a given b, it can not unambiguously determine s, p, or o (i.e., if one of the properties of b — `rdf:ttSubject`, `rdf:ttPredicate`, or `rdf:ttObject` — is missing or duplicated). An implementation MUST also report an error if the input graph contains at the same time a [=triple term=] and an [=asserted triple=] (b, `rdf:type`, `rdf:TripleTerm`) where b is the same [=blank node=]. Note that none of these situations can occur if the input graph was produced by the [=basic encoding=] transformation.

To revert a [=basic encoded=] [=RDF dataset=] to its original form, the transformation above is applied to its [=default graph=] and to each of its [=named graphs=].

Note that this transformation has no effect on any [=RDF graph=] or [=RDF dataset=] that does not use the `rdf:TripleTerm` type, including [=Full=] graphs or datasets containing [=triple terms=]. This makes this transformation idempotent as intended.

Limitations

The two transformations above explicitly do not support graphs or datasets containing at the same time a [=triple term=] and an [=asserted triple=] (b, `rdf:type`, `rdf:TripleTerm`) where b is a [=blank node=]. This means that the [=basic encoding=] transformation is not strictly universal.

This limitation should not be an issue in practice. The `rdf:TripleTerm` type is unlikely to be in used in any published graph or dataset, as it was not defined prior to this specification. For this reason, using it would actually have been bad practice. For future graphs and datasets, this type should be considered to be reserved for use within the [=basic encoding=] transformation, and not used otherwise.

This is one reason why this transformation introduces new vocabulary terms (`rdf:TripleTerm`, `rdf:ttSubject`, `rdf:ttPredicate`, `rdf::ttObject`), rather than repurposing the existing reification vocabulary (`rdf:Statement`, `rdf:subject`, `rdf:predicate`, `rdf:object`). Unlike `rdf:TripleTerm`, `rdf:Statement` is known to be found in widely used datasets (e.g., Uniprot), so reserving its use for the [=basic encoding=] transformation was not an option.

Another consequence of this restriction is that implementers will need to be aware and careful when merging graphs in an application that [=basic encoded=] graphs or datasets. The concern is that merging a [=Full=] [=RDF graph=] containing at least one [=triple term=] with a [=basic encoded=] [=RDF graph=] (which might contain [=blank node=] instances of `rdf:TripleTerm`) could result in a "hybrid" graph that cannot be transformed to a consistent [=Full=] nor [=Basic=] [=RDF graph=]. Therefore, such applications should [=basic encode=] every graph prior to merging them. Conversely, applications supporting RDF [=Full=] should make sure to apply the reverse transformation to any graph that is known or likely to have been [=basic encoded=], to avoid creating such "hybrid" graphs. Since these transformations are designed to be idempotent, there is no harm in applying them more than necessary.

Algorithms

The `basic-encode` algorithm

The algorithm expects one input variable Gᵢ which is an [=RDF graph=]. It returns a [=Basic=] [=RDF graph=].

Let Gₒ be an empty [=RDF graph=].
Let M be an empty map from [=triple terms=] to [=blank nodes=].
Let inputKind be `null`.
For each [=triple=] (s, p, o) in Gᵢ:
1. If s is a [=blank node=], p is `rdf:type` and o is `rdf:TripleTerm`, then:
  1. If inputKind is `"full"` then exit with an error.
  2. Otherwise, set inputKind to `"basic"`.
2. If o is a [=triple term=], then:
  1. If inputKind is `"basic"` then exit with an error.
  2. Otherwise, set inputKind to `"full"`.
  3. Let b, M' and G' be the result of invoking `basic-encode-triple-term` passing o as t and M as Mi.
  4. Merge M' into M.
  5. Merge G' into Gₒ.
  6. Set o to b.
3. Add the [=triple=] (s, p, o) to Gₒ.
Return Gₒ.

The `basic-encode-triple-term` algorithm

This algorithm is responsible for incrementally populating the mapping M and the graph G used internally by the `basic-encode` algorithm. It receives a [=triple term=] as input and processes it recursively (in case its object is itself a [=triple term=]). It returns, among other things, the [=blank node=] minted to replace the [=triple term=] in the transformed [=Basic=] [=RDF graph=].

This algorithm expects two input variables: a [=triple term=] t, and a map Mᵢ from [=triple terms=] to [=blank nodes=]. It returns a [=blank node=] b, a map Mₒ from [=triple terms=] to [=blank nodes=], and a [=Basic=] [=RDF graph=] G.

Let Mₒ be an empty map.
Let G be an empty [=RDF graph=].
Let b be the [=blank node=] associated with t in Mᵢ, if any.
Otherwise:
1. Let s, p and o be the subject, predicate and object of t, respectively.
2. If o is a [=triple term=], then:
  1. Let b', M' and G' be the result of invoking `basic-encode-triple-term` passing o as t and Mᵢ.
  2. Set o to b'.
  3. Merge M' into Mₒ.
  4. Merge G' into G.
3. Let b be a fresh blank node.
4. Add the association (t, b) to Mₒ.
5. Add the triples (b, `rdf:type`, `rdf:TripleTerm`), (b, `rdf:ttSubject`, s), (b, `rdf:ttPredicate`, p), and (b, `rdf:ttObject`, o) in G.
Return b, Mₒ and G.

The `basic-decode` algorithm

Write this algorithm

Introduction

Notation and Terminology