RDF-star and SPARQL-star

1. Introduction

1.1 Background and Motivation

This section is non-normative.

1.2 Overview

This section is non-normative.

The RDF data model lets you state facts in three-part subject-predicate-object statements known as triples. For example, with a single RDF triple you can say that employee38 has a familyName of "Smith". A triple's predicate is a property specified with an IRI (an Internationalized version of a URI) to identify the namespace of the property name. A triple's subject and object can each be an IRI referencing any entity, and the object can also be a literal value such as "Smith" or data of other types such as dates, numbers, or Boolean values.

The subject and object of a triple can themselves reference triples. In the statement "employee22 claims that employee38 has a jobTitle of 'Assistant Designer'", the object of the triple that has employee22 as its subject references the statement "employee38 has a jobTitle of 'Assistant Designer'". This use of a triple as the subject or object resource of another triple so that we can say things about that triple is known as reification.

The concept of reification has always been part of RDF, but expressing it in RDF concrete syntaxes such as Turtle, N-Triples, and RDF/XML has been verbose and cumbersome. This specification describes a new, more compact conceptual data model and Turtle concrete syntax for reification known as RDF-star and Turtle-star, respectively. This model and syntax enable the creation of concise triples that reference other triples as subject and object resources.

Triples that include a triple as a subject or an object are known as RDF-star triples. The following dataset shows the example RDF-star triples from above using the Turtle-star syntax, which uses double angle brackets to enclose a triple serving as a subject or object resource:

Example 1

@prefix :    <http://www.example.org/> .

:employee38 :familyName "Smith" .
:employee22 :claims << :employee38 :jobTitle "Assistant Designer" >> .

After declaring a prefix so that IRIs can be abbreviated, the first triple in this example asserts that employee38 has a familyName of "Smith". Note that this dataset does not assert that employee38 has a jobTitle of "Assistant Designer"; it says that employee22 has made that claim. In other words, the triple "employee38 has a jobTitle of 'Assistant Designer'" is not what we call an asserted triple, like "employee38 has a familyName of 'Smith'" above; it is known as an embedded triple. (If we added the triple :employee38 :jobTitle "Assistant Designer" below the triple about employee22's claim in the example above, then this triple about employee38's jobTitle would be both an embedded triple and an asserted one.)

This specification also describes an extension to the SPARQL Protocol and Query Language known as SPARQL-star for the querying of RDF-star triples. For example, the following SPARQL-star query asks "who has made any claims about employee38?"

Example 2

PREFIX : <http://www.example.org/> 

SELECT ?claimer WHERE {
   ?claimer :claims << :employee38 ?property ?value >>
}

SPARQL query triple patterns that include a triple pattern as a subject or object are known as SPARQL-star triple patterns.

Issue 93: annotation-syntax example in the overview/primer part

The overview/primer part of the report should also contain an example that highlights the availability of the annotation syntax (not only on the data-level but also on the query-level).

Issue 94: SPARQL-star update example in the overview/primer part

The overview/primer part of the report should also contain an example that hints at the fact that the spec also covers updates.

For the remainder of this document, examples will assume that the following prefixes have been declared to represent the IRIs shown with them here:

`:`	`<http://www.example.org/>`
`rdfs:`	`<http://www.w3.org/2000/01/rdf-schema#>`
`owl:`	`<http://www.w3.org/2002/07/owl#>`
`prov:`	`<http://www.w3.org/ns/prov#>`
`dc:`	`<http://purl.org/dc/elements/1.1/>`
`dct:`	`<http://purl.org/dc/terms/>`

1.3 Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words MAY, MUST, and MUST NOT in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

Issue 3: Do we need more things in the 'conformance' section? later process

For the moment, we only have the boilerplate text generated by respec.

2. Concepts and Abstract Syntax

In the following, we introduce a number of definitions specific to SPARQL-star, which rely on the following notions (extending some of them) defined in RDF 1.1 Concepts and Abstract Syntax [RDF11-CONCEPTS]: blank node, default graph, graph name, IRI, literal, named graphs, object, predicate, RDF dataset, RDF graph, RDF triple, and subject

An RDF-star graph is a set of RDF-star triples.

An RDF-star triple is a 3-tuple defined recursively as follows:

any RDF triple is an RDF-star triple;
if t and t' are RDF-star triples, s is an IRI or a blank node, p is an IRI, o is an IRI, a blank node or a literal, then (t, p, o), (s, p, t) and (t, p, t') are RDF-star triples.

As for RDF triples, we call the 3 components of an RDF-star triple its subject, predicate and object, respectively. From the definitions above, it follows that any RDF graph is also an RDF-star graph. Note also that, by definition, an RDF-star triple cannot contain itself and cannot be nested infinitely.

IRIs, literals, blank nodes and RDF-star triples are collectively known as RDF-star terms.

For every RDF-star triple t, we define its constituent terms (or simply constituents) as the set containing its subject, its predicate, its object, plus all the constituent terms of its subject and/or its object if they are themselves RDF-star triples. By extension, we define the constituent terms of an RDF-star graph to be the union set of the constituent terms of all its triples.

Consider the following RDF-star triple (represented in Turtle-star):

Example 3

<< _:a :name "Alice" >> :statedBy :bob.

Its set of constituent terms comprises the IRIs :name, :statedBy, :bob, the blank node _:a, the literal "Alice", and the triple << _:a :name "Alice" >>.

An RDF-star triple used as the subject or object of another RDF-star triple is called an embedded triple. An RDF-star triple that is an element of an RDF-star graph is called an asserted triple. Note that, in a given RDF-star graph, the same triple MAY be both embedded and asserted.

An RDF-star dataset is a collection of RDF-star graphs, and comprises:

Exactly one default graph, being an RDF-star graph. The default graph does not have a name and MAY be empty.
Zero or more named graphs. Each named graph is a pair consisting of either an IRI or a blank node (called the graph name), and an RDF-star graph. Graph names are unique within an RDF-star dataset.

Again, this definition is an extension of the notion of RDF dataset, hence it follows that any RDF dataset is also an RDF-star dataset.

2.1 Triples and occurrences

This section is non-normative.

According to the definitions above, an RDF-star triple is an abstract entity whose identity is entirely defined by its subject, predicate and object. Conversely, given three RDF-star terms s, p, and o, there is exactly and only one RDF-star triple with subject s, predicate p and object o. This unique triple (s, p, o) can be embedded as the subject or object of multiple other triples, but must be assumed to represent the same thing everywhere it occurs, just like the same IRI p is assumed to represent the same thing everywhere it occurs.

In some situations, however, it might be necessary to distinguish the occurrences of a triple in different graphs. Consider the following sentence: "The triple <http://example.org/s> <http://example.org/p> <http://example.org/o> in (the graph represented by) file1.ttl was added by Alice, and the same triple in file2.ttl was added by Bob." Note that the words "same triple" in this sentence may be confusing, because although the triple (as an abstract entity) is the same, its respective occurrences are different things, each within a different file and with a different author (this is known, in philosophy and linguistics, as the type-token distinction). As the embedded triple represents a unique thing, adequately conveying the meaning of the sentence above requires additional nodes for representing the two distinct occurrences. One possible solution is illustrated in the following example (using the Turtle-star concrete syntax described in the next section).

Example 4

_:a :occurenceOf << :s :p :o >> ;
    :in <file1.ttl> ;
    dct:creator :alice.
_:b :occurenceOf << :s :p :o >> ;
    :in <file2.ttl> ;
    dct:creator :bob.

3. Concrete Syntaxes

3.1 Turtle-star

In this section, we present Turtle-star, an extension of the Turtle format [TURTLE] allowing the representation of RDF-star graphs. For the sake of conciseness, we only describe here the differences between Turtle-star and Turtle.

3.1.1 Grammar

Turtle-star is defined to follow the same grammar as Turtle, except for the EBNF productions specified below, which replace the productions having the same number (if any) in the original grammar.

[8]	`objectList`	::=	object annotation`?` `(` `','` object annotation`?` `)*`
[10]	`subject`	::=	iri `\|` BlankNode `\|` collection `\|` embTriple
[12]	`object`	::=	iri `\|` BlankNode `\|` collection `\|` blankNodePropertyList `\|` literal `\|` embTriple
[27]	`embTriple`	::=	`'<<'` embSubject verb embObject `'>>'`
[28]	`embSubject`	::=	iri `\|` BlankNode `\|` embTriple
[29]	`embObject`	::=	iri `\|` BlankNode `\|` literal `\|` embTriple
[31]	`annotation`	::=	`'{\|'` predicateObjectList `'\|}'`

Note

The changes are that subject and object productions have been extended to accept embedded triples, which are described by the new productions 27 to 29. Note that embedded triples accept a more restricted range of subject and object expressions than asserted triples. Additionally, the objectList production now accepts an optional annotation after each object.

3.1.2 Parsing

A Turtle-star parser is similar to a Turtle parser as defined in Section 7 of the Turtle specification [TURTLE], with an additional item in its state :

RDF-star Term curObject — The curObject is bound to the embObject production.

Additionally, the curSubject can be bound to any RDF-star term (including an embedded triple).

A Turtle-star document defines an RDF-star graph composed of a set of RDF-star triples. The subject and embSubject productions set the curSubject. The verb production sets the curPredicate. The object and embObject productions set the curObject. Finishing the object production, an RDF-star triple curSubject curPredicate curObject is generated and added to the RDF-star graph.

Beginning the embTriple production records the curSubject and curPredicate. Finishing the embTriple production yields the RDF-star triple curSubject curPredicate curObject and restores the recorded values of curSubject and curPredicate.

Beginning the annotation production records the curSubject and curPredicate, and sets the curSubject to the RDF-star triple curSubject curPredicate curObject. Finishing the annotation production restores the recorded values of curSubject and curPredicate.

All other productions MUST be handled as specified by Section 7 of the Turtle specification [TURTLE], while still applying the changes above recursively.

3.2 N-Triples-star

This section describes N-Triples-star, a minimal extension of the N-Triples format [N-TRIPLES] allowing a subject or an object of an RDF-star triple to be an embedded triple.

3.2.1 Grammar

N-Triples-star is defined to follow the same grammar as the N-Triples Grammar, except for the EBNF productions specified below, which replace the productions having the same number (if any) in the original grammar.

[3]	`subject`	::=	IRIREF `\|` BLANK_NODE_LABEL `\|` embTriple
[5]	`object`	::=	IRIREF `\|` BLANK_NODE_LABEL `\|` literal `\|` embTriple
[7]	`embTriple`	::=	"<<" subject predicate object ">>"

Note

As with Turtle-star, the changes are that subject and object productions have been extended to accept embedded triples, which are described by the new production 7. N-Triples-star does not include an annotation form.

3.2.2 Parsing

In contrast to [N-TRIPLES], N-Triples-star allows recursion on the subject and object productions.

An N-Triples-star document defines an RDF-star graph composed of a set of RDF-star triples. The triple production produces an RDF-star triple composed of a subject, predicate and object.

In addition to the Term Constructors defined in [N-TRIPLES], an additional constructor is defined for embTriple of type RDF-star triple defined by the terms constructed for subject, predicate and object.

All other productions MUST be handled as specified by Section 8.1 of the N-Triples specification [N-TRIPLES], while still applying the changes above recursively.

3.3 N-Quads-star

The [N-QUADS] format is extended to describe the N-Quads-star format using the same production updates described in the N-Triples-star Grammar.

Note

As RDF-star describes embedded triples and not embedded quads, the graphLabel component of an N-Quads statement does not apply to the embTriple component.

An N-Quads-star document defines an RDF-star dataset composed of a single default graph, and zero or more named graphs, all of which are RDF-star graphs.

A conforming N-Quads-star parser MUST parse any valid N-Quads document and additionally parse the subject and object productions from N-Triples-star to generate RDF-star triples which are added to either the default graph or associated named graph, as appropriate.

3.4 Other Concrete Syntaxes

This section is non-normative.

While this document specifies a small number of concrete syntaxes, nothing prevents other concrete syntaxes of RDF-star from being proposed. In particular, other existing concrete syntaxes for RDF, such as RDF/XML [RDF-SYNTAX-GRAMMAR], could be extended to support RDF-star.

4. SPARQL-star Query Language

This Section introduces SPARQL-star, which is an RDF-star-aware extension of the RDF query language SPARQL [SPARQL11-QUERY]; i.e., SPARQL-star can be used to query RDF-star graphs.

4.1 Initial Definitions

In the following, we introduce a number of SPARQL-star-specific definitions, which rely on the following notions, defined in SPARQL 1.1 Query Language [SPARQL11-QUERY]: RDF term, query variable, triple pattern, property path pattern, property path expression, and solution mapping.

A SPARQL-star triple pattern is a 3-tuple that is defined recursively as follows:

Every SPARQL triple pattern is a SPARQL-star triple pattern;
If t and t' are SPARQL-star triple patterns, x is an RDF term or a query variable, and p is an IRI or a query variable, then (t, p, x), (x, p, t), and (t, p, t') are SPARQL-star triple patterns.

As for RDF-star triples, a SPARQL-star triple pattern MUST NOT contain itself.

A SPARQL-star basic graph pattern (BGP-star) is a set of SPARQL-star triple patterns.

A SPARQL-star property path pattern is a 3-tuple (s,p,o) where

s is either an RDF term, a query variable, or a SPARQL-star triple pattern,
p is a property path expression, and
o is either an RDF term, a query variable, or a SPARQL-star triple pattern.

Issue 7: Property path patterns in SPARQL* sparql*

I have added the definition of a SPARQL* property path pattern into the draft just for the sake of having such a definition. We need to think about whether it is useful to add this to SPARQL*, in which case we need to define the semantics of such SPARQL* property path patterns.

In fact, no matter what we decide, even for standard property path patterns, the semantics may have to be extended to use them over RDF* graphs.

A SPARQL-star solution mapping μ is a partial function from the set of all query variables to the set of all RDF-star terms. The domain of μ, denoted by dom(μ), is the set of query variables for which μ is defined.

Note

The notion of a SPARQL-star solution mapping extends the notion of a standard SPARQL solution mapping; that is, every SPARQL solution mapping is a SPARQL-star solution mapping. However, in contrast to SPARQL solution mappings, SPARQL-star solution mappings may map variables also to RDF-star triples.

All notions related to SPARQL solution mappings carry over naturally to SPARQL-star solution mappings. In particular, the definition of compatibility extends naturally to SPARQL-star solution mappings: two SPARQL-star solution mappings μ₁ and μ₂ are compatible if, for every variable v that is both in dom(μ₁) and in dom(μ₂), μ₁(v) and μ₂(v) are the same RDF-star term. In this case, μ₁ ∪ μ₂ is also a SPARQL-star solution mapping. Moreover, for any SPARQL-star solution mapping μ we write card[Ω](μ) to denote the cardinality of μ in a multiset Ω of such mappings. Finally, given a BGP-star B and a SPARQL-star solution mapping μ, we write μ(B) to denote the result of replacing every variable v in B for which μ is defined with μ(v).

Next, we aim to carry over the notion of solutions for BGPs to BGP-star. To this end, we first define an auxiliary concept that carries over the notion of an RDF instance mapping [RDF11-MT] to RDF-star.

An RDF-star instance mapping σ is a partial function from the set of all blank nodes to the set of all RDF-star terms. The domain of σ, denoted by dom(σ), is the set of blank nodes for which σ is defined.

Similar to the corresponding notation for solution mappings, for an RDF-star instance mapping σ and a BGP-star B we write σ(B) to denote the result of replacing every blank node b in B for which σ is defined with σ(b).

Now we are ready to define the notion of solution for BGP-star.

Given a BGP-star B and an RDF-star graph G, a SPARQL-star solution mapping μ is a solution for the BGP-star B over G if it has the following two properties

dom(μ) is equivalent to the set of query variables in B, and
there exists an RDF-star instance mapping σ such that dom(σ) is equivalent to the set of blank nodes in B and μ(σ(B)) is a subgraph of G.

4.2 Grammar

SPARQL-star is defined to follow the same grammar as SPARQL, except for the EBNF productions specified below, which replace the productions having the same number (if any) in the original grammar.

[60]	`Bind`	::=	`'BIND'` `'('` ( Expression `\|` EmbTP ) `'AS'` Var `')'`
[75]	`TriplesSameSubject`	::=	VarOrTermOrEmbTP PropertyListNotEmpty `\|` TriplesNode PropertyList
[80]	`Object`	::=	GraphNode `\|` EmbTP
[81]	`TriplesSameSubjectPath`	::=	VarOrTermOrEmbTP PropertyListPathNotEmpty `\|` TriplesNode PropertyListPath
[105]	`GraphNodePath`	::=	VarOrTermOrEmbTP `\|` TriplesNodePath `\|`
[174]	`EmbTP`	::=	`'<<'` EmbSubjectOrObject Verb EmbSubjectOrObject `'>>'`
[175]	`EmbSubjectOrObject`	::=	Var `\|` BlankNode `\|` iri `\|` RDFLiteral `\|` NumericLiteral `\|` BooleanLiteral `\|` EmbTP
[176]	`VarOrTermOrEmbTP`	::=	Var `\|` GraphTerm `\|` EmbTP

This introduces a notation for embedded triple patterns (productions [174] and following), which is similar to the one defined for embedded triples in § 3.1 Turtle-star, but accepting also variables. These embedded triple patterns are allowed in the subject ([75], [81]) and object ([80], [105]) positions of SPARQL-star triple patterns, as well as in BIND statements ([60]).

Issue 6: FIND instead of BIND sparql*

Instead of reusing the keyword BIND for SPARQL* (as in my original proposal), we may want to consider using a different keyword for this functionality because the behavior is a bit different. For instance, @klinovp has mentioned this issue in an email on the mailing list. In another email, @afs has proposed to use the keyword FIND instead.

Issue 9: Include Annotation syntax in Turtle* and SPARQL* concrete-syntax sparql*

This has already been discussed on the mailing list.

The idea would be to have a notation like

:bob :age 42 {| :source <http://example.org/~bob/> |}.

as shortcut for

:bob :age 42.
<< :bob :age 42 >> :source <http://example.org/~bob/>.

4.3 Translation to the Algebra

Based on the SPARQL grammar, the SPARQL specification defines the process of converting graph patterns and solution modifiers in a SPARQL query string into a SPARQL algebra expression [SPARQL11-QUERY, Section 18.2]. This process must be adjusted to consider the extended grammar introduced above. In the following, any step of the conversion process that requires adjustment is discussed.

4.3.1 Variable Scope

As a basis of the translation, the SPARQL specification introduces a notion of in-scope variables. To cover the new syntax elements introduced in § 4.2 Grammar this notion MUST be extended as follows.

A variable is in-scope of a BGP-star B if the variable occurs in B, which includes an occurrence in any embedded triple pattern in B (independent of the level of nesting).
A variable is in-scope of a property path pattern if the variable occurs in that pattern, which includes an occurrence in any embedded triple pattern in the pattern (independent of the level of nesting).
A variable is in-scope of a BIND clause of the form BIND ( T AS v ) (where T is an embedded triple pattern) if the variable is variable v or the variable occurs in the embedded triple pattern T. As for standard BIND clauses with expressions, variable v must not [be] in-scope from the preceding elements in the group graph pattern in which [the BIND clause] is used [SPARQL11-QUERY, Section 18.2.1]].

4.3.2 Expand Syntax Forms

The translation process starts with expanding abbreviations for IRIs and triple patterns [SPARQL11-QUERY, Section 18.2.2.1]. This step MUST be extended in two ways:

Abbreviations for triple patterns with embedded triple patterns MUST be expanded as if each embedded triple pattern was a variable (or an RDF term).
For instance, the following syntax expression:
Example 5
```
<<?c a owl:Class>> dct:source ?src ;
    :entailing <<?c a rdfs:Class>> .
```
must be expanded to
Example 6
```
<<?c a owl:Class>> dct:source ?src .
<<?c a rdfs:Class>> :entailing <<?c a rdfs:Class>> .
```
Abbreviations for IRIs in all embedded triple patterns MUST be expanded.
For instance, the embedded triple pattern
Example 7
```
<<?c a rdfs:Class>>
```
must be expanded to
Example 8
```
<<?c <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2000/01/rdf-schema#Class>>>
```

4.3.3 Translate Property Path Patterns

The translation of property path patterns has to be adjusted because the extended grammar allows for SPARQL-star property path patterns whose subject or object is a SPARQL-star triple pattern.

The translation as specified in the W3C specification distinguishes four cases. The first three of these cases do not require adjustment because they are taken care of either by recursion or by the adjusted translation of basic graph patterns (as defined in § 4.3.4 Translate Basic Graph Patterns below). However, the fourth case MUST be adjusted as follows.

Let X P Y be a string that corresponds to the fourth case in [SPARQL11-QUERY, Section 18.2.2.4]. Given the grammar introduced in § 4.2 Grammar, X and Y may be an RDF term, a variable, or an embedded triple pattern, respectively (and P is a property path expression). The string X P Y is translated to the algebra expression Path(X’,P,Y’) where X’ and Y’ are the result of calling a function named Lift for X and Y, respectively. For some input string Z (such as X or Y) that can be an RDF term, a variable, or an embedded triple pattern, the function Lift is defined recursively as follows:

If Z is an embedded triple pattern <<S,P,O>> then return the SPARQL-star triple pattern (Lift(S), P, Lift(O));
Otherwise, return Z.

Note

The purpose of this translation step is to convert any property path pattern as can be written based on the extended grammar for SPARQL-star (cf. § 4.2 Grammar) into a SPARQL-star property path pattern as considered in the algebra. To this end, the function Lift translates every embedded triple pattern as can be written in the SPARQL-star syntax into a SPARQL-star triple pattern.

4.3.4 Translate Basic Graph Patterns

After translating property path patterns, the translation process collects any adjacent triple patterns [...] to form a basic graph pattern [SPARQL11-QUERY, Section 18.2.2.5]. This step has to be adjusted because triple patterns in the extended syntax may have an embedded triple pattern in their subject position or in their object position (or in both). To ensure that every result of this step is a BGP-star, before adding a triple pattern to its corresponding collection, its subject and object MUST be replaced by the result of calling function Lift for the subject and the object, respectively.

4.3.5 Translate BIND Clauses with an Embedded Triple Pattern

The extended grammar in § 4.2 Grammar allows for BIND clauses with an embedded triple pattern. The translation of such a BIND clause to a SPARQL algebra expression requires a new algebra symbol:

TR( SPARQL-star triple pattern, variable )

Then, any string of the form BIND( T AS v ) with T being an embedded triple pattern (i.e., not a standard BIND expression) is translated to the algebra expression TR(T’, v) where T’ is the result of the function Lift for T.

Notice, the translation of BIND clauses with an embedded triple pattern as defined in this section is used during the translation of group graph patterns. The case of BIND clauses with an embedded triple pattern is covered in this translation of group graph patterns by the last, “catch all other” IF statement (i.e., the IF statement with the condition E is any other form) and not by the IF statement for BIND clauses with an expression.

4.4 Evaluation Semantics

The SPARQL specification defines a function eval(D(G), algebra expression) as the evaluation of an algebra expression with respect to a dataset D having active graph G [SPARQL11-QUERY, Section 18.6]. Recall that the dataset D in the context of SPARQL-star is an RDF-star dataset and, thus, the active graph G is an RDF-star graph, and so is any other graph in dataset D. The definition of the eval function is recursive; the two base cases of this definition for SPARQL-star are given as follows:

For every BGP-star B, eval(D(G), B) is a multiset Ω that consists of all SPARQL-star solution mappings that are a solution for the BGP-star B over G. For every such mapping μ, card[Ω](μ) is the number of distinct RDF-star instance mappings σ such that dom(σ) is equivalent to the set of blank nodes in B and μ(σ(B)) is a subgraph of G. (For any SPARQL-star solution mapping μ' that is not a solution for B over G, we have that card[Ω](μ')=0; i.e., μ' is not in Ω.)
For any algebra expression E of the form TR(tp, ?v) where tp is a SPARQL-star triple pattern and ?v is a variable (as introduced in § 4.3.5 Translate BIND Clauses with an Embedded Triple Pattern), eval(D(G), E) is a multiset Ω that consists of as many SPARQL-star solution mappings as there are solution mappings in Ω', where Ω'=eval(D(G),{tp}), such that for every μ' in Ω' there exists a μ in Ω that has the following four properties:
1. dom(μ) = dom(μ') ∪ {?v}
2. μ and μ' are compatible
3. μ(?v) = μ'(tp)
4. card[Ω](μ) = card[Ω'](μ')

For any other algebra expression, the SPARQL specification defines algebra operators [SPARQL11-QUERY]. These definitions can be extended naturally to operate over multisets of SPARQL-star solution mappings (instead of ordinary solution mappings). Given this extension, the recursive steps of the definition of the eval function for SPARQL-star are the same as in the SPARQL specification.

4.5 Query Result Formats

In SPARQL, queries can take four forms: SELECT, CONSTRUCT, DESCRIBE, and ASK - see SPARQL1.1 Query, Section 16 [SPARQL11-QUERY]. The first of these returns a sequence of solution mappings that contain variable bindings. The second and third both return an RDF graph, and the last returns a boolean value.

The result of the ASK query form is not changed by the introduction of RDF-star, and the result of the CONSTRUCT and DESCRIBE forms can be represented by Turtle-star. However, since the SELECT form deals with returning individual RDF terms, the specific serialization formats for representing such query results need to be extended so that the new embedded triple RDF term can be represented. In this section, we propose extensions for the two most common formats for this purpose: SPARQL 1.1 Query Results JSON Format, and SPARQL Query Results XML Format (Second Edition).

Issue 43: New mime types and XML namespace for the extended query result formats concrete-syntax sparql*

In addition to defining the extended formats for serializing the result of a SPARQL* SELECT query (#12 and #13), we have to decide whether we need/want new mime types for these extended formats? Similarly, do we need/want to introduce another namespace for the extended XML result format?

4.5.1 SPARQL-star Query Results JSON Format

The result of a SPARQL SELECT query is serialized in JSON as defined in SPARQL 1.1 Query Results JSON Format, which specifies a JSON representation of variable bindings to RDF terms (see [sparql11-results-json, Section 3.2]). To accommodate the new RDF term for embedded triples that RDF-star introduces, the table of RDF term JSON representations in sparql11-results-json, Section 3.2.2 is extended with the following entry:

An embedded triple with subject RDF term S, predicate RDF term P and object RDF term O

{
  "type": "triple",
  "value": {
     "subject": S,
     "predicate": P,
     "object": O
  }
}

where S, P and O are encoded using the same format, recursively.

Consider the following RDF term, an embedded triple in Turtle-star syntax:

Example 9

<< <http://example.org/alice> <http://example.org/name> "Alice" >>

This term is represented in JSON as follows:

Example 10

{
  "type": "triple",
  "value": {
     "subject": {
        "type": "uri",
        "value" "http://example.org/alice"
     },
     "predicate": {
        "type": "uri",
        "value" "http://example.org/name"
     },
     "object": {
        "type": "literal",
        "value" "Alice",
        "datatype": "http://www.w3.org/2001/XMLSchema#string"
     },
  }
}

4.5.2 SPARQL-star Query Results XML Format

The result of a SPARQL SELECT query is serialized in XML as defined in SPARQL Query Results XML Format (Second Edition). This format proposes an XML representation of variable bindings to RDF terms.

To accommodate the new RDF term for embedded triples that RDF-star introduces, the list of RDF terms and their XML representations in [rdf-sparql-XMLres, Section 2.3.1] is extended as follows:

An embedded triple with subject term S, predicate term P, and object term O

<binding>
  <triple>
    <subject>S</subject>
    <predicate>P</predicate>
    <object>O</object>
  </triple>
</binding>

where S, P and O are encoded recursively, using the same format, without the enclosing <binding> tag.

Consider the following RDF term, an embedded triple in Turtle-star syntax:

Example 11

<< <http://example.org/alice> <http://example.org/name> "Alice" >>

This term is represented in XML as follows:

Example 12

<triple>
    <subject>
        <uri>http://example.org/alice</uri>
    </subject>
    <predicate>
        <uri>http://example.org/name</uri>
    </predicate>
    <object>
        <literal datatype='http://www.w3.org/2001/XMLSchema#string'>Alice</literal>
    </object>
</triple>

6. RDF-star Semantics

In this section, we provide a model-theoretic semantics for RDF-star, based on the one defined in RDF 1.1 Semantics [RDF11-MT]. More precisely, we define a mapping from RDF-star's abstract syntax into standard RDF's abstract syntax, and define the semantics of RDF-star graphs in terms of the semantics of the mapped RDF graphs.

In the following, we introduce a number of definitions specific to RDF-star, which rely on the following notions, defined in RDF 1.1 Concepts and Abstract Syntax [RDF11-CONCEPTS] and RDF 1.1 Semantics [RDF11-MT]: datatype, lexical form, simple literal, ill-typed, merging, satisfiability, unsatisfiability, entailment, and equivalence.

6.1 Mapping RDF-star abstract syntax to RDF

We consider six IRIs ST, PT, OT, SS, PS and OS that will have a special meaning in our mapping.

We define a mapping L that maps any IRI or literal t to a literal with

xsd:string as its datatype, and
the canonical N-Triples representation of t as its lexical form [N-TRIPLES]. If t is itself a literal with the xsd:string datatype, the representation MUST be a simple literal.

Given an RDF-star graph G, the following steps transform it into an RDF graph that we call unstar(G).

Pick an RDF-star triple (s, p, o) in the constituents of G such that neither s nor o is an embedded triple.
Mint a fresh blank node b, and replace by b all occurrences of (s, p, o) in the subject or object position of an asserted or embedded triple of G.
Add the following asserted triples to G:
- (b, ST, s)
- (b, PT, p)
- (b, OT, o) unless o is an ill-typed literal
- (b, SS, L(s)) unless s is blank node
- (b, PS, L(p))
- (b, OS, L(o)) unless o is blank node
Repeat the steps above until there are no embedded triples left in G.

After these steps, unstar(G) is an RDF graph, as it contains no embedded triples. Note that if G was already an RDF graph, then unstar(G) = G.

6.2 Entailment of RDF-star graphs

Following RDF 1.1 Semantics, we extend the notions of satisfiability and entailment for RDF-star graphs. Given two RDF-star graphs G and H:

We say that G is (simply) satisfiable (resp. unsatisfiable) if and only if unstar(G) is (simply) satisfiable (resp. unsatisfiable).
We say that H is (simply) entailed by (resp. equivalent to) G if and only if unstar(H) is (simply) entailed by (resp. equivalent to) unstar(G).
Other notions of satisfiability and entailment, such as RDF entailment or RDFS entailment, can be extended in the same way for RDF-star graphs.

6.3 Remarks

This section is non-normative.

6.3.1 Combining RDF-star graphs

Care must be taken when RDF graphs that result from RDF-star graphs are combined through union or merging. Given two RDF-star graphs G and H, it may be the case that unstar(G ∪ H) ≠ unstar(G) ∪ unstar(H). More precisely, if G and H contain the same embedded triple, this triple will be mapped to a single blank node in unstar(G ∪ H), but in two potentially different blank nodes in unstar(G) ∪ unstar(H). These blank nodes will need to be unified in order to get the correct entailments.

6.3.2 Considerations on interoperability

The special properties (ST, PT, etc.), used in the mapping for representing embedded triples in plain RDF, are deliberately not specified. As a consequence, although any RDF-star graph G is semantically equivalent to an RDF graph unstar(G), that latter graph is implementation dependent, as different systems will use a different concrete IRI for each special property.

This makes it impossible for RDF-star-aware systems to reliably exchange RDF-star graphs in their mapped form using non-RDF-star concrete syntaxes (unless of course the RDF-star graph contains no embedded triple). However, such systems can always use Turtle-star or other extended concrete syntaxes, so that does not limit interoperability among them. On the other hand, it prevents the unrestricted use of the special properties, because that may lead to surprising corner cases, as illustrated in Example 13. Supporting these corner cases would be a significant burden on RDF-star implementations, for a very limited utility.

Furthermore, it is expected that some implementations will not rely on the mapping, but represent and work directly with the abstract syntax of RDF-star. For these implementations, having to handle both the native and the mapped representation of embedded triples would be even more challenging.

Example 13

:alice :says << :bob :age 42 >>.
<< :bob :age 42 >> :ST :charlie;
                   :SS "<http://example.org/charlie>".

# assuming that :ST and :SS stand for the corresponding special IRIs,
# the graph above entails the graph below

:alice :says << :charlie :age 42 >>.

The exchange of a mapped graph unstar(G) using standard RDF concrete syntaxes, with non-standard IRIs in place of the special properties, is however possible and useful when communicating with legacy RDF systems. For those systems, the special properties have no special meaning, so using non-standard IRIs makes no difference to them.

Issue 95: alternative semantic characterisations for RDF* semantics

There are several resent proposals for semantics.

The proposals can be found in
1/ #81
2/ #88
3/ https://lists.w3.org/Archives/Public/public-rdf-star/2021Jan/0057.html
4/ https://lists.w3.org/Archives/Public/public-rdf-star/2021Jan/0059.html
and
5/ https://lists.w3.org/Archives/Public/public-rdf-star/2021Feb/0038.html

All proposals create RDF graphs and the semantics of RDF* is defined as a semantics for these RDF graphs.

The differences between them lie in four areas:
A) RDF* graphs as an abstract syntax
B) hidden vocabulary for RDF* reification
C) special datatype(s) for embedded triples
D) extended semantics for RDF* reification vocabulary

First, there is whether there is the notion of an RDF* graph. RDF* surface syntax can be expressed by parsing RDF* surface syntax into RDF* graphs and then transforming these RDF* graphs into RDF graphs. Alternatively, RDF* surface syntax can be expressed by parsing surface syntax directly into RDF graphs.

Second, there is whether the vocabulary used to reify embedded triples is hidden or not. If so, then RDF* embedded triples cannot be constructed by means other than embedded triple syntax. If not, then regular RDF constructs (almost certainly reification) can be used to get the same effect as embedded triples.

Third, there is whether one or more special datatypes are needed for the subject, predicate, or object of embedded triples.

Fourth, there is whether the semantics of RDF* needs an extension of RDF semantics on the resultant RDF graphs (aside from the semantics of any new datatypes).

Here is a table of my understanding of how the proposals stand on the above differences.

Characteristic	1/ P81	2/ P88	3/57	4/ 59	5/ 38
A) RDF* graphs	YES	YES	YES	YES	NO
B) hidden vocabulary	YES	YES	NO	NO	NO
C) special datatype(s)	YES	NO	YES	NO	NO
D) extended semantics	YES	NO	NO	NO	NO

RDF-star and SPARQL-star

Draft Community Group Report 18 February 2021

Abstract

Status of This Document

1. Introduction

1.1 Background and Motivation

1.2 Overview

1.3 Conformance

2. Concepts and Abstract Syntax

2.1 Triples and occurrences

3. Concrete Syntaxes

3.1 Turtle-star

3.1.1 Grammar

3.1.2 Parsing

3.2 N-Triples-star

3.2.1 Grammar

3.2.2 Parsing

3.3 N-Quads-star

3.4 Other Concrete Syntaxes

4. SPARQL-star Query Language

4.1 Initial Definitions

4.2 Grammar

4.3 Translation to the Algebra

4.3.1 Variable Scope

4.3.2 Expand Syntax Forms

4.3.3 Translate Property Path Patterns

4.3.4 Translate Basic Graph Patterns

4.3.5 Translate BIND Clauses with an Embedded Triple Pattern

4.4 Evaluation Semantics

4.5 Query Result Formats

4.5.1 SPARQL-star Query Results JSON Format

4.5.2 SPARQL-star Query Results XML Format

5. SPARQL-star Update

6. RDF-star Semantics

6.1 Mapping RDF-star abstract syntax to RDF

6.2 Entailment of RDF-star graphs

6.3 Remarks

6.3.1 Combining RDF-star graphs

6.3.2 Considerations on interoperability

A. Historical remarks

A.1 SA-mode and PG-mode

A.2 The seminal example

B. Issue Summary

C. References

C.1 Normative references

C.2 Informative references