SHACL Advanced Features 1.1

This document describes advanced features of the Shapes Constraint Language (SHACL) [[!shacl]] including features to define custom targets, annotation properties, user-defined functions, node expressions and rules. While many of these features rely on SPARQL, they also define extension points that can be used by other implementation languages.

Prefix	Namespace
`rdf:`	`http://www.w3.org/1999/02/22-rdf-syntax-ns#`
`rdfs:`	`http://www.w3.org/2000/01/rdf-schema#`
`sh:`	`http://www.w3.org/ns/shacl#`
`xsd:`	`http://www.w3.org/2001/XMLSchema#`
`ex:`	`http://example.com/ns#`

Custom Targets

In general, targets define a mechanism that is used by SHACL engines to determine the focus nodes that should be validated against a given shape. SHACL Core [[!shacl]] defines a fixed set of Core targets by means of properties such as sh:targetClass. These Core targets were designed to cover a large number of use cases while retaining a simple declarative data model. However, in some use cases, these Core targets are not sufficient. For example it is impossible to state that a shape should apply only to a subset of instances of a class, e.g. persons born in the USA. Neither is it possible to state that a shape should apply to all subjects in a graph, or to nodes selected by completely different, application-specific mechanisms.

This section defines richer mechanisms to define targets, called custom targets. Custom targets are the values of the property sh:target in the shapes graph. The property sh:target has a similar status as, for example, sh:targetClass, and all subjects of sh:target triples are also shapes.

The values of sh:target at a shape are IRIs or blank nodes.

A SHACL engine that supports custom targets uses the values of the custom target node to compute the target nodes for the associated shape. The algorithm that is used for this computation depends on the rdf:type of the custom target. The following sub-sections define two such algorithms:

SPARQL-based targets
SPARQL-based target types

However, other types of targets can be supported by other extension languages such as JavaScript. The class sh:Target is the recommended base class for such extensions.

The behavior of a SHACL engine that is unable to handle a given custom target is left undefined. SHACL Core processors do not even need to be aware of the existence of the sh:target property. Engines that are aware of this property and cannot handle a given custom target SHOULD at least report a warning.

SPARQL-based Targets

Custom targets that are SHACL instances of sh:SPARQLTarget are called SPARQL-based targets.

SPARQL-based targets have exactly one value for the property sh:select. SPARQL-based targets may have values for the property sh:prefixes and these values are IRIs or blank nodes. Using the values of sh:prefixes as defined by 5.2.1 Prefix Declarations for SPARQL Queries, the values of sh:select must be valid SPARQL 1.1 SELECT queries with a single result variable this. SPARQL-based targets have at most one value for the property sh:ask.

The following example declares a well-formed SPARQL-based target that produces all persons born in the USA:

ex:
	sh:declare [
		sh:prefix "ex" ;
		sh:namespace "http://example.com/ns#"^^xsd:anyURI ;
	] .
				
ex:USCitizenShape
	a sh:NodeShape ;
	sh:target [
		a sh:SPARQLTarget ;
		sh:prefixes ex: ;
		sh:select """
			SELECT ?this
			WHERE {
				?this a ex:Person .
				?this ex:bornIn ex:USA .
			}
			""" ;
	] ;
	...

TEXTUAL DEFINITION

Let Q be the SPARQL SELECT query derived from the values of sh:select and sh:prefixes of the SPARQL-based target T. The target nodes of T are the bindings of the variable this returned by Q against the data graph.

While the SELECT queries can be used to identify all focus nodes for a given shape, SHACL processors sometimes also need to compute the inverse direction and find all shapes for which a given node needs to be validated against. For this reason, the following semantic restriction is recommended for SELECT queries used in SPARQL-based targets. Informally, SHACL Full processors should be able to derive an equivalent ASK query from the SELECT query, pre-bind the potential focus node, and check whether the potential focus node needs to be validated against the shape that has the given target. Formally, let A be a SPARQL ASK query that is produced by replacing the SelectClause with ASK in the outermost SELECT query. Let rs be the set of RDF terms returned as bindings for the variable this in the solutions of the SELECT query. Then A returns true if and only if the variable this is pre-bound with a value from rs. If the SELECT query of a SPARQL-based target does not fulfill this requirement, it needs to be accompanied by a SPARQL ASK query as the value for sh:ask. A SHACL engine can then determine whether a given shape applies to a given node by executing the ASK query with the variable this pre-bound to the node. If the ASK query evaluates to true then the node is in the target of the shape.

SPARQL-based Target Types

In some cases it would be too repetitive to declare SPARQL-based targets with similar SPARQL queries that only differ in a few aspects. SHACL-SPARQL defines a mechanism for user-defined constraint components, allowing users to reuse the same SPARQL query in a parameterized form. The SPARQL-based target types introduced in this section follow a similar design.

The class sh:TargetType can be used to declare high-level vocabularies for targets in a shapes graph. The class sh:SPARQLTargetType is declared as rdfs:subClassOf sh:TargetType for SPARQL-based target types. Other extension languages may define alternative execution instructions for target types with the same IRI, making them potentially more platform independent than pure SPARQL-based targets. Instances of the class sh:SPARQLTargetType specify a SPARQL SELECT query via the property sh:select, and this query has to fulfill the same syntactic and semantic rules as SPARQL-based targets.

Similar to SPARQL-based constraint components, such targets take parameters and the parameter values become pre-bound variables in the associated SPARQL queries. The parameter values of such targets cannot not be blank nodes, and the same target cannot have more than one value per parameter. A target that lacks a value for a non-optional parameter is ignored, producing no target nodes. Similar to SPARQL-based constraint components, target types may also have values for the property sh:labelTemplate.

The following example declares a new SPARQL-based target type that takes one parameter ex:country that gets mapped into the variable country in the corresponding SPARQL query to determine the resulting target nodes.

ex:PeopleBornInCountryTarget
	a sh:SPARQLTargetType ;
	rdfs:subClassOf sh:Target ;
	sh:labelTemplate "All persons born in {$country}" ;
	sh:parameter [
		sh:path ex:country ;
		sh:description "The country that the focus nodes are 'born' in." ;
		sh:class ex:Country ;
		sh:nodeKind sh:IRI ;
	] ;
	sh:select """
		PREFIX ex: <http://example.com/ns#>
		SELECT ?this
		WHERE {
			?this a ex:Person .
			?this ex:bornIn $country .
		}
		""" .

Once such a target type has been defined in a shapes graph, it can be used by multiple shapes:

ex:GermanCitizenShape
	a sh:NodeShape ;
	sh:target [
		a ex:PeopleBornInCountryTarget ;
		ex:country ex:Germany ;
	] ;
	...
	
ex:USCitizenShape
	a sh:NodeShape ;
	sh:target [
		a ex:PeopleBornInCountryTarget ;
		ex:country ex:USA ;
	] ;
	...

The set of focus nodes produced by such a target type consists of all bindings of the variable this in the result set, when the SPARQL SELECT query has been executed with the pre-bound parameter values.

Property	Summary and Syntax Rules
`sh:annotationProperty`	The property that shall be set. Each result annotation has exactly one value for the property `sh:annotationProperty` and this value is an IRI.
`sh:annotationVarName`	The name of the SPARQL variable to take the annotation values from. Each result annotation has at most 1 value for the property `sh:annotationVarName` and this value is literal with datatype `xsd:string`.
`sh:annotationValue`	Constant RDF terms that shall be used as default values.

Node Expressions

This section defines a feature called node expressions. Node expressions are declared as RDF nodes in a shapes graph and instruct a SHACL engine how to compute a list of nodes for a given focus node. Each node expression has one of the following types, each of which is defined together with its evaluation semantics in the following sub-sections.

Node Expression Type	Syntax (Informative)	Summary (Informative)
Focus Node Expression	`sh:this`	The list consisting of the current focus node.
Constant Term Expression	Any IRI or literal except `sh:this`	The list consisting of the given term.
Function Expression	Blank node with a list-valued triple	The results of evaluating a given SHACL Function.
Path Expression	Blank node with `sh:path`	The values of a given property path.
Exists Expression	Blank node with `sh:exists`	The list consisting of either `true` or `false` depending on whether input nodes exist.
If Expression	Blank node with `sh:if`	The results of either `sh:then` or `sh:else` depending on whether the `sh:if` node expression is [`true`].
Filter Shape Expression	Blank node with `sh:filterShape`	The sub-list of the input nodes that conform to a given shape.
Intersection Expression	Blank node with `sh:intersection`	The intersection of two or more input node lists.
Union Expression	Blank node with `sh:union`	The union (concatenation) of two or more input node lists.
Minus Expression	Blank node with `sh:minus` and `sh:nodes`	The input nodes except those that are in another "minus" list.
Distinct Expression	Blank node with `sh:distinct`	The sub-list of the input nodes that are distinct, eliminating duplicates.
Count Expression	Blank node with `sh:count`	The number of input nodes as a single `xsd:integer` node.
Min Expression	Blank node with `sh:min`	The smallest of the input nodes.
Max Expression	Blank node with `sh:max`	The largest of the input nodes.
Sum Expression	Blank node with `sh:sum`	The sum of the input nodes.
Group Concat Expression	Blank node with `sh:groupConcat`	A string concatenation of all input nodes.
OrderBy Expression	Blank node with `sh:orderBy` and `sh:nodes`	The input nodes, ordered by a given expression.
Limit Expression	Blank node with `sh:limit` and `sh:nodes`	Only the first N of the input nodes.
Offset Expression	Blank node with `sh:offset` and `sh:nodes`	The input nodes except the first N ones.
SPARQL ASK Expression	Blank node with `sh:ask`.	An `xsd:boolean` based on the result of a SPARQL ASK query.
SPARQL SELECT Expression	Blank node with `sh:select`.	The results of a SPARQL SELECT query.

The basic idea of these expressions is that they can be used to derive a list of RDF nodes from a given focus node, for example all values of a given property of the focus node. Some of these expressions can use the output of another expression as their input, leading to evaluation chains and trees.

The following example declares a node expression that produces the display labels of all values of the property ex:customer that conform to a given shape ex:GoodCustomerShape. The assumption here is that there is a SHACL function ex:displayLabel which declares a single parameter.

[
	ex:displayLabel ( [
		sh:filterShape ex:GoodCustomerShape ;
		sh:nodes [ sh:path ex:customer ] ;
	] )
] .

To evaluate this example, an engine gets all values of ex:customer of the focus node, then filters them according to the shape ex:GoodCustomerShape and repeatedly calls the SHACL function ex:displayLabel with all values that pass the filter shape as arguments.

Important use cases of such expressions are expression constraints and SHACL rules, yet the basic functionality and vocabulary may find many other application areas.

Each of the following sub-sections defines a node expression type with its syntax rules and evaluation semantics based on a mapping operation Eval($expr, $this) where the first argument $expr is the given expression, $this is the current focus node and which produces a list of RDF nodes. Unless specified otherwise, the members of these lists are handled in their natural order.

A node expression cannot recursively have itself as a "nested" node expression, e.g. as value of sh:nodes.

Focus Node Expressions

The IRI sh:this is the (only) node declaring a focus node expression.

EVALUATION OF FOCUS NODE EXPRESSIONS

For the focus node expression sh:this, Eval(sh:this, $this) produces the list of length 1 with $this as its only member.

Constant Term Expressions

Any literal or IRI except sh:this declares a constant term expression.

EVALUATION OF CONSTANT TERM EXPRESSIONS

For the constant term expression $expr, Eval($expr, $this) produces the list of length 1 with $expr as its only member.

Exists Expressions

An exists expression is a blank node with exactly one value for sh:exists (which is a well-formed shape).

EVALUATION OF EXISTS EXPRESSIONS

For the exists expression $expr with N being the node expression that is the value of sh:exists, Eval($expr, $this) produces the list consisting of exactly the node true if Eval(N, $this) produces at least one node, and the list consisting of exactly the node false otherwise.

In the following example, sh:exists is used to test whether the current focus node has any value for an example property. This is comparable to the SPARQL expression on the right.

[
    sh:exists [ sh:path ex:someProperty ]
] .

{
    BIND (EXISTS { $this ex:someProperty ?any } AS ?result)
}

If Expressions

An if expression is a blank node with exactly one value for sh:if (which is a well-formed node expression), at most one value for sh:then (which is a well-formed node expression) and at most one value for sh:else (which is a well-formed node expression).

EVALUATION OF IF EXPRESSIONS

For the if expression $expr with IF being the node expression that is the value of sh:if, and THEN and ELSE being corresponding values of sh:then and sh:else, resp. If Eval(IF, $this) produces the list consisting of exactly the node true then produce Eval(THEN, $this) (or the empty list if THEN is absent), otherwise produce Eval(ELSE, $this) (or the empty list if ELSE is absent).

The following example produces the string node "married" if the focus node has a spouse, "not married" otherwise.

[
    sh:if [ sh:exists [ sh:path ex:spouse ] ] ;
    sh:then "married" ;
    sh:else "not married" ;
] .

{
    BIND (IF(EXISTS { $this ex:spouse ?any }, 
            "married", 
            "not married") AS ?result)
}

Filter Shape Expressions

A filter shape expression is a blank node with exactly one value for sh:filterShape (which is a well-formed shape) and at most one value for sh:nodes (which is a well-formed node expression).

EVALUATION OF FILTER SHAPE EXPRESSIONS

For the filter shape expression $expr with S being the shape that is the value of sh:filterShape and N being the node expression that is the value of sh:nodes (defaulting to the focus node expression if absent), Eval($expr, $this) produces the list of nodes for each node n produced by Eval(N, $this) where n conforms to S.

The following example returns all values of ex:child that conform to the given SHACL shape. In order to conform to the shape, the child must have "male" as one of its values for ex:gender.

[
    sh:nodes [ sh:path ex:child ] ;
    sh:filterShape [
        sh:property [
            sh:path ex:gender ;
            sh:hasValue "male" ;
        ] ;
    ] ;
]

Function Expressions

A function expression is a blank node that does not fulfill any of the syntax rules of the other node expression types and which is the subject of exactly one triple T where the object is a well-formed SHACL list, and each member of that list is a well-formed node expression.

EVALUATION OF FUNCTION EXPRESSIONS

For the function expression $expr, Eval($expr, $this) produces the list of nodes returned by evaluating the SHACL function specified as predicate of the triple T mentioned above. The arguments of the function call(s) are based on the results of the node expressions listed in the object list of T so that the first list member is used for the first argument, etc. This is done for all combinations of nodes produced by each node expression. If one of the node expressions produces the empty list and the corresponding function parameter is non-optional (see sh:optional), then the result is the empty list.

As illustrated in the following example, function expressions are comparable to SPARQL BIND clauses.

[  ex:concat ( 
		[ sh:path ex:firstName ]
		" "
		[ sh:path ex:lastName ] 
	)
] .

{
	$this ex:firstName ?a .
	$this ex:lastName ?b .
	BIND (ex:concat(?a, " ", ?b) AS ?result) .
}

Path Expressions

A path expression is a blank node with exactly one value of the property sh:path (which are well-formed property paths) and at most one value for sh:nodes (which is a well-formed node expression).

EVALUATION OF PATH EXPRESSIONS

For the path expression $expr that has the property path P as its value for sh:path and the node expression N as its value for sh:nodes (defaulting to the focus node expression if absent), Eval($expr, $this) produces the list of values of all nodes produced by Eval(N, $this) for the property path P. The order of results of these lists is undefined and they do not contain duplicates.

As illustrated in the following examples, path expressions are comparable to SPARQL basic graph patterns.

[  sh:path ex:firstName ] .

[  sh:nodes [ sh:path ex:children ] ;
   sh:path rdfs:label ;
] .

{   $this ex:firstName ?result . }

{   $this ex:children ?a .
    ?a rdfs:label ?result .
}

Intersection Expressions

An intersection expression is a blank node with exactly one value for the property sh:intersection which is a well-formed SHACL list with at least two members (which are well-formed node expressions).

EVALUATION OF INTERSECTION EXPRESSIONS

For the intersection expression $expr that has the list L as its value for sh:intersection, Eval($expr, $this) produces the list of nodes that are produced by the first node expression from the list and are also in the result lists produced by the other members of L.

Union Expressions

A union expression is a blank node with exactly one value for the property sh:union which is a well-formed SHACL list with at least two members (which are well-formed node expressions).

EVALUATION OF UNION EXPRESSIONS

For the union expression $expr that has the list L as its value for sh:union, Eval($expr, $this) produces the list of nodes that are the concatenation of the result lists produced by all of the members of L, preserving their original order.

As illustrated in the following example, union expressions are comparable to SPARQL UNION clauses. Similar to SPARQL, they may produce duplicate results.

[  sh:union (
		[ sh:path ex:firstName ]
		[ sh:path ex:givenName ]
   )
] .

{
	$this ex:firstName ?result . 
} UNION {
	$this ex:givenName ?result . 
}

Minus Expressions

A minus expression is a blank node with exactly one value for the property sh:minus which is a well-formed node expression and exactly one value for the property sh:nodes which is a well-formed node expression.

EVALUATION OF MINUS EXPRESSIONS

For the minus expression $expr that has the node expression N as its value for sh:minus and the node expression M as its value for sh:nodes, Eval($expr, $this) produces the list of nodes that are in Eval(N, $this) but not in Eval(M, $this).

In the following example, sh:minus returns all values of the property ex:children except those that are also values of ex:sons.

[
    sh:nodes [ sh:path ex:children ] ;
    sh:minus [ sh:path ex:sons ] ;
] .

{
    $this ex:children ?result .
    MINUS {
        $this ex:sons ?result .
    }
}

Distinct Expressions

A distinct expression is a blank node with exactly one value for the property sh:distinct which is a well-formed node expression.

EVALUATION OF DISTINCT EXPRESSIONS

For the distinct expression $expr that has the node expression N as its value for sh:distinct, Eval($expr, $this) produces the list of nodes that are in Eval(N, $this) dropping any duplicate nodes.

The following example returns all values of ex:customer and ex:client but without duplicates.

[
    sh:distinct [
        sh:union (
            [ sh:path ex:customer ] ;
            [ sh:path ex:client ] ;
        )
    ]
] .

SELECT DISTINCT (?c AS ?result)
WHERE {
    { $this ex:customer ?c }
    UNION
    { $this ex:client ?c }
}

Count Expressions

A count expression is a blank node with exactly one value for the property sh:count which is a well-formed node expression.

EVALUATION OF COUNT EXPRESSIONS

For the count expression $expr that has the node expression N as its value for sh:count, Eval($expr, $this) produces the list consisting of exactly one literal with datatype xsd:integer representing the size of the input list produced by Eval(N, $this).

The following example returns the number of values of ex:customer.

[
    sh:count [ sh:path ex:customer ]
] .

SELECT (COUNT(?c) AS ?result)
WHERE {
    $this ex:customer ?c .
}

Min Expressions

A min expression is a blank node with exactly one value for the property sh:min which is a well-formed node expression.

EVALUATION OF MIN EXPRESSIONS

For the min expression $expr that has the node expression N as its value for sh:min, Eval($expr, $this) produces the empty list if the input list produced by Eval(N, $this) is empty, or the list consisting of exactly one node that is the minimum input node as defined by SPARQL MIN.

The following example returns the minimum of all values of ex:turnOver of all values of ex:client at the focus node.

[
    sh:min [ sh:path ( ex:client ex:turnOver ) ]
] .

SELECT (MIN(?t) AS ?result)
WHERE {
    $this ex:client/ex:turnOver ?t .
}

Max Expressions

A max expression is a blank node with exactly one value for the property sh:max which is a well-formed node expression.

EVALUATION OF MAX EXPRESSIONS

For the max expression $expr that has the node expression N as its value for sh:max, Eval($expr, $this) produces the empty list if the input list produced by Eval(N, $this) is empty, or the list consisting of exactly one node that is the maximum input node as defined by SPARQL MAX.

Sum Expressions

A sum expression is a blank node with exactly one value for the property sh:sum which is a well-formed node expression.

EVALUATION OF SUM EXPRESSIONS

For the sum expression $expr that has the node expression S as its value for sh:sum, and for which the results of Eval(S, $this) are the nodes n, Eval($expr, $this) produces the list consisting of exactly one node that is the sum of all n as defined by SPARQL SUM, producing the empty list of any of the nodes can not be summed up.

Group Concat Expressions

A group concat expression is a blank node with exactly one value for the property sh:groupConcat which is a well-formed node expression. A group concat expression can have a single value for the property sh:separator which is literal with datatype xsd:string.

EVALUATION OF GROUP CONCAT EXPRESSIONS

For the group concat expression $expr that has the node expression C as its value for sh:groupConcat, Eval($expr, $this) produces the list consisting of exactly one literal with datatype xsd:string consisting of the string representation of the nodes from the input list produced by Eval(C, $this). If S is the value of sh:separator then S will be inserted in between each pair of strings. It is up to the implementation to decide how to represent individual nodes as strings. For example, resources may be represented by their values of rdfs:label in the user's preferred language. However, if an individual node is a literal with datatype xsd:string then the lexical form of that node is used.

Tip: in order to control exactly how individual values get rendered, wrap the input node expression with a function expression that produces a string literal.

The following example returns a comma-separated list of the labels of the values of ex:child at the focus node, ordered by their individual labels. Here we assume each child has at most one label, to make the result predictable.

[
    sh:groupConcat [
        sh:path rdfs:label ;
        sh:nodes [ 
            sh:nodes [ sh:path ex:child ] ;
            sh:orderBy [ sh:path rdfs:label ] ;
        ] ;
    ] ;
    sh:separator ", " ;
] .

SELECT (GROUP_CONCAT(?label; separator=", ") AS ?result)
WHERE {
    $this ex:child ?child .
    ?child rdfs:label ?label .
} ORDER BY ?label

OrderBy Expressions

An orderBy expression is a blank node with exactly one value for the property sh:orderBy which is a well-formed node expression and with exactly one value for the property sh:nodes which is a well-formed node expression. An orderBy expression can have one value for the property sh:desc which is either true or false.

EVALUATION OF ORDERBY EXPRESSIONS

For the orderBy expression $expr that has the node expression N as its value for sh:nodes and the node expression O as its value for sh:orderBy, Eval($expr, $this) produces a list consisting of exactly the results n of Eval(N, $this) ordered by the first node from Eval(O, n) according to the node comparison policy of the SPARQL ORDER BY operator. If sh:desc is true for the expression, then the order is descending, otherwise ascending.

The following example returns the list of values of ex:child at the focus node, ordered by their individual labels. Here we assume each child has at most one label, to make the result predictable.

[
    sh:nodes [ sh:path ex:child ] ;
    sh:orderBy [ sh:path rdfs:label ] ;
] .

SELECT DISTINCT ?child
WHERE {
    $this ex:child ?child .
    OPTIONAL {
        ?child rdfs:label ?label .
    }
} ORDER BY ?label

Limit Expressions

A limit expression is a blank node with exactly one value for the property sh:limit which is a literal with datatype xsd:integer and with exactly one value for the property sh:nodes which is a well-formed node expression.

EVALUATION OF LIMIT EXPRESSIONS

For the limit expression $expr that has the node expression N as its value for sh:nodes and L as its value for sh:limit, Eval($expr, $this) produces the list consisting of exactly the first L results of Eval(N, $this).

The following example returns at most two values of ex:parent at the focus node.

[
    sh:limit 2 ;
    sh:nodes [ sh:path ex:parent ] ;
] .

SELECT ?parent
WHERE {
    $this ex:parent ?parent .
} LIMIT 2

Offset Expressions

An offset expression is a blank node with exactly one value for the property sh:offset which is a literal with datatype xsd:integer and with exactly one value for the property sh:nodes which is a well-formed node expression.

EVALUATION OF OFFSET EXPRESSIONS

For the offset expression $expr that has the node expression N as its value for sh:nodes and O as its value for sh:offset, Eval($expr, $this) produces the list consisting of the results of Eval(N, $this) except for the first O ones.

SPARQL ASK Expressions

A SPARQL ASK expression is a blank node with exactly one value for the property sh:ask which is string literal. The blank node may have values for the property sh:prefixes and these values are IRIs or blank nodes. Using the values of sh:prefixes as defined by 5.2.1 Prefix Declarations for SPARQL Queries, the value of sh:ask must be valid SPARQL 1.1 ASK query. The blank node may also have exactly one value for the property sh:nodes which is a well-formed node expression.

EVALUATION OF SPARQL ASK EXPRESSIONS

For the SPARQL ASK expression $expr that represents the SPARQL ASK query A and that has the (optional) node expression N as its value for sh:nodes: If N is not present, then Eval($expr, $this) produces the list of length 1 consisting of either true or false based on the result of executing A with the node $this as the a pre-bound variable $this. If N is present, then Eval($expr, $this) produces a corresponding list of boolean literals for each node produced by Eval(N, $this) as pre-bound variable $this.

The following example produces true if the focus node has the given rdf:type.

[
    sh:ask "ASK { $this a <http://example.org/ns#Person> }"
] .

SPARQL SELECT Expressions

A SPARQL SELECT expression is a blank node with exactly one value for the property sh:select which is string literal. The blank node may have values for the property sh:prefixes and these values are IRIs or blank nodes. Using the values of sh:prefixes as defined by 5.2.1 Prefix Declarations for SPARQL Queries, the value of sh:select must be valid SPARQL 1.1 SELECT query with exactly one result variable. The blank node may also have exactly one value for the property sh:nodes which is a well-formed node expression.

EVALUATION OF SPARQL SELECT EXPRESSIONS

For the SPARQL SELECT expression $expr that represents the SPARQL SELECT query S and that has the (optional) node expression N as its value for sh:nodes: If N is not present, then Eval($expr, $this) produces the list of all bindings of the result variable produced by executing S with the node $this as the a pre-bound variable $this. If N is present, then Eval($expr, $this) produces a list concatenating the corresponding results when S is executed for each node produced by Eval(N, $this) as pre-bound variable $this.

The following example is an excerpt of a Turtle file declaring a SPARQL SELECT expression using namespace prefixes.

<http://example.org/ns>
    a owl:Ontology ;
    sh:declare [
        a sh:PrefixDeclaration ;
        sh:prefix "ex" ;
        sh:namespace "http://example.org/ns#"^^xsd:anyURI ;
    ] .

[
    sh:prefixes <http://example.org/ns> ;
    sh:select """
        SELECT DISTINCT ?client
        WHERE {
            $this ex:client ?client .
            ?client ex:address/ex:country ?country .
            ?country ex:memberOf ex:WHO .
        }
    """ ;
] .

Property	Summary and Syntax Rules
`sh:expression`	The node expression that must return `true`. The values of `sh:expression` at a shape must be well-formed node expressions.

SHACL Rules

SHACL defines an RDF vocabulary to describe shapes - collections of constraints that apply to a set of nodes. Shapes can be associated with nodes using a flexible target mechanism, e.g. for all instances of a class. One focus area of SHACL is data validation. However, the same principles of describing data patterns in shapes can also be exploited for other purposes. SHACL rules build on SHACL to form a light-weight RDF vocabulary for the exchange of rules that can be used to derive inferred RDF triples from existing asserted triples.

The SHACL rules feature defined in this section includes a general framework using the properties such as sh:values, sh:rule and sh:condition, plus an extension mechanism for specific rule types. This document defines two such rule types: Triple rules and SPARQL rules. Other documents, including SHACL JavaScript Extensions [[shacl-js]], can define additional types of rules.

Examples of SHACL Rules

The following example illustrates the use of a triple rule that adds an rdf:type triple so that those SHACL instances of ex:Rectangle where the ex:width equals the ex:height are also marked to be instances of ex:Square. The rule applies only to well-formed rectangles that conform to the ex:Rectangle shape, e.g. by having exactly one width and height, both integers.

ex:Rectangle
	a rdfs:Class, sh:NodeShape ;
	rdfs:label "Rectangle" ;
	sh:property [
		sh:path ex:height ;
		sh:datatype xsd:integer ;
		sh:maxCount 1 ;
		sh:minCount 1 ;
		sh:name "height" ;
	] ;
	sh:property [
		sh:path ex:width ;
		sh:datatype xsd:integer ;
		sh:maxCount 1 ;
		sh:minCount 1 ;
		sh:name "width" ;
	] ;
	sh:rule [
		a sh:TripleRule ;
		sh:subject sh:this ;
		sh:predicate rdf:type ;
		sh:object ex:Square ;
		sh:condition ex:Rectangle ;
		sh:condition [
			sh:property [
				sh:path ex:width ;
				sh:equals ex:height ;
			] ;
		] ;
	] .

ex:InvalidRectangle
	a ex:Rectangle .

ex:NonSquareRectangle
	a ex:Rectangle ;
	ex:height 2 ;
	ex:width 3 .
	
ex:SquareRectangle
	a ex:Rectangle ;
	ex:height 4 ;
	ex:width 4 .

For the data graph above, a SHACL rules engine will produce the following inferred triples:

	ex:SquareRectangle rdf:type ex:Square .

No inferences will be made for ex:NonSquareRectangle because its width is not equal to its height. No inferences will be made for ex:InvalidRectangle because although it has equal width and height (namely none), it does not pass the sh:condition of being a well-formed rectangle.

The following example illustrates a simple use case of a SPARQL rule that applies to all instances of the class ex:Rectangle and computes the values of the ex:area property by multiplying the rectangle's width and height:

ex:RectangleShape
	a sh:NodeShape ;
	sh:targetClass ex:Rectangle ;
	sh:property [
		sh:path ex:width ;
		sh:datatype xsd:integer ;
		sh:minCount 1 ;
		sh:maxCount 1 ;
	] ;
	sh:property [
		sh:path ex:height ;
		sh:datatype xsd:integer ;
		sh:minCount 1 ;
		sh:maxCount 1 ;
	] .

ex:RectangleRulesShape
	a sh:NodeShape ;
	sh:targetClass ex:Rectangle ;
	sh:rule [
		a sh:SPARQLRule ;
		sh:prefixes ex: ;
		sh:construct """
			CONSTRUCT {
				$this ex:area ?area .
			}
			WHERE {
				$this ex:width ?width .
				$this ex:height ?height .
				BIND (?width * ?height AS ?area) .
			}
			""" ;
		sh:condition ex:RectangleShape ;    # Rule only applies to Rectangles that conform to ex:RectangleShape
	] ;
.

An engine that is capable of executing such rules uses the target statements associated with the shapes in the shapes graph to determine which rules need to be executed on which target nodes. For those target nodes that conform to any condition shapes, it executes the provided CONSTRUCT queries to produce the inferred triples. During the execution of the query, the variable this has the current focus node as pre-bound variable. For the following data graph, the triples below would be produced.

ex:ExampleRectangle
	a ex:Rectangle ;
	ex:width 7 ;
	ex:height 8 .

ex:InvalidRectangle    # Lacks a value for ex:height, so sh:condition is not met
	a ex:Rectangle ;
	ex:width 7 .

Inferred triples:

	ex:ExampleRectangle ex:area 56 .

The following variation produces the same results as the SPARQL rule, but uses a Triple rule. While not as expressive as CONSTRUCT-based rules, Triple rules are more declarative and may be executed on platforms that do not support SPARQL.

ex:RectangleRulesShape
	a sh:NodeShape ;
	sh:targetClass ex:Rectangle ;
	sh:rule [
		a sh:TripleRule ;
		sh:subject sh:this ;
		sh:predicate ex:area ;    # Computes the values of the ex:area property at the focus nodes
		sh:object [
			ex:multiply ( [ sh:path ex:width ] [ sh:path ex:height ] ) ;
		] ;
		sh:condition ex:RectangleShape ;    # Rule only applies to Rectangles that conform to ex:RectangleShape
	] .

Finally, the following variation does the same inferences but using a property value rule. These rules are using a more compact syntax than triple rules and are directly attached to property shapes with the property sh:values.

ex:RectangleRulesShape
	a sh:NodeShape ;
	sh:targetClass ex:Rectangle ;
	sh:property [
		sh:path ex:area ;    # Computes the values of the ex:area property at the focus nodes
		sh:values [
			ex:multiply ( [ sh:path ex:width ] [ sh:path ex:height ] ) ;
		] ;
	] .

General Syntax of SHACL Rules

The values of the property sh:rule at a shape are called SHACL rules. SHACL has a flexible design in which multiple types of rules can be supported, including Triple rules and SPARQL rules. Each rule type is identified by an IRI that is used as rdf:type of rules. Each rule type also defines execution instructions that can be implemented by rule engines.

Each SHACL rule has at least one rdf:type which is a IRI.

Rules can have multiple types, e.g. to provide instructions that work either in SPARQL or JavaScript, depending on the capabilities of the engine. The creator of such rules needs to make sure that such rules have consistent semantics. Rule R has rule type T if R is a SHACL instance of T.

All rules may have the properties defined in the rest of this section.

sh:condition

A rule may have values for the property sh:condition to specify shapes that the focus nodes must conform to before the rule gets executed.

The values of sh:condition at a rule must be well-formed shapes.

sh:order

Rules and shapes may specify its relative execution order as defined in this section.

Each rule or shape may have at most one value for the property sh:order. The values of sh:order at rules and shapes are literals with a numeric datatype such as xsd:decimal.

If unspecified, then the default execution order is 0. These values are used by a rules engine to determine the order of rules. When the rules associated with a shape are executed, rules with larger values will be executed after those with smaller values.

ex:RuleOrderExampleShape
	a sh:NodeShape ;
	sh:targetClass ex:Person ;
	sh:rule [
		a sh:SPARQLRule ;
		rdfs:label "Infer uncles, i.e. male siblings of the parents of $this" ;
		sh:prefixes ex: ;
		sh:order 1 ;   # Will be evaluated before 2
		sh:construct """
			CONSTRUCT {
				$this ex:uncle ?uncle .
			}
			WHERE {
				$this ex:parent ?parent .
				?parent ex:sibling ?uncle .
				?uncle ex:gender ex:male .
			}
			"""
	] ;
	sh:rule [
		a sh:SPARQLRule ;
		rdfs:label "Infer cousins, i.e. the children of the uncles" ;
		sh:prefixes ex: ;
		sh:order 2 ;
		sh:construct """
			CONSTRUCT {
				$this ex:cousin ?cousin .
			}
			WHERE {
				$this ex:uncle ?uncle .
				?cousin ex:parent ?uncle .
			}
			"""
	] .

sh:deactivated

Rules may be deactivated by setting sh:deactivated to true. Deactivated rules are ignored by the rules engine.

Each rule may have at most one value for the property sh:deactivated. The values of sh:deactivated are either of the xsd:boolean literals true or false.

The sh:Rules Entailment Regime

SHACL defines the property sh:entailment to link a shapes graph with entailment regimes. The IRI sh:Rules represents the SHACL rules entailment regime. In the following example, the shapes graph indicates to a SHACL validation engine that the SHACL rules inside of the shapes graph need to be executed prior to starting the validation.

<http://example.org/my-shapes>
	a owl:Ontology ;
	sh:entailment sh:Rules .

Following the general policy for SHACL, validation engines that do not support the SHACL rules entailment regime MUST signal a failure if this triple is present. Validation engines that do support the SHACL rules entailment regime execute the rules following the rules execution instructions prior to performing the actual validation.

General Execution Instructions for SHACL Rules

A SHACL rules engine is a computer procedure that takes as input a data graph and a shapes graph and is capable of adding triples to the data graph. The new triples that are produced by a rules engine are called the inferred triples.

Note that, from a logical perspective, the data graph will be modified if triples get inferred. This means that rules can trigger after other triples have been inferred. However, in cases where the original data should not be modified, implementations may construct a logical data graph that has the original data as one subgraph and a dedicated inferences graph as another subgraph, and where the inferred triples get added to the inferences graph only.

In order to count as a SHACL rules engine, an implementation must be capable of inferring triples according to the following procedure (given in pseudo-code), or a different algorithm as long as the result is the same as specified. Note that this algorithm only covers a single "iteration" over all rules, without prescribing the behavior if the same rule needs to be applied multiple times after other rules have fired. The latter is left to future work.

	for each shape S in the shapes graph, ordered by execution order {
		for each non-deactivated rule R in the shape, ordered by execution order {
			for each target node T of S that conforms to all conditions of R {
				execute R using T as focus node following the execution instructions of its rule types
			}
		}
	}

The triples that are inferred by a rule do not immediately become part of the data graph, i.e. the triples produced by one rule can not always be queried by other rules. These policies reduce the likelihood of race conditions and better support parallel execution.

If two shapes have the same execution order then their newly inferred triples are not visible to each other.
If two rules have the same execution order then their newly inferred triples are not visible to each other.
If the same rule is executed on multiple target nodes then the newly inferred triples are not visible to the other target nodes.

If a rules engine is not able to execute a given rule because it does not support any of the rule types of the rule, then it reports a failure.

At no time are inferred triples visible to the shapes graph, i.e. it is impossible for rules to modify the definitions of rules or shapes.

Triple Rules

This section defines a rule type called triple rules, identified by the IRI sh:TripleRule. Triple rules have the following properties:

Property	Summary and Syntax Rules
`sh:subject`	The node expression used to compute the subjects of the triples. Each triple rule must have exactly one value of the property `sh:subject` (which must be a well-formed node expression).
`sh:predicate`	The node expression used to compute the predicates of the triples. Each triple rule must have exactly one value of the property `sh:predicate` (which must be a well-formed node expression).
`sh:object`	The node expression used to compute the objects of the triples. Each triple rule must have exactly one value of the property `sh:object` (which must be a well-formed node expression).

EXECUTION OF TRIPLE RULES

Let S, P and O be the sets of nodes produced by evaluating the node expressions that are the values of sh:subject, sh:predicate and sh:object respectively at the triple rule. For each combination of members s of S, p of P and o of O, infer a triple with subject s, predicate p and object o.

Property Value Rules

Property value rules provide syntactic sugar for triple rules, supporting the use case where triple rules are used to infer values of a single property at a single focus node.

A property value rule is represented by a value V of the property sh:values of a property shape P that has a IRI path as its value for sh:path. The values of the property sh:values must be well-formed node expressions. For each node shape N that has P as value of sh:property, there is an implicit triple rule T equivalent to the following triples (unless either P or N have true as the value of sh:deactivated):

N sh:rule T
T rdf:type sh:TripleRule
T sh:subject sh:this
T sh:predicate path
T sh:object V

An example property value rule had already been provided.

SPARQL Rules

This section defines a rule type called SPARQL rules, identified by the IRI sh:SPARQLRule. SPARQL rules have the following properties:

Property	Summary and Syntax Rules
`sh:construct`	The SPARQL CONSTRUCT query. SPARQL rules must have exactly one value for the property `sh:construct`. The values of `sh:construct` are literals with datatype `xsd:string`.
`sh:prefixes`	The prefixes to use to turn the `sh:construct` into a SPARQL query. SPARQL rules may use the property `sh:prefixes` to declare a dependency on prefixes based on the mechanism defined in Prefix Declarations for SPARQL Queries from the SHACL specification [[!shacl]]. This mechanism allows users to abbreviate URIs in the `sh:construct` strings.

EXECUTION OF SPARQL RULES

Let Q be the SPARQL CONSTRUCT query derived from the values of the properties sh:construct and sh:prefixes of the SPARQL rule in the shapes graph. For each focus node, execute the query Q pre-binding the variable this to the focus node, and infer the constructed triples.

Document Conventions

Terminology

Introduction