This is the draft has been submitted to the W3C SHACL Community Group for publication as a Community Report as input for a potential revision of the SHACL Advanced Features document.
Changes since the SHACL-AF Working Group Note release of 08 June 2017
Some examples in this document use Turtle [[!turtle]]. The reader is expected to be familiar with SHACL [[!shacl]] and SPARQL [[!sparql11-query]].
Within this document, the following namespace prefix bindings are used:
Prefix | Namespace |
---|---|
rdf: |
http://www.w3.org/1999/02/22-rdf-syntax-ns# |
rdfs: |
http://www.w3.org/2000/01/rdf-schema# |
sh: |
http://www.w3.org/ns/shacl# |
xsd: |
http://www.w3.org/2001/XMLSchema# |
ex: |
http://example.com/ns# |
Throughout the document, color-coded boxes containing RDF graphs in Turtle will appear. These fragments of Turtle documents use the prefix bindings given above.
# This box represents a shapes graph <s> <p> <o> .
// This box contains JavaScript code
# This box represents a data graph.
# This box represents an output results graph
Formal definitions appear in blue boxes:
# This box contains textual definitions.
Grey boxes such as this include syntax rules that apply to the shapes graph.
true
denotes the RDF term "true"^^xsd:boolean
.
false
denotes the RDF term "false"^^xsd:boolean
.
The terminology used throughout this document is consistent with the definitions in the main SHACL [[!shacl]] specification, which references terms from RDF [[!rdf11-concepts]]. This includes the terms binding, blank node, conformance, constraint, constraint component, data graph, datatype, failure, focus node, RDF graph, ill-formed, IRI, literal, local name, member, node, node shape, object, parameter, pre-binding, predicate, property path, property shape, RDF term, SHACL instance, SHACL list, SHACL subclass, shape, shapes graph, solution, subject, target, triple, validation, validation report, validation result, validator, value, value node.
The SHACL specification [[!shacl]] is divided into SHACL Core and SHACL-SPARQL:
This document extends the functionality of SHACL by defining RDF vocabularies to cover the following features:
Taken together or individually, these features greatly extend the application scenarios of SHACL, and SHACL-SPARQL in particular.
Some of the features presented here (including node expressions, expression constraints and triple rules) do not necessarily require a SPARQL processor and could be used as extensions of pure SHACL Core implementations. Other features (including custom target types, SHACL functions, and general SHACL rules) define extension mechanisms that can also be used with other languages than SPARQL, such as JavaScript (as defined by the SHACL-JS document [[shacl-js]]).
A SHACL-SPARQL processor that also supports all features defined in this document and is called an Advanced SHACL-SPARQL processor.
In general, targets define a mechanism that is used by SHACL engines to determine the focus nodes
that should be validated against a given shape.
SHACL Core [[!shacl]] defines a fixed set of Core targets by means of properties such as sh:targetClass
.
These Core targets were designed to cover a large number of use cases while retaining a simple declarative data model.
However, in some use cases, these Core targets are not sufficient. For example it is impossible to state that
a shape should apply only to a subset of instances of a class, e.g. persons born in the USA.
Neither is it possible to state that a shape should apply to all subjects in a graph, or to nodes selected by completely
different, application-specific mechanisms.
This section defines richer mechanisms to define targets, called custom targets.
Custom targets are the values of the property sh:target
in the shapes graph.
The property sh:target
has a similar status as, for example, sh:targetClass
,
and all subjects of sh:target
triples are also shapes.
The values of sh:target
at a shape are
IRIs or blank nodes.
A SHACL engine that supports custom targets uses the values of the custom target node
to compute the target nodes for the associated shape.
The algorithm that is used for this computation depends on the rdf:type
of the custom target.
The following sub-sections define two such algorithms:
However, other types of targets can be supported by other extension languages such as JavaScript.
The class sh:Target
is the recommended base class for such extensions.
The behavior of a SHACL engine that is unable to handle a given custom target is left undefined.
SHACL Core processors do not even need to be aware of the existence of the sh:target
property.
Engines that are aware of this property and cannot handle a given custom target SHOULD at least report a warning.
Custom targets that are SHACL instances of sh:SPARQLTarget
are called
SPARQL-based targets.
SPARQL-based targets have exactly one value for the property sh:select
.
SPARQL-based targets may have values for the property sh:prefixes
and these values are IRIs or blank nodes.
Using the values of sh:prefixes
as defined by
5.2.1 Prefix Declarations for SPARQL Queries,
the values of sh:select
must be valid SPARQL 1.1 SELECT queries with a single result variable this
.
SPARQL-based targets have at most one value for the property sh:ask
.
The following example declares a well-formed SPARQL-based target that produces all persons born in the USA:
ex: sh:declare [ sh:prefix "ex" ; sh:namespace "http://example.com/ns#"^^xsd:anyURI ; ] . ex:USCitizenShape a sh:NodeShape ; sh:target [ a sh:SPARQLTarget ; sh:prefixes ex: ; sh:select """ SELECT ?this WHERE { ?this a ex:Person . ?this ex:bornIn ex:USA . } """ ; ] ; ...
Q
be the SPARQL SELECT query derived from the values of sh:select
and sh:prefixes
of the SPARQL-based target T
.
The target nodes of T
are the bindings of the variable this
returned
by Q
against the data graph.
While the SELECT queries can be used to identify all focus nodes for a given shape, SHACL processors sometimes
also need to compute the inverse direction and find all shapes for which a given node needs to be validated against.
For this reason, the following semantic restriction is recommended for SELECT queries used in SPARQL-based targets.
Informally, SHACL Full processors should be able to derive an equivalent ASK query from the SELECT query,
pre-bind the potential focus node,
and check whether the potential focus node needs to be validated against the shape that has the given target.
Formally, let A
be a SPARQL ASK query that is produced by replacing the SelectClause
with ASK
in the outermost SELECT query.
Let rs
be the set of RDF terms returned as bindings for the variable this
in the solutions of the SELECT query.
Then A
returns true
if and only if the variable this
is pre-bound with a value from rs
.
If the SELECT query of a SPARQL-based target does not fulfill this requirement, it needs to be accompanied
by a SPARQL ASK query as the value for sh:ask
.
A SHACL engine can then determine whether a given shape applies to a given node by executing the ASK
query with the variable this
pre-bound to the node.
If the ASK query evaluates to true
then the node is in the target of the shape.
In some cases it would be too repetitive to declare SPARQL-based targets with similar SPARQL queries that only differ in a few aspects. SHACL-SPARQL defines a mechanism for user-defined constraint components, allowing users to reuse the same SPARQL query in a parameterized form. The SPARQL-based target types introduced in this section follow a similar design.
The class sh:TargetType
can be used to declare high-level vocabularies for targets in a shapes graph.
The class sh:SPARQLTargetType
is declared as rdfs:subClassOf sh:TargetType
for
SPARQL-based target types.
Other extension languages may define alternative execution instructions for target types with the same IRI,
making them potentially more platform independent than pure SPARQL-based targets.
Instances of the class sh:SPARQLTargetType
specify a SPARQL SELECT query via the property sh:select
,
and this query has to fulfill the same syntactic and semantic rules as SPARQL-based targets.
Similar to SPARQL-based constraint components, such targets take parameters and
the parameter values become pre-bound variables in the associated SPARQL queries.
The parameter values of such targets cannot not be blank nodes, and the same target cannot have more than one value per parameter.
A target that lacks a value for a non-optional parameter is ignored, producing no target nodes.
Similar to SPARQL-based constraint components, target types may also have values for the property sh:labelTemplate
.
The following example declares a new SPARQL-based target type that takes one parameter ex:country
that gets mapped into the variable country
in the corresponding SPARQL query to determine the resulting target nodes.
ex:PeopleBornInCountryTarget a sh:SPARQLTargetType ; rdfs:subClassOf sh:Target ; sh:labelTemplate "All persons born in {$country}" ; sh:parameter [ sh:path ex:country ; sh:description "The country that the focus nodes are 'born' in." ; sh:class ex:Country ; sh:nodeKind sh:IRI ; ] ; sh:select """ PREFIX ex: <http://example.com/ns#> SELECT ?this WHERE { ?this a ex:Person . ?this ex:bornIn $country . } """ .
Once such a target type has been defined in a shapes graph, it can be used by multiple shapes:
ex:GermanCitizenShape a sh:NodeShape ; sh:target [ a ex:PeopleBornInCountryTarget ; ex:country ex:Germany ; ] ; ... ex:USCitizenShape a sh:NodeShape ; sh:target [ a ex:PeopleBornInCountryTarget ; ex:country ex:USA ; ] ; ...
The set of focus nodes produced by such a target type consists of all bindings of the variable this
in the result set,
when the SPARQL SELECT query has been executed with the pre-bound parameter values.
This section extends the general mechanism from SHACL-SPARQL [[!shacl]] to produce validation reports as a result of the validation.
Implementations that support this feature make it possible to inject so-called annotation properties
into the validation result nodes created for each solution produced by the SELECT queries of a
SPARQL-based constraint or constraint component.
Any such annotation property needs to be declared via a value of sh:resultAnnotation
at
the subject of the sh:select
or sh:ask
triple.
The values of sh:resultAnnotation
are
called result annotations and are either IRIs or blank nodes.
Result annotations have the following properties:
Property | Summary and Syntax Rules |
---|---|
sh:annotationProperty |
The property that shall be set.
Each result annotation has exactly one value
for the property sh:annotationProperty and this value is an IRI.
|
sh:annotationVarName |
The name of the SPARQL variable to take the annotation values from.
Each result annotation has at most 1 value
for the property sh:annotationVarName and this value is literal with
datatype xsd:string .
|
sh:annotationValue |
Constant RDF terms that shall be used as default values. |
For each solution of a SELECT result set, a SHACL processor that supports annotations walks through the declared result annotations. The mapping from result annotations to SPARQL variables uses the following rules:
sh:annotationVarName
sh:annotationProperty
as the variable name
If a variable name could be determined, then the SHACL processor copies the binding for the given variable
as a value for the property specified using sh:annotationProperty
into the validation result that is being produced for the current solution.
If the variable has no binding in the result set solution,
then the values of sh:annotationValue
is used, if present.
Here is an example illustrating the use of result annotations.
ex:AnnotationExample a sh:NodeShape ; sh:targetNode ex:ExampleResource ; sh:sparql [ # _:b1 sh:resultAnnotation [ sh:annotationProperty ex:time ; sh:annotationVarName "time" ; ] ; sh:select """ SELECT $this ?message ?time WHERE { BIND (CONCAT("The ", "message.") AS ?message) . BIND (NOW() AS ?time) . } """ ; ] .
Validation produces the following validation report:
[ a sh:ValidationReport ; sh:conforms false ; sh:result [ a sh:ValidationResult ; sh:focusNode ex:ExampleResource ; sh:resultMessage "The message." ; sh:resultSeverity sh:Violation ; sh:sourceConstraint _:b1 ; sh:sourceConstraintComponent sh:SPARQLConstraintComponent ; sh:sourceShape ex:AnnotationExample ; ex:time "2015-03-27T10:58:00"^^xsd:dateTime ; # Example ] ] .
SHACL functions declare operations that produce an RDF term based on zero or more parameters and a data graph. Each SHACL function has an IRI. The actual execution logic (or algorithm) of a SHACL function can be declared in a variety of execution languages. This document defines one specific kind of SHACL functions, the SPARQL-based functions. JavaScript-based Functions are defined in the separate SHACL-JS document [[shacl-js]]. The same function IRI can potentially be executed on a multitude of platforms, if it declares execution instructions for these platforms.
SHACL functions can be called within FILTER or BIND clauses and similar features of SPARQL queries. SHACL functions can also be used declaratively in frameworks such as the SHACL node expressions which are used in SHACL rules. In those scenarios they may be used to perform data transformations such as string concatenation.
The following example illustrates the declaration of a SHACL function based on a simple mathematical SPARQL query.
ex:multiply a sh:SPARQLFunction ; rdfs:comment "Multiplies its two arguments $op1 and $op2." ; sh:parameter [ sh:path ex:op1 ; sh:datatype xsd:integer ; sh:description "The first operand" ; ] ; sh:parameter [ sh:path ex:op2 ; sh:datatype xsd:integer ; sh:description "The second operand" ; ] ; sh:returnType xsd:integer ; sh:select """ SELECT ($op1 * $op2 AS ?result) WHERE { } """ .
Using the declaration above, SPARQL engines that support SHACL functions install a new SPARQL function based on the
SPARQL 1.1 Extensible Value Testing mechanism.
Such engines are then able to handle expressions such as ex:multiply(7, 8)
, producing 56
,
as illustrated in the following SPARQL query.
SELECT ?subject ?area WHERE { ?subject ex:width ?width . ?subject ex:height ?height . BIND (ex:multiply(?width, ?height) AS ?area) . }
The following sections introduce the general properties that such functions may have, before the specific characteristics of SPARQL-based functions are defined.
The parameters of a SHACL function are declared using the property sh:parameter
.
This corresponds closely to the parameter
declarations of SPARQL-based constraint components, and the same syntax rules apply.
Parameters are ordered, corresponding to the notation of function calls in SPARQL such as
ex:exampleFunction(?param1, ?param2)
.
The ordering of function parameters is determined as follows:
sh:order
then all of them
are ordered in ascending order by the parameters' numeric values of sh:order
,
using 0
as default value if unspecified.
sh:order
then all of them
are ordered in ascending order of the local names of their declared sh:path
values.
Each parameter may have its property sh:optional
set to true
to indicate that the parameter is not mandatory.
If a function gets invoked without all its mandatory parameters then it returns no result node
(an error in SPARQL, producing unbound in a BIND statement).
A function may declare a single return type via sh:returnType
.
A function has at most one value for sh:returnType
.
The values of sh:returnType
are IRIs.
The return type may serve for documentation purposes only.
However, in some execution languages such as JavaScript, the declared sh:returnType
may inform
a processor how to cast a native value into an RDF term.
SHACL instances of sh:SPARQLFunction
that are IRIs in a shapes graph
are called SPARQL-based functions.
SPARQL-based functions have exactly one value for either sh:ask
or sh:select
.
The values of these properties are strings that can be parsed into SPARQL queries of type ASK (for sh:ask
)
or SELECT (for sh:select
) using the SHACL-SPARQL prefix declaration mechanism.
SELECT queries return exactly one result variable and do not use the SELECT *
syntax.
When the function is executed, the SPARQL processor needs to pre-bind variables based on the provided arguments
of the function call.
In the SHACL functions example above, the value for the parameter declared as
ex:op1
is pre-bound to the SPARQL variable $op1
, etc.
For ASK queries, the function's return value is the result of the ASK query execution, i.e. true
or false
.
For SELECT queries, the function's return value is the binding of the (single) result variable of the first solution in the result set.
Since all other bindings will be ignored, such SELECT queries should only return at most one solution.
If the result variable is unbound, then the function generates a SPARQL error.
This section defines a feature called node expressions. Node expressions are declared as RDF nodes in a shapes graph and instruct a SHACL engine how to compute a list of nodes for a given focus node. Each node expression has one of the following types, each of which is defined together with its evaluation semantics in the following sub-sections.
Node Expression Type | Syntax (Informative) | Summary (Informative) |
---|---|---|
Focus Node Expression | sh:this |
The list consisting of the current focus node. |
Constant Term Expression | Any IRI or literal except sh:this |
The list consisting of the given term. |
Function Expression | Blank node with a list-valued triple | The results of evaluating a given SHACL Function. |
Path Expression | Blank node with sh:path |
The values of a given property path. |
Exists Expression | Blank node with sh:exists |
The list consisting of either true or false depending on whether input nodes exist. |
If Expression | Blank node with sh:if |
The results of either sh:then or sh:else depending on whether the sh:if node expression is [true ]. |
Filter Shape Expression | Blank node with sh:filterShape |
The sub-list of the input nodes that conform to a given shape. |
Intersection Expression | Blank node with sh:intersection |
The intersection of two or more input node lists. |
Union Expression | Blank node with sh:union |
The union (concatenation) of two or more input node lists. |
Minus Expression | Blank node with sh:minus and sh:nodes |
The input nodes except those that are in another "minus" list. |
Distinct Expression | Blank node with sh:distinct |
The sub-list of the input nodes that are distinct, eliminating duplicates. |
Count Expression | Blank node with sh:count |
The number of input nodes as a single xsd:integer node. |
Min Expression | Blank node with sh:min |
The smallest of the input nodes. |
Max Expression | Blank node with sh:max |
The largest of the input nodes. |
Sum Expression | Blank node with sh:sum |
The sum of the input nodes. |
Group Concat Expression | Blank node with sh:groupConcat |
A string concatenation of all input nodes. |
OrderBy Expression | Blank node with sh:orderBy and sh:nodes |
The input nodes, ordered by a given expression. |
Limit Expression | Blank node with sh:limit and sh:nodes |
Only the first N of the input nodes. |
Offset Expression | Blank node with sh:offset and sh:nodes |
The input nodes except the first N ones. |
SPARQL ASK Expression | Blank node with sh:ask . |
An xsd:boolean based on the result of a SPARQL ASK query. |
SPARQL SELECT Expression | Blank node with sh:select . |
The results of a SPARQL SELECT query. |
The basic idea of these expressions is that they can be used to derive a list of RDF nodes from a given focus node, for example all values of a given property of the focus node. Some of these expressions can use the output of another expression as their input, leading to evaluation chains and trees.
The following example declares a node expression that produces the display labels of all values of
the property ex:customer
that conform to a given shape ex:GoodCustomerShape
.
The assumption here is that there is a SHACL function ex:displayLabel
which declares a
single parameter.
[ ex:displayLabel ( [ sh:filterShape ex:GoodCustomerShape ; sh:nodes [ sh:path ex:customer ] ; ] ) ] .
To evaluate this example, an engine gets all values of ex:customer
of the focus node,
then filters them according to the shape ex:GoodCustomerShape
and repeatedly calls the SHACL function ex:displayLabel
with all values that pass the
filter shape as arguments.
Important use cases of such expressions are expression constraints and SHACL rules, yet the basic functionality and vocabulary may find many other application areas.
Each of the following sub-sections defines a node expression type with its syntax rules
and evaluation semantics based on a mapping operation Eval($expr, $this)
where the
first argument $expr
is the given expression, $this
is the current focus node
and which produces a list of RDF nodes.
Unless specified otherwise, the members of these lists are handled in their natural order.
A node expression
cannot recursively have itself
as a "nested" node expression, e.g. as value of sh:nodes
.
The IRI sh:this
is the (only) node declaring a focus node expression.
sh:this
, Eval(sh:this, $this)
produces
the list of length 1 with $this
as its only member.
Any literal or IRI except sh:this
declares a constant term expression.
$expr
, Eval($expr, $this)
produces
the list of length 1 with $expr
as its only member.
An exists expression is a blank node
with exactly one value for sh:exists
(which is a well-formed shape).
$expr
with N
being the node expression that
is the value of sh:exists
, Eval($expr, $this)
produces the list consisting of exactly the node true
if Eval(N, $this)
produces at least one node, and the list consisting of exactly the node false
otherwise.
In the following example, sh:exists
is used to test whether the current focus node has any value for an example property.
This is comparable to the SPARQL expression on the right.
[ sh:exists [ sh:path ex:someProperty ] ] . |
{ BIND (EXISTS { $this ex:someProperty ?any } AS ?result) } |
An if expression is a blank node
with exactly one value for sh:if
(which is a well-formed node expression),
at most one value for sh:then
(which is a well-formed node expression)
and at most one value for sh:else
(which is a well-formed node expression).
$expr
with IF
being the node expression that
is the value of sh:if
, and THEN
and ELSE
being corresponding
values of sh:then
and sh:else
, resp.
If Eval(IF, $this)
produces the list consisting of exactly the node true
then produce Eval(THEN, $this)
(or the empty list if THEN is absent),
otherwise produce Eval(ELSE, $this)
(or the empty list if ELSE is absent).
The following example produces the string node "married" if the focus node has a spouse, "not married" otherwise.
[ sh:if [ sh:exists [ sh:path ex:spouse ] ] ; sh:then "married" ; sh:else "not married" ; ] . |
{ BIND (IF(EXISTS { $this ex:spouse ?any }, "married", "not married") AS ?result) } |
A filter shape expression is a blank node
with exactly one value for sh:filterShape
(which is a well-formed shape)
and at most one value for sh:nodes
(which is a well-formed node expression).
$expr
with S
being the shape
that is the value of sh:filterShape
and N
being the node expression that
is the value of sh:nodes
(defaulting to the focus node expression if absent), Eval($expr, $this)
produces the list of nodes for
each node n
produced by Eval(N, $this)
where n
conforms to S
.
The following example returns all values of ex:child
that conform to the given SHACL shape.
In order to conform to the shape, the child must have "male" as one of its values for ex:gender
.
[ sh:nodes [ sh:path ex:child ] ; sh:filterShape [ sh:property [ sh:path ex:gender ; sh:hasValue "male" ; ] ; ] ; ]
A function expression is a blank node
that does not fulfill any of the syntax rules of the other node expression types and which
is the subject of exactly one triple T
where the object is a well-formed SHACL list,
and each member of that list is a well-formed node expression.
$expr
, Eval($expr, $this)
produces
the list of nodes returned by evaluating the SHACL function specified as predicate
of the triple T
mentioned above.
The arguments of the function call(s) are based on the results of the node expressions listed
in the object list of T
so that the first list member is used for the first argument, etc.
This is done for all combinations of nodes produced by each node expression.
If one of the node expressions produces the empty list and the corresponding function parameter
is non-optional (see sh:optional
), then the result is the empty list.
As illustrated in the following example, function expressions are comparable to SPARQL BIND clauses.
[ ex:concat ( [ sh:path ex:firstName ] " " [ sh:path ex:lastName ] ) ] . |
{ $this ex:firstName ?a . $this ex:lastName ?b . BIND (ex:concat(?a, " ", ?b) AS ?result) . } |
A path expression is a blank node
with exactly one value of the property sh:path
(which are well-formed property paths)
and at most one value for sh:nodes
(which is a well-formed node expression).
$expr
that has the property path P
as its
value for sh:path
and the node expression N
as its value
for sh:nodes
(defaulting to the focus node expression if absent),
Eval($expr, $this)
produces the list of values of all nodes produced by Eval(N, $this)
for the property path P
.
The order of results of these lists is undefined and they do not contain duplicates.
As illustrated in the following examples, path expressions are comparable to SPARQL basic graph patterns.
[ sh:path ex:firstName ] . [ sh:nodes [ sh:path ex:children ] ; sh:path rdfs:label ; ] . |
{ $this ex:firstName ?result . } { $this ex:children ?a . ?a rdfs:label ?result . } |
An intersection expression is a blank node
with exactly one value for the property sh:intersection
which is a well-formed SHACL list
with at least two members (which are well-formed node expressions).
$expr
that has the list L
as its
value for sh:intersection
, Eval($expr, $this)
produces
the list of nodes that are produced by the first node expression from the list and are also
in the result lists produced by the other members of L
.
A union expression is a blank node
with exactly one value for the property sh:union
which is a well-formed SHACL list
with at least two members (which are well-formed node expressions).
$expr
that has the list L
as its
value for sh:union
, Eval($expr, $this)
produces
the list of nodes that are the concatenation of the result lists produced by all of the members of L
,
preserving their original order.
As illustrated in the following example, union expressions are comparable to SPARQL UNION clauses. Similar to SPARQL, they may produce duplicate results.
[ sh:union ( [ sh:path ex:firstName ] [ sh:path ex:givenName ] ) ] . |
{ $this ex:firstName ?result . } UNION { $this ex:givenName ?result . } |
A minus expression is a blank node
with exactly one value for the property sh:minus
which is a well-formed node expression
and exactly one value for the property sh:nodes
which is a well-formed node expression.
$expr
that has
the node expression N
as its value for sh:minus
and
the node expression M
as its value for sh:nodes
,
Eval($expr, $this)
produces the list of nodes that are in Eval(N, $this)
but not in Eval(M, $this)
.
In the following example, sh:minus
returns all values of the property ex:children
except those that are also values of ex:sons
.
[ sh:nodes [ sh:path ex:children ] ; sh:minus [ sh:path ex:sons ] ; ] . |
{ $this ex:children ?result . MINUS { $this ex:sons ?result . } } |
A distinct expression is a blank node
with exactly one value for the property sh:distinct
which is a well-formed node expression.
$expr
that has
the node expression N
as its value for sh:distinct
,
Eval($expr, $this)
produces the list of nodes that are in Eval(N, $this)
dropping any
duplicate nodes.
The following example returns all values of ex:customer
and ex:client
but without duplicates.
[ sh:distinct [ sh:union ( [ sh:path ex:customer ] ; [ sh:path ex:client ] ; ) ] ] . |
SELECT DISTINCT (?c AS ?result) WHERE { { $this ex:customer ?c } UNION { $this ex:client ?c } } |
A count expression is a blank node
with exactly one value for the property sh:count
which is a well-formed node expression.
$expr
that has
the node expression N
as its value for sh:count
,
Eval($expr, $this)
produces the list consisting of exactly one literal with datatype xsd:integer
representing the size of the input list produced by Eval(N, $this)
.
The following example returns the number of values of ex:customer
.
[ sh:count [ sh:path ex:customer ] ] . |
SELECT (COUNT(?c) AS ?result) WHERE { $this ex:customer ?c . } |
A min expression is a blank node
with exactly one value for the property sh:min
which is a well-formed node expression.
$expr
that has
the node expression N
as its value for sh:min
,
Eval($expr, $this)
produces the empty list if the input list produced by Eval(N, $this)
is empty,
or the list consisting of exactly one node that is the minimum input node as defined by SPARQL MIN.
The following example returns the minimum of all values of ex:turnOver
of all values of ex:client
at the focus node.
[ sh:min [ sh:path ( ex:client ex:turnOver ) ] ] . |
SELECT (MIN(?t) AS ?result) WHERE { $this ex:client/ex:turnOver ?t . } |
A max expression is a blank node
with exactly one value for the property sh:max
which is a well-formed node expression.
$expr
that has
the node expression N
as its value for sh:max
,
Eval($expr, $this)
produces the empty list if the input list produced by Eval(N, $this)
is empty,
or the list consisting of exactly one node that is the maximum input node as defined by SPARQL MAX.
A sum expression is a blank node
with exactly one value for the property sh:sum
which is a well-formed node expression.
$expr
that has
the node expression S
as its value for sh:sum
,
and for which the results of Eval(S, $this)
are the nodes n
,
Eval($expr, $this)
produces the list consisting of exactly one node that is the sum of all n
as defined
by SPARQL SUM, producing the empty list of any of the nodes
can not be summed up.
A group concat expression is a blank node
with exactly one value for the property sh:groupConcat
which is a well-formed node expression.
A group concat expression can have a single value for the property sh:separator
which is literal with datatype xsd:string
.
$expr
that has
the node expression C
as its value for sh:groupConcat
,
Eval($expr, $this)
produces the list consisting of exactly one literal with datatype xsd:string
consisting of the string representation of the nodes from the input list produced by Eval(C, $this)
.
If S
is the value of sh:separator
then S
will be inserted in between each pair of strings.
It is up to the implementation to decide how to represent individual nodes as strings.
For example, resources may be represented by their values of rdfs:label
in the user's preferred language.
However, if an individual node is a literal with datatype xsd:string
then the lexical form of that node is used.
Tip: in order to control exactly how individual values get rendered, wrap the input node expression with a function expression that produces a string literal.
The following example returns a comma-separated list of the labels of the values of ex:child
at the focus node, ordered by their individual labels.
Here we assume each child has at most one label, to make the result predictable.
[ sh:groupConcat [ sh:path rdfs:label ; sh:nodes [ sh:nodes [ sh:path ex:child ] ; sh:orderBy [ sh:path rdfs:label ] ; ] ; ] ; sh:separator ", " ; ] . |
SELECT (GROUP_CONCAT(?label; separator=", ") AS ?result) WHERE { $this ex:child ?child . ?child rdfs:label ?label . } ORDER BY ?label |
An orderBy expression is a blank node
with exactly one value for the property sh:orderBy
which is a well-formed node expression and
with exactly one value for the property sh:nodes
which is a well-formed node expression.
An orderBy expression can have one value for the property sh:desc
which
is either true
or false
.
$expr
that has
the node expression N
as its value for sh:nodes
and
the node expression O
as its value for sh:orderBy
,
Eval($expr, $this)
produces a list consisting of exactly the results n
of Eval(N, $this)
ordered by the first node from Eval(O, n)
according to the node comparison policy of the
SPARQL ORDER BY operator.
If sh:desc
is true
for the expression, then the order is descending, otherwise ascending.
The following example returns the list of values of ex:child
at the focus node, ordered by their individual labels.
Here we assume each child has at most one label, to make the result predictable.
[ sh:nodes [ sh:path ex:child ] ; sh:orderBy [ sh:path rdfs:label ] ; ] . |
SELECT DISTINCT ?child WHERE { $this ex:child ?child . OPTIONAL { ?child rdfs:label ?label . } } ORDER BY ?label |
A limit expression is a blank node
with exactly one value for the property sh:limit
which is a literal with datatype xsd:integer
and
with exactly one value for the property sh:nodes
which is a well-formed node expression.
$expr
that has
the node expression N
as its value for sh:nodes
and
L
as its value for sh:limit
,
Eval($expr, $this)
produces the list consisting of exactly the first L
results of Eval(N, $this)
.
The following example returns at most two values of ex:parent
at the focus node.
[ sh:limit 2 ; sh:nodes [ sh:path ex:parent ] ; ] . |
SELECT ?parent WHERE { $this ex:parent ?parent . } LIMIT 2 |
An offset expression is a blank node
with exactly one value for the property sh:offset
which is a literal with datatype xsd:integer
and
with exactly one value for the property sh:nodes
which is a well-formed node expression.
$expr
that has
the node expression N
as its value for sh:nodes
and
O
as its value for sh:offset
,
Eval($expr, $this)
produces the list consisting of the results of Eval(N, $this)
except for the first O
ones.
A SPARQL ASK expression is a blank node
with exactly one value for the property sh:ask
which is string literal.
The blank node may have values for the property sh:prefixes
and these values are IRIs or blank nodes.
Using the values of sh:prefixes
as defined by
5.2.1 Prefix Declarations for SPARQL Queries,
the value of sh:ask
must be valid SPARQL 1.1 ASK query.
The blank node may also have exactly one value for the property sh:nodes
which is a well-formed node expression.
$expr
that represents the SPARQL ASK query A
and that has the (optional) node expression N
as its value for sh:nodes
:
If N
is not present, then Eval($expr, $this)
produces the list of length 1 consisting of either true
or
false
based on the result of executing A
with the node $this
as the a pre-bound
variable $this
.
If N
is present, then Eval($expr, $this)
produces a corresponding list of boolean literals for each node
produced by Eval(N, $this)
as pre-bound variable $this
.
The following example produces true
if the focus node has the given rdf:type
.
[ sh:ask "ASK { $this a <http://example.org/ns#Person> }" ] .
A SPARQL SELECT expression is a blank node
with exactly one value for the property sh:select
which is string literal.
The blank node may have values for the property sh:prefixes
and these values are IRIs or blank nodes.
Using the values of sh:prefixes
as defined by
5.2.1 Prefix Declarations for SPARQL Queries,
the value of sh:select
must be valid SPARQL 1.1 SELECT query with exactly one result variable.
The blank node may also have exactly one value for the property sh:nodes
which is a well-formed node expression.
$expr
that represents the SPARQL SELECT query S
and that has the (optional) node expression N
as its value for sh:nodes
:
If N
is not present, then Eval($expr, $this)
produces the list of all bindings of the result variable produced
by executing S
with the node $this
as the a pre-bound variable $this
.
If N
is present, then Eval($expr, $this)
produces a list concatenating the corresponding results when S
is
executed for each node produced by Eval(N, $this)
as pre-bound variable $this
.
The following example is an excerpt of a Turtle file declaring a SPARQL SELECT expression using namespace prefixes.
<http://example.org/ns> a owl:Ontology ; sh:declare [ a sh:PrefixDeclaration ; sh:prefix "ex" ; sh:namespace "http://example.org/ns#"^^xsd:anyURI ; ] . [ sh:prefixes <http://example.org/ns> ; sh:select """ SELECT DISTINCT ?client WHERE { $this ex:client ?client . ?client ex:address/ex:country ?country . ?country ex:memberOf ex:WHO . } """ ; ] .
Based on node expressions, this section introduces a constraint component called
expression constraints.
Expression constraints can be used in any shape to declare the condition that the
node expression specified via sh:expression
has true
as its (only) result.
In the evaluation of these node expressions is repeated for all value nodes of the shape
as the focus node.
Constraint Component IRI: sh:ExpressionConstraintComponent
Property | Summary and Syntax Rules |
---|---|
sh:expression |
The node expression that must return true .
The values of sh:expression at a
shape must be well-formed node expressions.
|
v
where Eval(v, $expression)
returns a node set that is not equal to { true }
there is a validation result that has v
as its sh:value
and $expression
as its sh:sourceConstraint
.
If the $expression
has values for sh:message
in the shapes graph
then these values become the (only) values for sh:resultMessage
in the
validation result.
The remainder of this section is informative.
The following example assumes that there are SHACL functions ex:concat
,
ex:strlen
and ex:lessThan
and uses them to verify that the
combined string length of ex:firstName
and ex:lastName
is less than 30.
ex:FilterExampleShape a sh:NodeShape ; sh:expression [ ex:lessThan ( [ ex:strlen ( [ ex:concat ( [ sh:path ex:firstName] [ sh:path ex:lastName ] ) ] ) ] 30 ); ] .
SHACL defines an RDF vocabulary to describe shapes - collections of constraints that apply to a set of nodes. Shapes can be associated with nodes using a flexible target mechanism, e.g. for all instances of a class. One focus area of SHACL is data validation. However, the same principles of describing data patterns in shapes can also be exploited for other purposes. SHACL rules build on SHACL to form a light-weight RDF vocabulary for the exchange of rules that can be used to derive inferred RDF triples from existing asserted triples.
The SHACL rules feature defined in this section includes a general framework using the properties
such as sh:values
, sh:rule
and sh:condition
, plus an extension mechanism for specific rule types.
This document defines two such rule types: Triple rules and SPARQL rules.
Other documents, including SHACL JavaScript Extensions [[shacl-js]], can define additional types of rules.
The following example illustrates the use of a triple rule that adds an rdf:type
triple so that those SHACL instances of ex:Rectangle
where the
ex:width
equals the ex:height
are also marked to be instances of ex:Square
.
The rule applies only to well-formed rectangles that conform to the ex:Rectangle
shape,
e.g. by having exactly one width and height, both integers.
ex:Rectangle a rdfs:Class, sh:NodeShape ; rdfs:label "Rectangle" ; sh:property [ sh:path ex:height ; sh:datatype xsd:integer ; sh:maxCount 1 ; sh:minCount 1 ; sh:name "height" ; ] ; sh:property [ sh:path ex:width ; sh:datatype xsd:integer ; sh:maxCount 1 ; sh:minCount 1 ; sh:name "width" ; ] ; sh:rule [ a sh:TripleRule ; sh:subject sh:this ; sh:predicate rdf:type ; sh:object ex:Square ; sh:condition ex:Rectangle ; sh:condition [ sh:property [ sh:path ex:width ; sh:equals ex:height ; ] ; ] ; ] .
ex:InvalidRectangle a ex:Rectangle . ex:NonSquareRectangle a ex:Rectangle ; ex:height 2 ; ex:width 3 . ex:SquareRectangle a ex:Rectangle ; ex:height 4 ; ex:width 4 .
For the data graph above, a SHACL rules engine will produce the following inferred triples:
ex:SquareRectangle rdf:type ex:Square .
No inferences will be made for ex:NonSquareRectangle
because its width is not equal to its height.
No inferences will be made for ex:InvalidRectangle
because although it has equal width
and height (namely none), it does not pass the sh:condition
of being a well-formed rectangle.
The following example illustrates a simple use case of a SPARQL rule that applies to all instances of
the class ex:Rectangle
and computes the values of the ex:area
property by multiplying
the rectangle's width and height:
ex:RectangleShape a sh:NodeShape ; sh:targetClass ex:Rectangle ; sh:property [ sh:path ex:width ; sh:datatype xsd:integer ; sh:minCount 1 ; sh:maxCount 1 ; ] ; sh:property [ sh:path ex:height ; sh:datatype xsd:integer ; sh:minCount 1 ; sh:maxCount 1 ; ] . ex:RectangleRulesShape a sh:NodeShape ; sh:targetClass ex:Rectangle ; sh:rule [ a sh:SPARQLRule ; sh:prefixes ex: ; sh:construct """ CONSTRUCT { $this ex:area ?area . } WHERE { $this ex:width ?width . $this ex:height ?height . BIND (?width * ?height AS ?area) . } """ ; sh:condition ex:RectangleShape ; # Rule only applies to Rectangles that conform to ex:RectangleShape ] ; .
An engine that is capable of executing such rules uses the target statements associated
with the shapes in the shapes graph to determine which rules need to be executed on which target nodes.
For those target nodes that conform to any condition shapes, it executes the provided
CONSTRUCT queries to produce the inferred triples.
During the execution of the query, the variable this
has the current focus node as pre-bound variable.
For the following data graph, the triples below would be produced.
ex:ExampleRectangle a ex:Rectangle ; ex:width 7 ; ex:height 8 . ex:InvalidRectangle # Lacks a value for ex:height, so sh:condition is not met a ex:Rectangle ; ex:width 7 .
Inferred triples:
ex:ExampleRectangle ex:area 56 .
The following variation produces the same results as the SPARQL rule, but uses a Triple rule. While not as expressive as CONSTRUCT-based rules, Triple rules are more declarative and may be executed on platforms that do not support SPARQL.
ex:RectangleRulesShape a sh:NodeShape ; sh:targetClass ex:Rectangle ; sh:rule [ a sh:TripleRule ; sh:subject sh:this ; sh:predicate ex:area ; # Computes the values of the ex:area property at the focus nodes sh:object [ ex:multiply ( [ sh:path ex:width ] [ sh:path ex:height ] ) ; ] ; sh:condition ex:RectangleShape ; # Rule only applies to Rectangles that conform to ex:RectangleShape ] .
Finally, the following variation does the same inferences but using a property value rule.
These rules are using a more compact syntax than triple rules and are directly attached to property shapes
with the property sh:values
.
ex:RectangleRulesShape a sh:NodeShape ; sh:targetClass ex:Rectangle ; sh:property [ sh:path ex:area ; # Computes the values of the ex:area property at the focus nodes sh:values [ ex:multiply ( [ sh:path ex:width ] [ sh:path ex:height ] ) ; ] ; ] .
The values of the property sh:rule
at a shape are called SHACL rules.
SHACL has a flexible design in which multiple types of rules can be supported,
including Triple rules and SPARQL rules.
Each rule type is identified by an IRI that is used as rdf:type
of rules.
Each rule type also defines execution instructions that can be implemented by rule engines.
Each SHACL rule has at least one rdf:type
which is a IRI.
Rules can have multiple types, e.g. to provide instructions that work either in SPARQL or JavaScript,
depending on the capabilities of the engine.
The creator of such rules needs to make sure that such rules have consistent semantics.
Rule R
has rule type T
if R
is a SHACL instance of T
.
All rules may have the properties defined in the rest of this section.
A rule may have values for the property sh:condition
to specify shapes
that the focus nodes must conform to before the rule gets executed.
The values of sh:condition
at a rule must be well-formed shapes.
Rules and shapes may specify its relative execution order as defined in this section.
Each rule or shape may have at most one value for the
property sh:order
.
The values of sh:order
at rules and shapes
are literals with a numeric datatype such as xsd:decimal
.
If unspecified, then the default execution order is 0
.
These values are used by a rules engine to determine the order of rules.
When the rules associated with a shape are executed, rules with larger values will be executed after
those with smaller values.
ex:RuleOrderExampleShape a sh:NodeShape ; sh:targetClass ex:Person ; sh:rule [ a sh:SPARQLRule ; rdfs:label "Infer uncles, i.e. male siblings of the parents of $this" ; sh:prefixes ex: ; sh:order 1 ; # Will be evaluated before 2 sh:construct """ CONSTRUCT { $this ex:uncle ?uncle . } WHERE { $this ex:parent ?parent . ?parent ex:sibling ?uncle . ?uncle ex:gender ex:male . } """ ] ; sh:rule [ a sh:SPARQLRule ; rdfs:label "Infer cousins, i.e. the children of the uncles" ; sh:prefixes ex: ; sh:order 2 ; sh:construct """ CONSTRUCT { $this ex:cousin ?cousin . } WHERE { $this ex:uncle ?uncle . ?cousin ex:parent ?uncle . } """ ] .
Rules may be deactivated by setting sh:deactivated
to true
.
Deactivated rules are ignored by the rules engine.
Each rule may have at most one value for the
property sh:deactivated
.
The values of sh:deactivated
are either
of the xsd:boolean
literals true
or false
.
SHACL defines the property sh:entailment
to link a shapes graph with entailment regimes.
The IRI sh:Rules
represents the SHACL rules entailment regime.
In the following example, the shapes graph indicates to a SHACL validation engine that the SHACL rules
inside of the shapes graph need to be executed prior to starting the validation.
<http://example.org/my-shapes> a owl:Ontology ; sh:entailment sh:Rules .
Following the general policy for SHACL, validation engines that do not support the SHACL rules entailment regime MUST signal a failure if this triple is present. Validation engines that do support the SHACL rules entailment regime execute the rules following the rules execution instructions prior to performing the actual validation.
A SHACL rules engine is a computer procedure that takes as input a data graph and a shapes graph and is capable of adding triples to the data graph. The new triples that are produced by a rules engine are called the inferred triples.
Note that, from a logical perspective, the data graph will be modified if triples get inferred. This means that rules can trigger after other triples have been inferred. However, in cases where the original data should not be modified, implementations may construct a logical data graph that has the original data as one subgraph and a dedicated inferences graph as another subgraph, and where the inferred triples get added to the inferences graph only.
In order to count as a SHACL rules engine, an implementation must be capable of inferring triples according to the following procedure (given in pseudo-code), or a different algorithm as long as the result is the same as specified. Note that this algorithm only covers a single "iteration" over all rules, without prescribing the behavior if the same rule needs to be applied multiple times after other rules have fired. The latter is left to future work.
for each shapeS
in the shapes graph, ordered by execution order { for each non-deactivated ruleR
in the shape, ordered by execution order { for each target nodeT
ofS
that conforms to all conditions ofR
{ executeR
usingT
as focus node following the execution instructions of its rule types } } }
The triples that are inferred by a rule do not immediately become part of the data graph, i.e. the triples produced by one rule can not always be queried by other rules. These policies reduce the likelihood of race conditions and better support parallel execution.
If a rules engine is not able to execute a given rule because it does not support any of the rule types of the rule, then it reports a failure.
At no time are inferred triples visible to the shapes graph, i.e. it is impossible for rules to modify the definitions of rules or shapes.
This section defines a rule type called triple rules, identified by
the IRI sh:TripleRule
.
Triple rules have the following properties:
Property | Summary and Syntax Rules |
---|---|
sh:subject |
The node expression used to compute the subjects of the triples.
Each triple rule must have exactly one
value of the property sh:subject (which must be a well-formed node expression).
|
sh:predicate |
The node expression used to compute the predicates of the triples.
Each triple rule must have exactly one
value of the property sh:predicate (which must be a well-formed node expression).
|
sh:object |
The node expression used to compute the objects of the triples.
Each triple rule must have exactly one
value of the property sh:object (which must be a well-formed node expression).
|
S
, P
and O
be the sets of nodes produced by evaluating
the node expressions that are the values of sh:subject
, sh:predicate
and sh:object
respectively at the triple rule.
For each combination of members s
of S
, p
of P
and
o
of O
, infer a triple with subject s
,
predicate p
and object o
.
Property value rules provide syntactic sugar for triple rules, supporting the use case where triple rules are used to infer values of a single property at a single focus node.
A property value rule is represented by a value V
of the property
sh:values
of a property shape P
that has a IRI path
as its value for sh:path
.
The values of the property sh:values
must be well-formed node expressions.
For each node shape N
that has P
as value of sh:property
, there is an implicit triple rule T
equivalent to the following triples (unless either P
or N
have true
as the value of sh:deactivated
):
N sh:rule T
T rdf:type sh:TripleRule
T sh:subject sh:this
T sh:predicate path
T sh:object V
An example property value rule had already been provided.
This section defines a rule type called SPARQL rules,
identified by the IRI sh:SPARQLRule
.
SPARQL rules have the following properties:
Property | Summary and Syntax Rules |
---|---|
sh:construct |
The SPARQL CONSTRUCT query.
SPARQL rules must have exactly one
value for the property sh:construct .
The values of sh:construct
are literals with datatype xsd:string .
|
sh:prefixes |
The prefixes to use to turn the sh:construct into a SPARQL query.
SPARQL rules may use the property sh:prefixes to declare a dependency on prefixes based on the
mechanism defined in Prefix Declarations for SPARQL Queries
from the SHACL specification [[!shacl]].
This mechanism allows users to abbreviate URIs in the sh:construct strings.
|
Q
be the SPARQL CONSTRUCT query derived from the values of the properties
sh:construct
and sh:prefixes
of the SPARQL rule in the shapes graph.
For each focus node, execute the query Q
pre-binding the variable this
to the focus node,
and infer the constructed triples.
This section enumerates all normative syntax rules from this document. This section is automatically generated from other parts of this spec and hyperlinks are provided back into the prose if the context of the rule in unclear. Nodes that violate these rules in a shapes graph are ill-formed.
Syntax Rule Id | Syntax Rule Text |
---|
The features defined in this document share certain security and privacy considerations with those mentioned in [[!shacl]]. The general advice is for users to only use trusted and controlled shape graphs.
Many people contributed to this document, including members of the RDF Data Shapes Working Group. The sections , and had been part of earlier drafts of the main SHACL specification [[!shacl]] but were moved out in part due to time constraints in the Working Group. Dimitris Kontokostas was the main contributor to the section.