This document defines the SPARQL-related features of the SHACL Shapes Constraint Language. SHACL is a language for validating RDF graphs against a set of conditions. These conditions are provided as shapes and other constructs expressed in the form of an RDF graph. RDF graphs that are used in this manner are called "shapes graphs" in SHACL and the RDF graphs that are validated against a shapes graph are called "data graphs". As SHACL shape graphs are used to validate that data graphs satisfy a set of conditions they can also be viewed as a description of the data graphs that do satisfy these conditions. Such descriptions may be used for a variety of purposes beside validation, including user interface building, code generation and data integration.
The introduction includes a Terminology section.
The sections 2 and 3 are about the features that SHACL-SPARQL has in addition to the Core language. These advanced features are SPARQL-based constraints and constraint components.
The syntax of SHACL is RDF. The examples in this document use Turtle [[!turtle]] and (in one instance) JSON-LD [[json-ld]]. Other RDF serializations such as RDF/XML may be used in practice. The reader should be familiar with basic RDF concepts [[!rdf11-concepts]] such as triples and with SPARQL [[!sparql11-query]].
This document specifies the SPARQL-related features of the SHACL (Shapes Constraint Language).
Throughout this document, the following terminology is used.
Terminology that is linked to portions of RDF 1.1 Concepts and Abstract Syntax is used in SHACL as defined there. Terminology that is linked to portions of SPARQL 1.1 Query Language is used in SHACL as defined there. A single linkage is sufficient to provide a definition for all occurences of a particular term in this document.
Definitions are complete within this document, i.e., if there is no rule to make some situation true in this document then the situation is false.
n
has a value v
for property p
in an RDF graph if there is an RDF triple in the graph
with subject n
, predicate p
, and object v
.
The phrase "Every value of P in graph G ..." means "Every object of a triple in G with predicate P ...".
(In this document, the verbs specify or declare are sometimes used to express the fact that an RDF term has values for a given predicate in a graph.)
n
has value v
for SPARQL property path expression
p
in an RDF graph G
if there is a solution mapping in the result of the SPARQL query
SELECT ?s ?o WHERE { ?s p' ?o }
on G
that binds ?s
to
n
and ?o
to v
, where p'
is SPARQL surface syntax for p
.
G
is an IRI or a blank node
that is either rdf:nil
(provided that rdf:nil
has no value
for either rdf:first
or rdf:rest
), or has exactly one value
for the property rdf:first
in G
and exactly one value
for the property rdf:rest
in G
that is also a SHACL list in G
,
and the list does not have itself as a value of the property path rdf:rest+
in G
.
rdf:nil
in an RDF
graph G
consist of its value for rdf:first
in G
followed by
the members in G
of its value for rdf:rest
in G
.
The SHACL list rdf:nil
has no members in any RDF graph.
Sub
in an RDF graph is a SHACL subclass of another node Super
in the graph if there is a sequence of triples in the graph each with predicate rdfs:subClassOf
such that the subject of the first triple is Sub
,
the object of the last triple is Super
, and the object of each triple except the last is the subject of the next.
If Sub
is a SHACL subclass of Super
in an RDF graph then Super
is a SHACL superclass of Sub
in the graph.
n
in an RDF graph G
is a SHACL instance of a SHACL class C
in G
if one of the SHACL types of n
in G
is C
.
Within this document, the following namespace prefix bindings are used:
Prefix | Namespace |
---|---|
rdf: |
http://www.w3.org/1999/02/22-rdf-syntax-ns# |
rdfs: |
http://www.w3.org/2000/01/rdf-schema# |
sh: |
http://www.w3.org/ns/shacl# |
xsd: |
http://www.w3.org/2001/XMLSchema# |
ex: |
http://example.com/ns# |
Note that the URI of the graph defining the SHACL vocabulary itself is equivalent to
the namespace above, i.e. it includes the #
.
References to the SHACL vocabulary, e.g. via owl:imports
should include the #
.
Throughout the document, color-coded boxes containing RDF graphs in Turtle will appear. These fragments of Turtle documents use the prefix bindings given above.
# This box represents an input shapes graph
# Triples that can be omitted are marked as grey e.g.
<s> <p> <o> .
# This box represents an input data graph. # When highlighting is used in the examples: # Elements highlighted in blue are focus nodes ex:Bob a ex:Person . # Elements highlighted in red are focus nodes that fail validation ex:Alice a ex:Person .
# This box represents an output results graph
SHACL Definitions appear in blue boxes:
# This box contains SPARQL or textual definitions.
Grey boxes such as this include syntax rules that apply to the shapes graph.
true
denotes the RDF term "true"^^xsd:boolean
.
false
denotes the RDF term "false"^^xsd:boolean
.
This document defines the SHACL-SPARQL language that extends SHACL Core. This specification describes conformance criteria for:
This document includes syntactic rules that shapes and other nodes need to fulfill in the shapes graph. These rules are typically of the form A shape must have... or The values of X are literals or All objects of triples with predicate P must be IRIs. The complete list of these rules can be found in the appendix. Nodes that violate any of these rules are called ill-formed. Nodes that violate none of these rules are called well-formed. A shapes graph is ill-formed if it contains at least one ill-formed node.
The remainder of this section is informative.
SHACL Core processors that do not also support SHACL-SPARQL ignore any SHACL-SPARQL constructs
such as sh:sparql
triples.
The following example data graph contains three SHACL instances of the class ex:Person
.
ex:Alice a ex:Person ; ex:ssn "987-65-432A" . ex:Bob a ex:Person ; ex:ssn "123-45-6789" ; ex:ssn "124-35-6789" . ex:Calvin a ex:Person ; ex:birthDate "1971-07-07"^^xsd:date ; ex:worksFor ex:UntypedCompany .
The following conditions are shown in the example:
ex:Person
can have at most one value for the property ex:ssn
,
and this value is a literal with the datatype xsd:string
that matches
a specified regular expression.
ex:Person
can have unlimited values for the property ex:worksFor
,
and these values are IRIs and SHACL instances of ex:Company
.
ex:Person
cannot have values for any other property apart from
ex:ssn
, ex:worksFor
and rdf:type
.
The aforementioned conditions can be represented as shapes and constraints in the following shapes graph:
ex:PersonShape a sh:NodeShape ; sh:targetClass ex:Person ; # Applies to all persons sh:property [ # _:b1 sh:path ex:ssn ; # constrains the values of ex:ssn sh:maxCount 1 ; sh:datatype xsd:string ; sh:pattern "^\\d{3}-\\d{2}-\\d{4}$" ; ] ; sh:property [ # _:b2 sh:path ex:worksFor ; sh:class ex:Company ; sh:nodeKind sh:IRI ; ] ; sh:closed true ; sh:ignoredProperties ( rdf:type ) .
The example below shows the same shape definition as a possible JSON-LD [[json-ld]] fragment.
Note that we have left out a @context
declaration, and depending on the
@context
the rendering may look quite different.
Therefore this example should be understood as an illustration only.
{ "@id" : "ex:PersonShape", "@type" : "NodeShape", "targetClass" : "ex:Person", "property" : [ { "path" : "ex:ssn", "maxCount" : 1, "datatype" : "xsd:string" , "pattern" : "^\\d{3}-\\d{2}-\\d{4}$" }, { "path" : "ex:worksFor", "class" : "ex:Company", "nodeKind" : "sh:IRI" } ], "closed" : true, "ignoredProperties" : [ "rdf:type" ] }
We can use the shape declaration above to illustrate some of the key terminology used by SHACL.
The target for the shape ex:PersonShape
is the set of all SHACL instances of the class ex:Person
.
This is specified using the property sh:targetClass
.
During the validation, these target nodes become focus nodes for the shape.
The shape ex:PersonShape
is a node shape, which means that it applies to the focus nodes.
It declares constraints on the focus nodes, for example using the parameters sh:closed
and sh:ignoredProperties
.
The node shape also declares two other constraints with the property sh:property
,
and each of these is backed by a property shape.
These property shapes declare additional constraints using parameters such as sh:datatype
and sh:maxCount
.
Some of the property shapes specify parameters from multiple constraint components in order to
restrict multiple aspects of the property values.
For example, in the property shape for ex:ssn
, parameters from three constraint components are used.
The parameters of these constraint components are sh:datatype
, sh:pattern
and sh:maxCount
.
For each focus node the property values of ex:ssn
will be validated against all three components.
SHACL validation based on the provided data graph and shapes graph would produce the following validation report. See the section Validation Report for details on the format.
[ a sh:ValidationReport ; sh:conforms false ; sh:result [ a sh:ValidationResult ; sh:resultSeverity sh:Violation ; sh:focusNode ex:Alice ; sh:resultPath ex:ssn ; sh:value "987-65-432A" ; sh:sourceConstraintComponent sh:RegexConstraintComponent ; sh:sourceShape ... blank node _:b1 on ex:ssn above ... ; ] , [ a sh:ValidationResult ; sh:resultSeverity sh:Violation ; sh:focusNode ex:Bob ; sh:resultPath ex:ssn ; sh:sourceConstraintComponent sh:MaxCountConstraintComponent ; sh:sourceShape ... blank node _:b1 on ex:ssn above ... ; ] , [ a sh:ValidationResult ; sh:resultSeverity sh:Violation ; sh:focusNode ex:Calvin ; sh:resultPath ex:worksFor ; sh:value ex:UntypedCompany ; sh:sourceConstraintComponent sh:ClassConstraintComponent ; sh:sourceShape ... blank node _:b2 on ex:worksFor above ... ; ] , [ a sh:ValidationResult ; sh:resultSeverity sh:Violation ; sh:focusNode ex:Calvin ; sh:resultPath ex:birthDate ; sh:value "1971-07-07"^^xsd:date ; sh:sourceConstraintComponent sh:ClosedConstraintComponent ; sh:sourceShape sh:PersonShape ; ] ] .
The validation results are enclosed in a validation report.
The first validation result is produced because ex:Alice
has a value for ex:ssn
that does not match the regular expression specified by the property sh:regex
.
The second validation result is produced because ex:Bob
has more than the permitted number of values
for the property ex:ssn
as specified by the sh:maxCount
of 1.
The third validation result is produced because ex:Calvin
has a value for ex:worksFor
that does not have an rdf:type
triple that makes it a SHACL instance of ex:Company
.
The forth validation result is produced because the shape ex:PersonShape
has the property sh:closed
set to true
but ex:Calvin
uses the property ex:birthDate
which is neither one of the predicates from any of the
property shapes of the shape, nor one of the properties listed using sh:ignoredProperties
.
SHACL uses the RDF and RDFS vocabularies, but full RDFS inferencing is not required.
However, SHACL processors MAY operate on RDF graphs that include entailments [[!sparql11-entailment]] -
either pre-computed before being submitted to a SHACL processor or performed on the fly as
part of SHACL processing (without modifying either data graph or shapes graph).
To support processing of entailments, SHACL includes the property
sh:entailment
to indicate what inferencing is required
by a given shapes graph.
The values of the property sh:entailment
are IRIs.
Common values for this property are covered by [[!sparql11-entailment]].
SHACL implementations MAY, but are not required to, support entailment regimes.
If a shapes graph contains any triple with the predicate sh:entailment
and object E
and the SHACL processor does not support E
as an entailment regime for the given data graph
then the processor MUST signal a failure.
Otherwise, the SHACL processor MUST provide the entailments for all of the values of sh:entailment
in the shapes graph,
and any inferred triples MUST be returned by all queries against the data graph during the validation process.
For SHACL Core this specification uses parts of SPARQL 1.1 in non-normative alternative definitions of the semantics of constraint components and targets. While these may help some implementers, SPARQL is not required for the implementation of the SHACL Core language.
SHACL-SPARQL is based on SPARQL 1.1 and uses it as a mechanism to declare constraints and constraint components. Implementations that cover only the SHACL Core features are not required to implement these mechanisms.
SPARQL variables using the $
marker represent external bindings that are pre-bound or, in the case of $PATH
, substituted in the SPARQL query before execution (as explained in ).
The definition of some constraints requires or is simplified through access to the shapes graph during query execution.
SHACL-SPARQL processors MAY pre-bind the variable shapesGraph
to provide access to the shapes graph.
Access to the shapes graph is not a requirement for supporting the SHACL Core language.
The variable shapesGraph
can also be used in SPARQL-based constraints and SPARQL-based constraint components.
However, such constraints may not be interoperable across different SHACL-SPARQL processors or not applicable to remote RDF datasets.
Note that at the time of writing, SPARQL EXISTS has been imperfectly defined and implementations vary. While a W3C Community Group is working on improving this situation, users of SPARQL are advised that the use of EXISTS may have inconsistent results and should be approached with care.
SHACL-SPARQL supports a constraint component that can be used to express restrictions based on a SPARQL SELECT query.
Constraint Component IRI: sh:SPARQLConstraintComponent
Property | Summary |
---|---|
sh:sparql |
A SPARQL-based constraint declaring the SPARQL query to evaluate. |
The syntax rules and validation process for SPARQL-based constraints are defined in the rest of this section.
The following example illustrates the syntax of a SPARQL-based constraint.
ex:ValidCountry a ex:Country ;
ex:germanLabel "Spanien"@de .
ex:InvalidCountry a ex:Country ;
ex:germanLabel "Spain"@en .
ex:LanguageExampleShape
a sh:NodeShape ;
sh:targetClass ex:Country ;
sh:sparql [
a sh:SPARQLConstraint ; # This triple is optional
sh:message "Values are literals with German language tag." ;
sh:prefixes ex: ;
sh:select """
SELECT $this (ex:germanLabel AS ?path) ?value
WHERE {
$this ex:germanLabel ?value .
FILTER (!isLiteral(?value) || !langMatches(lang(?value), "de"))
}
""" ;
] .
The target of the shape above includes all SHACL instances of ex:Country
.
For those nodes (represented by the variable this
), the SPARQL query walks through the values of ex:germanLabel
and verifies that they are literals with a German language code.
The validation results for the aforementioned data graph is shown below:
[ a sh:ValidationReport ; sh:conforms false ; sh:result [ a sh:ValidationResult ; sh:resultSeverity sh:Violation ; sh:focusNode ex:InvalidCountry ; sh:resultPath ex:germanLabel ; sh:value "Spain"@en ; sh:sourceConstraintComponent sh:SPARQLConstraintComponent ; sh:sourceShape ex:LanguageExampleShape ; # ... ] ] .
The SPARQL query returns result set solutions for all bindings of the variable value
that violate the constraint.
There is a validation result for each solution in that result set, applying the mapping rules explained later.
In this example, each validation result will have the binding for the variable this
as the sh:focusNode
,
ex:germanLabel
as sh:resultPath
and the violating value as sh:value
.
The following example illustrates a similar scenario as above, but with a property shape.
ex:LanguageExamplePropertyShape
a sh:PropertyShape ;
sh:targetClass ex:Country ;
sh:path ex:germanLabel ;
sh:sparql [
a sh:SPARQLConstraint ; # This triple is optional
sh:message "Values are literals with German language tag." ;
sh:prefixes ex: ;
sh:select """
SELECT $this ?value
WHERE {
$this $PATH ?value .
FILTER (!isLiteral(?value) || !langMatches(lang(?value), "de"))
}
""" ;
] .
Shapes may have values for the property sh:sparql
, and these values are either IRIs or blank nodes.
These values are called SPARQL-based constraints.
SPARQL-based constraints have exactly one value for the property sh:select
.
The value of sh:select
is a literal of datatype xsd:string
.
The class sh:SPARQLConstraint
is defined in the SHACL vocabulary and may be used as the type of these constraints (although no type is required).
Using the prefix handling rules, the value of sh:select
is a valid SPARQL 1.1 SELECT query.
The SPARQL query derived from the value of sh:select
projects the variable this
in the SELECT clause.
The following two properties are similar to their use in shapes:
SPARQL-based constraints may have values for the property sh:message
and these are either xsd:string
literals or literals with a language tag.
SPARQL-based constraints may have at most one value for the property sh:deactivated
and this value is either true
or false
.
SELECT queries used in the context of property shapes use a special variable named PATH
as a placeholder for the path used by the shape.
The only legal use of the variable PATH
in the SPARQL queries of SPARQL-based constraints
and SELECT-based validators is in the
predicate position of a triple pattern.
A query that uses the variable PATH
in any other position is ill-formed.
A shapes graph may include declarations of namespace prefixes so that these prefixes can be used to abbreviate the SPARQL queries derived from the same shapes graph. The syntax of such prefix declarations is illustrated by the following example.
ex: a owl:Ontology ; owl:imports sh: ; sh:declare [ sh:prefix "ex" ; sh:namespace "http://example.com/ns#"^^xsd:anyURI ; ] ; sh:declare [ sh:prefix "schema" ; sh:namespace "http://schema.org/"^^xsd:anyURI ; ] .
The values of the property sh:declare
are IRIs or blank nodes,
and these values are called prefix declarations.
The SHACL vocabulary includes the class sh:PrefixDeclaration
as type for such prefix declarations
although no rdf:type
triple is required for them.
Prefix declarations have exactly one value for the property sh:prefix
.
The values of sh:prefix
are literals of datatype xsd:string
.
Prefix declarations have exactly one value for the property sh:namespace
.
The values of sh:namespace
are literals of datatype xsd:anyURI
.
Such a pair of values specifies a single mapping of a prefix to a namespace.
The recommended subject for values of sh:declare
is the IRI of the named graph containing the shapes that use the prefixes.
These IRIs are often declared as an instance of owl:Ontology
, but this is not required.
Prefix declarations can be used by SPARQL-based constraints,
the validators of SPARQL-based constraint components,
and by similar features defined by SHACL extensions.
These nodes can use the property sh:prefixes
to specify a set of prefix mappings.
An example use of the sh:prefixes
property can be found in the
example above.
The values of sh:prefixes
are either IRIs or blank nodes.
A SHACL processor collects a set of prefix mappings as the union of all
individual prefix mappings that are values of the SPARQL property path sh:prefixes/owl:imports*/sh:declare
of the SPARQL-based constraint or validator.
If such a collection of prefix declarations contains multiple namespaces for the same value of sh:prefix
,
then the shapes graph is ill-formed.
(Note that SHACL processors MAY ignore prefix declarations that are never reached).
A SHACL processor transforms the values of sh:select
(and similar properties such as sh:ask
)
into SPARQL by prepending PREFIX
declarations
for all prefix mappings.
Each value of sh:prefix
is turned into the PNAME_NS
, while each value of sh:namespace
is turned
into the IRIREF
in the PREFIX
declaration.
For the example shapes graph above, a SHACL-SPARQL processor would produce lines such as PREFIX ex: <http://example.com/ns#>
.
The SHACL-SPARQL processor MUST produce a failure if the resulting query string cannot be parsed into a valid SPARQL 1.1 query.
In the rest of this document, the sh:prefixes
statements may have been omitted for brevity.
This section explains the validator of sh:SPARQLConstraintComponent
.
Note that this validator only explains one possible implementation strategy, and
SHACL processors may choose alternative approaches as long as the outcome is equivalent.
true
as a value for the property sh:deactivated
.
Otherwise, execute the SPARQL query specified by the SPARQL-based constraint $sparql
pre-binding the variables this
and, if supported,
shapesGraph
and currentShape
as described in .
If the shape is a property shape, then prior to execution
substitute the variable PATH
where it appears in the predicate
position of a triple pattern
with a valid SPARQL surface syntax string of the SHACL property path
specified via sh:path
at the property shape.
There is one validation result for each solution that does not have true
as the binding for the variable failure
.
These validation results MUST have the property values explained in .
A failure MUST be produced if and only if one of the solutions has true
as the binding for failure
.
When the SPARQL queries of SPARQL-based constraints and the validators of SPARQL-based constraint components are processed, the SHACL-SPARQL processor pre-binds values for the variables in the following table.
Variable | Interpretation |
---|---|
this |
The focus node. |
shapesGraph (Optional) |
Can be used to query the shapes graph as in GRAPH $shapesGraph { ... } .
If the shapes graph is a named graph in the same dataset as the data graph then it is the IRI of the shapes graph in the dataset.
Not all SHACL-SPARQL processors need to support this variable.
Processors that do not support the variable shapesGraph MUST report a failure if they encounter a query that references this variable.
Use of GRAPH $shapesGraph { ... } should be handled with extreme caution.
It may result in constraints that are not interoperable across different SHACL-SPARQL processors and that may not run on remote RDF datasets.
|
currentShape (Optional) |
The current shape. Typically used in conjunction with the variable shapesGraph .
The same support policies as for shapesGraph apply for this variable.
|
The property values of the validation result nodes are derived by the following rules, through a combination of result solutions and the values of the constraint itself. The rules are meant to be executed from top to bottom, so that the first bound value will be used.
Property | Production Rules |
---|---|
sh:focusNode |
|
sh:resultPath |
|
sh:value |
|
sh:resultMessage |
These message literals may include the names of any SELECT result variables via
{?varName} or {$varName} .
If the constraint is based on a SPARQL-based constraint component, then the component's parameter names can also be used.
These {?varName} and {$varName} blocks SHOULD be replaced with suitable string representations of the values of said variables.
|
sh:sourceConstraint |
|
SPARQL-based constraints provide a lot of flexibility but may be hard to understand for some people or lead to repetition. This section introduces SPARQL-based constraint components as a way to abstract the complexity of SPARQL and to declare high-level reusable components similar to the Core constraint components. Such constraint components can be declared using the SHACL RDF vocabulary and thus shared and reused.
The following example demonstrates how SPARQL can be used to specify new constraint components using the SHACL-SPARQL language.
The example implements sh:pattern
and sh:flags
using a
SPARQL ASK query to validate that each value node matches a given regular expression.
Note that this is only an example implementation and should not be considered normative.
sh:PatternConstraintComponent a sh:ConstraintComponent ; sh:parameter [ sh:path sh:pattern ; ] ; sh:parameter [ sh:path sh:flags ; sh:optional true ; ] ; sh:validator shimpl:hasPattern . shimpl:hasPattern a sh:SPARQLAskValidator ; sh:message "Value does not match pattern {$pattern}" ; sh:ask """ ASK { FILTER (!isBlank($value) && IF(bound($flags), regex(str($value), $pattern, $flags), regex(str($value), $pattern))) }""" .
Constraint components provide instructions to validation engines on how to identify and validate constraints within a shape.
In general, if a shape S
has a value for a property p
, and there is a constraint component
C
that specifies p
as a parameter, and S
has values for all mandatory parameters of C
,
then the set of these parameter values (including the optional parameters) declare a constraint and the validation engine uses a suitable validator from C
to perform the validation of this constraint.
In the example above, sh:PatternConstraintComponent
declares the mandatory parameter sh:pattern
,
the optional parameter sh:flags
,
and a validator that can be used to perform validation against either node shapes or property shapes.
A SPARQL-based constraint component is an IRI that has SHACL type
sh:ConstraintComponent
in the shapes graph.
The mechanism to declare new constraint components in this document is limited to those based on SPARQL. However, then general syntax of declaring parameters and validators has been designed to also work for other extension languages such as JavaScript.
The parameters of a constraint component are declared via the property sh:parameter
.
The values of sh:parameter
are called parameter declarations.
The class sh:Parameter
may be used as type of parameter declarations but no such triple is required.
Each parameter declaration has exactly one value for the property sh:path
.
At parameter declarations, the value of sh:path
is an IRI.
The local name of an IRI is defined as the longest NCNAME
at the end of the IRI, not immediately preceded by the first colon in the IRI.
The parameter name of a parameter declaration is defined as the local name of the value of sh:path
.
To ensure that a correct mapping from parameters into SPARQL variables is possible, the following syntax rules apply:
Every parameter name is a valid SPARQL VARNAME.
Parameter names must not be one of the following: this
, shapesGraph
, currentShape
, path
, PATH
, value
.
A constraint component where two or more parameter declarations use the same parameter names is ill-formed.
The values of sh:optional
must be literals with datatype xsd:boolean
.
A parameter declaration can have at most one value for the property sh:optional
.
If set to true
then the parameter declaration declares an optional parameter.
Every constraint component has at least one non-optional parameter.
The class sh:Parameter
is defined as a SHACL subclass of sh:PropertyShape
,
and all properties that are applicable to property shapes may also be used for parameters.
This includes descriptive properties such as sh:name
and sh:description
but also constraint parameters such as sh:class
.
Shapes that do not conform with the constraints declared for the parameters are ill-formed.
Some implementations MAY use these constraint parameters to prevent the execution of constraint components with invalid parameter values.
The property sh:labelTemplate
can be used at any constraint component to suggest how constraints could be rendered to humans.
The values of sh:labelTemplate
are strings (possibly with language tag) and
are called label templates.
The remainder of this section is informative.
Label templates can include the names of the parameters that are declared for the constraint component
using the syntaxes {?varName}
or {$varName}
,
where varName
is the name of the parameter name.
At display time, these {?varName}
and {$varName}
blocks SHOULD be replaced with the actual parameter values.
There may be multiple label templates for the same subject, but they should not have the same language tags.
For every supported shape type (i.e., property shape or node shape) the constraint component declares a suitable validator. For a given constraint, a validator is selected from the constraint component using the following rules, in order:
sh:nodeValidator
, if present.sh:propertyValidator
, if present.sh:validator
.
If no suitable validator can be found, a SHACL-SPARQL processor ignores the constraint.
SHACL-SPARQL includes two types of validators, based on SPARQL SELECT (for sh:nodeValidator
and sh:propertyValidator
)
or SPARQL ASK queries (for sh:validator
).
Validators with SHACL type sh:SPARQLSelectValidator
are called SELECT-based validators.
The values of sh:nodeValidator
must be SELECT-based validators.
The values of sh:propertyValidator
must be SELECT-based validators.
SELECT-based validators have exactly one value for the property sh:select
.
The value of sh:select
is a valid SPARQL SELECT query using the aforementioned prefix handling rules.
The SPARQL query derived from the value of sh:select
projects the variable this
in its SELECT clause.
The remainder of this section is informative.
The following example illustrates the declaration of a constraint component based on a SPARQL SELECT query.
It is a generalized variation of the example from .
That SPARQL query included two constants: the specific property ex:germanLabel
and the language tag de
.
Constraint components make it possible to generalize such scenarios, so that constants get pre-bound with parameters.
This allows the query logic to be reused in multiple places, without having to write any new SPARQL.
ex:LanguageConstraintComponentUsingSELECT a sh:ConstraintComponent ; rdfs:label "Language constraint component" ; sh:parameter [ sh:path ex:lang ; sh:datatype xsd:string ; sh:minLength 2 ; sh:name "language" ; sh:description "The language tag, e.g. \"de\"." ; ] ; sh:labelTemplate "Values are literals with language \"{$lang}\"" ; sh:propertyValidator [ a sh:SPARQLSelectValidator ; sh:message "Values are literals with language \"{?lang}\"" ; sh:select """ SELECT DISTINCT $this ?value WHERE { $this $PATH ?value . FILTER (!isLiteral(?value) || !langMatches(lang(?value), $lang)) } """ ] .
Once a constraint component has been declared (in a shapes graph), its parameters can be used as illustrated in the following example.
ex:LanguageExampleShape
a sh:NodeShape ;
sh:targetClass ex:Country ;
sh:property [
sh:path ex:germanLabel ;
ex:lang "de" ;
] ;
sh:property [
sh:path ex:englishLabel ;
ex:lang "en" ;
] .
The example shape above specifies the condition that all values of ex:germanLabel
carry the language tag de
while all values of ex:englishLabel
have en
as their language.
These details are specified via two property shapes that have values for the ex:lang
parameter required by the constraint component.
Many constraint components are of the form in which all value nodes are tested individually against some boolean condition. Writing SELECT queries for these becomes burdensome, especially if a constraint component can be used for both property shapes and node shapes. SHACL-SPARQL provides an alternative, more compact syntax for validators based on ASK queries.
Validators with SHACL type sh:SPARQLAskValidator
are called ASK-based validators.
The values of sh:validator
must be ASK-based validators.
ASK-based validators have exactly one value for the property sh:ask
.
The value of sh:ask
must be a literal with datatype xsd:string
.
The value of sh:ask
must be a valid SPARQL ASK query using the aforementioned prefix handling rules.
The remainder of this section is informative.
The ASK queries return true
if and only if a given value node
(represented by the pre-bound variable value
) conforms to the constraint.
The following example declares a constraint component using an ASK query.
ex:LanguageConstraintComponentUsingASK a sh:ConstraintComponent ; rdfs:label "Language constraint component" ; sh:parameter [ sh:path ex:lang ; sh:datatype xsd:string ; sh:minLength 2 ; sh:name "language" ; sh:description "The language tag, e.g. \"de\"." ; ] ; sh:labelTemplate "Values are literals with language \"{$lang}\"" ; sh:validator ex:hasLang . ex:hasLang a sh:SPARQLAskValidator ; sh:message "Values are literals with language \"{$lang}\"" ; sh:ask """ ASK { FILTER (isLiteral($value) && langMatches(lang($value), $lang)) } """ .
Note that the validation condition implemented by an ASK query is "in the inverse direction" from its SELECT counterpart:
ASK queries return true
for value nodes that conform to the constraint, while SELECT queries return those value nodes that do not conform.
This section defines the validator of SPARQL-based constraint components. Note that this validator only explains one possible implementation strategy, and SHACL processors may choose alternative approaches as long as the outcome is equivalent.
As the first step, a validator MUST be selected based on the rules outlined in . Then the following rules apply, producing a set of solutions of SPARQL queries:
v
where the SPARQL ASK query returns false
with v
pre-bound to the variable value
,
create one solution consisting of the bindings
($this
, focus node) and ($value
, v
).
Let QS
be a list of these solutions.
PATH
where it appears in the predicate
position of a triple pattern
with a valid SPARQL surface syntax string of the SHACL property path
specified via sh:path
at the property shape.
Let QS
be the solutions produced by executing the SPARQL query.
The SPARQL query executions above MUST pre-bind the variables
this
and, if supported, shapesGraph
and currentShape
as described in .
In addition, each value of a parameter of the constraint component in the constraint
MUST be pre-bound as a variable that has the parameter name as its name.
The production rules for the validation results are identical to those for SPARQL-based constraints,
using the solutions QS
as produced above.
Some features of SHACL-SPARQL rely on the concept of pre-binding of variables as defined in this section.
The definition of pre-binding used by SHACL requires the following restrictions on SPARQL queries.
SHACL-SPARQL processors MUST report a failure when it is operating on a shapes graph
that contains SHACL-SPARQL queries (via sh:select
and sh:ask
) that violate any of these restrictions.
Note that the term potentially pre-bound variables includes the variables this
,
shapesGraph
, currentShape
, value
(for ASK queries),
and any variables that represent the parameters of the constraint component that uses the query.
MINUS
clauseSERVICE
)VALUES
clauseAS ?var
for any potentially pre-bound variableshapesGraph
and currentShape
which are optional as already mentioned in
For solution mapping μ
, define Table(μ)
to be the multiset formed from μ
.
Table(μ) = { μ }
Card[μ] = 1
Define the Values Insertion function Replace(X, μ)
to
replace each occurence Y
of a
Basic Graph Pattern,
Property Path Expression,
Graph(Var, pattern)
in X
with join(Y, Table(μ))
.
The evaluation of the SPARQL Query
Q = (E, DS, QF)
with pre-bound variables μ
is defined as the evaluation of SPARQL query Q' = (Replace(E, μ), DS, QF)
.
This section enumerates all normative syntax rules of SHACL. This section is automatically generated from other parts of this spec and hyperlinks are provided back into the prose if the context of the rule in unclear. Nodes that violate these rules in a shapes graph are ill-formed.
Syntax Rule Id | Syntax Rule Text |
---|
Like most RDF-based technologies, SHACL processors may operate on graphs that are combined
from various sources. Some applications may have an open "linked data" architecture and dynamically
assemble RDF triples from sources that are outside of an organization's network of trust.
Since RDF allows anyone to add statements about any resource, triples may modify the originally
intended semantics of shape definitions or nodes in a data graph and thus lead to misleading results.
Protection against this (and the following) scenario can be achieved by only using trusted
and verified RDF sources and eliminating the possibility that graphs are dynamically added via
owl:imports
and sh:shapesGraph
.
SHACL-SPARQL includes all the security issues of SPARQL.
The original 1.0 version of SHACL was produced by the RDF Data Shapes Working Group. See its SHACL 1.0 Acknowledgements section.
The detailed list of changes and their diffs can be found in the Git repository.