This document defines the SPARQL-related features of the SHACL Shapes Constraint Language. SHACL is a language for validating RDF graphs against a set of conditions. These conditions are provided as shapes and other constructs expressed in the form of an RDF graph. RDF graphs that are used in this manner are called "shapes graphs" in SHACL and the RDF graphs that are validated against a shapes graph are called "data graphs". As SHACL shape graphs are used to validate that data graphs satisfy a set of conditions they can also be viewed as a description of the data graphs that do satisfy these conditions. Such descriptions may be used for a variety of purposes beside validation, including user interface building, code generation and data integration.

Document Outline

The introduction includes a Terminology section.

The sections 2 and 3 are about the features that SHACL-SPARQL has in addition to the Core language. These advanced features are SPARQL-based constraints and constraint components.

The syntax of SHACL is RDF. The examples in this document use Turtle [[!turtle]] and (in one instance) JSON-LD [[json-ld]]. Other RDF serializations such as RDF/XML may be used in practice. The reader should be familiar with basic RDF concepts [[!rdf11-concepts]] such as triples and with SPARQL [[!sparql11-query]].

Introduction

This document specifies the SPARQL-related features of the SHACL (Shapes Constraint Language).

Terminology

Throughout this document, the following terminology is used.

Terminology that is linked to portions of RDF 1.1 Concepts and Abstract Syntax is used in SHACL as defined there. Terminology that is linked to portions of SPARQL 1.1 Query Language is used in SHACL as defined there. A single linkage is sufficient to provide a definition for all occurences of a particular term in this document.

Definitions are complete within this document, i.e., if there is no rule to make some situation true in this document then the situation is false.

Basic RDF Terminology
This document uses the terms RDF graph, RDF triple, IRI, literal, blank node, node of an RDF graph, RDF term, and subject, predicate, and object of RDF triples, and datatype as defined in RDF 1.1 Concepts and Abstract Syntax [[!rdf11-concepts]]. Language tags are defined as in [[!BCP47]].
Property Value and Path
A property is an IRI. An RDF term n has a value v for property p in an RDF graph if there is an RDF triple in the graph with subject n, predicate p, and object v. The phrase "Every value of P in graph G ..." means "Every object of a triple in G with predicate P ...". (In this document, the verbs specify or declare are sometimes used to express the fact that an RDF term has values for a given predicate in a graph.)
SPARQL property paths are defined as in SPARQL 1.1. An RDF term n has value v for SPARQL property path expression p in an RDF graph G if there is a solution mapping in the result of the SPARQL query SELECT ?s ?o WHERE { ?s p' ?o } on G that binds ?s to n and ?o to v, where p' is SPARQL surface syntax for p.
SHACL Lists
A SHACL list in an RDF graph G is an IRI or a blank node that is either rdf:nil (provided that rdf:nil has no value for either rdf:first or rdf:rest), or has exactly one value for the property rdf:first in G and exactly one value for the property rdf:rest in G that is also a SHACL list in G, and the list does not have itself as a value of the property path rdf:rest+ in G.
The members of any SHACL list except rdf:nil in an RDF graph G consist of its value for rdf:first in G followed by the members in G of its value for rdf:rest in G. The SHACL list rdf:nil has no members in any RDF graph.
Binding, Solution
A binding is a pair (variable, RDF term), consistent with the term's use in SPARQL. A solution is a set of bindings, informally often understood as one row in the body of the result table of a SPARQL query. Variables are not required to be bound in a solution.
SHACL Subclass, SHACL superclass
A node Sub in an RDF graph is a SHACL subclass of another node Super in the graph if there is a sequence of triples in the graph each with predicate rdfs:subClassOf such that the subject of the first triple is Sub, the object of the last triple is Super, and the object of each triple except the last is the subject of the next. If Sub is a SHACL subclass of Super in an RDF graph then Super is a SHACL superclass of Sub in the graph.
SHACL Type
The SHACL types of an RDF term in an RDF graph is the set of its values for rdf:type in the graph as well as the SHACL superclasses of these values in the graph.
SHACL Class
Nodes in an RDF graph that are subclasses, superclasses, or types of nodes in the graph are referred to as SHACL class.
SHACL Class Instance
A node n in an RDF graph G is a SHACL instance of a SHACL class C in G if one of the SHACL types of n in G is C.
SHACL Core and SHACL-SPARQL
The SHACL specification is divided into SHACL Core and SHACL-SPARQL. SHACL Core consists of frequently needed features for the representation of shapes, constraints and targets. All SHACL implementations MUST at least implement SHACL Core. SHACL-SPARQL consists of all features of SHACL Core plus the advanced features of SPARQL-based constraints and an extension mechanism to declare new constraint components.

Document Conventions

Within this document, the following namespace prefix bindings are used:

Prefix Namespace
rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs: http://www.w3.org/2000/01/rdf-schema#
sh: http://www.w3.org/ns/shacl#
xsd: http://www.w3.org/2001/XMLSchema#
ex: http://example.com/ns#

Note that the URI of the graph defining the SHACL vocabulary itself is equivalent to the namespace above, i.e. it includes the #. References to the SHACL vocabulary, e.g. via owl:imports should include the #.

Throughout the document, color-coded boxes containing RDF graphs in Turtle will appear. These fragments of Turtle documents use the prefix bindings given above.

# This box represents an input shapes graph

# Triples that can be omitted are marked as grey e.g.
<s> <p> <o> .
# This box represents an input data graph.
# When highlighting is used in the examples:

# Elements highlighted in blue are focus nodes
ex:Bob a ex:Person .

# Elements highlighted in red are focus nodes that fail validation
ex:Alice a ex:Person .
# This box represents an output results graph

SHACL Definitions appear in blue boxes:

SPARQL or TEXTUAL DEFINITIONS
# This box contains SPARQL or textual definitions. 

Grey boxes such as this include syntax rules that apply to the shapes graph.

true denotes the RDF term "true"^^xsd:boolean. false denotes the RDF term "false"^^xsd:boolean.

This document defines the SHACL-SPARQL language that extends SHACL Core. This specification describes conformance criteria for:

This document includes syntactic rules that shapes and other nodes need to fulfill in the shapes graph. These rules are typically of the form A shape must have... or The values of X are literals or All objects of triples with predicate P must be IRIs. The complete list of these rules can be found in the appendix. Nodes that violate any of these rules are called ill-formed. Nodes that violate none of these rules are called well-formed. A shapes graph is ill-formed if it contains at least one ill-formed node.

The remainder of this section is informative.

SHACL Core processors that do not also support SHACL-SPARQL ignore any SHACL-SPARQL constructs such as sh:sparql triples.

SHACL Example

The following example data graph contains three SHACL instances of the class ex:Person.

ex:Alice
	a ex:Person ;
	ex:ssn "987-65-432A" .
  
ex:Bob
	a ex:Person ;
	ex:ssn "123-45-6789" ;
	ex:ssn "124-35-6789" .
  
ex:Calvin
	a ex:Person ;
	ex:birthDate "1971-07-07"^^xsd:date ;
	ex:worksFor ex:UntypedCompany .

The following conditions are shown in the example:

The aforementioned conditions can be represented as shapes and constraints in the following shapes graph:

ex:PersonShape
	a sh:NodeShape ;
	sh:targetClass ex:Person ;    # Applies to all persons
	sh:property [                 # _:b1
		sh:path ex:ssn ;           # constrains the values of ex:ssn
		sh:maxCount 1 ;
		sh:datatype xsd:string ;
		sh:pattern "^\\d{3}-\\d{2}-\\d{4}$" ;
	] ;
	sh:property [                 # _:b2
		sh:path ex:worksFor ;
		sh:class ex:Company ;
		sh:nodeKind sh:IRI ;
	] ;
	sh:closed true ;
	sh:ignoredProperties ( rdf:type ) .

The example below shows the same shape definition as a possible JSON-LD [[json-ld]] fragment. Note that we have left out a @context declaration, and depending on the @context the rendering may look quite different. Therefore this example should be understood as an illustration only.

{
	"@id" : "ex:PersonShape",
	"@type" : "NodeShape",
	"targetClass" : "ex:Person",
	"property" : [
		{
			"path" : "ex:ssn",
			"maxCount" : 1,
			"datatype" : "xsd:string" ,
			"pattern" : "^\\d{3}-\\d{2}-\\d{4}$"
		},
		{
			"path" : "ex:worksFor",
			"class" : "ex:Company",
			"nodeKind" : "sh:IRI"
		}
	],
	"closed" : true,
	"ignoredProperties" : [ "rdf:type" ]
}

We can use the shape declaration above to illustrate some of the key terminology used by SHACL. The target for the shape ex:PersonShape is the set of all SHACL instances of the class ex:Person. This is specified using the property sh:targetClass. During the validation, these target nodes become focus nodes for the shape. The shape ex:PersonShape is a node shape, which means that it applies to the focus nodes. It declares constraints on the focus nodes, for example using the parameters sh:closed and sh:ignoredProperties. The node shape also declares two other constraints with the property sh:property, and each of these is backed by a property shape. These property shapes declare additional constraints using parameters such as sh:datatype and sh:maxCount.

Some of the property shapes specify parameters from multiple constraint components in order to restrict multiple aspects of the property values. For example, in the property shape for ex:ssn, parameters from three constraint components are used. The parameters of these constraint components are sh:datatype, sh:pattern and sh:maxCount. For each focus node the property values of ex:ssn will be validated against all three components.

SHACL validation based on the provided data graph and shapes graph would produce the following validation report. See the section Validation Report for details on the format.

[	a sh:ValidationReport ;
	sh:conforms false ;
	sh:result
	[	a sh:ValidationResult ;
		sh:resultSeverity sh:Violation ;
		sh:focusNode ex:Alice ;
		sh:resultPath ex:ssn ;
		sh:value "987-65-432A" ;
		sh:sourceConstraintComponent sh:RegexConstraintComponent ;
		sh:sourceShape ... blank node _:b1 on ex:ssn above ... ;
	] ,
	[	a sh:ValidationResult ;
		sh:resultSeverity sh:Violation ;
		sh:focusNode ex:Bob ;
		sh:resultPath ex:ssn ;
		sh:sourceConstraintComponent sh:MaxCountConstraintComponent ;
		sh:sourceShape ... blank node _:b1 on ex:ssn above ... ;
	] ,
	[	a sh:ValidationResult ;
		sh:resultSeverity sh:Violation ;
		sh:focusNode ex:Calvin ;
		sh:resultPath ex:worksFor ;
		sh:value ex:UntypedCompany ;
		sh:sourceConstraintComponent sh:ClassConstraintComponent ;
		sh:sourceShape ... blank node _:b2 on ex:worksFor above ... ;
	] ,
	[	a sh:ValidationResult ;
		sh:resultSeverity sh:Violation ;
		sh:focusNode ex:Calvin ;
		sh:resultPath ex:birthDate ;
		sh:value "1971-07-07"^^xsd:date ;
		sh:sourceConstraintComponent sh:ClosedConstraintComponent ;
		sh:sourceShape sh:PersonShape ;
	] 
] .

The validation results are enclosed in a validation report. The first validation result is produced because ex:Alice has a value for ex:ssn that does not match the regular expression specified by the property sh:regex. The second validation result is produced because ex:Bob has more than the permitted number of values for the property ex:ssn as specified by the sh:maxCount of 1. The third validation result is produced because ex:Calvin has a value for ex:worksFor that does not have an rdf:type triple that makes it a SHACL instance of ex:Company. The forth validation result is produced because the shape ex:PersonShape has the property sh:closed set to true but ex:Calvin uses the property ex:birthDate which is neither one of the predicates from any of the property shapes of the shape, nor one of the properties listed using sh:ignoredProperties.

Relationship between SHACL and RDFS inferencing

SHACL uses the RDF and RDFS vocabularies, but full RDFS inferencing is not required.

However, SHACL processors MAY operate on RDF graphs that include entailments [[!sparql11-entailment]] - either pre-computed before being submitted to a SHACL processor or performed on the fly as part of SHACL processing (without modifying either data graph or shapes graph). To support processing of entailments, SHACL includes the property sh:entailment to indicate what inferencing is required by a given shapes graph.

The values of the property sh:entailment are IRIs. Common values for this property are covered by [[!sparql11-entailment]].

SHACL implementations MAY, but are not required to, support entailment regimes. If a shapes graph contains any triple with the predicate sh:entailment and object E and the SHACL processor does not support E as an entailment regime for the given data graph then the processor MUST signal a failure. Otherwise, the SHACL processor MUST provide the entailments for all of the values of sh:entailment in the shapes graph, and any inferred triples MUST be returned by all queries against the data graph during the validation process.

Relationship between SHACL and SPARQL

For SHACL Core this specification uses parts of SPARQL 1.1 in non-normative alternative definitions of the semantics of constraint components and targets. While these may help some implementers, SPARQL is not required for the implementation of the SHACL Core language.

SHACL-SPARQL is based on SPARQL 1.1 and uses it as a mechanism to declare constraints and constraint components. Implementations that cover only the SHACL Core features are not required to implement these mechanisms.

SPARQL variables using the $ marker represent external bindings that are pre-bound or, in the case of $PATH, substituted in the SPARQL query before execution (as explained in ).

The definition of some constraints requires or is simplified through access to the shapes graph during query execution. SHACL-SPARQL processors MAY pre-bind the variable shapesGraph to provide access to the shapes graph. Access to the shapes graph is not a requirement for supporting the SHACL Core language. The variable shapesGraph can also be used in SPARQL-based constraints and SPARQL-based constraint components. However, such constraints may not be interoperable across different SHACL-SPARQL processors or not applicable to remote RDF datasets.

Note that at the time of writing, SPARQL EXISTS has been imperfectly defined and implementations vary. While a W3C Community Group is working on improving this situation, users of SPARQL are advised that the use of EXISTS may have inconsistent results and should be approached with care.

The button below can be used to show or hide the SPARQL definitions.

SPARQL-based Constraints

SHACL-SPARQL supports a constraint component that can be used to express restrictions based on a SPARQL SELECT query.

Constraint Component IRI: sh:SPARQLConstraintComponent

Parameters:
Property Summary
sh:sparql A SPARQL-based constraint declaring the SPARQL query to evaluate.

The syntax rules and validation process for SPARQL-based constraints are defined in the rest of this section.

An Example SPARQL-based Constraint

The following example illustrates the syntax of a SPARQL-based constraint.

ex:ValidCountry a ex:Country ;
	ex:germanLabel "Spanien"@de .
  
ex:InvalidCountry a ex:Country ;
	ex:germanLabel "Spain"@en .
ex:LanguageExampleShape
	a sh:NodeShape ;
	sh:targetClass ex:Country ;
	sh:sparql [
		a sh:SPARQLConstraint ;   # This triple is optional
		sh:message "Values are literals with German language tag." ;
		sh:prefixes ex: ;
		sh:select """
			SELECT $this (ex:germanLabel AS ?path) ?value
			WHERE {
				$this ex:germanLabel ?value .
				FILTER (!isLiteral(?value) || !langMatches(lang(?value), "de"))
			}
			""" ;
	] .

The target of the shape above includes all SHACL instances of ex:Country. For those nodes (represented by the variable this), the SPARQL query walks through the values of ex:germanLabel and verifies that they are literals with a German language code. The validation results for the aforementioned data graph is shown below:

[	a sh:ValidationReport ;
	sh:conforms false ;
	sh:result [
		a sh:ValidationResult ;
		sh:resultSeverity sh:Violation ;
		sh:focusNode ex:InvalidCountry ;
		sh:resultPath ex:germanLabel ;
		sh:value "Spain"@en ;
		sh:sourceConstraintComponent sh:SPARQLConstraintComponent ;
		sh:sourceShape ex:LanguageExampleShape ;
		# ...
	]
] .

The SPARQL query returns result set solutions for all bindings of the variable value that violate the constraint. There is a validation result for each solution in that result set, applying the mapping rules explained later. In this example, each validation result will have the binding for the variable this as the sh:focusNode, ex:germanLabel as sh:resultPath and the violating value as sh:value.

The following example illustrates a similar scenario as above, but with a property shape.

ex:LanguageExamplePropertyShape
	a sh:PropertyShape ;
	sh:targetClass ex:Country ;
	sh:path ex:germanLabel ;
	sh:sparql [
		a sh:SPARQLConstraint ;   # This triple is optional
		sh:message "Values are literals with German language tag." ;
		sh:prefixes ex: ;
		sh:select """
			SELECT $this ?value
			WHERE {
				$this $PATH ?value .
				FILTER (!isLiteral(?value) || !langMatches(lang(?value), "de"))
			}
			""" ;
	] .

Syntax of SPARQL-based Constraints

Shapes may have values for the property sh:sparql, and these values are either IRIs or blank nodes. These values are called SPARQL-based constraints.

SPARQL-based constraints have exactly one value for the property sh:select. The value of sh:select is a literal of datatype xsd:string. The class sh:SPARQLConstraint is defined in the SHACL vocabulary and may be used as the type of these constraints (although no type is required). Using the prefix handling rules, the value of sh:select is a valid SPARQL 1.1 SELECT query. The SPARQL query derived from the value of sh:select projects the variable this in the SELECT clause.

The following two properties are similar to their use in shapes:

SPARQL-based constraints may have values for the property sh:message and these are either xsd:string literals or literals with a language tag. SPARQL-based constraints may have at most one value for the property sh:deactivated and this value is either true or false.

SELECT queries used in the context of property shapes use a special variable named PATH as a placeholder for the path used by the shape.

The only legal use of the variable PATH in the SPARQL queries of SPARQL-based constraints and SELECT-based validators is in the predicate position of a triple pattern. A query that uses the variable PATH in any other position is ill-formed.

Prefix Declarations for SPARQL Queries

A shapes graph may include declarations of namespace prefixes so that these prefixes can be used to abbreviate the SPARQL queries derived from the same shapes graph. The syntax of such prefix declarations is illustrated by the following example.

ex:
	a owl:Ontology ;
	owl:imports sh: ;
	sh:declare [
		sh:prefix "ex" ;
		sh:namespace "http://example.com/ns#"^^xsd:anyURI ;
	] ;
	sh:declare [
		sh:prefix "schema" ;
		sh:namespace "http://schema.org/"^^xsd:anyURI ;
	] .

The values of the property sh:declare are IRIs or blank nodes, and these values are called prefix declarations. The SHACL vocabulary includes the class sh:PrefixDeclaration as type for such prefix declarations although no rdf:type triple is required for them. Prefix declarations have exactly one value for the property sh:prefix. The values of sh:prefix are literals of datatype xsd:string. Prefix declarations have exactly one value for the property sh:namespace. The values of sh:namespace are literals of datatype xsd:anyURI. Such a pair of values specifies a single mapping of a prefix to a namespace.

The recommended subject for values of sh:declare is the IRI of the named graph containing the shapes that use the prefixes. These IRIs are often declared as an instance of owl:Ontology, but this is not required.

Prefix declarations can be used by SPARQL-based constraints, the validators of SPARQL-based constraint components, and by similar features defined by SHACL extensions. These nodes can use the property sh:prefixes to specify a set of prefix mappings. An example use of the sh:prefixes property can be found in the example above.

The values of sh:prefixes are either IRIs or blank nodes. A SHACL processor collects a set of prefix mappings as the union of all individual prefix mappings that are values of the SPARQL property path sh:prefixes/owl:imports*/sh:declare of the SPARQL-based constraint or validator. If such a collection of prefix declarations contains multiple namespaces for the same value of sh:prefix, then the shapes graph is ill-formed. (Note that SHACL processors MAY ignore prefix declarations that are never reached).

A SHACL processor transforms the values of sh:select (and similar properties such as sh:ask) into SPARQL by prepending PREFIX declarations for all prefix mappings. Each value of sh:prefix is turned into the PNAME_NS, while each value of sh:namespace is turned into the IRIREF in the PREFIX declaration. For the example shapes graph above, a SHACL-SPARQL processor would produce lines such as PREFIX ex: <http://example.com/ns#>. The SHACL-SPARQL processor MUST produce a failure if the resulting query string cannot be parsed into a valid SPARQL 1.1 query.

In the rest of this document, the sh:prefixes statements may have been omitted for brevity.

Validation with SPARQL-based Constraints

This section explains the validator of sh:SPARQLConstraintComponent. Note that this validator only explains one possible implementation strategy, and SHACL processors may choose alternative approaches as long as the outcome is equivalent.

TEXTUAL DEFINITION
There are no validation results if the SPARQL-based constraint has true as a value for the property sh:deactivated. Otherwise, execute the SPARQL query specified by the SPARQL-based constraint $sparql pre-binding the variables this and, if supported, shapesGraph and currentShape as described in . If the shape is a property shape, then prior to execution substitute the variable PATH where it appears in the predicate position of a triple pattern with a valid SPARQL surface syntax string of the SHACL property path specified via sh:path at the property shape. There is one validation result for each solution that does not have true as the binding for the variable failure. These validation results MUST have the property values explained in . A failure MUST be produced if and only if one of the solutions has true as the binding for failure.

Pre-bound Variables in SPARQL Constraints ($this, $shapesGraph, $currentShape)

When the SPARQL queries of SPARQL-based constraints and the validators of SPARQL-based constraint components are processed, the SHACL-SPARQL processor pre-binds values for the variables in the following table.

Variable Interpretation
this The focus node.
shapesGraph (Optional) Can be used to query the shapes graph as in GRAPH $shapesGraph { ... }. If the shapes graph is a named graph in the same dataset as the data graph then it is the IRI of the shapes graph in the dataset. Not all SHACL-SPARQL processors need to support this variable. Processors that do not support the variable shapesGraph MUST report a failure if they encounter a query that references this variable. Use of GRAPH $shapesGraph { ... } should be handled with extreme caution. It may result in constraints that are not interoperable across different SHACL-SPARQL processors and that may not run on remote RDF datasets.
currentShape (Optional) The current shape. Typically used in conjunction with the variable shapesGraph. The same support policies as for shapesGraph apply for this variable.

Mapping of Solution Bindings to Result Properties

The property values of the validation result nodes are derived by the following rules, through a combination of result solutions and the values of the constraint itself. The rules are meant to be executed from top to bottom, so that the first bound value will be used.

Property Production Rules
sh:focusNode
  1. The binding for the variable this
sh:resultPath
  1. The binding for the variable path, if that is a IRI
  2. For results produced by a property shape, a SHACL property path that is equivalent to the value of sh:path of the shape
sh:value
  1. The binding for the variable value
  2. The value node
sh:resultMessage
  1. The binding for the variable message
  2. For SPARQL-based constraints: The values of sh:message of the SPARQL-based constraint. For SPARQL-based constraint components: The values of sh:message of the validator of the SPARQL-based constraint component.
  3. For SPARQL-based constraint components: The values of sh:message of the SPARQL-based constraint component.
These message literals may include the names of any SELECT result variables via {?varName} or {$varName}. If the constraint is based on a SPARQL-based constraint component, then the component's parameter names can also be used. These {?varName} and {$varName} blocks SHOULD be replaced with suitable string representations of the values of said variables.
sh:sourceConstraint
  1. The SPARQL-based constraint, i.e. the value of sh:sparql

SPARQL-based Constraint Components

SPARQL-based constraints provide a lot of flexibility but may be hard to understand for some people or lead to repetition. This section introduces SPARQL-based constraint components as a way to abstract the complexity of SPARQL and to declare high-level reusable components similar to the Core constraint components. Such constraint components can be declared using the SHACL RDF vocabulary and thus shared and reused.

An Example SPARQL-based Constraint Component

The following example demonstrates how SPARQL can be used to specify new constraint components using the SHACL-SPARQL language. The example implements sh:pattern and sh:flags using a SPARQL ASK query to validate that each value node matches a given regular expression. Note that this is only an example implementation and should not be considered normative.

sh:PatternConstraintComponent
	a sh:ConstraintComponent ;
	sh:parameter [
		sh:path sh:pattern ;
	] ;
	sh:parameter [
		sh:path sh:flags ;
		sh:optional true ;
	] ;
	sh:validator shimpl:hasPattern .

shimpl:hasPattern
	a sh:SPARQLAskValidator ;
	sh:message "Value does not match pattern {$pattern}" ;
	sh:ask """
		ASK { 
			FILTER (!isBlank($value) && 
				IF(bound($flags), regex(str($value), $pattern, $flags), regex(str($value), $pattern)))
		}""" .

Constraint components provide instructions to validation engines on how to identify and validate constraints within a shape. In general, if a shape S has a value for a property p, and there is a constraint component C that specifies p as a parameter, and S has values for all mandatory parameters of C, then the set of these parameter values (including the optional parameters) declare a constraint and the validation engine uses a suitable validator from C to perform the validation of this constraint. In the example above, sh:PatternConstraintComponent declares the mandatory parameter sh:pattern, the optional parameter sh:flags, and a validator that can be used to perform validation against either node shapes or property shapes.

Syntax of SPARQL-based Constraint Components

A SPARQL-based constraint component is an IRI that has SHACL type sh:ConstraintComponent in the shapes graph.

The mechanism to declare new constraint components in this document is limited to those based on SPARQL. However, then general syntax of declaring parameters and validators has been designed to also work for other extension languages such as JavaScript.

Parameter Declarations (sh:parameter)

The parameters of a constraint component are declared via the property sh:parameter. The values of sh:parameter are called parameter declarations. The class sh:Parameter may be used as type of parameter declarations but no such triple is required. Each parameter declaration has exactly one value for the property sh:path. At parameter declarations, the value of sh:path is an IRI.

The local name of an IRI is defined as the longest NCNAME at the end of the IRI, not immediately preceded by the first colon in the IRI. The parameter name of a parameter declaration is defined as the local name of the value of sh:path. To ensure that a correct mapping from parameters into SPARQL variables is possible, the following syntax rules apply:

Every parameter name is a valid SPARQL VARNAME. Parameter names must not be one of the following: this, shapesGraph, currentShape, path, PATH, value. A constraint component where two or more parameter declarations use the same parameter names is ill-formed.

The values of sh:optional must be literals with datatype xsd:boolean. A parameter declaration can have at most one value for the property sh:optional. If set to true then the parameter declaration declares an optional parameter. Every constraint component has at least one non-optional parameter.

The class sh:Parameter is defined as a SHACL subclass of sh:PropertyShape, and all properties that are applicable to property shapes may also be used for parameters. This includes descriptive properties such as sh:name and sh:description but also constraint parameters such as sh:class. Shapes that do not conform with the constraints declared for the parameters are ill-formed. Some implementations MAY use these constraint parameters to prevent the execution of constraint components with invalid parameter values.

Label Templates (sh:labelTemplate)

The property sh:labelTemplate can be used at any constraint component to suggest how constraints could be rendered to humans. The values of sh:labelTemplate are strings (possibly with language tag) and are called label templates.

The remainder of this section is informative.

Label templates can include the names of the parameters that are declared for the constraint component using the syntaxes {?varName} or {$varName}, where varName is the name of the parameter name. At display time, these {?varName} and {$varName} blocks SHOULD be replaced with the actual parameter values. There may be multiple label templates for the same subject, but they should not have the same language tags.

Validators

For every supported shape type (i.e., property shape or node shape) the constraint component declares a suitable validator. For a given constraint, a validator is selected from the constraint component using the following rules, in order:

  1. For node shapes, use one of the values of sh:nodeValidator, if present.
  2. For property shapes, use one of the values of sh:propertyValidator, if present.
  3. Otherwise, use one of the values of sh:validator.

If no suitable validator can be found, a SHACL-SPARQL processor ignores the constraint.

SHACL-SPARQL includes two types of validators, based on SPARQL SELECT (for sh:nodeValidator and sh:propertyValidator) or SPARQL ASK queries (for sh:validator).

SELECT-based Validators

Validators with SHACL type sh:SPARQLSelectValidator are called SELECT-based validators. The values of sh:nodeValidator must be SELECT-based validators. The values of sh:propertyValidator must be SELECT-based validators. SELECT-based validators have exactly one value for the property sh:select. The value of sh:select is a valid SPARQL SELECT query using the aforementioned prefix handling rules. The SPARQL query derived from the value of sh:select projects the variable this in its SELECT clause.

The remainder of this section is informative.

The following example illustrates the declaration of a constraint component based on a SPARQL SELECT query. It is a generalized variation of the example from . That SPARQL query included two constants: the specific property ex:germanLabel and the language tag de. Constraint components make it possible to generalize such scenarios, so that constants get pre-bound with parameters. This allows the query logic to be reused in multiple places, without having to write any new SPARQL.

ex:LanguageConstraintComponentUsingSELECT
	a sh:ConstraintComponent ;
	rdfs:label "Language constraint component" ;
	sh:parameter [
		sh:path ex:lang ;
		sh:datatype xsd:string ;
		sh:minLength 2 ;
		sh:name "language" ;
		sh:description "The language tag, e.g. \"de\"." ;
	] ;
	sh:labelTemplate "Values are literals with language \"{$lang}\"" ;
	sh:propertyValidator [
		a sh:SPARQLSelectValidator ;
		sh:message "Values are literals with language \"{?lang}\"" ;
		sh:select """
			SELECT DISTINCT $this ?value
			WHERE {
				$this $PATH ?value .
				FILTER (!isLiteral(?value) || !langMatches(lang(?value), $lang))
			}
			"""
	] .

Once a constraint component has been declared (in a shapes graph), its parameters can be used as illustrated in the following example.

ex:LanguageExampleShape
	a sh:NodeShape ;
	sh:targetClass ex:Country ;
	sh:property [
		sh:path ex:germanLabel ;
		ex:lang "de" ;
	] ;
	sh:property [
		sh:path ex:englishLabel ;
		ex:lang "en" ;
	] .

The example shape above specifies the condition that all values of ex:germanLabel carry the language tag de while all values of ex:englishLabel have en as their language. These details are specified via two property shapes that have values for the ex:lang parameter required by the constraint component.

ASK-based Validators

Many constraint components are of the form in which all value nodes are tested individually against some boolean condition. Writing SELECT queries for these becomes burdensome, especially if a constraint component can be used for both property shapes and node shapes. SHACL-SPARQL provides an alternative, more compact syntax for validators based on ASK queries.

Validators with SHACL type sh:SPARQLAskValidator are called ASK-based validators. The values of sh:validator must be ASK-based validators. ASK-based validators have exactly one value for the property sh:ask. The value of sh:ask must be a literal with datatype xsd:string. The value of sh:ask must be a valid SPARQL ASK query using the aforementioned prefix handling rules.

The remainder of this section is informative.

The ASK queries return true if and only if a given value node (represented by the pre-bound variable value) conforms to the constraint.

The following example declares a constraint component using an ASK query.

ex:LanguageConstraintComponentUsingASK
	a sh:ConstraintComponent ;
	rdfs:label "Language constraint component" ;
	sh:parameter [
		sh:path ex:lang ;
		sh:datatype xsd:string ;
		sh:minLength 2 ;
		sh:name "language" ;
		sh:description "The language tag, e.g. \"de\"." ;
	] ;
	sh:labelTemplate "Values are literals with language \"{$lang}\"" ;
	sh:validator ex:hasLang .
	
ex:hasLang
	a sh:SPARQLAskValidator ;
	sh:message "Values are literals with language \"{$lang}\"" ;
	sh:ask """
		ASK {
			FILTER (isLiteral($value) && langMatches(lang($value), $lang))
		}
		""" .

Note that the validation condition implemented by an ASK query is "in the inverse direction" from its SELECT counterpart: ASK queries return true for value nodes that conform to the constraint, while SELECT queries return those value nodes that do not conform.

Validation with SPARQL-based Constraint Components

This section defines the validator of SPARQL-based constraint components. Note that this validator only explains one possible implementation strategy, and SHACL processors may choose alternative approaches as long as the outcome is equivalent.

As the first step, a validator MUST be selected based on the rules outlined in . Then the following rules apply, producing a set of solutions of SPARQL queries:

The SPARQL query executions above MUST pre-bind the variables this and, if supported, shapesGraph and currentShape as described in . In addition, each value of a parameter of the constraint component in the constraint MUST be pre-bound as a variable that has the parameter name as its name.

The production rules for the validation results are identical to those for SPARQL-based constraints, using the solutions QS as produced above.

Appendix

Pre-binding of Variables in SPARQL Queries

Some features of SHACL-SPARQL rely on the concept of pre-binding of variables as defined in this section.

The definition of pre-binding used by SHACL requires the following restrictions on SPARQL queries. SHACL-SPARQL processors MUST report a failure when it is operating on a shapes graph that contains SHACL-SPARQL queries (via sh:select and sh:ask) that violate any of these restrictions. Note that the term potentially pre-bound variables includes the variables this, shapesGraph, currentShape, value (for ASK queries), and any variables that represent the parameters of the constraint component that uses the query.

DEFINITION: Values Insertion

For solution mapping μ, define Table(μ) to be the multiset formed from μ.

   Table(μ) = { μ }
   Card[μ] = 1

Define the Values Insertion function Replace(X, μ) to replace each occurence Y of a Basic Graph Pattern, Property Path Expression, Graph(Var, pattern) in X with join(Y, Table(μ)).

DEFINITION: Pre-binding of variables

The evaluation of the SPARQL Query Q = (E, DS, QF) with pre-bound variables μ is defined as the evaluation of SPARQL query Q' = (Replace(E, μ), DS, QF).

Summary of SHACL Syntax Rules

This section enumerates all normative syntax rules of SHACL. This section is automatically generated from other parts of this spec and hyperlinks are provided back into the prose if the context of the rule in unclear. Nodes that violate these rules in a shapes graph are ill-formed.

Syntax Rule Id Syntax Rule Text

Security and Privacy Considerations

Like most RDF-based technologies, SHACL processors may operate on graphs that are combined from various sources. Some applications may have an open "linked data" architecture and dynamically assemble RDF triples from sources that are outside of an organization's network of trust. Since RDF allows anyone to add statements about any resource, triples may modify the originally intended semantics of shape definitions or nodes in a data graph and thus lead to misleading results. Protection against this (and the following) scenario can be achieved by only using trusted and verified RDF sources and eliminating the possibility that graphs are dynamically added via owl:imports and sh:shapesGraph.

SHACL-SPARQL includes all the security issues of SPARQL.

Acknowledgements

The original 1.0 version of SHACL was produced by the RDF Data Shapes Working Group. See its SHACL 1.0 Acknowledgements section.

Revision History

The detailed list of changes and their diffs can be found in the Git repository.