The Shapes Constraint Language (SHACL) [[!shacl]] is a language for validating RDF graphs against a set of conditions. SHACL consists of SHACL Core and SHACL-SPARQL which covers advanced features that use SPARQL-based constraints. The syntax of SHACL is RDF.
This document defines the Compact Syntax for a subset of SHACL Core. The Compact Syntax offers an alternative notation to the general RDF-based notations for SHACL, aimed at human editors and readers.
Some examples in this document use Turtle [[!turtle]]. The reader is expected to be familiar with SHACL [[!shacl]].
Within this document, the following namespace prefix bindings are used:
Prefix | Namespace |
---|---|
rdf: |
http://www.w3.org/1999/02/22-rdf-syntax-ns# |
rdfs: |
http://www.w3.org/2000/01/rdf-schema# |
sh: |
http://www.w3.org/ns/shacl# |
xsd: |
http://www.w3.org/2001/XMLSchema# |
ex: |
http://example.com/ns# |
The following example illustrates key features of the SHACL Compact Syntax. It is an extended version of the Person Example from [[!shacl]].
BASE <http://example.com/ns> IMPORTS <http://example.com/person-ontology> PREFIX ex: <http://example.com/ns#> shape ex:PersonShape -> ex:Person { closed=true ignoredProperties=[rdf:type] . ex:ssn xsd:string [0..1] pattern="^\\d{3}-\\d{2}-\\d{4}$" . ex:worksFor IRI ex:Company [0..*] . ex:address BlankNode [0..1] { ex:city xsd:string [1..1] . ex:postalCode xsd:integer|xsd:string [1..1] maxLength=5 . } . }
Using the this example is mapped to the following Turtle RDF graph:
@base <http://example.com/ns> . @prefix ex: <http://example.com/ns#> . @prefix owl: <http://www.w3.org/2002/07/owl#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix sh: <http://www.w3.org/ns/shacl#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . <http://example.com/ns> rdf:type owl:Ontology ; owl:imports <http://example.com/person-ontology> . ex:PersonShape a sh:NodeShape ; sh:targetClass ex:Person ; sh:closed true ; sh:ignoredProperties ( rdf:type ) ; sh:property [ sh:path ex:ssn ; sh:maxCount 1 ; sh:datatype xsd:string ; sh:pattern "^\\d{3}-\\d{2}-\\d{4}$" ; ] ; sh:property [ sh:path ex:worksFor ; sh:class ex:Company ; sh:nodeKind sh:IRI ; ] ; sh:property [ sh:path ex:address ; sh:maxCount 1 ; sh:nodeKind sh:BlankNode ; sh:node [ sh:property [ sh:path ex:city ; sh:datatype xsd:string ; sh:minCount 1 ; sh:maxCount 1 ; ] ; sh:property [ sh:path ex:postalCode ; sh:or ( [ sh:datatype xsd:integer ] [ sh:datatype xsd:string ] ) ; sh:minCount 1 ; sh:maxCount 1 ; sh:maxLength 5 ; ] ; ] ; ] .
SHACL also supports a design pattern where shapes that are also declared to be classes apply to
all instances of the class. The Compact Syntax includes the keyword shapeClass
for this case, as shown in the snippet below:
shapeClass ex:Person { ... }
Compared to the example further above using shape
, this would produce the following
RDF triples (with no sh:targetClass
triple):
ex:Person a sh:NodeShape, rdfs:Class ; ...
The SHACL Compact Syntax can be used both as an exchange format but also as temporary editing input format.
If SHACL Compact Syntax files are saved, the recommended file ending is .shaclc
.
The following grammar (in ANTLR format) defines the parsing rules for the SHACL Compact Syntax. Valid SHACL Compact Syntax documents must be parseable against this grammar, must not cause any errors during the application of the production rules and furthermore produce no ill-formed nodes.
grammar SHACLC; shaclDoc : directive* (nodeShape|shapeClass)* EOF; directive : baseDecl | importsDecl | prefixDecl ; baseDecl : KW_BASE IRIREF ; importsDecl : KW_IMPORTS IRIREF ; prefixDecl : KW_PREFIX PNAME_NS IRIREF ; nodeShape : KW_SHAPE iri targetClass? nodeShapeBody ; shapeClass : KW_SHAPE_CLASS iri nodeShapeBody ; nodeShapeBody : '{' constraint* '}'; targetClass : '->' iri+ ; constraint : ( nodeOr+ | propertyShape ) '.' ; nodeOr : nodeNot ( '|' nodeNot) * ; nodeNot : negation? nodeValue ; nodeValue : nodeParam '=' iriOrLiteralOrArray ; propertyShape : path ( propertyCount | propertyOr )* ; propertyOr : propertyNot ( '|' propertyNot) * ; propertyNot : negation? propertyAtom ; propertyAtom : propertyType | nodeKind | shapeRef | propertyValue | nodeShapeBody ; propertyCount : '[' propertyMinCount '..' propertyMaxCount ']' ; propertyMinCount : INTEGER ; propertyMaxCount : (INTEGER | '*') ; propertyType : iri ; nodeKind : 'BlankNode' | 'IRI' | 'Literal' | 'BlankNodeOrIRI' | 'BlankNodeOrLiteral' | 'IRIOrLiteral' ; shapeRef : ATPNAME_LN | ATPNAME_NS | '@' IRIREF ; propertyValue : propertyParam '=' iriOrLiteralOrArray ; negation : '!' ; path : pathAlternative ; pathAlternative : pathSequence ( '|' pathSequence )* ; pathSequence : pathEltOrInverse ( '/' pathEltOrInverse )* ; pathElt : pathPrimary pathMod? ; pathEltOrInverse : pathElt | pathInverse pathElt ; pathInverse : '^' ; pathMod : '?' | '*' | '+' ; pathPrimary : iri | '(' path ')' ; iriOrLiteralOrArray : iriOrLiteral | array ; iriOrLiteral : iri | literal ; iri : IRIREF | prefixedName ; prefixedName : PNAME_LN | PNAME_NS ; literal : rdfLiteral | numericLiteral | booleanLiteral ; booleanLiteral : KW_TRUE | KW_FALSE ; numericLiteral : INTEGER | DECIMAL | DOUBLE ; rdfLiteral : string (LANGTAG | '^^' datatype)? ; datatype : iri ; string : STRING_LITERAL_LONG1 | STRING_LITERAL_LONG2 | STRING_LITERAL1 | STRING_LITERAL2 ; array : '[' iriOrLiteral* ']' ; nodeParam : 'targetNode' | 'targetObjectsOf' | 'targetSubjectsOf' | 'deactivated' | 'severity' | 'message' | 'class' | 'datatype' | 'nodeKind' | 'minExclusive' | 'minInclusive' | 'maxExclusive' | 'maxInclusive' | 'minLength' | 'maxLength' | 'pattern' | 'flags' | 'languageIn' | 'equals' | 'disjoint' | 'closed' | 'ignoredProperties' | 'hasValue' | 'in' ; propertyParam : 'deactivated' | 'severity' | 'message' | 'class' | 'datatype' | 'nodeKind' | 'minExclusive' | 'minInclusive' | 'maxExclusive' | 'maxInclusive' | 'minLength' | 'maxLength' | 'pattern' | 'flags' | 'languageIn' | 'uniqueLang' | 'equals' | 'disjoint' | 'lessThan' | 'lessThanOrEquals' | 'qualifiedValueShape' | 'qualifiedMinCount' | 'qualifiedMaxCount' | 'qualifiedValueShapesDisjoint' | 'closed' | 'ignoredProperties' | 'hasValue' | 'in' ; // Keywords KW_BASE : 'BASE' ; KW_IMPORTS : 'IMPORTS' ; KW_PREFIX : 'PREFIX' ; KW_SHAPE_CLASS : 'shapeClass' ; KW_SHAPE : 'shape' ; KW_TRUE : 'true' ; KW_FALSE : 'false' ; // Terminals PASS : [ \t\r\n]+ -> skip; COMMENT : '#' ~[\r\n]* -> skip; IRIREF : '<' (~[\u0000-\u0020=<>\"{}|^`\\] | UCHAR)* '>' ; PNAME_NS : PN_PREFIX? ':' ; PNAME_LN : PNAME_NS PN_LOCAL ; ATPNAME_NS : '@' PN_PREFIX? ':' ; ATPNAME_LN : '@' PNAME_NS PN_LOCAL ; LANGTAG : '@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)* ; INTEGER : [+-]? [0-9]+ ; DECIMAL : [+-]? [0-9]* '.' [0-9]+ ; DOUBLE : [+-]? ([0-9]+ '.' [0-9]* EXPONENT | '.'? [0-9]+ EXPONENT) ; fragment EXPONENT : [eE] [+-]? [0-9]+ ; STRING_LITERAL1 : '\'' (~[\u0027\u005C\u000A\u000D] | ECHAR | UCHAR)* '\'' ; STRING_LITERAL2 : '"' (~[\u0022\u005C\u000A\u000D] | ECHAR | UCHAR)* '"' ; STRING_LITERAL_LONG1: '\'\'\'' (('\'' | '\'\'')? (~[\'\\] | ECHAR | UCHAR))* '\'\'\'' ; STRING_LITERAL_LONG2: '"""' (('"' | '""')? (~[\"\\] | ECHAR | UCHAR))* '"""' ; fragment UCHAR : '\\u' HEX HEX HEX HEX | '\\U' HEX HEX HEX HEX HEX HEX HEX HEX ; fragment ECHAR : '\\' [tbnrf\\\"\'] ; fragment WS : [\u0020\u0009\u000D\u000A] ; fragment PN_CHARS_BASE: [A-Z] | [a-z] | [\u00C0-\u00D6] | [\u00D8-\u00F6] | [\u00F8-\u02FF] | [\u0370-\u037D] | [\u037F-\u1FFF] | [\u200C-\u200D] | [\u2070-\u218F] | [\u2C00-\u2FEF] | [\u3001-\uD7FF] | [\uF900-\uFDCF] | [\uFDF0-\uFFFD] ; fragment PN_CHARS_U : PN_CHARS_BASE | '_' ; fragment PN_CHARS : PN_CHARS_U | '-' | [0-9] | [\u00B7] | [\u0300-\u036F] | [\u203F-\u2040] ; fragment PN_PREFIX : PN_CHARS_BASE ((PN_CHARS | '.')* PN_CHARS)? ; fragment PN_LOCAL : (PN_CHARS_U | ':' | [0-9] | PLX) ((PN_CHARS | '.' | ':' | PLX)* (PN_CHARS | ':' | PLX))? ; fragment PLX : PERCENT | PN_LOCAL_ESC ; fragment PERCENT : '%' HEX HEX ; fragment HEX : [0-9] | [A-F] | [a-f] ; fragment PN_LOCAL_ESC: '\\' ('_' | '~' | '.' | '-' | '!' | '$' | '&' | '\'' | '(' | ')' | '*' | '+' | ',' | ';' | '=' | '/' | '?' | '#' | '@' | '%') ;
A parser for the SHACL Compact Syntax receives as input a (text) document plus an optional base URI
which is used as initial value for the variable ?baseURI
.
The parser uses a prefix mapping which has initial mappings for the following namespaces:
Prefix | Namespace |
---|---|
rdf: |
http://www.w3.org/1999/02/22-rdf-syntax-ns# |
rdfs: |
http://www.w3.org/2000/01/rdf-schema# |
sh: |
http://www.w3.org/ns/shacl# |
xsd: |
http://www.w3.org/2001/XMLSchema# |
It then produces a new RDF graph with the triples produced by the following rules.
baseDecl: set ?baseURI
to the IRI specified by IRIREF
.
importsDecl: add the IRI specified by IRIREF into a set ?imports
.
prefixDecl: add to the prefix mapping a mapping from the prefix PNAME_NS
(without the ':') to
the namespace specified by IRIREF
.
Once the whole document has been completed, produce a triple ?baseURI rdf:type owl:Ontology
using the final value of baseURI
.
Report an error if baseURI
has no value but imports
is not empty.
For each iri
in imports
, produce a triple ?baseURI owl:imports ?iri
.
nodeShape: Produce a triple ?shape rdf:type sh:NodeShape
where ?shape
is
derived from the iri
using iri
.
Use ?shape
as context shape for the targetClass
and nodeShapeBody
.
shapeClass: Produce the triples ?shape rdf:type sh:NodeShape
and
?shape rdf:type rdfs:Class
where ?shape
is
derived from the iri
using iri
.
Use ?shape
as context shape for the nodeShapeBody
.
targetClass: For each iri
, produce a triple ?shape sh:targetClass ?iri
where ?iri
is derived from iri
.
nodeShapeBody: Handle each constraint
using the context shape ?shape
.
constraint: Handle each nodeOr
or
propertyShape
using the context shape ?shape
.
nodeOr: If there is more than one nodeNot
, then produce an RDF list ?or
where for each nodeNot
,
there is a new blank node, and that blank node is used as context shape for the nodeNot
.
Then produce a triple ?shape sh:or ?or
.
If there is only one nodeNot
, handle the nodeNot
using the context shape ?shape
.
nodeNot: If there is a negation
, produce a new blank node ?not
and produce a triple
?shape sh:not ?not
. Then handle the nodeValue
using ?not
as context shape.
If there is no negation
, handle the nodeValue
using the context shape ?shape
.
nodeValue: Produce a triple ?shape ?predicate ?object
where ?predicate
is the IRI produced by concatenating the sh
namespace with string value of nodeParam
(for example "minLength"
becomes sh:minLength
),
and ?object
is derived from the iriOrLiteralOrArray
.
propertyShape: Using a new blank node ?property
, produce a triple
?shape sh:property ?property
.
Produce a triple ?property sh:path ?path
where ?path
is the result of path
.
Use ?property
as context shape for propertyCount
and propertyOr
.
propertyCount:
If propertyMinCount
is not "0"
, produce a triple ?property sh:minCount ?minCount
using the xsd:integer
derived from propertyMinCount
as ?minCount
.
If propertyMaxCount
is not "*"
, produce a triple ?property sh:maxCount ?maxCount
using the xsd:integer
derived from propertyMaxCount
as ?maxCount
.
propertyOr: If there is more than one propertyNot
, then produce an RDF list ?or
where for each propertyNot
,
there is a new blank node, and that blank node is used as context shape for the propertyNot
.
Then produce a triple ?property sh:or ?or
.
If there is only one propertyNot
, handle the propertyNot
using the context shape ?property
.
propertyNot: If there is a negation
, produce a new blank node ?not
and produce a triple
?property sh:not ?not
. Then handle the propertyAtom
using ?not
as context shape.
If there is no negation
, handle the propertyAtom
using the context shape ?property
.
propertyAtom: Use ?property
as context shape for any of the child elements.
For a nested nodeShapeBody
, produce a new blank node ?node
and use that as the context shape ?shape
.
Then produce a triple ?property sh:node ?node
.
propertyType: Let ?iri
be the IRI derived from the propertyType
using iri
.
If ?iri
is one of the RDF datatypes supported by SPARQL 1.1 (such as xsd:string
) then produce a triple
?property sh:datatype ?iri
, otherwise ?property sh:class ?iri
.
nodeKind: Produce a triple ?property sh:nodeKind ?nodeKind
where
?nodeKind
is the IRI produced by concatenating the sh
namespace
with the text value of nodeKind
(e.g., sh:Literal
).
shapeRef: Produce a triple ?property sh:node ?node
where
?node
is the IRI derived from the substring of shapeRef
after the '@' character using iri
.
propertyValue: Produce a triple ?property ?predicate ?object
where ?predicate
is the IRI produced by concatenating the sh
namespace with the string value of propertyParam
,
and ?object
is derived from the iriOrLiteralOrArray
.
path: If there is more than one pathSequence
, produce a new RDF list ?list
where there is
one list member for the paths derived from each pathSequence
.
Then produce a triple ?alt sh:alternativePath ?list
where ?alt
is a new blank node, and return ?alt
.
If there is only one pathSequence
, return the path derived from pathSequence
.
pathSequence: If there is more than one pathEltOrInverse
, produce a new blank node RDF list ?list
where there is one list member for the path derived from each pathEltOrInverse
. Return ?list
.
If there is only one pathEltOrInverse
, return the path derived from pathEltOrInverse
.
pathEltOrInverse: If there is a pathInverse
, produce a triple
?path sh:inversePath ?inverse
where ?path
is a new blank node and ?inverse
is the path derived from pathElt
. Return ?path
.
If there is only one pathElt
, return the path derived from pathElt
.
pathElt: Let ?primary
be the path derived from pathPrimary
.
If pathMod
does not exist, return ?primary
.
Otherwise, produce and return a new blank node ?path
with one of the following triples:
If pathMod
equals "?" produce a triple ?path sh:zeroOrOnePath ?primary
.
If pathMod
equals "+" produce a triple ?path sh:oneOrMorePath ?primary
.
If pathMod
equals "*" produce a triple ?path sh:zeroOrMorePath ?primary
.
pathPrimary: If iri
exists, return the predicate derived from that IRI.
Otherwise, return the path derived from path.
iriOrLiteralOrArray: If there is an array
, produce and return an RDF list
where each iriOrLiteral
is a member.
Otherwise, return iriOrLiteral
for iriOrLiteral
.
iriOrLiteral: If there is an iri
, return the node derived from iri
.
Otherwise, apply Turtle's parsing rules to turn the string literal
into an RDF literal.
iri: If there is a IRIREF
, return the result of IRIREF
.
Otherwise, return an IRI applying the current prefix mapping on prefixedName
.
Report an error if there is no matching prefix.
IRIREF: Return the IRI consisting of the substring of IRIREF
between the leading <
and the trailing >
, turning relative IRIs into absolute IRIs using the current ?baseURI
.
Developers may find the Test Cases useful.
Each test consists of a .shaclc
file and an associated .ttl
file.
Parsing the .shaclc
file must produce a graph that is isomorphic to the .ttl
file.
The test cases are not normative.
This section reviewed, approved, and registered with IANA by the Internet Engineering Steering Group (IESG), see https://www.iana.org/assignments/media-types/text/shaclc.
Revealing the structure of an RDF graph can reveal information about the content of conformant data. For instance, a schema with a predicate to describe cancer stage indicates that conforming graphs describe patients with cancer.
The process of testing a graph's conformance to a schema could draw significant system resources and be a vector for Denial of Service attacks.
RDF Turtle security considerations about IRI spoofing may also apply here.
PREFIX
or BASE
(case sensitive) near the beginning of the document.
However, the same words may appear in Turtle and SPARQL documents.
This document is heavily inspired by the ShEx Compact Syntax, a version of which was provided as input to the RDF Data Shapes Working Group. The IANA Considerations section has been adapted from the same document.
The ShEx Compact Syntax was primarily developed by the following people:
Eric Prud'hommeaux, Iovka Boneva, Jose Labra, Harold Solbrig