Copyright © 2014 W3C. This document is available under the W3C Document License. See the W3C Intellectual Rights Notice and Legal Disclaimers for additional information.
Shape Expressions associate RDF graphs with labeled patterns called "shapes". Shapes can be used for validation, documentation and transformation of RDF data.
Shape Expressions express formal constraints on the content of RDF graphs and is intended to be used to validate RDF documents, communicate expected graph patterns for interfaces and to generate forms and validation code. This document describes the formal semantics of the Shape Expressions language through the use of the Z Specification Language. The accompanying Shape Expressions Primer provides an informal introduction to the language.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications can be found in the W3C technical reports index at http://www.w3.org/TR/.
This document describes the Shape Expressions langauge developed as a community effort. It is being submitted to W3C so that it can inform the development of a future RDF Data Shape specification.
By publishing this document, W3C acknowledges that the Submitting Members have made a formal Submission request to W3C for discussion. Publication of this document by W3C indicates no endorsement of its content by W3C, nor that W3C has, is, or will be allocating any resources to the issues addressed by it. This document is not the product of a chartered W3C group, but is published as potential input to the W3C Process. A W3C Team Comment has been published in conjunction with this Member Submission. Publication of acknowledged Member Submissions at the W3C site is one of the benefits of W3C Membership. Please consult the requirements associated with Member Submissions of section 3.3 of the W3C Patent Policy. Please consult the complete list of acknowledged W3C Member Submissions.
This version has no official status. It corrects multiple Z Notation rendering errors which were caused by manually translating LaTeX to HTML. The Z Notation rendering in this version was generated by automatically translating LaTeX to HTML using XSLT. These XSLT scripts are available in the w3c/ShEx project hosted on GitHub: https://github.com/w3c/ShEx.
This version only corrects the Z Notation rendering errors present in the original W3C Member Submission. Stylistic and semantic errors remain.
The Shape Expressions Language (ShEx) is used to specify formal constraints on the content of RDF graphs and are intended to be used to validate RDF documents, communicate expected graph patterns for interfaces and to generate forms and validation code. ShEx can be used to:
Information about the use, grammar and syntax of ShEx can be found at http://www.w3.org/2013/ShEx/. This document describes the formal semantics of the ShEx language using the Z specification language, beginning with a Z specification of the characteristics of an RDF Graph that are referenced by ShEx.
The output from ExtractZed.xsl produces text parsable with fuzz -d
.
The formal definition for an RDF graph in RDF 1.1 Concepts and Abstract Syntax[RDF11-CONCEPTS]:
“An RDF graph is a set of RDF Triples”
“An RDF triple consists of three components:
- the subject, which is an IRI or a blank node
- the predicate, which is an IRI
- the object, which is an IRI, a literal or a blank node”
“... IRIs, literals and blank nodes are distinct and distinguishable.”
provides a definition in Z:
Graph ==ℙTriple |
\begin{zed} Graph == \power Triple \end{zed}
The ShEx language treats IRIs and blank nodes as primitive types, which are defined as Z free types:
[IRI, BlankNode] |
\begin{zed} [IRI, BlankNode] \end{zed}
The ShEx language can express constraints on both the type and content of literals, which are modeled separately:
“A literal in an RDF graph consists of two or three elements:
This is modelled by String and LanguageTag as free types:
[String, LanguageTag] |
\begin{zed} [String, LanguageTag] \end{zed}and using them in the definition the two flavors of RDFLiteral, plain literal and typed literal:
TypedLiteral≙ |
[lexicalForm:String ; dataType:IRI | |
dataType≠ XSD_String] |
PlainLiteral≙ |
[lexicalForm:String ; dataType:IRI; langTag:LanguageTag | |
dataType = XSD_String] |
RDFLiteral ::= pl《PlainLiteral》| tl《TypedLiteral》 |
\begin{zed} TypedLiteral \defs \\ \t1 [lexicalForm:String ; dataType:IRI | \\ \t2 dataType \neq XSD\_String] \also PlainLiteral \defs \\ \t1 [lexicalForm:String ; dataType:IRI; langTag:LanguageTag | \\ \t2 dataType = XSD\_String] \also RDFLiteral ::= pl \ldata PlainLiteral \rdata | tl \ldata TypedLiteral \rdata \end{zed}RDFTerm is defined as:
“IRIs, literals and blank nodes are collectively known as RDF terms
RDFTerm ::= |
iri《IRI》| |
literal《RDFLiteral》| |
bnode《BlankNode》 |
\begin{zed} RDFTerm ::= \\ \t1 iri \ldata IRI \rdata | \\ \t1 literal \ldata RDFLiteral \rdata | \\ \t1 bnode \ldata BlankNode \rdata \end{zed}The definition of RDF Triple is modelled as a tuple consisting of three constrained RDFTerms:
Triple | ||
s,p,o:RDFTerm | ||
iri⋅s∈ IRI∨ bnode⋅s∈ BlankNode | ||
iri⋅p∈ IRI | ||
iri⋅o∈ IRI∨ bnode⋅o∈ BlankNode∨ literal⋅o∈ RDFLiteral | ||
\begin{schema}{Triple} s,p,o:RDFTerm \where iri \entryFor s \in IRI \lor bnode \entryFor s \in BlankNode \\ iri \entryFor p \in IRI \\ iri \entryFor o \in IRI \lor bnode \entryFor o \in BlankNode \lor literal \entryFor o \in RDFLiteral \end{schema}
The ShEx language uses the following functions:
– return set of triples in a graph triples whose subject is a given RDFTerm
triplesForSubject:RDFTerm→ Graph→ Graph | ||
∀ subj:RDFTerm; g:Graph • triplesForSubject subj g = {t:g | t.s = subj } | ||
\begin{gendef} triplesForSubject:RDFTerm \fun Graph \fun Graph \where \forall subj:RDFTerm; g:Graph @ triplesForSubject~subj~g = \{t:g | t.s = subj \} \end{gendef}
– return set of triples in a graph triples whose object is a given RDFTerm
triplesForObject:RDFTerm→ Graph→ Graph | ||
∀ obj:RDFTerm; g:Graph • triplesForObject obj g = {t:g | t.o = obj } | ||
\begin{gendef} triplesForObject:RDFTerm \fun Graph \fun Graph \where \forall obj:RDFTerm; g:Graph @ triplesForObject~obj~g = \{t:g | t.o = obj \} \end{gendef}
The following URIs are referenced explicitly in the ShEx implementation:
XSD_String:IRI | |
SHEX_IRI:IRI | |
SHEX_BNODE:IRI | |
SHEX_LITERAL:IRI | |
SHEX_NONLITERAL:IRI |
\begin{axdef} XSD\_String:IRI \\ SHEX\_IRI:IRI \\ SHEX\_BNODE:IRI \\ SHEX\_LITERAL:IRI \\ SHEX\_NONLITERAL:IRI \end{axdef}
The above definitions of Graph, Triple, RDFTerm and their components will now be used to describe the relationship between a ShEx Schema and an RDF graph.
A Shape Expression Schema is a collection of labeled rules where exactly one rule in the collection is identified as the outermost or “starting” rule. In addition, any rule that is referenced within the Schema is also itself a member of the Schema Formally:
Schema | ||
rules:Label⇸Rule | ||
start:Label | ||
start∈ domrules | ||
∀ r:ranrules • | ||
(r∈ rangroup⇒ (group⋅r).rule∈ domrules)∧ | ||
(r∈ ranand⇒ ran(and⋅r)⊆ domrules)∧ | ||
(r∈ ranxor⇒ ran(xor⋅r)⊆ domrules)∧ | ||
(r∈ ranarc∧ (arc⋅r).valueSpec∈ ranvalueRef⇒ | ||
(valueRef⋅(arc⋅r).valueSpec)∈ domrules) | ||
\begin{schema}{Schema} rules:Label \pfun Rule \\ start:Label \where start \in \dom rules \\ \forall r:\ran rules @ \\ \t1 (r \in \ran group \implies (group \entryFor r).rule \in \dom rules) \land \\ \t1 (r \in \ran and \implies \ran (and \entryFor r) \subseteq \dom rules) \land \\ \t1 (r \in \ran xor \implies \ran (xor \entryFor r) \subseteq \dom rules) \land \\ \t1 (r \in \ran arc \land (arc \entryFor r).valueSpec \in \ran valueRef \implies \\ \t2 (valueRef \entryFor (arc \entryFor r).valueSpec) \in \dom rules) \end{schema}
While existing ShEx implementations define a rule Label as being either an IRI or a BlankNode, the type of Label does not impact the evaluation semantics. For our purposes, we can simply define it as a separate free type:
[Label] |
\begin{zed} [Label] \end{zed}
The validity of a given RDF Graph is determined by taking the start Rule of a ShEx Schema and a reference IRI and evaluating the validity of the Rule against the supplied graph.
Formally, the evaluate function takes a Schema, a Graph and a reference IRI and, if the start Rule in the Schema, in the context of the starting Schema and graph, returns either nomatch (z) or pass (p) then the function returns pass. In all other cases, the function returns fail (f ).
evaluate:Schema→ Graph→ IRI→ OptValidity | ||
∀ s:Schema; g:Graph; i:IRI; v:OptValidity; ec:EvalContext | | ||
ec.graph = g∧ ec.schema = s • | ||
evaluate s g i = | ||
ifevalRule ec (iri i) (s.rules s.start)∈ {z,p} | ||
thenpelsef | ||
\begin{gendef} evaluate:Schema \fun Graph \fun IRI \fun OptValidity \where \forall s:Schema; g:Graph; i:IRI; v:OptValidity; ec:EvalContext | \\ \t1 ec.graph = g \land ec.schema = s @ \\ \t2 evaluate ~s ~g ~i = \\ \t3 \IF evalRule ~ec ~ (iri~ i)~ (s.rules~ s.start) \in \{\nomatch, \pass\} \\ \t3 \THEN \pass \ELSE \fail \end{gendef}
A ShEx Rule is a set of constraints that can be evaluated against a reference RDFTerm in the context of a given Schema and RDF Graph:
EvalContext≙[schema:Schema; graph:Graph] |
\begin{zed} EvalContext \defs [schema:Schema; graph:Graph] \end{zed}
Formally, the evalRule function takes an EvalContext, a reference RDFTerm and a Rule and returns one of the following:
OptValidity ::=p|f|z|∅|ε |
\begin{zed} OptValidity ::= \pass | \fail | \nomatch | \none | \error \\ \end{zed}A Rule can take one of five forms. Each will be formally described later in this document, but informally they are:
Rule ::= |
arc《ArcRule》| |
rarc《RevArcRule》| |
group《GroupRule》| |
and《AndRule》| |
xor《XorRule》 |
\begin{zed} Rule ::= \\ \t1 arc \ldata ArcRule \rdata | \\ \t1 rarc \ldata RevArcRule \rdata | \\ \t1 group \ldata GroupRule \rdata | \\ \t1 and \ldata AndRule \rdata | \\ \t1 xor \ldata XorRule \rdata \\ \end{zed}
evalRule:EvalContext→ RDFTerm→ Rule→ OptValidity | ||
∀ ec:EvalContext; i:RDFTerm; r:Rule • evalRule ec i r = | ||
ifr∈ ranarc | ||
thenevalArcRule ec i (arc⋅r) | ||
elseifr∈ ranrarc | ||
thenevalRevArcRule ec i (rarc⋅r) | ||
elseifr∈ rangroup | ||
thenevalGroupRule ec i (group⋅r) | ||
elseifr∈ ranand | ||
thenevalAndRule ec i (and⋅r) | ||
else | ||
evalXorRule ec i (xor⋅r) | ||
\begin{gendef} evalRule:EvalContext \fun RDFTerm \fun Rule \fun OptValidity \where \forall ec:EvalContext; i:RDFTerm; r:Rule @ evalRule~ec~i~r = \\ \t1 \IF r \in \ran arc \\ \t2 \THEN evalArcRule~ec~i~(arc \entryFor r) \\ \t1 \ELSE \IF r \in \ran rarc \\ \t2 \THEN evalRevArcRule~ec~i~(rarc \entryFor r) \\ \t1 \ELSE \IF r \in \ran group \\ \t2 \THEN evalGroupRule~ec~i~(group \entryFor r) \\ \t1 \ELSE \IF r \in \ran and \\ \t2 \THEN evalAndRule~ec~i~(and \entryFor r) \\ \t1 \ELSE \\ \t2 evalXorRule~ec~i~(xor \entryFor r) \end{gendef}The evalRule′ function de-references the supplied Label and invokes evalRule with the result. This is not explicitly represented because the Z specification language does not allow cyclic dependencies. This function is undefined if Label is not in EvalContext
evalRule':EvalContext→ RDFTerm→ Label→ OptValidity | ||
∀ ec:EvalContext; l:Label • l∈ domec.schema.rules | ||
\begin{gendef} evalRule':EvalContext \fun RDFTerm \fun Label \fun OptValidity \where \forall ec:EvalContext; l:Label @ l \in \dom ec.schema.rules \end{gendef}
The ArcRule is used to select the subset of the graph having a given predicate or predicates and to determine whether the cardinality and/or “type” of this subset matches a supplied criteria. The rule itself consists of a PredicateFilter to select the triples, an ObjectSpecification to evaluate the result, an optional min and max cardinality and a (possibly empty) set of Actions:
ArcRule | ||
filter:PredicateFilter | ||
valueSpec:ObjectSpecification | ||
min, max:optional[ℕ] | ||
actions:ℙAction | ||
(# min = 1∧ # max = 1)⇒ value min≤value max | ||
\begin{schema}{ArcRule} filter:PredicateFilter \\ valueSpec:ObjectSpecification \\ min, max:\optional[\nat] \\ actions:\power Action \where (\# min = 1 \land \# max = 1) \implies value~ min \leq value~ max \end{schema}ArcRule evaluation consists of:
evalArcRule:EvalContext→ RDFTerm→ ArcRule→ OptValidity | ||
∀ ec:EvalContext; s:RDFTerm; ar:ArcRule; sg:Graph | | ||
sg = evalPredicateFilter ar.filter (triplesForSubject s ec.graph) • | ||
evalArcRule ec s ar = | ||
ifevalCardinality sg ar.min ar.max≠ p | ||
thenevalCardinality sg ar.min ar.max | ||
elseifevalObjectSpecification ec ar.valueSpec sg≠ p | ||
thenevalObjectSpecification ec ar.valueSpec sg | ||
else | ||
dispatch ar.actions sg ec | ||
\begin{gendef} evalArcRule:EvalContext \fun RDFTerm \fun ArcRule \fun OptValidity \where \forall ec:EvalContext; s:RDFTerm; ar:ArcRule; sg:Graph | \\ \t1 sg = evalPredicateFilter~ar.filter~(triplesForSubject~ s~ ec.graph) @ \\ \t2 evalArcRule~ec~s~ar = \\ \t3 \IF evalCardinality~sg~ar.min~ar.max \neq \pass \\ \t4 \THEN evalCardinality~sg~ar.min~ar.max \\ \t3 \ELSE \IF evalObjectSpecification~ec~ar.valueSpec~sg \neq \pass \\ \t4 \THEN evalObjectSpecification~ec~ar.valueSpec~sg \\ \t3 \ELSE \\ \t4 dispatch~ ar.actions~ sg~ ec \end{gendef}
A PredicateFilter can be one of:
IRIorStem ::= iosi《IRI》| ioss《IRIstem》 |
PredicateFilter ::= pfIRI《IRIorStem》| pfWild《ℙIRIorStem》 |
\begin{zed} IRIorStem ::= iosi \ldata IRI \rdata | ioss \ldata IRIstem \rdata \\ PredicateFilter ::= pfIRI \ldata IRIorStem \rdata | pfWild \ldata \power IRIorStem \rdata \end{zed}An IRIstem matches any IRI whose stringified representation begins with the stringified representation of IRIstem according to standard IRI matching rules [RFC3987]. This is represented by the function:
[IRIstem] |
\begin{zed} [IRIstem] \end{zed}
IRIstemRange:IRIstem⇸ℙIRI | ||
\begin{gendef} IRIstemRange:IRIstem \pfun \power IRI \end{gendef}evalPredicateFilter returns all of the triples in a Graph whose predicate matches the supplied PredicateFilter:
evalPredicateFilter:PredicateFilter⇸Graph⇸Graph | ||
∀ f:PredicateFilter; g:Graph • evalPredicateFilter f g = | ||
iff∈ ranpfIRIthenevalIRIorStem (pfIRI⋅f) g | ||
elseevalWild (pfWild⋅f) g | ||
\begin{gendef} evalPredicateFilter:PredicateFilter \pfun Graph \pfun Graph \where \forall f:PredicateFilter; g:Graph @ evalPredicateFilter~ f ~g = \\ \t1 \IF f \in \ran pfIRI \THEN evalIRIorStem~ (pfIRI \entryFor f)~ g \\ \t1 \ELSE evalWild~ (pfWild \entryFor f)~ g \end{gendef}evalIRIorStem returns all of the triples in a Graph matching the supplied IRIorStem
evalIRIorStem:IRIorStem⇸Graph⇸Graph | ||
∀ e:IRIorStem; g:Graph • evalIRIorStem e g = | ||
ife∈ raniosithen{t:g | iri⋅t.p = iosi⋅e } | ||
else{t:g | iri⋅t.p∈ IRIstemRange (ioss⋅e) } | ||
\begin{gendef} evalIRIorStem:IRIorStem \pfun Graph \pfun Graph \where \forall e:IRIorStem; g:Graph @ evalIRIorStem~ e ~g = \\ \t1 \IF e \in \ran iosi \THEN \{t:g | iri \entryFor t.p = iosi \entryFor e \} \\ \t1 \ELSE \{t:g | iri \entryFor t.p \in IRIstemRange~ (ioss \entryFor e) \} \end{gendef}evalWild returns all of the triples in a Graph that do not match an entry in the set of IRIorStems.
evalWild:ℙIRIorStem⇸Graph⇸Graph | ||
∀ es:ℙIRIorStem; g:Graph • evalWild es g = | ||
{t:g | t∉ ⋃{e:es • evalIRIorStem e g} } | ||
\begin{gendef} evalWild:\power IRIorStem \pfun Graph \pfun Graph \where \forall es:\power IRIorStem; g:Graph @ evalWild ~es ~g = \\ \t1 \{t:g | t \notin \bigcup \{e:es @ evalIRIorStem~e~g\} \} \end{gendef}
ObjectSpecification specifies a set of possible values for an RDFTerm and takes one of the following forms:
MatchValue ::= |
mviri《IRI》| |
mviris《IRIstem》| |
mvlit《RDFLiteral》 |
ObjectSpecification ::= |
valueType《IRI》| |
valueSet《ℙMatchValue》| |
osWild《ℙMatchValue》| |
valueRef《Label》 |
\begin{zed} MatchValue ::= \\ \t1 mviri \ldata IRI \rdata | \\ \t1 mviris \ldata IRIstem \rdata | \\ \t1 mvlit \ldata RDFLiteral \rdata \also ObjectSpecification ::= \\ \t1 valueType \ldata IRI \rdata | \\ \t1 valueSet \ldata \power MatchValue \rdata | \\ \t1 osWild \ldata \power MatchValue \rdata | \\ \t1 valueRef \ldata Label \rdata \end{zed}
– evaluates the cardinality the supplied graph.
evalCardinality:Graph→ optional[ℕ]→ optional[ℕ]→ OptValidity | ||
∀ g:Graph; min, max:optional[ℕ] • evalCardinality g min max = | ||
if# min = 1∧ # g = 0∧ value min = 0 | ||
thenz | ||
elseif# g = 0 | ||
then∅ | ||
elseif(# min = 1∧ # g < value min)∨ | ||
(# max = 1∧ # g > value max) | ||
thenf | ||
elsep | ||
\begin{gendef} evalCardinality:Graph \fun \optional[\nat] \fun \optional[\nat] \fun OptValidity \where \forall g:Graph; min, max:\optional[\nat] @ evalCardinality~g~min~max = \\ \t1 \IF \# min = 1 \land \# g = 0 \land value~ min = 0 \\ \t2 \THEN \nomatch \\ \t1 \ELSE \IF \# g = 0 \\ \t2 \THEN \none \\ \t1 \ELSE \IF (\# min = 1 \land \# g < value~ min) \lor \\ \t3 (\# max = 1 \land \# g > value~ max) \\ \t2 \THEN \fail \\ \t1 \ELSE \pass \end{gendef}
evalObjectSpecification – returns pass (p) if all of the triples in a Graph match the supplied ObjectSpecification, otherwise fail (f )
evalObjectSpecification:EvalContext→ ObjectSpecification→ Graph→ | ||
OptValidity | ||
∀ ec:EvalContext; os:ObjectSpecification; g:Graph • | ||
evalObjectSpecification ec os g = | ||
if∀ t:g • evalObjectSpecificationTriple ec os t.o =p | ||
thenp | ||
elsef | ||
\begin{gendef} evalObjectSpecification:EvalContext \fun ObjectSpecification \fun Graph \fun \\ \t1 OptValidity \where \forall ec:EvalContext; os:ObjectSpecification; g:Graph @ \\ evalObjectSpecification~ec~os~g = \\ \t1 \IF \forall t:g @ evalObjectSpecificationTriple~ec~os~t.o = \pass \\ \t2 \THEN \pass \\ \t2 \ELSE \fail \end{gendef}
evalObjectSpecificationTriple:EvalContext→ ObjectSpecification→ | ||
RDFTerm→ OptValidity | ||
∀ ec:EvalContext; os:ObjectSpecification; n:RDFTerm • | ||
evalObjectSpecificationTriple ec os n = | ||
ifos∈ ranvalueTypethen | ||
evalValueType (valueType⋅os) n | ||
elseifos∈ ranvalueSetthen | ||
evalTermSet (valueSet⋅os) n | ||
elseifos∈ ranosWildthen | ||
evalTermWild (osWild⋅os) n | ||
else | ||
evalTermReference ec (valueRef⋅os) n | ||
\begin{gendef} evalObjectSpecificationTriple:EvalContext \fun ObjectSpecification \fun \\ \t1 RDFTerm \fun OptValidity \where \forall ec:EvalContext; os:ObjectSpecification; n:RDFTerm @ \\ \t1 evalObjectSpecificationTriple~ec~os~n = \\ \t2 \IF os \in \ran valueType \THEN \\ \t3 evalValueType~(valueType \entryFor os)~n \\ \t2 \ELSE \IF os \in \ran valueSet \THEN \\ \t3 evalTermSet~(valueSet \entryFor os)~n \\ \t2 \ELSE \IF os \in \ran osWild \THEN \\ \t3 evalTermWild~(osWild \entryFor os)~n \\ \t2 \ELSE \\ \t3 evalTermReference~ec~(valueRef \entryFor os)~n \end{gendef}
evalValueType – returns pass if the supplied RDFTerm is:
evalValueType:IRI⇸RDFTerm⇸OptValidity | ||
∀ vt:IRI; n:RDFTerm; l:RDFLiteral • evalValueType vt n = | ||
ifvt = SHEX_IRI∧ n∈ ranirithenp | ||
elseifvt = SHEX_BNODE∧ n∈ ranbnodethenp | ||
elseifvt = SHEX_NONLITERAL∧ (n∈ raniri∨ n∈ ranbnode)thenp | ||
elseifvt = SHEX_LITERAL∧ n∈ ranliteralthenp | ||
elseifn∈ ranliteral∧ l = (literal⋅n)∧ | ||
((l∈ ranpl∧ (pl⋅l).dataType = vt)∨ | ||
(l∈ rantl∧ (tl⋅l).dataType = vt))thenp | ||
elsef | ||
\begin{gendef} evalValueType:IRI \pfun RDFTerm \pfun OptValidity \where \forall vt:IRI; n:RDFTerm; l:RDFLiteral @ evalValueType~vt~ n = \\ \t1 \IF vt = SHEX\_IRI \land n \in \ran iri \THEN \pass \\ \t1 \ELSE \IF vt = SHEX\_BNODE \land n \in \ran bnode \THEN \pass \\ \t1 \ELSE \IF vt = SHEX\_NONLITERAL \land (n \in \ran iri \lor n \in \ran bnode) \THEN \pass \\ \t1 \ELSE \IF vt = SHEX\_LITERAL \land n \in \ran literal \THEN \pass \\ \t1 \ELSE \IF n \in \ran literal \land l = (literal \entryFor n) \land \\ \t2 ((l \in \ran pl \land (pl \entryFor l).dataType = vt) \lor \\ \t2 (l \in \ran tl \land (tl \entryFor l).dataType = vt)) \THEN \pass \\ \t1 \ELSE \fail \end{gendef}
evalTermSet – return p if the supplied RDFTerm is a member of MatchValue
evalTermSet:ℙMatchValue⇸RDFTerm⇸OptValidity | ||
∀ mvs:ℙMatchValue; n:RDFTerm • evalTermSet mvs n = | ||
if∃ mv:mvs • | ||
((mv∈ ranmviri∧ n∈ raniri∧ (iri⋅n) = mviri⋅mv)∨ | ||
(mv∈ ranmviris∧ n∈ raniri∧ | ||
(iri⋅n)∈ IRIstemRange (mviris⋅mv) )∨ | ||
(n∈ ranliteral∧ mvlit⋅mv = literal⋅n)) | ||
thenp | ||
elsef | ||
\begin{gendef} evalTermSet:\power MatchValue \pfun RDFTerm \pfun OptValidity \where \forall mvs:\power MatchValue; n:RDFTerm @ evalTermSet~mvs~n = \\ \t1 \IF \exists mv:mvs @ \\ \t2 ((mv \in \ran mviri \land n \in \ran iri \land (iri \entryFor n) = mviri \entryFor mv) \lor \\ \t2 (mv \in \ran mviris \land n \in \ran iri \land \\ \t3 (iri \entryFor n) \in IRIstemRange ~ (mviris \entryFor mv) ) \lor \\ \t2 (n \in \ran literal \land mvlit \entryFor mv = literal \entryFor n)) \\ \t1 \THEN \pass \\ \t1 \ELSE \fail \end{gendef}
evalTermWild – return pass (p) if the supplied RDFTerm is not a member of MatchValue.
evalTermWild:ℙMatchValue→ RDFTerm→ OptValidity | ||
∀ mvs:ℙMatchValue; n:RDFTerm • evalTermWild mvs n = | ||
ifevalTermSet mvs n =pthenfelsep | ||
\begin{gendef} evalTermWild:\power MatchValue \fun RDFTerm \fun OptValidity \where \forall mvs:\power MatchValue; n:RDFTerm @ evalTermWild~mvs~n = \\ \t1 \IF evalTermSet~ mvs~ n = \pass \THEN \fail \ELSE \pass \end{gendef}
evalTermReference – return p if the subgraph of the EvalContext graph whose subjects match the supplied RDFTerm satisfies the ValueReference rule.
evalTermReference:EvalContext⇸Label⇸RDFTerm⇸OptValidity | ||
∀ ec:EvalContext; vr:Label; n:RDFTerm • | ||
evalTermReference ec vr n = | ||
ifn∉ ranliteralthenevalRule' ec n vr | ||
elsef | ||
\begin{gendef} evalTermReference:EvalContext \pfun Label \pfun RDFTerm \pfun OptValidity \where \forall ec:EvalContext; vr:Label; n:RDFTerm @ \\ \t1 evalTermReference~ec~vr~n = \\ \t2 \IF n \notin \ran literal \THEN evalRule'~ec~n~vr \\ \t2 \ELSE \fail \end{gendef}
The RevArcRule is used to select the subset of the graph having a given predicate or predicates and to determine whether the cardinality and/or “type” of this subset matches a supplied criteria. The rule itself consists of a PredicateFilter to select the triples, an SubjectSpecification to evaluate the result, a optional min and max cardinality and a (possibly empty) set of Actions:
RevArcRule | ||
filter:PredicateFilter | ||
valueSpec:SubjectSpecification | ||
min, max:optional[ℕ] | ||
actions:ℙAction | ||
(# min = 1∧ # max = 1)⇒ value min≤value max | ||
\begin{schema}{RevArcRule} filter:PredicateFilter \\ valueSpec:SubjectSpecification \\ min, max:\optional[\nat] \\ actions:\power Action \where (\# min = 1 \land \# max = 1) \implies value~ min \leq value~ max \end{schema}RevArcRule evaluation consists of:
evalRevArcRule:EvalContext→ RDFTerm→ RevArcRule→ OptValidity | ||
∀ ec:EvalContext; o:RDFTerm; rar:RevArcRule; og:Graph | | ||
og = evalPredicateFilter rar.filter (triplesForObject o ec.graph) • | ||
evalRevArcRule ec o rar = | ||
ifevalCardinality og rar.min rar.max≠ p | ||
thenevalCardinality og rar.min rar.max | ||
elseifevalSubjectSpecification ec rar.valueSpec og≠ p | ||
thenevalSubjectSpecification ec rar.valueSpec og | ||
else | ||
dispatch rar.actions og ec | ||
\begin{gendef} evalRevArcRule:EvalContext \fun RDFTerm \fun RevArcRule \fun OptValidity \where \forall ec:EvalContext; o:RDFTerm; rar:RevArcRule; og:Graph | \\ \t1 og = evalPredicateFilter~rar.filter~(triplesForObject~ o~ ec.graph) @ \\ \t2 evalRevArcRule~ec~o~rar = \\ \t3 \IF evalCardinality~og~rar.min~rar.max \neq \pass \\ \t4 \THEN evalCardinality~og~rar.min~rar.max \\ \t3 \ELSE \IF evalSubjectSpecification~ec~rar.valueSpec~og \neq \pass \\ \t4 \THEN evalSubjectSpecification~ec~rar.valueSpec~og \\ \t3 \ELSE \\ \t4 dispatch~ rar.actions~ og~ ec \end{gendef}
SubjectSpecification specifies a set of possible values for an RDFTerm and takes one of the following forms:
SubjectSpecification ::= |
subjectSet《ℙMatchValue》| |
ssWild《ℙMatchValue》| |
subjectRef《Label》 |
\begin{zed} SubjectSpecification ::= \\ \t1 subjectSet \ldata \power MatchValue \rdata | \\ \t1 ssWild \ldata \power MatchValue \rdata | \\ \t1 subjectRef \ldata Label \rdata \end{zed}
paragraphevalSubjectSpecification – returns pass (p) if all of the triples in a Graph match the supplied SubjectSpecification, otherwise fail (f )
evalSubjectSpecification:EvalContext→ SubjectSpecification→ Graph→ | ||
OptValidity | ||
∀ ec:EvalContext; ss:SubjectSpecification; g:Graph • | ||
evalSubjectSpecification ec ss g = | ||
if∀ t:g • evalSubjectSpecificationTriple ec ss t.o =p | ||
thenp | ||
elsef | ||
\begin{gendef} evalSubjectSpecification:EvalContext \fun SubjectSpecification \fun Graph \fun \\ \t1 OptValidity \where \forall ec:EvalContext; ss:SubjectSpecification; g:Graph @ \\ \t1 evalSubjectSpecification~ec~ss~g = \\ \t2 \IF \forall t:g @ evalSubjectSpecificationTriple~ec~ss~t.o = \pass \\ \t2 \THEN \pass \\ \t2 \ELSE \fail \end{gendef}
evalSubjectSpecificationTriple:EvalContext→ SubjectSpecification→ | ||
RDFTerm→ OptValidity | ||
∀ ec:EvalContext; ss:SubjectSpecification; n:RDFTerm • | ||
evalSubjectSpecificationTriple ec ss n = | ||
ifss∈ ransubjectSetthen | ||
evalTermSet (subjectSet⋅ss) n | ||
elseifss∈ ranssWildthen | ||
evalTermWild (ssWild⋅ss) n | ||
else | ||
evalTermReference ec (subjectRef⋅ss) n | ||
\begin{gendef} evalSubjectSpecificationTriple:EvalContext \fun SubjectSpecification \fun \\ \t1 RDFTerm \fun OptValidity \where \forall ec:EvalContext; ss:SubjectSpecification; n:RDFTerm @ \\ \t1 evalSubjectSpecificationTriple~ec~ss~n = \\ \t2 \IF ss \in \ran subjectSet \THEN \\ \t3 evalTermSet~(subjectSet \entryFor ss)~n \\ \t2 \ELSE \IF ss \in \ran ssWild \THEN \\ \t3 evalTermWild~(ssWild \entryFor ss)~n \\ \t2 \ELSE \\ \t3 evalTermReference~ec~(subjectRef \entryFor ss)~n \end{gendef}
A GroupRule serves two purposes. The first is to declare that a referenced rule is to be treated as “optional”, which, in this case means that if (a) the referenced rule returned none (∅), (meaning an ArcRule was encountered that had no matching predicates and a non-zero minimum cardinality) the group rule returns nomatch (z). An optional GroupRule also treats an error situation as a fail (f ).
The second purpose of a group rule is to allow a set of external actions to be evaluated whenever the referenced rule returns pass (p).
OPT ::= OPTIONAL | REQUIRED |
GroupRule≙[rule:Label; opt:OPT; actions:ℙAction ] |
\begin{zed} OPT ::= OPTIONAL | REQUIRED \also GroupRule \defs [rule:Label; opt:OPT; actions:\power Action ] \\ \end{zed}evalGroupRule evaluates Rule, applies opt and, if the result is pass (p), evaluates the actions with respect to the passing Graph
evalGroupRule:EvalContext→ RDFTerm→ GroupRule→ OptValidity | ||
∀ ec:EvalContext; i:RDFTerm; gr:GroupRule • | ||
evalGroupRule ec i gr = | ||
ifevalRule' ec i gr.rule =∅∧ gr.opt = OPTIONAL | ||
thenz | ||
elseifevalRule' ec i gr.rule =ε∧ gr.opt = OPTIONAL | ||
thenf | ||
elseifevalRule' ec i gr.rule =p | ||
thendispatch gr.actions ∅ ec | ||
else | ||
evalRule' ec i gr.rule | ||
\begin{gendef} evalGroupRule:EvalContext \fun RDFTerm \fun GroupRule \fun OptValidity \where \forall ec:EvalContext; i:RDFTerm; gr:GroupRule @ \\ \t1 evalGroupRule~ ec~ i ~ gr = \\ \t2 \IF evalRule'~ec~i~gr.rule = \none \land gr.opt = OPTIONAL \\ \t3 \THEN \nomatch \\ \t2 \ELSE \IF evalRule'~ec~i~gr.rule = \error \land gr.opt = OPTIONAL \\ \t3 \THEN \fail \\ \t2 \ELSE \IF evalRule'~ec~i~gr.rule = \pass \\ \t3 \THEN dispatch~ gr.actions~ \emptyset ~ ec \\ \t2 \ELSE \\ \t3 evalRule'~ec~i~gr.rule \end{gendef}
An AndRule consists of a set of one or more Rules, whose evaluation is determined by
the following table:
And | ∅ | z | f | p | ε |
---|---|---|---|---|---|
∅ | ∅1 | ∅1 | f1 | f1 | ε1 |
z | ∅1 | z1 | f1 | p1 | ε1 |
f | f1 | f1 | f1 | f1 | ε1 |
p | f1 | p1 | f1 | p1 | ε1 |
ε | ε1 | ε1 | ε1 | ε1 | ε1 |
The formal implementation of which will be realized in the corresponding function:
And:OptValidity→ OptValidity→ OptValidity | ||
\begin{gendef} And:OptValidity \fun OptValidity \fun OptValidity \end{gendef}
Observing that the above table is a monoid with nomatch (z) as an identity element, evalAndRule can be applied using the standard functional pattern:
AndRule ==seq1Label |
\begin{zed} AndRule == \seq_1 Label \end{zed}
evalAndRule:EvalContext→ RDFTerm→ AndRule→ OptValidity | ||
∀ ec:EvalContext; i:RDFTerm; r:AndRule • | ||
evalAndRule ec i r = | ||
foldr And z (map (evalRule' ec i) r) | ||
\begin{gendef} evalAndRule:EvalContext \fun RDFTerm \fun AndRule \fun OptValidity \where \forall ec:EvalContext; i:RDFTerm; r:AndRule @ \\ \t1 evalAndRule~ec~i~r = \\ \t2 foldr~ And~ \nomatch ~ (map~(evalRule'~ec~i)~r) \end{gendef}
An XorRule consists of a set of one or more Rules, whose evaluation is determined by
the following table:
Or | ∅ | z | f | p | ε |
---|---|---|---|---|---|
∅ | ∅1 | z1 | ∅1 | p1 | ε1 |
z | z1 | z1 | z1 | p1 | ε1 |
f | ∅1 | z1 | f1 | p1 | ε1 |
p | p1 | p1 | p1 | ε23 | ε1 |
ε | ε1 | ε1 | ε1 | ε1 | ε1 |
Xor:OptValidity→ OptValidity→ OptValidity | ||
\begin{gendef} Xor:OptValidity \fun OptValidity \fun OptValidity \end{gendef}As with the And function above, Xor is a monoid whose identity is fail (f ) resulting in the following definition for evalXorRule
XorRule ==seq1Label |
\begin{zed} XorRule == \seq_1 Label \end{zed}
evalXorRule:EvalContext→ RDFTerm→ XorRule→ OptValidity | ||
∀ ec:EvalContext; i:RDFTerm; r:XorRule • | ||
evalXorRule ec i r = | ||
foldr Xor f (map (evalRule' ec i) r) | ||
\begin{gendef} evalXorRule:EvalContext \fun RDFTerm \fun XorRule \fun OptValidity \where \forall ec:EvalContext; i:RDFTerm; r:XorRule @ \\ \t1 evalXorRule~ec~i~r = \\ \t2 foldr~ Xor~ \fail ~ (map~(evalRule'~ ec ~i)~ r) \end{gendef}
The dispatch function allows the evaluation / execution of arbitrary external “Actions”. While the evaluation of an Action can (obviously) have side effects outside the context of the ShEx environment, it must be side effect free within the execution context. In particular, an Action may not change anything in the EvalContext The action dispatcher exists to allow external events to happen. Parameters:
The dispatch function usually returns pass (p) or fail (f ), although there may also be cases for other OptValidity values in certain circumstances. The dispatch function always returns pass (p) if the set of actions is empty.
[Action] |
\begin{zed} [Action] \end{zed}
dispatch:ℙAction→ Graph→ EvalContext→ OptValidity | ||
∀ as:ℙAction; g:Graph; ec:EvalContext • | ||
as =∅⇒ dispatch as g ec =p | ||
\begin{gendef} dispatch:\power Action \fun Graph \fun EvalContext \fun OptValidity \where \forall as:\power Action; g:Graph; ec:EvalContext @ \\ \t1 as = \emptyset \implies dispatch~ as~ g ~ec = \pass \end{gendef}
The semantics of the productions and terminals from the Turtle specification are defined in Turtle Terse RDF Triple Language section 7 Parsing.
The following grammar rules are mapped to the abstract syntax:
Productions: | |||
[1] | ShExDoc |
::= | statement* |
[2] | statement |
::= | directive
| shape |
[3] | directive |
::= | sparqlPrefix
| sparqlBase
| "start" "=" ( label | typeSpec CODE* ) let b = a fresh blank node typeSpec.setLabel(b) new ArcRule() |
[4] | sparqlPrefix |
::= | "PREFIX" PNAME_NS IRIREF |
[5] | sparqlBase |
::= | "BASE" IRIREF |
[6] | shape |
::= | label typeSpec CODE* if (CODE.SIZE > 0) let r = new GroupRule(false, typeSpec) r.label = label else typeSpec.label = label |
[7] | typeSpec |
::= | "{" OrExpression "}" |
[8] | OrExpression |
::= | l:AndExpression ( "|" r:AndExpression )* if (r) new XorRule(l, r) else l |
[9] | AndExpression |
::= | l:UnaryExpression ( "," r:UnaryExpression )* if (r) new AndRule(l, r) else l |
[10] | UnaryExpression |
::= | arc |
| "(" OrExpression ")" opt:"?"? CODE* new GroupRule(OrExpression, opt, CODE) |
|||
[11] | label |
::= | iri
| BlankNode |
[12] | arc |
::= | nameClass valueSpec default? repeatCount? properties? CODE* new ArcRule(nameClass, valueSpec, default, repeatCount, properties, CODE) |
[13] | nameClass |
::= | iriStem pfIRI(iriStem) |
| "." exclusions pfWild(exclusions) |
|||
| 'a' pfIRI(rdf:type) |
|||
[14] | valueSpec |
::= | "@" label ValueReference(label) |
| typeSpec let b = a fresh blank node typeSpec.setLabel(b) |
|||
| valueSet new ValueSet(valueSet) |
|||
| object new ValueType(object) |
|||
| exclusions new ValueWild(exclusions) |
|||
[15] | iriStem |
::= | iri ("~")? |
[16] | default |
::= | "=" object |
[17] | properties |
::= | "[" iri object ( ";" ( iri object )? )* "]" |
[18] | exclusions |
::= | ( "-" iriStem )* |
[19] | repeatCount |
::= | "*"
| "+"
| "?"
| "{" INTEGER ( "," (INTEGER)? )? "}" |
[20] | valueSet |
::= | "(" (object)+ ")" |
[21] | object |
::= | iriStem
| BlankNode
| literal |
[22] | literal |
::= | RDFLiteral
| NumericLiteral
| BooleanLiteral |
[23] | NumericLiteral |
::= | INTEGER
| DECIMAL
| DOUBLE |
[24] | RDFLiteral |
::= | String ( LANGTAG | "^^" iri )? |
[25] | BooleanLiteral |
::= | "true"
| "false" |
[26] | String |
::= | STRING_LITERAL1
| STRING_LITERAL2
| STRING_LITERAL_LONG1
| STRING_LITERAL_LONG2 |
[27] | iri |
::= | IRIREF
| PrefixedName |
[28] | PrefixedName |
::= | PNAME_LN
| PNAME_NS |
[29] | BlankNode |
::= | BLANK_NODE_LABEL
| ANON |
Terminals Not in SPARQL: | |||
[30] | <CODE > |
::= | "%" ( [a-zA-Z+#_] ([a-zA-Z0-9+#_])* )? "{" ( [^%\\] | "\\" "%" )* "%" "}" |
A Shape Expressions schema with no semantic actions can be treated as unordered. In practice, error messages can be much clearer, and extension functions much simpler to write if the engine follows the lexical order when validating the members of a conjunction. Likewise, form and interface code generation will be more predictable and controllable if the Shape Expressions are processed in a specific order.
The foldr function is the standard functional pattern, which takes a binary function of type T, an identity function for type T, a sequence of type T and returns the result of applying the function to the right to left pairs of the sequence.
[T] | ||
foldr:(T→ T→ T)→ T→ seqT→ T | ||
∀ f:T→ T→ T; id:T; xs:seqT • | ||
foldr f id xs = | ||
ifxs =〈〉thenid | ||
elsef (head xs) (foldr f id (tail xs)) | ||
\begin{gendef}[T] foldr:(T \fun T \fun T) \fun T \fun \seq T \fun T \where \forall f:T \fun T \fun T; id:T; xs:\seq T @ \\ \t1 foldr~f~id~xs = \\ \t2 \IF xs = \langle \rangle \THEN id \\ \t2 \ELSE f~(head~xs)~(foldr~f~id~(tail~xs)) \end{gendef}
The map function takes a function from type A to type B and applies it to all members in the supplied sequence
[A,B] | ||
map:(A→ B)→ seqA→ seqB | ||
∀ f:A→ B; xs:seqA • | ||
map f xs = | ||
ifxs =〈〉then〈〉 | ||
else〈f (head xs)〉⌒map f (tail xs) | ||
\begin{gendef}[A,B] map:(A \fun B) \fun \seq A \fun \seq B \where \forall f:A \fun B; xs:\seq A @ \\ \t1 map~f~xs = \\ \t2 \IF xs = \langle \rangle \THEN \langle \rangle \\ \t2 \ELSE \langle f~(head~xs) \rangle \cat map~f~(tail~xs) \end{gendef}
Z uses the notion of free type definitions in the form:
FreeType ::= constructor《source》 |
\begin{syntax} FreeType ::= constructor \ldata source \rdata \end{syntax}which introduces a collection of constants of type FreeType, one for each element of the set source. constructor is an injective function from source to FreeType:
constructor:source↣FreeType |
\begin{axdef} constructor:source \inj FreeType \end{axdef}In the models that follow, there is a need to reverse this – to find the source for a given FreeType instance. The ⋅ function exists for this purpose. As an example, if one were to define:
TravelDirections ::= |
bus《BusDirections》| |
walking《WalkingDirections》 |
\begin{syntax} TravelDirections ::= \\ \t1 bus \ldata BusDirections \rdata | \\ \t1 walking \ldata WalkingDirections \rdata \end{syntax}If one is supplied with an instance of TravelDirections, one can convert it to the appropriate type by:
x∈ TravelDirections |
ifx∈ ranbusthenbus⋅xelsewalking⋅x |
\begin{syntax} x \in TravelDirections \where \IF x \in \ran bus \THEN bus \entryFor x \ELSE walking \entryFor x \end{syntax}
[X, Y] | ||
_⋅_:(X↣Y)× Y⇸X | ||
∀ f:X↣Y; y:Y • | ||
f⋅y = (μ x:domf | f x = y) | ||
\begin{gendef}[X, Y] \_ \entryFor \_ :(X \inj Y) \cross Y \pfun X \where \forall f:X \inj Y; y:Y @ \\ \t1 f \entryFor y = (\mu x:\dom f | f~x = y) \end{gendef}
One way to represent optional values is a set with one member. We take that route here and introduce a bit of syntactic sugar to show our intent:
optional[T] == { s:ℙT | # s≤1 } |
\begin{zed} \optional[T] == \{ s:\power T \mid \# s \leq 1 \} \\ \end{zed}
And a shorthand for addressing the content:
[T] | ||
value:ℙT⇸T | ||
∀ s:ℙT • value s = (μ e:T | e∈ s) | ||
\begin{gendef}[T] value:\power T \pfun T \where \forall s:\power T @ value~ s = (\mu e:T | e \in s) \end{gendef}