W3C W3C Member Submission

Shape Expressions 1.0 Definition

W3C Member Submission 2 June 2014 V1.0.1

This version:
http://w3c.github.io/ShEx/V1.0.1/Definition.html
Latest published version:
http://www.w3.org/Submission/shex-defn/
Previous version:
http://www.w3.org/Submission/2014/SUBM-shex-defn-20140602/
Authors:
Harold Solbrig,
Eric Prud'hommeaux,
Editor:
Arthur Ryman,

Abstract

Shape Expressions associate RDF graphs with labeled patterns called "shapes". Shapes can be used for validation, documentation and transformation of RDF data.

Shape Expressions express formal constraints on the content of RDF graphs and is intended to be used to validate RDF documents, communicate expected graph patterns for interfaces and to generate forms and validation code. This document describes the formal semantics of the Shape Expressions language through the use of the Z Specification Language. The accompanying Shape Expressions Primer provides an informal introduction to the language.


Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document describes the Shape Expressions langauge developed as a community effort. It is being submitted to W3C so that it can inform the development of a future RDF Data Shape specification.

By publishing this document, W3C acknowledges that the Submitting Members have made a formal Submission request to W3C for discussion. Publication of this document by W3C indicates no endorsement of its content by W3C, nor that W3C has, is, or will be allocating any resources to the issues addressed by it. This document is not the product of a chartered W3C group, but is published as potential input to the W3C Process. A W3C Team Comment has been published in conjunction with this Member Submission. Publication of acknowledged Member Submissions at the W3C site is one of the benefits of W3C Membership. Please consult the requirements associated with Member Submissions of section 3.3 of the W3C Patent Policy. Please consult the complete list of acknowledged W3C Member Submissions.

V1.0.1

This version has no official status. It corrects multiple Z Notation rendering errors which were caused by manually translating LaTeX to HTML. The Z Notation rendering in this version was generated by automatically translating LaTeX to HTML using XSLT. These XSLT scripts are available in the w3c/ShEx project hosted on GitHub: https://github.com/w3c/ShEx.

This version only corrects the Z Notation rendering errors present in the original W3C Member Submission. Stylistic and semantic errors remain.

1 Introduction
2 The RDF Data Model in Z
3 Shape Expression Evaluation
4 Rule Evaluation
5 Action evaluation
6 Parsing Rules
7 Appendix
References

1. Introduction

The Shape Expressions Language (ShEx) is used to specify formal constraints on the content of RDF graphs and are intended to be used to validate RDF documents, communicate expected graph patterns for interfaces and to generate forms and validation code. ShEx can be used to:

Information about the use, grammar and syntax of ShEx can be found at http://www.w3.org/2013/ShEx/. This document describes the formal semantics of the ShEx language using the Z specification language, beginning with a Z specification of the characteristics of an RDF Graph that are referenced by ShEx.

The output from ExtractZed.xsl produces text parsable with fuzz -d .

2 The RDF Data Model in Z

The formal definition for an RDF graph in RDF 1.1 Concepts and Abstract Syntax[RDF11-CONCEPTS]:

“An RDF graph is a set of RDF Triples

“An RDF triple consists of three components:

  • the subject, which is an IRI or a blank node
  • the predicate, which is an IRI
  • the object, which is an IRI, a literal or a blank node

“... IRIs, literals and blank nodes are distinct and distinguishable.”

provides a definition in Z:

Graph ==Triple
\begin{zed}
Graph == \power Triple
\end{zed}
        

The ShEx language treats IRIs and blank nodes as primitive types, which are defined as Z free types:

[IRI, BlankNode]
\begin{zed}
[IRI, BlankNode]
\end{zed}
        

The ShEx language can express constraints on both the type and content of literals, which are modeled separately:

“A literal in an RDF graph consists of two or three elements:

  • a lexical form, being a Unicode string...
  • a datatype IRI, being an IRI
  • if and only if the datatype IRI is
    http://www.w3.org/1999/02/22-rdf-syntax-ns#langString, a non-empty language tag as defined in [BCP47]. The language tag MUST be well-formed according to section 2.2.9 of [BCP47].”

This is modelled by String and LanguageTag as free types:

[String, LanguageTag]
\begin{zed}
[String, LanguageTag]
\end{zed}
        
and using them in the definition the two flavors of RDFLiteral, plain literal and typed literal:
TypedLiteral
      [lexicalForm:String ; dataType:IRI |
            dataTypeXSD_String]
PlainLiteral
      [lexicalForm:String ; dataType:IRI; langTag:LanguageTag |
            dataType = XSD_String]
RDFLiteral ::= plPlainLiteral| tlTypedLiteral
\begin{zed}
TypedLiteral \defs \\
\t1	[lexicalForm:String ; dataType:IRI | \\
\t2		dataType \neq XSD\_String]
\also
PlainLiteral \defs \\
\t1	[lexicalForm:String ; dataType:IRI; langTag:LanguageTag | \\
\t2		dataType = XSD\_String]
\also
RDFLiteral ::= pl \ldata PlainLiteral \rdata | tl \ldata TypedLiteral \rdata
\end{zed}
        
RDFTerm is defined as:

IRIs, literals and blank nodes are collectively known as RDF terms

RDFTerm ::=
      iriIRI|
      literalRDFLiteral|
      bnodeBlankNode

\begin{zed}
RDFTerm ::= \\
\t1	iri \ldata IRI \rdata | \\
\t1	literal \ldata RDFLiteral \rdata | \\
\t1	bnode \ldata BlankNode \rdata
\end{zed}
        
The definition of RDF Triple is modelled as a tuple consisting of three constrained RDFTerms:
Triple
s,p,o:RDFTerm
irisIRIbnodesBlankNode
iripIRI
irioIRIbnodeoBlankNodeliteraloRDFLiteral
\begin{schema}{Triple}
   s,p,o:RDFTerm
\where
   iri \entryFor s \in IRI \lor bnode \entryFor s \in BlankNode \\
   iri \entryFor p \in IRI \\
   iri \entryFor o \in IRI \lor bnode \entryFor o \in BlankNode \lor literal \entryFor o \in RDFLiteral
\end{schema}
        

2.1 RDF Access Functions

The ShEx language uses the following functions:

2.1.1 triplesForSubject

– return set of triples in a graph triples whose subject is a given RDFTerm

triplesForSubject:RDFTermGraphGraph
subj:RDFTerm; g:Graph triplesForSubject subj g = {t:g | t.s = subj }
\begin{gendef}
   triplesForSubject:RDFTerm \fun Graph \fun Graph
\where
   \forall subj:RDFTerm; g:Graph @ triplesForSubject~subj~g = \{t:g | t.s = subj \}
\end{gendef}
		

2.1.2 triplesForObject

– return set of triples in a graph triples whose object is a given RDFTerm

triplesForObject:RDFTermGraphGraph
obj:RDFTerm; g:Graph triplesForObject obj g = {t:g | t.o = obj }
\begin{gendef}
   triplesForObject:RDFTerm \fun Graph \fun Graph
\where
   \forall obj:RDFTerm; g:Graph @ triplesForObject~obj~g = \{t:g | t.o = obj \}
\end{gendef}
		

2.2 Well Known URIs

The following URIs are referenced explicitly in the ShEx implementation:

XSD_String:IRI
SHEX_IRI:IRI
SHEX_BNODE:IRI
SHEX_LITERAL:IRI
SHEX_NONLITERAL:IRI

\begin{axdef}
XSD\_String:IRI \\
SHEX\_IRI:IRI \\
SHEX\_BNODE:IRI \\
SHEX\_LITERAL:IRI \\
SHEX\_NONLITERAL:IRI
\end{axdef}
        

The above definitions of Graph, Triple, RDFTerm and their components will now be used to describe the relationship between a ShEx Schema and an RDF graph.

3 Shape Expression Evaluation

A Shape Expression Schema is a collection of labeled rules where exactly one rule in the collection is identified as the outermost or “starting” rule. In addition, any rule that is referenced within the Schema is also itself a member of the Schema Formally:

Schema
rules:LabelRule
start:Label
startdomrules
r:ranrules
      (rrangroup(groupr).ruledomrules)
      (rranandran(andr)domrules)
      (rranxorran(xorr)domrules)
      (rranarc(arcr).valueSpecranvalueRef
            (valueRef(arcr).valueSpec)domrules)

\begin{schema}{Schema}
   rules:Label \pfun Rule \\
   start:Label
\where
   start \in \dom rules \\
\forall r:\ran rules @ \\
\t1 (r \in \ran group \implies (group \entryFor r).rule \in \dom rules) \land \\
\t1 (r \in \ran and \implies \ran (and \entryFor r) \subseteq \dom rules) \land \\
\t1 (r \in \ran xor \implies \ran (xor \entryFor r) \subseteq \dom rules) \land \\ 
\t1 (r \in \ran arc \land (arc \entryFor r).valueSpec \in \ran valueRef \implies \\
\t2 (valueRef \entryFor (arc \entryFor r).valueSpec) \in \dom rules)
\end{schema}
		

While existing ShEx implementations define a rule Label as being either an IRI or a BlankNode, the type of Label does not impact the evaluation semantics. For our purposes, we can simply define it as a separate free type:

[Label]
\begin{zed}
[Label]
\end{zed}
		

The validity of a given RDF Graph is determined by taking the start Rule of a ShEx Schema and a reference IRI and evaluating the validity of the Rule against the supplied graph.

Formally, the evaluate function takes a Schema, a Graph and a reference IRI and, if the start Rule in the Schema, in the context of the starting Schema and graph, returns either nomatch (z) or pass (p) then the function returns pass. In all other cases, the function returns fail (f ).

evaluate:SchemaGraphIRIOptValidity
s:Schema; g:Graph; i:IRI; v:OptValidity; ec:EvalContext |
      ec.graph = gec.schema = s
            evaluate  s  g  i =
                  ifevalRule  ec   (iri  i)  (s.rules  s.start){z,p}
                  thenpelsef
\begin{gendef}
   evaluate:Schema \fun Graph \fun IRI \fun OptValidity
\where
\forall s:Schema; g:Graph; i:IRI; v:OptValidity; ec:EvalContext | \\
\t1 ec.graph = g \land ec.schema = s @ \\
\t2		evaluate ~s ~g ~i = \\
\t3			\IF evalRule ~ec ~ (iri~ i)~ (s.rules~ s.start) \in \{\nomatch, \pass\} \\
\t3			\THEN \pass \ELSE \fail
\end{gendef}
        

4 Rule Evaluation

A ShEx Rule is a set of constraints that can be evaluated against a reference RDFTerm in the context of a given Schema and RDF Graph:

EvalContext[schema:Schema; graph:Graph]
\begin{zed}
EvalContext \defs [schema:Schema; graph:Graph]
\end{zed}
		

Formally, the evalRule function takes an EvalContext, a reference RDFTerm and a Rule and returns one of the following:

OptValidity ::=p|f|z||ε
\begin{zed}
OptValidity ::= \pass | \fail | \nomatch | \none | \error \\
\end{zed}
        
A Rule can take one of five forms. Each will be formally described later in this document, but informally they are:
Rule ::=
      arcArcRule|
      rarcRevArcRule|
      groupGroupRule|
      andAndRule|
      xorXorRule
\begin{zed}
Rule ::= \\
\t1	arc \ldata ArcRule \rdata | \\
\t1 rarc \ldata RevArcRule \rdata | \\
\t1 group \ldata GroupRule \rdata | \\
\t1 and \ldata AndRule \rdata | \\
\t1 xor \ldata XorRule \rdata \\
\end{zed}
        
evalRule:EvalContextRDFTermRuleOptValidity
ec:EvalContext; i:RDFTerm; r:Rule evalRule ec i r =
      ifrranarc
            thenevalArcRule ec i (arcr)
      elseifrranrarc
            thenevalRevArcRule ec i (rarcr)
      elseifrrangroup
            thenevalGroupRule ec i (groupr)
      elseifrranand
            thenevalAndRule ec i (andr)
      else
            evalXorRule ec i (xorr)
\begin{gendef}
   evalRule:EvalContext \fun RDFTerm \fun Rule \fun OptValidity
\where
   \forall ec:EvalContext; i:RDFTerm; r:Rule @ evalRule~ec~i~r = \\
\t1 \IF r \in \ran arc \\
\t2 \THEN evalArcRule~ec~i~(arc \entryFor r) \\
\t1 \ELSE \IF r \in \ran rarc \\
\t2 \THEN evalRevArcRule~ec~i~(rarc \entryFor r) \\
\t1 \ELSE \IF r \in \ran group \\
\t2 \THEN evalGroupRule~ec~i~(group \entryFor r) \\
\t1 \ELSE \IF r \in \ran and \\
\t2 \THEN evalAndRule~ec~i~(and \entryFor r) \\
\t1 \ELSE \\
\t2 evalXorRule~ec~i~(xor \entryFor r)
\end{gendef}
        
The evalRulefunction de-references the supplied Label and invokes evalRule with the result. This is not explicitly represented because the Z specification language does not allow cyclic dependencies. This function is undefined if Label is not in EvalContext

evalRule':EvalContextRDFTermLabelOptValidity
ec:EvalContext; l:Label ldomec.schema.rules

\begin{gendef}
   evalRule':EvalContext \fun RDFTerm \fun Label \fun OptValidity
\where
   \forall ec:EvalContext; l:Label @ l \in \dom ec.schema.rules
\end{gendef}
		

4.1 ArcRule Evaluation

The ArcRule is used to select the subset of the graph having a given predicate or predicates and to determine whether the cardinality and/or “type” of this subset matches a supplied criteria. The rule itself consists of a PredicateFilter to select the triples, an ObjectSpecification to evaluate the result, an optional min and max cardinality and a (possibly empty) set of Actions:

ArcRule
filter:PredicateFilter
valueSpec:ObjectSpecification
min, max:optional[ℕ]
actions:Action
(# min = 1# max = 1)value  minvalue  max
\begin{schema}{ArcRule}
   filter:PredicateFilter \\
   valueSpec:ObjectSpecification \\
   min, max:\optional[\nat] \\
   actions:\power Action
\where
   (\# min = 1 \land \# max = 1) \implies value~ min \leq value~ max
\end{schema}
		
ArcRule evaluation consists of:
  1. Select the subset of the EvalContext Graph with the supplied subject and predicates matching PredicateFilter
  2. Evaluate the cardinality and return the result if it doesn’t pass
  3. Evaluate the object of each of the triples in the set against ObjectSpecification. If any of the evaluations fail, return fail (f ).
  4. Return the result of evaluating actions against the matching triples.
evalArcRule:EvalContextRDFTermArcRuleOptValidity
ec:EvalContext; s:RDFTerm; ar:ArcRule; sg:Graph |
      sg = evalPredicateFilter ar.filter (triplesForSubject  s  ec.graph)
            evalArcRule ec s ar =
                  ifevalCardinality sg ar.min ar.maxp
                        thenevalCardinality sg ar.min ar.max
                  elseifevalObjectSpecification ec ar.valueSpec sgp
                        thenevalObjectSpecification ec ar.valueSpec sg
                  else
                        dispatch  ar.actions  sg  ec
\begin{gendef}
   evalArcRule:EvalContext \fun RDFTerm \fun ArcRule \fun OptValidity
\where
   \forall ec:EvalContext; s:RDFTerm; ar:ArcRule; sg:Graph | \\ 
\t1 sg = evalPredicateFilter~ar.filter~(triplesForSubject~ s~ ec.graph) @ \\
\t2		evalArcRule~ec~s~ar = \\
\t3 \IF evalCardinality~sg~ar.min~ar.max \neq \pass \\
\t4 \THEN evalCardinality~sg~ar.min~ar.max \\
\t3 \ELSE \IF evalObjectSpecification~ec~ar.valueSpec~sg \neq \pass \\
\t4 \THEN evalObjectSpecification~ec~ar.valueSpec~sg \\
\t3 \ELSE \\
\t4  dispatch~ ar.actions~ sg~ ec 
\end{gendef}
		

4.1.1 PredicateFilter Validation

A PredicateFilter can be one of:

  • pfIRI - the IRI of a specific predicate or the IRIstem that defines a set of predicates
  • pfWild - an expression that matches any predicate except those matching the (possibly empty) set of IRIorStems
IRIorStem ::= iosiIRI| iossIRIstem
PredicateFilter ::= pfIRIIRIorStem| pfWildIRIorStem
\begin{zed}
IRIorStem ::= iosi \ldata IRI \rdata | ioss \ldata IRIstem \rdata \\
PredicateFilter ::= pfIRI \ldata IRIorStem \rdata | pfWild \ldata \power IRIorStem \rdata
\end{zed}
		
An IRIstem matches any IRI whose stringified representation begins with the stringified representation of IRIstem according to standard IRI matching rules [RFC3987]. This is represented by the function:
[IRIstem]
\begin{zed}
[IRIstem]
\end{zed}
		
IRIstemRange:IRIstemIRI
\begin{gendef}
	IRIstemRange:IRIstem \pfun \power IRI
\end{gendef}
		
evalPredicateFilter returns all of the triples in a Graph whose predicate matches the supplied PredicateFilter:
evalPredicateFilter:PredicateFilterGraphGraph
f:PredicateFilter; g:Graph evalPredicateFilter  f  g =
      iffranpfIRIthenevalIRIorStem  (pfIRIfg
      elseevalWild  (pfWildfg
\begin{gendef}
   evalPredicateFilter:PredicateFilter \pfun Graph \pfun Graph
\where
   \forall f:PredicateFilter; g:Graph @ evalPredicateFilter~ f ~g = \\
\t1 \IF f \in \ran pfIRI \THEN evalIRIorStem~ (pfIRI \entryFor f)~ g \\
\t1 \ELSE evalWild~ (pfWild \entryFor f)~ g
\end{gendef}
		
evalIRIorStem returns all of the triples in a Graph matching the supplied IRIorStem
evalIRIorStem:IRIorStemGraphGraph
e:IRIorStem; g:Graph evalIRIorStem  e  g =
      iferaniosithen{t:g | irit.p = iosie }
      else{t:g | irit.pIRIstemRange  (iosse) }
\begin{gendef}
   evalIRIorStem:IRIorStem \pfun Graph \pfun Graph
\where
   \forall e:IRIorStem; g:Graph @ evalIRIorStem~ e ~g = \\
\t1 \IF e \in \ran iosi \THEN \{t:g | iri \entryFor t.p = iosi \entryFor e \} \\
\t1 \ELSE \{t:g | iri \entryFor t.p \in IRIstemRange~ (ioss \entryFor e) \}
\end{gendef}
		
evalWild returns all of the triples in a Graph that do not match an entry in the set of IRIorStems.
evalWild:IRIorStemGraphGraph
es:IRIorStem; g:Graph evalWild  es  g =
      {t:g | t{e:es evalIRIorStem e g} }
\begin{gendef}
   evalWild:\power IRIorStem \pfun Graph \pfun Graph
\where
   \forall es:\power IRIorStem; g:Graph @ evalWild ~es ~g = \\
\t1 \{t:g | t \notin \bigcup \{e:es @ evalIRIorStem~e~g\} \}
\end{gendef}
		

4.1.2 ObjectSpecification Evaluation

ObjectSpecification specifies a set of possible values for an RDFTerm and takes one of the following forms:

  • ValueType - matches Literals having a specified data type
  • ValueSet - matches IRIs or Literals that match one or more of the expressions in the specified set
  • ValueWild - matches any target except those matching the (possibly empty) set of IRIstems
  • ValueReference - matches any target that is considered valid according the Rule identified by Label.
MatchValue ::=
      mviriIRI|
      mvirisIRIstem|
      mvlitRDFLiteral
ObjectSpecification ::=
      valueTypeIRI|
      valueSetMatchValue|
      osWildMatchValue|
      valueRefLabel
\begin{zed}
MatchValue ::= \\
\t1	mviri \ldata IRI \rdata | \\
\t1	mviris \ldata IRIstem \rdata | \\
\t1 mvlit \ldata RDFLiteral \rdata
\also
ObjectSpecification ::= \\
\t1	valueType \ldata IRI \rdata | \\
\t1 valueSet \ldata \power MatchValue \rdata | \\
\t1 osWild \ldata \power MatchValue \rdata | \\
\t1 valueRef \ldata Label \rdata
\end{zed}
		

4.1.3 evalCardinality

– evaluates the cardinality the supplied graph.

  • If the graph has no elements and:
    • min value is 0 – nomatch (z)
    • min value isn’t specified or is > 0 – none ()
  • Otherwise:
    • If number of elements in graph < min or > max fail (f )
    • Otherwise – pass (p)

evalCardinality:Graphoptional[ℕ]optional[ℕ]OptValidity
g:Graph; min, max:optional[ℕ] evalCardinality g min max =
      if# min = 1# g = 0value  min = 0
            thenz
      elseif# g = 0
            then
      elseif(# min = 1# g < value  min)
                  (# max = 1# g > value  max)
            thenf
      elsep

\begin{gendef}
   evalCardinality:Graph \fun \optional[\nat] \fun \optional[\nat] \fun OptValidity
\where
   \forall g:Graph; min, max:\optional[\nat] @ evalCardinality~g~min~max = \\
   \t1 \IF \# min = 1 \land \# g = 0 \land  value~ min = 0 \\
\t2 \THEN \nomatch \\
\t1 \ELSE \IF \# g = 0 \\
\t2 \THEN \none \\
\t1 \ELSE \IF (\# min = 1 \land \# g < value~ min) \lor \\
\t3 (\# max = 1 \land \# g > value~ max) \\
\t2 \THEN \fail \\
\t1 \ELSE \pass
\end{gendef}
		

evalObjectSpecification – returns pass (p) if all of the triples in a Graph match the supplied ObjectSpecification, otherwise fail (f )

evalObjectSpecification:EvalContextObjectSpecificationGraph
      OptValidity
ec:EvalContext; os:ObjectSpecification; g:Graph
evalObjectSpecification ec os g =
      ift:g evalObjectSpecificationTriple ec os t.o =p
            thenp
            elsef
\begin{gendef}
   evalObjectSpecification:EvalContext \fun ObjectSpecification \fun Graph \fun \\
\t1 OptValidity
\where
   \forall ec:EvalContext; os:ObjectSpecification; g:Graph @ \\ evalObjectSpecification~ec~os~g = \\
\t1 \IF \forall t:g @ evalObjectSpecificationTriple~ec~os~t.o = \pass \\
\t2		\THEN \pass \\
\t2		\ELSE \fail
\end{gendef}
		
evalObjectSpecificationTriple:EvalContextObjectSpecification
      RDFTermOptValidity
ec:EvalContext; os:ObjectSpecification; n:RDFTerm
      evalObjectSpecificationTriple ec os n =
            ifosranvalueTypethen
                  evalValueType (valueTypeosn
            elseifosranvalueSetthen
                  evalTermSet (valueSetosn
            elseifosranosWildthen
                  evalTermWild (osWildosn
            else
                  evalTermReference ec (valueRefosn
\begin{gendef}
   evalObjectSpecificationTriple:EvalContext \fun ObjectSpecification \fun \\
\t1 RDFTerm \fun OptValidity
\where
   \forall ec:EvalContext; os:ObjectSpecification; n:RDFTerm @ \\
\t1	evalObjectSpecificationTriple~ec~os~n = \\
\t2 \IF os \in \ran valueType \THEN \\
\t3 evalValueType~(valueType \entryFor os)~n \\
\t2 \ELSE \IF os \in \ran valueSet \THEN \\
\t3 evalTermSet~(valueSet \entryFor os)~n \\
\t2 \ELSE \IF os \in \ran osWild \THEN \\
\t3 evalTermWild~(osWild \entryFor os)~n \\
\t2 \ELSE \\
\t3 evalTermReference~ec~(valueRef \entryFor os)~n
\end{gendef}
		

evalValueType – returns pass if the supplied RDFTerm is:

  • type literal and whose dataType matches ValueType
  • type IRI and ValueType is type RDF_Literal
evalValueType:IRIRDFTermOptValidity
vt:IRI; n:RDFTerm; l:RDFLiteral evalValueType vt  n =
      ifvt = SHEX_IRInranirithenp
      elseifvt = SHEX_BNODEnranbnodethenp
      elseifvt = SHEX_NONLITERAL(nranirinranbnode)thenp
      elseifvt = SHEX_LITERALnranliteralthenp
      elseifnranliterall = (literaln)
            ((lranpl(pll).dataType = vt)
            (lrantl(tll).dataType = vt))thenp
      elsef
\begin{gendef}
   evalValueType:IRI \pfun RDFTerm \pfun OptValidity
\where
   \forall vt:IRI; n:RDFTerm; l:RDFLiteral @ evalValueType~vt~ n = \\
\t1 \IF vt = SHEX\_IRI \land  n \in \ran iri \THEN \pass \\
\t1 \ELSE \IF vt = SHEX\_BNODE \land  n \in \ran bnode \THEN \pass \\
\t1 \ELSE \IF vt = SHEX\_NONLITERAL \land  (n \in \ran iri \lor n \in \ran bnode) \THEN \pass \\
\t1 \ELSE \IF vt = SHEX\_LITERAL \land  n \in \ran literal \THEN \pass \\
\t1 \ELSE \IF n \in \ran literal \land l = (literal \entryFor n) \land \\
\t2 ((l \in \ran pl \land (pl \entryFor l).dataType = vt) \lor \\
\t2  (l \in \ran tl \land (tl \entryFor l).dataType = vt)) \THEN \pass \\
\t1 \ELSE \fail
\end{gendef}
		

evalTermSet – return p if the supplied RDFTerm is a member of MatchValue

evalTermSet:MatchValueRDFTermOptValidity
mvs:MatchValue; n:RDFTerm evalTermSet mvs n =
      ifmv:mvs
            ((mvranmvirinraniri(irin) = mvirimv)
            (mvranmvirisnraniri
                  (irin)IRIstemRange   (mvirismv) )
            (nranliteralmvlitmv = literaln))
      thenp
      elsef
\begin{gendef}
   evalTermSet:\power MatchValue \pfun RDFTerm \pfun OptValidity
\where
   \forall mvs:\power MatchValue; n:RDFTerm @ evalTermSet~mvs~n = \\
\t1	\IF \exists mv:mvs @ \\
\t2 ((mv \in \ran mviri \land n \in \ran iri \land (iri \entryFor n) = mviri \entryFor mv) \lor \\
\t2  (mv \in \ran mviris \land n \in \ran iri \land  \\
\t3 (iri \entryFor n) \in IRIstemRange ~ (mviris \entryFor mv) ) \lor \\
\t2  (n \in \ran literal \land mvlit \entryFor mv = literal \entryFor n)) \\
\t1 \THEN \pass \\
\t1 \ELSE \fail
\end{gendef}
		

evalTermWild – return pass (p) if the supplied RDFTerm is not a member of MatchValue.

evalTermWild:MatchValueRDFTermOptValidity
mvs:MatchValue; n:RDFTerm evalTermWild mvs n =
      ifevalTermSet  mvs  n =pthenfelsep

\begin{gendef}
   evalTermWild:\power MatchValue \fun RDFTerm \fun OptValidity
\where
   \forall mvs:\power MatchValue; n:RDFTerm @ evalTermWild~mvs~n = \\
\t1 \IF evalTermSet~ mvs~ n = \pass \THEN \fail \ELSE \pass
\end{gendef}
		

evalTermReference – return p if the subgraph of the EvalContext graph whose subjects match the supplied RDFTerm satisfies the ValueReference rule.

evalTermReference:EvalContextLabelRDFTermOptValidity
ec:EvalContext; vr:Label; n:RDFTerm
      evalTermReference ec vr n =
            ifnranliteralthenevalRuleec n vr
            elsef

\begin{gendef}
   evalTermReference:EvalContext \pfun Label \pfun RDFTerm \pfun OptValidity
\where
\forall ec:EvalContext; vr:Label; n:RDFTerm @ \\ 
\t1	evalTermReference~ec~vr~n = \\
\t2		\IF n \notin \ran literal \THEN evalRule'~ec~n~vr \\
\t2		\ELSE \fail
\end{gendef}
		

4.2 RevArcRule Evaluation

The RevArcRule is used to select the subset of the graph having a given predicate or predicates and to determine whether the cardinality and/or “type” of this subset matches a supplied criteria. The rule itself consists of a PredicateFilter to select the triples, an SubjectSpecification to evaluate the result, a optional min and max cardinality and a (possibly empty) set of Actions:

RevArcRule
filter:PredicateFilter
valueSpec:SubjectSpecification
min, max:optional[ℕ]
actions:Action
(# min = 1# max = 1)value  minvalue  max
\begin{schema}{RevArcRule}
   filter:PredicateFilter \\
   valueSpec:SubjectSpecification \\
   min, max:\optional[\nat] \\
   actions:\power Action
\where
   (\# min = 1 \land \# max = 1) \implies value~ min \leq value~ max
\end{schema}
		
RevArcRule evaluation consists of:
  1. Select the subset of the EvalContext Graph with the supplied object and predicates matching PredicateFilter
  2. Evaluate the cardinality and return the result if it doesn’t pass
  3. Evaluate the object of each of the triples in the set against SubjectSpecification. If any of the evaluations fail, return fail (f ).
  4. Return the result of evaluating actions against the matching triples.
evalRevArcRule:EvalContextRDFTermRevArcRuleOptValidity
ec:EvalContext; o:RDFTerm; rar:RevArcRule; og:Graph |
      og = evalPredicateFilter rar.filter (triplesForObject  o  ec.graph)
            evalRevArcRule ec o rar =
                  ifevalCardinality og rar.min rar.maxp
                        thenevalCardinality og rar.min rar.max
                  elseifevalSubjectSpecification ec rar.valueSpec ogp
                        thenevalSubjectSpecification ec rar.valueSpec og
                  else
                        dispatch  rar.actions  og  ec
\begin{gendef}
   evalRevArcRule:EvalContext \fun RDFTerm \fun RevArcRule \fun OptValidity
\where
  \forall ec:EvalContext; o:RDFTerm; rar:RevArcRule; og:Graph  | \\ 
\t1 og = evalPredicateFilter~rar.filter~(triplesForObject~ o~ ec.graph) @ \\
\t2		evalRevArcRule~ec~o~rar = \\
\t3 		\IF evalCardinality~og~rar.min~rar.max \neq \pass \\
\t4 			\THEN evalCardinality~og~rar.min~rar.max \\
\t3 		\ELSE \IF evalSubjectSpecification~ec~rar.valueSpec~og \neq \pass \\
\t4 			\THEN evalSubjectSpecification~ec~rar.valueSpec~og \\
\t3 		\ELSE \\
\t4 			 dispatch~ rar.actions~ og~ ec 
\end{gendef}
		

4.2.1 SubjectSpecification Evaluation

SubjectSpecification specifies a set of possible values for an RDFTerm and takes one of the following forms:

  • SubjectSet - matches IRIs or IRIstems that match one or more of the expressions in the specified set
  • SubjectWild - matches any target except those matching the (possibly empty) set of IRIstems
  • subjectRef - matches any target that is considered valid according the Rule identified by Label.
SubjectSpecification ::=
      subjectSetMatchValue|
      ssWildMatchValue|
      subjectRefLabel
\begin{zed}
SubjectSpecification ::=  \\
\t1	subjectSet \ldata \power MatchValue \rdata | \\
\t1 ssWild \ldata \power MatchValue \rdata | \\
\t1 subjectRef \ldata Label \rdata
\end{zed}
			

paragraphevalSubjectSpecification – returns pass (p) if all of the triples in a Graph match the supplied SubjectSpecification, otherwise fail (f )

evalSubjectSpecification:EvalContextSubjectSpecificationGraph
      OptValidity
ec:EvalContext; ss:SubjectSpecification; g:Graph
      evalSubjectSpecification ec ss g =
            ift:g evalSubjectSpecificationTriple ec ss t.o =p
            thenp
            elsef
\begin{gendef}
   evalSubjectSpecification:EvalContext \fun SubjectSpecification \fun Graph \fun \\
\t1 OptValidity
\where
   \forall ec:EvalContext; ss:SubjectSpecification; g:Graph @ \\ 
\t1	evalSubjectSpecification~ec~ss~g = \\
\t2 	\IF \forall t:g @ evalSubjectSpecificationTriple~ec~ss~t.o = \pass \\
\t2		\THEN \pass \\
\t2 	\ELSE \fail
\end{gendef}
			
evalSubjectSpecificationTriple:EvalContextSubjectSpecification
      RDFTermOptValidity
ec:EvalContext; ss:SubjectSpecification; n:RDFTerm
      evalSubjectSpecificationTriple ec ss n =
            ifssransubjectSetthen
                  evalTermSet (subjectSetssn
            elseifssranssWildthen
                  evalTermWild (ssWildssn
            else
                  evalTermReference ec (subjectRefssn
\begin{gendef}
   evalSubjectSpecificationTriple:EvalContext \fun SubjectSpecification \fun \\
\t1 RDFTerm \fun OptValidity
\where
   \forall ec:EvalContext; ss:SubjectSpecification; n:RDFTerm @ \\
\t1	evalSubjectSpecificationTriple~ec~ss~n = \\
\t2 	\IF ss \in \ran subjectSet \THEN \\
\t3 		evalTermSet~(subjectSet \entryFor ss)~n \\
\t2 	\ELSE \IF ss \in \ran ssWild \THEN \\
\t3 		evalTermWild~(ssWild \entryFor ss)~n \\
\t2 	\ELSE \\
\t3 		evalTermReference~ec~(subjectRef \entryFor ss)~n
\end{gendef}
			

4.3 GroupRule Evaluation

A GroupRule serves two purposes. The first is to declare that a referenced rule is to be treated as “optional”, which, in this case means that if (a) the referenced rule returned none (), (meaning an ArcRule was encountered that had no matching predicates and a non-zero minimum cardinality) the group rule returns nomatch (z). An optional GroupRule also treats an error situation as a fail (f ).

The second purpose of a group rule is to allow a set of external actions to be evaluated whenever the referenced rule returns pass (p).

OPT ::= OPTIONAL | REQUIRED
GroupRule[rule:Label; opt:OPT; actions:Action ]

\begin{zed}
OPT ::= OPTIONAL | REQUIRED
\also
GroupRule \defs [rule:Label; opt:OPT; actions:\power Action ] \\
\end{zed}
		
evalGroupRule evaluates Rule, applies opt and, if the result is pass (p), evaluates the actions with respect to the passing Graph
evalGroupRule:EvalContextRDFTermGroupRuleOptValidity
ec:EvalContext; i:RDFTerm; gr:GroupRule
      evalGroupRule  ec  i   gr =
            ifevalRuleec i gr.rule =gr.opt = OPTIONAL
                  thenz
            elseifevalRuleec i gr.rule =εgr.opt = OPTIONAL
                  thenf
            elseifevalRuleec i gr.rule =p
                  thendispatch  gr.actions ∅  ec
            else
                  evalRuleec i gr.rule
\begin{gendef}
   evalGroupRule:EvalContext \fun RDFTerm \fun GroupRule \fun OptValidity
\where
\forall ec:EvalContext; i:RDFTerm; gr:GroupRule @ \\
\t1	evalGroupRule~ ec~ i ~ gr = \\
\t2 	\IF evalRule'~ec~i~gr.rule = \none \land gr.opt = OPTIONAL \\
\t3 	\THEN \nomatch \\
\t2 	\ELSE \IF evalRule'~ec~i~gr.rule = \error \land gr.opt = OPTIONAL \\
\t3 	\THEN \fail \\
\t2 	\ELSE \IF evalRule'~ec~i~gr.rule = \pass \\
\t3 	\THEN dispatch~ gr.actions~ \emptyset ~ ec \\
\t2 	\ELSE \\
\t3			evalRule'~ec~i~gr.rule
\end{gendef}
		

4.4 AndRule evaluation

An AndRule consists of a set of one or more Rules, whose evaluation is determined by the following table:

And z f p ε
1 1 f1 f1 ε1
z 1 z1 f1 p1 ε1
f f1 f1 f1 f1 ε1
p f1 p1 f1 p1 ε1
ε ε1 ε1 ε1 ε1 ε1

The formal implementation of which will be realized in the corresponding function:

And:OptValidityOptValidityOptValidity
\begin{gendef}
   And:OptValidity \fun OptValidity \fun OptValidity
\end{gendef}
		

Observing that the above table is a monoid with nomatch (z) as an identity element, evalAndRule can be applied using the standard functional pattern:

AndRule ==seq1Label
\begin{zed}
AndRule == \seq_1 Label
\end{zed}
		
evalAndRule:EvalContextRDFTermAndRuleOptValidity
ec:EvalContext; i:RDFTerm; r:AndRule
      evalAndRule ec i r =
            foldr  And z  (map (evalRuleec ir)
\begin{gendef}
   evalAndRule:EvalContext \fun RDFTerm \fun AndRule \fun OptValidity
\where
\forall ec:EvalContext; i:RDFTerm; r:AndRule  @ \\
\t1	evalAndRule~ec~i~r = \\
\t2		foldr~ And~ \nomatch ~ (map~(evalRule'~ec~i)~r)
\end{gendef}
		

4.5 XorRule Evaluation

An XorRule consists of a set of one or more Rules, whose evaluation is determined by the following table:

Or z f p ε
1 z1 1 p1 ε1
z z1 z1 z1 p1 ε1
f 1 z1 f1 p1 ε1
p p1 p1 p1 ε23 ε1
ε ε1 ε1 ε1 ε1 ε1
The formal implementation of which will be realized in the corresponding function:
Xor:OptValidityOptValidityOptValidity
\begin{gendef}
   Xor:OptValidity \fun OptValidity \fun OptValidity
\end{gendef}
		
As with the And function above, Xor is a monoid whose identity is fail (f ) resulting in the following definition for evalXorRule
XorRule ==seq1Label
\begin{zed}
XorRule == \seq_1 Label
\end{zed}
		
evalXorRule:EvalContextRDFTermXorRuleOptValidity
ec:EvalContext; i:RDFTerm; r:XorRule
      evalXorRule ec i r =
            foldr  Xor f  (map (evalRuleec  ir)
\begin{gendef}
   evalXorRule:EvalContext \fun RDFTerm \fun XorRule \fun OptValidity
\where
   \forall ec:EvalContext; i:RDFTerm; r:XorRule  @ \\
\t1	evalXorRule~ec~i~r = \\
\t2		foldr~ Xor~ \fail ~ (map~(evalRule'~ ec ~i)~ r)
\end{gendef}
		

5 Action evaluation

The dispatch function allows the evaluation / execution of arbitrary external “Actions”. While the evaluation of an Action can (obviously) have side effects outside the context of the ShEx environment, it must be side effect free within the execution context. In particular, an Action may not change anything in the EvalContext The action dispatcher exists to allow external events to happen. Parameters:

The dispatch function usually returns pass (p) or fail (f ), although there may also be cases for other OptValidity values in certain circumstances. The dispatch function always returns pass (p) if the set of actions is empty.

[Action]
\begin{zed}
[Action]
\end{zed}
		
dispatch:ActionGraphEvalContextOptValidity
as:Action; g:Graph; ec:EvalContext
      as =dispatch  as  g  ec =p
\begin{gendef}
   dispatch:\power Action \fun Graph \fun EvalContext \fun OptValidity
\where
   \forall as:\power Action; g:Graph; ec:EvalContext @ \\
\t1 as = \emptyset \implies dispatch~ as~ g ~ec = \pass
\end{gendef}
		

6 Parsing Rules

The semantics of the productions and terminals from the Turtle specification are defined in Turtle Terse RDF Triple Language section 7 Parsing.

The following grammar rules are mapped to the abstract syntax:

Productions:

[1]    ShExDoc    ::=    statement*
[2]    statement    ::=    directive | shape
[3]    directive    ::=    sparqlPrefix | sparqlBase | "start" "=" ( label | typeSpec CODE* )
let b = a fresh blank node
typeSpec.setLabel(b)
new ArcRule()
[4]    sparqlPrefix    ::=    "PREFIX" PNAME_NS IRIREF
[5]    sparqlBase    ::=    "BASE" IRIREF
[6]    shape    ::=    label typeSpec CODE*
if (CODE.SIZE > 0)
  let r = new GroupRule(false, typeSpec)
  r.label = label
else
  typeSpec.label = label
[7]    typeSpec    ::=    "{" OrExpression "}"
[8]    OrExpression    ::=    l:AndExpression ( "|" r:AndExpression )*
if (r)
  new XorRule(l, r)
else
  l
[9]    AndExpression    ::=    l:UnaryExpression ( "," r:UnaryExpression )*
if (r)
  new AndRule(l, r)
else
  l
[10]    UnaryExpression    ::=    arc
| "(" OrExpression ")" opt:"?"? CODE*
new GroupRule(OrExpression, opt, CODE)
[11]    label    ::=    iri | BlankNode
[12]    arc    ::=    nameClass valueSpec default? repeatCount? properties? CODE*
new ArcRule(nameClass, valueSpec, default, repeatCount, properties, CODE)
[13]    nameClass    ::=    iriStem
pfIRI(iriStem)
| "." exclusions
pfWild(exclusions)
| 'a'
pfIRI(rdf:type)
[14]    valueSpec    ::=    "@" label
ValueReference(label)
| typeSpec
let b = a fresh blank node
typeSpec.setLabel(b)
| valueSet
new ValueSet(valueSet)
| object
new ValueType(object)
| exclusions
new ValueWild(exclusions)
[15]    iriStem    ::=    iri ("~")?
[16]    default    ::=    "=" object
[17]    properties    ::=    "[" iri object ( ";" ( iri object )? )* "]"
[18]    exclusions    ::=    ( "-" iriStem )*
[19]    repeatCount    ::=    "*" | "+" | "?" | "{" INTEGER ( "," (INTEGER)? )? "}"
[20]    valueSet    ::=    "(" (object)+ ")"
[21]    object    ::=    iriStem | BlankNode | literal
[22]    literal    ::=    RDFLiteral | NumericLiteral | BooleanLiteral
[23]    NumericLiteral    ::=    INTEGER | DECIMAL | DOUBLE
[24]    RDFLiteral    ::=    String ( LANGTAG | "^^" iri )?
[25]    BooleanLiteral    ::=    "true" | "false"
[26]    String    ::=    STRING_LITERAL1 | STRING_LITERAL2 | STRING_LITERAL_LONG1 | STRING_LITERAL_LONG2
[27]    iri    ::=    IRIREF | PrefixedName
[28]    PrefixedName    ::=    PNAME_LN | PNAME_NS
[29]    BlankNode    ::=    BLANK_NODE_LABEL | ANON

Terminals Not in SPARQL:

[30]    <CODE>    ::=    "%" ( [a-zA-Z+#_] ([a-zA-Z0-9+#_])* )? "{" ( [^%\\] | "\\" "%" )* "%" "}"

Ordering (informative)

A Shape Expressions schema with no semantic actions can be treated as unordered. In practice, error messages can be much clearer, and extension functions much simpler to write if the engine follows the lexical order when validating the members of a conjunction. Likewise, form and interface code generation will be more predictable and controllable if the Shape Expressions are processed in a specific order.

7 Appendix

7.1 Foldr

The foldr function is the standard functional pattern, which takes a binary function of type T, an identity function for type T, a sequence of type T and returns the result of applying the function to the right to left pairs of the sequence.

[T]
foldr:(TTT)TseqTT
f:TTT; id:T; xs:seqT
      foldr f id xs =
            ifxs =〈〉thenid
            elsef (head xs) (foldr f id (tail xs))
\begin{gendef}[T]
   foldr:(T \fun T \fun T) \fun T \fun \seq T \fun T
\where
   \forall f:T \fun T \fun T; id:T; xs:\seq T @ \\
\t1	foldr~f~id~xs = \\
\t2 	\IF xs = \langle \rangle \THEN id \\
\t2 	\ELSE f~(head~xs)~(foldr~f~id~(tail~xs))
\end{gendef}
		

7.2 Map

The map function takes a function from type A to type B and applies it to all members in the supplied sequence

[A,B]
map:(AB)seqAseqB
f:AB; xs:seqA
      map f xs =
            ifxs =〈〉then〈〉
            elsef (head xs)〉map f (tail xs)
\begin{gendef}[A,B]
   map:(A \fun B) \fun \seq A \fun \seq B
\where
   \forall f:A \fun B; xs:\seq A @ \\
\t1	map~f~xs = \\
\t2 	\IF xs = \langle \rangle \THEN \langle \rangle \\
\t2 	\ELSE \langle f~(head~xs) \rangle \cat map~f~(tail~xs)
\end{gendef}
		

6.3 Helper Functions

Z uses the notion of free type definitions in the form:

FreeType ::= constructorsource
\begin{syntax}
FreeType ::= constructor \ldata source \rdata
\end{syntax}
		
which introduces a collection of constants of type FreeType, one for each element of the set source. constructor is an injective function from source to FreeType:
constructor:sourceFreeType
\begin{axdef}
constructor:source \inj FreeType
\end{axdef}
		
In the models that follow, there is a need to reverse this – to find the source for a given FreeType instance. The function exists for this purpose. As an example, if one were to define:
TravelDirections ::=
      busBusDirections|
      walkingWalkingDirections
\begin{syntax}
TravelDirections ::= \\
\t1	bus \ldata BusDirections \rdata | \\
\t1	walking \ldata WalkingDirections \rdata
\end{syntax}
		
If one is supplied with an instance of TravelDirections, one can convert it to the appropriate type by:
xTravelDirections
ifxranbusthenbusxelsewalkingx
\begin{syntax}
x \in TravelDirections
\where
\IF x \in \ran bus \THEN bus \entryFor x \ELSE walking \entryFor x
\end{syntax}
		

[X, Y]
__:(XY)× YX
f:XY; y:Y
      fy = (μ x:domf | f x = y)

\begin{gendef}[X, Y]
\_  \entryFor \_ :(X \inj Y) \cross Y \pfun X
\where
\forall f:X \inj Y; y:Y @  \\
\t1	f \entryFor y = (\mu x:\dom f | f~x = y)
\end{gendef}
		

One way to represent optional values is a set with one member. We take that route here and introduce a bit of syntactic sugar to show our intent:

optional[T] == { s:T | # s1 }
\begin{zed} 
  \optional[T] == \{ s:\power T \mid \# s \leq 1 \} \\
\end{zed}
		

And a shorthand for addressing the content:

[T]
value:TT
s:T value  s = (μ e:T | es)
\begin{gendef}[T]
   value:\power T \pfun T
\where
   \forall s:\power T @ value~ s = (\mu e:T | e \in s) 
\end{gendef}
		

A. References

A.1 Normative References

[RDF11-CONCEPTS]
Richard Cyganiak, David Wood, Markus Lanthaler. RDF 1.1 Concepts and Abstract Syntax. W3C Recommendation, 25 February 2014. URL: http://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/. The latest edition is available at http://www.w3.org/TR/rdf11-concepts/
[RFC3987]
M. Dürst and M. Suignard. Internationalized resource identifiers (IRIs). IETF Request For Comment January 2005. URL: http://tools.ietf.org/html/rfc3987.
[BCP47]
RA Phillips and M. Davis. Tags for identifying language. IETF Best CurrentPractices, September 2009. URL: http://tools.ietf.org/html/bcp47.

A.2 Informative References

[Z Notation Reference Manual]
The Z Notation: A Reference Manual, Second Edition, J. M. Spivey, Prentice Hall, 1992.
[Fuzz 2000]
Release Notes For Fuzz 2000, J. M. Spivey.
Grp
z1
z z1
f f1
p p1
ε ε1
All
f6
z p7
f f11
p p10
ε f21