This specification defines elements of the SHACL Shapes Constraint Language created to allow for profiles of SHACL and profiling with SHACL.

SHACL is a language for validating RDF graphs against a set of conditions, so this document's scope is limited to profiling of RDF graphs, including graphs containing SHACL Shapes.

The namespace for SHACL Profiling terms is http://www.w3.org/ns/shpr#

The suggested prefix for the SHACL Profiling namespace is shpr

Document Outline

The introduction provides background concepts of profiling and states this specification's scope.

Sections 2 & 3 cover the two main elements within the stated scope.

Introduction

What is profiling?

Profiling is the act of creating a "profile" of something.

Generically, in English, a "profile" of something is as follows:

The outline of a physical object or feature, or a representation of this

- Oxford English dictionary, use of the word "profile" since the 17th century

Within the world of data, a derived definition of "profile" consistent with the above is —

A summary or an extraction

In this definition, the essence of the English word is retained, since a summary or extraction of or from a data object may be an outline of it; for example, a 2D representation of a 3D spatial object. or a statistical summary of a dataset having lots of parts.

By definition, SHACL constrains (RDF) data. Therefore, any data that is valid according to a shapes graph will be a profile of the data graph that was validated. If a shapes graph validates all elements of a data graph, the resulting valid data will be a "null" profile of the data graph, meaning it is identical to the original data graph.

The W3C's Profiles Vocabulary [[dx-prof]] has defined "data profiling" in the context of specifications or data specifications:

A data specification that constrains, extends, combines, or provides guidance or explanation about the use of other data specifications.

If a shapes graph is taken to be a "data specification," then not only is the data that is valid according to the shapes graph a profile of the validated data graph, but the shapes graph itself also serves as a profile of the data model used for the data graph.

Scope

With the above section's concepts in mind, this specification defines the following:

  1. profiles of SHACL
  2. profiling with SHACL

Terminology

Terminology used throughout this specification is taken from several sources:

SHACL 1.2 Core specification
technical terms for SHACL and RDF, the latter from [[rdf12-concepts]]
Profiles Vocabulary [[dx-prof]]
defines general terms to do with profiling including the terms "profiling" & "profile"

The SHACL & RDF terms include: binding , blank node , conformance , constraint , constraint component , data graph , datatype , failure , focus node , RDF graph , ill-formed , IRI , literal , local name , member , node , node shape , object , parameter , pre-binding , predicate , property path , property shape , RDF term , SHACL instance , SHACL list , SHACL subclass , shape , shapes graph , solution , subject , target , triple , validation , validation report , validation result , validator , value , value node .

The general profiling terms include: specification , [data] profile , metadata .

Document Conventions

Within this specification, the following namespace prefix definitions are used:

Prefix Namespace
rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs: http://www.w3.org/2000/01/rdf-schema#
sh: http://www.w3.org/ns/shacl#
xsd: http://www.w3.org/2001/XMLSchema#
ex: http://example.com/ns#

Within this specification, the following JSON-LD context is used:

{
  "@context": {
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
    "sh": "http://www.w3.org/ns/shacl#",
    "xsd": "http://www.w3.org/2001/XMLSchema#",
    "ex": "http://example.com/ns#"
  }
}

Note that the URI of the graph defining the SHACL vocabulary itself is equivalent to the namespace above, i.e., it includes the #. References to the SHACL vocabulary, e.g., via owl:imports should include the #.

Throughout the specification, color-coded boxes containing RDF graphs in Turtle and JSON-LD will appear. The color and title of a box indicate whether it is a Shapes graph, a Data graph, or something else. The Turtle specification fragments use the prefix bindings given above. The JSON-LD specification fragments use the context given above. Only the Turtle specifications will have parts highlighted.

# This box represents an input shapes graph <s> <p> <o> .
// This box represents an input shapes graph
{
  "@id": "ex:s",
  "ex:p": {
    "@id": "ex:o"
  }
}
# This box represents an input data graph. # When highlighting is used in the examples: # Elements highlighted in blue are focus nodes ex:Bob a ex:Person . # Elements highlighted in red are focus nodes that fail validation ex:Alice a ex:Person .
// This box represents an input data graph
{
	"@graph": [
		{
			"@id": "ex:Alice",
			"@type": "ex:Person"
		},
		{
			"@id": "ex:Bob",
			"@type": "ex:Person"
		}
	]
}
# This box represents an output results graph
// This box represents an output results graph

Grey boxes such as this include syntax rules that apply to the shapes graph.

true denotes the RDF term "true"^^xsd:boolean. false denotes the RDF term "false"^^xsd:boolean.

TODO

Profiles of SHACL

This section describes profiles of the total SHACL 1.2 specification and how to create other profiles of SHACL.

Base Profiles

Base profiles of SHACL are profiles that correspond to the individual SHACL 1.2 specifications listed in the Specifications section, above.

Together, these Base Profiles cover all of SHACL and, when used together in the Union Profile, they can validate whether any graph is valid SHACL 1.2 data.

Non-base Profiles

Non-base profiles of SHACL are profiles that do not correspond directly to a SHACL 1.2 specification. These profiles include supersets of the specification-derived base profiles, such as the Union Profile defined below, as well as other profiles of parts of SHACL, such as Profile X that contains SHACL elements that, when used for validation, require no more than XXX computational complexity.

Union Profile

Profile X

Creating other Profiles

Profiling with SHACL

This section describes how to profile specifications with SHACL.

Profile Part Roles

SHACL is a language that can implement constraints and inference rules for RDF data. These two things are not the complete set of things that a profile designer may wish to do; for example, they may want to include documentation about why and how the profile was created, or provide new model elements (classes, predicates, etc.) within an extended schema.

Since SHACL cannot be used for all possible profile parts, profile designers need to look to specifications outside the SHACL family of specifications for guidance on a profile's total set of parts and how to relate them to one another. [[[dx-prof]]] is expected to be used for this.

Referencing [[[dx-prof]]]'s vocabulary of Resource Role Instances, it seems clear that SHACL graphs can be applied to the role of Validation within a profile and perhaps also to the role of Specification, since SHACL can be used to declare Node and Property Shapes just as OWL can be used to declare Classes and Properties.

SHACL resources can also document the constraints they implement validation shapes for, therefore they can also be applied to the Constraints role.

They could also be applied to the Mapping role, if SHACL Rules are implemented to transform data from one model to another.

SHACL lists of values required for use with a model could be applied to the Vocabulary role.

Profile Hierarchies

Creating Null Profiles

A "null" profile of a specification, created using SHACL, is a profile of that specification in which SHACL is used to implement some, and potentially all the specification's rules, but no other rules, in SHACL.

The main purpose of creating a null profile of a specification using SHACL is to enable the testing of conformance to that specification by SHACL validation. This purpose exists because many RDF data models exist that have a model specification, perhaps an OWL model or only a natural-language document of a model, but do not provide a mechanism for data validation, such as a SHACL validator.

While null profiles of specifications can be created using mechanism other than SHACL, here we focus only on the use of SHACL.

An example of a non-SHACL validator for RDF data that acts as a null profile is the W3C's [[[prov-constraints]]] which provides a list of constraints that apply to provenance data formulated according to [[[prov-o]]]. [[[prov-constraints]]] implements no constraints beyond those stated or implied in [[[prov-o]]] and the conceptual [[[prov-dm]]], however it does include tests for things that the models do not explicitly model but whose proper use requires, e.g., for ordering of temporal entities. Implementations of those constraints have been made as Python scripts that execute SPARQL queries, allowing for RDF data validation.

Summary of Syntax Rules from this Specification

Security and Privacy Considerations

Like most RDF-based technologies, SHACL processors may operate on graphs that are assembled from various sources. Some applications may have an open "linked data" ("LD") architecture and dynamically assemble RDF triples from sources that are outside an organization's network of trust. Since RDF allows anyone to add statements about any resource, triples may modify the originally intended semantics of shape definitions or nodes in a data graph and thus feed into misleading results. Protection against this (and the following) scenario is achievable by using only trusted and verified RDF sources and eliminating the possibility that graphs are dynamically added via owl:imports and sh:shapesGraph.

When creating profiles of other specifications, profile creators need to ensure that their constraints do not violate those specification's rules. If any did so, and if only the profile's rules, but not the specification's rules, were used to check for data validity, by accident or by design, data could be wrongly calculated to be valid. This could lead to accidental data release or use, potentially introducing security issues.

Acknowledgements

Many people contributed to this specification, including members of the RDF Data Shapes Working Group.

Internationalization Considerations

TODO