DCAT is an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web. This document defines the schema and provides examples for its use.

By using DCAT to describe datasets in data catalogs, publishers are using a standard model and vocabulary that facilitates the consumption and aggregation of metadata from multiple catalogs, and in doing so can increase the discoverability of datasets. It also makes it possible to have a decentralized approach to publishing data catalogs and makes federated search for datasets across catalogs in multiple sites possible using the same query mechanism and structure. Aggregated DCAT metadata can serve as a manifest file as part of the digital preservation process.

The namespace for DCAT terms is http://www.w3.org/ns/dcat#

The suggested prefix for the DCAT namespace is dcat

The (revised) DCAT vocabulary is available here.

The original DCAT vocabulary (originally hosted at http://vocab.deri.ie/dcat) was developed at the Digital Enterprise Research Institute (DERI), refined by the eGov Interest Group, and then finally standardized in 2014 [[VOCAB-DCAT-20140116]] by the Government Linked Data (GLD) Working Group.

This revised version of DCAT was developed by the Dataset Exchange Working Group in response to a new set of Use Cases and Requirements [[DCAT-UCR]] submitted on the basis of experience with the DCAT vocabulary from the time of the original version, and new applications not originally considered.

DCAT incorporates terms from pre-existing vocabularies where stable terms with appropriate meanings could be found, such as foaf:homepage and dct:title. Informal summary definitions of the externally-defined terms are included here for convenience, while authoritative definitions are available in the normative references. Changes to definitions in the references, if any, supersede the summaries given in this specification. Note that conformance to DCAT (Section 4) concerns usage of only the terms in the DCAT namespace itself, so possible changes to the external definitions will not affect the conformance of DCAT implementations.

Introduction

From DCAT 2014 [[!VOCAB-DCAT-20140116]]

Data can come in many formats, ranging from spreadsheets, through XML and RDF, to various speciality formats. DCAT does not make any assumptions about the serialisation format of the datasets described in a catalog. Other, complementary vocabularies MAY be used together with DCAT to provide more detailed format-specific information. For example, properties from the VoID vocabulary [[VOID]] can be used to express various statistics about a DCAT-described dataset if that dataset is in RDF format.

This document does not prescribe any particular method of deploying data expressed in DCAT. DCAT is applicable in many contexts including RDF accessible via SPARQL endpoints, embedded in HTML pages as RDFa, or serialized as e.g. RDF/XML or Turtle. The examples in this document use Turtle simply because of Turtle's readability.

Motivation for Change

The original Recommendation [[VOCAB-DCAT-20140116]], published in January 2014, provided the basic framework for describing datasets. Importantly, it made the distinction between a dataset as an abstract idea and a distribution as a manifestation of the dataset. Although DCAT has been widely adopted, it has became clear that the original specification lacked a number of essential features that were added either through application profiles, such as the European Commission's DCAT-AP [[DCAT-AP]], or the development of larger vocabularies that, to a greater or lesser extent, built upon the base standard, such as the Healthcare and Life Sciences Community Profile [[HCLS-Dataset]], the Data Tag Suite [[DATS]] and more. This version of DCAT has been developed to address the specific shortcomings that have come to light through the experiences of different communities, the aim being, of course, to improve interoperability between the outputs of these larger vocabularies.

Namespaces

The namespace for DCAT is http://www.w3.org/ns/dcat#. However, it can be noted that DCAT makes extensive use of terms from other vocabularies, in particular Dublin Core [[DCTERMS]]. DCAT itself defines a minimal set of classes and properties of its own. A full set of namespaces and prefixes used in this document is shown in the table below.

PrefixNamespace
dcathttp://www.w3.org/ns/dcat#
dcthttp://purl.org/dc/terms/
dctypehttp://purl.org/dc/dcmitype/
dqvhttp://www.w3.org/ns/dqv#
foafhttp://xmlns.com/foaf/0.1/
owlhttp://www.w3.org/2002/07/owl#
provhttp://www.w3.org/ns/prov#
rdfhttp://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfshttp://www.w3.org/2000/01/rdf-schema#
skoshttp://www.w3.org/2004/02/skos/core#
vcardhttp://www.w3.org/2006/vcard/ns#
xsdhttp://www.w3.org/2001/XMLSchema#

Modified from DCAT 2014 [[VOCAB-DCAT-20140116]]

A data catalog conforms to DCAT if:

A DCAT profile is a specification for data catalogs that adds additional constraints to DCAT. A data catalog that conforms to the profile also conforms to DCAT. Additional constraints in a profile MAY include:

Vocabulary Overview

From DCAT 2014 [[VOCAB-DCAT-20140116]] except as noted

DCAT is an RDF vocabulary well-suited to representing data catalogs such as data.gov and data.gov.uk. DCAT defines eight main classes:

A dataset in DCAT is defined as a "collection of data, published or curated by a single agent, and available for access or download in one or more formats". A dataset is a conceptual entity, and can be represented by one or more distributions that serialize the dataset for transfer. Distributions of a dataset can be provided via data distribution services. Detailed properties for a data distribution service API are out of the scope of this version of DCAT.

Datasets and data services, and potentially other types of thing, MAY be included in a catalog. Types of data service that might be found in a catalog include data distribution services, discovery services such as portals and catalog services, data transformation services such as coordinate transformation services, re-sampling and interpolation services, and various data processing services.

The scope of DCAT 2014 [[VOCAB-DCAT-20140116]] was limited to catalogs of datasets. A number of use cases for the revision involve also having data distribution services as members of a catalog - see DCAT Distribution to describe web services - ID6 and Modeling service-based data access - ID18. It has been decided to add an explicit class for Data Distribution Services in this revision of DCAT, and to enable these to be part of a Catalog. Provision for other data services to also be part of a Catalog will also be made, as well as for Catalogs to be composed of other Catalogs. See Issue #172.

Milestone 5 collects a number of issues related to this requirement.

A CatalogRecord describes an entry in the catalog. Notice that while dcat:Resource represents the dataset or service itself, dcat:CatalogRecord is the record that describes the registration of an item in the catalog. The use of dcat:CatalogRecord is considered optional. It is used to capture provenance information about entries in a catalog. If this distinction is not necessary then dcat:CatalogRecord can be safely ignored.

In DCAT 2014 [[VOCAB-DCAT-20140116]] dcat:Dataset was a sub-class of dctype:Dataset, which is a term of the DCMI Types vocabulary [[DCTERMS]]. This relationship has been removed in the revised DCAT vocabulary - see Issue #98.

UML model of DCAT classes and properties
Overview of DCAT model, showing the classes of resources that can be members of a Catalog and the relationships between them.

All RDF examples in this document are written in Turtle syntax [[Turtle]].

Basic Example

This example provides a quick overview of how DCAT might be used to represent a government catalog and its datasets.

First, the catalog description:

   :catalog
       a dcat:Catalog ;
       dct:title "Imaginary Catalog" ;
       rdfs:label "Imaginary Catalog" ;
       foaf:homepage <http://example.org/catalog> ;
       dct:publisher :transparency-office ;
       dct:language <http://id.loc.gov/vocabulary/iso639-1/en>  ;
       dcat:dataset :dataset-001  , :dataset-002 , :dataset-003 ;
       .

The publisher of the catalog has the relative URI :transparency-office. Further description of the publisher can be provided as in the following example:

   :transparency-office
       a foaf:Organization ;
       rdfs:label "Transparency Office" ;
       .

The catalog lists each of its datasets via the dcat:dataset property. In the example above, an example dataset was mentioned with the relative URI :dataset-001. A possible description of it using DCAT is shown below:

   :dataset-001
       a dcat:Dataset ;
       dct:title "Imaginary dataset" ;
       dcat:keyword "accountability","transparency" ,"payments" ;
       dct:issued "2011-12-05"^^xsd:date ;
       dct:modified "2011-12-05"^^xsd:date ;
       dcat:contactPoint <http://example.org/transparency-office/contact> ;
       dct:temporal <http://reference.data.gov.uk/id/quarter/2006-Q1> ;
       dct:spatial <http://www.geonames.org/6695072> ;
       dct:publisher :finance-ministry ;
       dct:language <http://id.loc.gov/vocabulary/iso639-1/en>  ;
       dct:accrualPeriodicity <http://purl.org/linked-data/sdmx/2009/code#freq-W>  ;
       dcat:distribution :dataset-001-csv ;
       .

In order to express the frequency of update in the example above, we chose to use an instance from the Content-Oriented Guidelines developed as part of the W3C Data Cube Vocabulary [[VOCAB-DATA-CUBE]] efforts. Additionally, we chose to describe the spatial and temporal coverage of the example dataset using URIs from Geonames and the Interval dataset (originally available from http://reference.data.gov.uk/id/interval) from data.gov.uk, respectively. A contact point is also provided where comments and feedback about the dataset can be sent. Further details about the contact point, such as email address or telephone number, can be provided using vCard [[VCARD-RDF]].

The dataset distribution :dataset-001-csv can be downloaded as a 5Kb CSV file. This information is represented via an RDF resource of type dcat:Distribution.

   :dataset-001-csv
       a dcat:Distribution ;
       dcat:downloadURL <http://www.example.org/files/001.csv> ;
       dct:title "CSV distribution of imaginary dataset 001" ;
       dcat:mediaType <https://www.iana.org/assignments/media-types/text/csv> ;
       dcat:byteSize "5120"^^xsd:decimal ;
       .

Classifying datasets thematically

The catalog classifies its datasets according to a set of domains represented by the relative URI :themes. SKOS can be used to describe the domains used:

   :catalog dcat:themeTaxonomy :themes .
   :themes
       a skos:ConceptScheme ;
       skos:prefLabel "A set of domains to classify documents" ;
       .
   :dataset-001 dcat:theme :accountability  .

Notice that this dataset is classified under the domain represented by the relative URI :accountability. It is recommended to define the concept as part of the concepts scheme identified by the URI :themes that was used to describe the catalog domains. An example SKOS description:

   :accountability
       a skos:Concept ;
       skos:inScheme :themes ;
       skos:prefLabel "Accountability" ;
       .

Classifying dataset types

The scope of DCAT Datasets is a lot more than tabular data, so a type/genre indication is required. A number of controlled vocabularies are available for classification of datasets according to type. In some cases, each individual classifier is denoted by a URI so it can be linked directly. However, in other cases there is only a text list, or a vocabulary is embedded in a document or code, so the way to indicate an individual item is less direct. Guidance about how to use items from various styles of controlled vocabulary are needed alongside the simple case.

The type or genre of a dataset MAY be indicated using the dct:type property:

  :dataset-001 dct:type	dctype:Text . 

Describing catalog records metadata

If the catalog publisher decides to keep metadata describing its records (i.e. the records containing metadata describing the datasets), dcat:CatalogRecord can be used. For example, while  :dataset-001 was issued on 2011-12-05, its description on Imaginary Catalog was added on 2011-12-11. This can be represented by DCAT as in the following:

   :catalog  dcat:record :record-001  .
   :record-001
       a dcat:CatalogRecord ;
       foaf:primaryTopic :dataset-001 ;
       dct:issued "2011-12-11"^^xsd:date ;
       .

Dataset available only behind some Web page

:dataset-002 is available as a CSV file. However :dataset-002 can only be obtained through some Web page where the user needs to follow some links, provide some information and check some boxes before accessing the data

   :dataset-002
       a dcat:Dataset ;
       dcat:landingPage <http://example.org/dataset-002.html> ;
       dcat:distribution :dataset-002-csv ;
       .
   :dataset-002-csv
       a dcat:Distribution ;
       dcat:accessURL <http://example.org/dataset-002.html> ;
       dcat:mediaType <https://www.iana.org/assignments/media-types/text/csv> ;
       .
Notice the use of a dcat:landingPage and the definition of the dcat:Distribution instance.

A dataset available as a download and behind some Web page

On the other hand, :dataset-003 can be obtained through some landing page but also can be downloaded from a known URL.

   :dataset-003
       a dcat:Dataset ;
       dcat:landingPage <http://example.org/dataset-003.html> ;
       dcat:distribution :dataset-003-csv ;
       .
   :dataset-003-csv
       a dcat:Distribution ;
       dcat:downloadURL <http://example.org/dataset-003.csv> ;
       dcat:mediaType <https://www.iana.org/assignments/media-types/text/csv> ;
       .
Notice that we used dcat:downloadURL with the downloadable distribution and that the other distribution accessible through the landing page does not have to be defined as a separate dcat:Distribution instance.

More examples needed

Need to add examples showing data distribution services and their relationship to datasets and distributions.

Need to add examples showing other data services.

Vocabulary specification

RDF representation

The DCAT RDF representation is modularized into several files or graphs to help users access a version of DCAT with just the alignments that they need. This mechanism can also be used to capture different levels of axiomatization, though the status of such proposals has not been finalized. See Issue #134 and the issues enumerated below.

Guidance on the use DCAT in a weakly-axiomatized environment, such as schema.org, has been identified as a requirement to be satisfied in this revision of DCAT.

An RDF graph containing a proposed alignment of DCAT with schema.org is available. Comments on this alignment are invited.

The use of guarded constraints (existence, cardinality, range-type) to control the use of the recommended properties in the context of a class is being considered as part of the revision of DCAT.

The axiomatization of DCAT 2014 used global domain and range constraints for many of the properties defined in the DCAT namespace [[VOCAB-DCAT-20140116]]. This makes quite strong ontological commitments, some of which are now being reconsidered - see individual issues noted inline below.

The (revised) DCAT vocabulary is available in RDF. The primary artefact dcat.ttl is a serialization of the core DCAT vocabulary. Alongside it are a set of other RDF files that provide additional information, including:

  1. alignments to other vocabularies
  2. additional axioms, which can be useful in some contexts
  3. some profiles of DCAT, including a profile that corresponds to the 2014 version of DCAT [[VOCAB-DCAT-20140116]]

The implementation of a DCAT 2014 profile of the revised DCAT is being considered.

Dependencies

The definitions (including domain and range) of terms outside the DCAT namespace are provided here only for convenience and MUST NOT be considered normative. The authoritative definitions of these terms are in the corresponding specifications: [[!DC11]], [[!DCTERMS]], [[!FOAF]], [[!RDF-SCHEMA]], [[!SKOS-REFERENCE]], [[!XMLSCHEMA11-2]] and [[!VCARD-RDF]].

Description of DCAT vocabulary elements from DCAT 2014 [[VOCAB-DCAT-20140116]] except where indicated.

Class: Catalog

The following properties are recommended for use on this class: catalog record, hasPart, dataset, service, catalog, description, homepage, language, license, publisher, release date, rights, spatial, themes, title, update date

The scope of DCAT 2014 was limited to catalogs of datasets [[VOCAB-DCAT-20140116]]. A number of use cases for the revision involve also having data distribution services as members of a catalog - see DCAT Distribution to describe web services - ID6 and Modeling service-based data access - ID18. It has been decided to add an explicit class for Data Distribution Services in this revision of DCAT, and to enable these to be part of a Catalog. Provision for other services to also be part of a Catalog will also be made, as well as for Catalogs to be composed of other Catalogs. See Issue #172 and Issue #116.

RDF Class:dcat:Catalog
Sub-class of:Dataset
Definition:A curated collection of metadata about datasets and data services
Usage note:A web-based data catalog is typically represented as a single instance of this class.
See also: Catalog record, Dataset

Property: title

RDF Property:dct:title
Definition:A name given to the catalog.
Range:rdfs:Literal

Property: description

RDF Property:dct:description
Definition:A free-text account of the catalog.
Range:rdfs:Literal

Property: release date

RDF Property:dct:issued
Definition:Date of formal issuance (e.g., publication) of the catalog.
Range:rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [[!DATETIME]] and typed using the appropriate XML Schema datatype [[XMLSCHEMA11-2]]
See also: dataset release date, catalog record listing date and distribution release date

Property: update/modification date

RDF Property:dct:modified
Definition:Most recent date on which the catalog was changed, updated or modified.
Range:rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [[!DATETIME]] and typed using the appropriate XML Schema datatype [[XMLSCHEMA11-2]]
See also: dataset modification date, catalog record modification date and distribution modification date

Property: language

RDF Property:dct:language
Definition:The language of the catalog. This refers to the language used in the textual metadata describing titles, descriptions, etc. of the datasets in the catalog.
Range: dct:LinguisticSystem
Resources defined by the Library of Congress (1, 2) SHOULD be used.
If a ISO 639-1 (two-letter) code is defined for language, then its corresponding IRI SHOULD be used; if no ISO 639-1 code is defined, then IRI corresponding to the ISO 639-2 (three-letter) code SHOULD be used.
Usage note:Multiple values can be used. The publisher might also choose to describe the language on the dataset level (see dataset language).

Property: homepage

RDF Property:foaf:homepage
Definition:The homepage of the catalog.
Range:foaf:Document
Usage note:foaf:homepage is an inverse functional property (IFP) which means that it SHOULD be unique and precisely identify the catalog. This allows smushing various descriptions of the catalog when different URIs are used.

Property: publisher

RDF Property:dct:publisher
Definition:The entity responsible for making the catalog online.
Usage note:Resources of type foaf:Agent are recommended as values for this property.
See also:Class: Organization/Person

Property: spatial/geographic

RDF Property:dct:spatial
Definition:The geographical area covered by the catalog.
Range:dct:Location

Property: themes

RDF Property:dcat:themeTaxonomy
Definition:The knowledge organization system (KOS) used to classify catalog's datasets.
Domain:dcat:Catalog
Range:skos:ConceptScheme

Property: license

Models for the kind of license or rights representation indicated by the dct:license property are being considered as part of the revision of DCAT.

RDF Property:dct:license
Definition:This links to the license document under which the catalog is made available and not the datasets. Even if the license of the catalog applies to all of its datasets and distributions, it SHOULD be replicated on each distribution.
Range:dct:LicenseDocument
See also:catalog rights, distribution license

Property: rights

RDF Property:dct:rights
Definition:This describes the rights under which the catalog can be used/reused and not the datasets. Even if theses rights apply to all the catalog datasets and distributions, it SHOULD be replicated on each distribution.
Range:dct:RightsStatement
See also:catalog license, distribution rights

Property: hasPart

Explicit use of this property added in this revision of DCAT.

RDF Property:dct:hasPart
Definition:An item that is part of the catalog.
Domain:dcat:Catalog
Range:dcat:Resource
Usage note:This is the most general predicate for membership of a catalog. Use of a more specific sub-property is recommended when available.

Property: dataset

RDF Property:dcat:dataset
Definition:A dataset that is part of the catalog.
Sub property of:dct:hasPart
Domain:dcat:Catalog
Range:dcat:Dataset

Property: service

Property added in this revision of DCAT.

RDF Property:dcat:service
Definition:A service that is part of the catalog.
Sub property of:dct:hasPart
Domain:dcat:Catalog
Range:dcat:DataService

Property: catalog

Property added in this revision of DCAT.

RDF Property:dcat:catalog
Definition:Link from a catalog to another catalog whose contents are of interest in the context of this catalog
Sub property of:dct:dataset
Domain:dcat:Catalog
Range:dcat:Catalog

Property: catalog record

RDF Property:dcat:record
Definition:A catalog record that is part of the catalog.
Domain:dcat:Catalog
Range:dcat:CatalogRecord

Class: Catalog record

The following properties are recommended for use on this class: description, listing date, primary topic, title, update date

The need to be able to express rights relating to the re-use of DCAT metadata has been identified as a requirement to be satisfied in the revision of DCAT.

The need to be able to link a metadata record to its original source has been identified as a requirement to be satisfied in the revision of DCAT.

RDF Class:dcat:CatalogRecord
Definition:A record in a data catalog, describing the registration of a single dataset or data service.
Usage noteThis class is optional and not all catalogs will use it. It exists for catalogs where a distinction is made between metadata about a dataset and metadata about the dataset's entry in the catalog. For example, the publication date property of the dataset reflects the date when the information was originally made available by the publishing agency, while the publication date of the catalog record is the date when the dataset was added to the catalog. In cases where both dates differ, or where only the latter is known, the publication date SHOULD only be specified for the catalog record. Notice that the W3C PROV Ontology [[PROV-O]] allows describing further provenance information such as the details of the process and the agent involved in a particular change to a dataset.
See alsoDataset

If a catalog is represented as an RDF Dataset with named graphs (as defined in [[SPARQL11-QUERY]]), then it is appropriate to place the description of each dataset (consisting of all RDF triples that mention the dcat:Dataset, dcat:CatalogRecord, and any of its dcat:Distributions) into a separate named graph. The name of that graph SHOULD be the IRI of the catalog record.

Property: title

RDF Property:dct:title
Definition:A name given to the record.
Range:rdfs:Literal

Property: description

RDF Property:dct:description
Definition:free-text account of the record.
Range:rdfs:Literal

Property: listing date

RDF Property:dct:issued
Definition:The date of listing the corresponding dataset in the catalog.
Range:rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [[!DATETIME]] and typed using the appropriate XML Schema datatype [[XMLSCHEMA11-2]]
Usage note:This indicates the date of listing the dataset in the catalog and not the publication date of the dataset itself.
See also: dataset release date

Property: update/modification date

RDF Property:dct:modified
Definition:Most recent date on which the catalog entry was changed, updated or modified.
Range:rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [[!DATETIME]] and typed using the appropriate XML Schema datatype [[XMLSCHEMA11-2]]
Usage note:This indicates the date of last change of a catalog entry, i.e. the catalog metadata description of the dataset, and not the date of the dataset itself.
See also: dataset modification date

Property: primary topic

RDF Property:foaf:primaryTopic
Definition:Links the catalog record to the dcat:Dataset resource described in the record.
Usage note:foaf:primaryTopic property is functional: each catalog record can have at most one primary topic i.e. describes one dataset.

Class: Data distribution service

In addition to the properties inherited from the super-class dcat:DataService, the following properties are recommended for use on this class: servesDataset,

RDF Class:dcat:DataDistributionService
Definition:A site or end-point that provides access to datasets through distributions of the datasets
Sub class of:dcat:DataService
Usage note:
See also:

The class dcat:DataDistributionService has been added to the DCAT vocabulary in this revision.

Property: serves dataset

RDF Property:dcat:servesDataset
Definition:Link to description of a dataset that this DataDistributionService can distribute
Range:dact:Dataset

Class: Data Service

In addition to the properties inherited from the super-class dcat:Resource, the following properties are recommended for use on this class: endpointDescription, endpointURL, license, accessRights

RDF Class:dcat:DataService
Definition:A service for discovery, access or processing data or related resources.
Sub class of:dcat:Resource
Sub class of:dctype:Service
Usage note:
See also:

New class in this revision of DCAT.

Property: endpoint address

RDF Property:dct:endpointURL
Definition:Link to service end-point.
Domain:dcat:DataService
Range:xsd:anyURI

Property: endpoint description

RDF Property:dcat:endpointDescription
Definition:Link to a description of the service end-point, for example an OpenAPI (Swagger) description, an OGC getCapabilities response, a SD Service, an OpenSearch or WSDL document.
Domain:dcat:DataService
Range:rdfs:Resource

Property: license

RDF Property:dct:license
Definition:This links to the license document under which the service is made available.
Range:dct:LicenseDocument
See also: distribution rights, catalog license

Property: access rights

RDF Property:dct:accessRights
Definition:Access Rights MAY include information regarding access or restrictions based on privacy, security, or other policies.
Range:dct:RightsStatement

Class: Dataset

In addition to the properties inherited from the super-class dcat:Resource, the following properties are recommended for use on this class: distribution, frequency, spatial coverage, temporal coverage,

Information about licences and rights SHOULD be provided on the level of Distribution. Information about licences and rights MAY be provided for a Dataset in addition to but not in stead of the information provided for the Distributions of that Dataset. Providing licence or rights information for a Dataset that is different from information provided for a Distribution of that Dataset SHOULD be avoided as this can create legal conflicts.

The need to more formally encode access restrictions for both datasets and distributions has been identified as a requirement to be satisfied in the revision of DCAT.

The need to provide richer descriptions of dataset aspects (e.g. instrument/sensor used, spatial feature, observable property, quantity kind) has been identified as a requirement to be satisfied in the revision of DCAT.

The need to provide better guidance and vocabulary elements for dataset citation has been identified as a requirement to be satisfied in the revision of DCAT.

The need to be able to link a dataset with publications arising from it has been identified as a requirement to be satisfied in the revision of DCAT.

The need provide a more comprehensive method for describing dataset provenance has been identified as a requirement to be satisfied in the revision of DCAT. A preliminary alignment of DCAT with PROV-O is available.

The need to be able to provide summary statistics about a dataset has been identified as a requirement to be satisfied in the revision of DCAT.

The need to be able to provide usage notes for a dataset or distribution has been identified as a requirement to be satisfied in the revision of DCAT.

RDF Class:dcat:Dataset
Definition:A collection of data, published or curated by a single agent, and available for access or download in one or more formats.
Sub class of:dcat:Resource
Usage note:This class represents the actual dataset as published by the dataset publisher. In cases where a distinction between the actual dataset and its entry in the catalog is necessary (because metadata such as modification date and maintainer might differ), the catalog record class can be used for the latter.

In DCAT 2014 [[!VOCAB-DCAT-20140116]] dcat:Dataset was a sub-class of dctype:Dataset, which is a member of the DCMI Types vocabulary [[DCTERMS]]. This relationship has been removed in the revised DCAT vocabulary - see Issue #98.

Property: dataset distribution

RDF Property:dcat:distribution
Definition:Connects a dataset to its available distributions.
Domain:dcat:Dataset
Range:dcat:Distribution

Property: frequency

RDF Property:dct:accrualPeriodicity
Definition:The frequency at which dataset is published.
Range:dct:Frequency (A rate at which something recurs)

Property: spatial/geographical coverage

The need to indicate the spatial reference system used in the spatial description of a dataset has been identified as a requirement to be satisfied in the revision of DCAT.

The need to be able to describe the spatial coverage of a dataset as a geometry has been identified as a requirement to be satisfied in the revision of DCAT.

RDF Property:dct:spatial
Definition:Spatial coverage of the dataset.
Range:dct:Location (A spatial region or named place)

Property: temporal coverage

The need to be able to describe the temporal coverage of a dataset in a structure way has been identified as a requirement to be satisfied in the revision of DCAT.

RDF Property:dct:temporal
Definition:The temporal period that the dataset covers.
Range:dct:PeriodOfTime (An interval of time that is named or defined by its start and end dates)

Class: Discovery service

In addition to the properties inherited from the super-class dcat:DataDistributionService, the following properties are recommended for use on this class: none yet

RDF Class:dcat:DiscoveryService
Definition:A site or end-point that supports data discovery, usuing by providing access to catalogs
Sub class of:dcat:DataDistributionService
Usage note:
See also:

The class dcat:DiscoveryService has been added to the DCAT vocabulary in this revision.

Class: Distribution

The following properties are recommended for use on this class: access URL, access service, byte size, conforms to, description, download URL, format, license, media type, release date, rights, title, update date

The definition of dcat:Distribution is being re-evaluated as part of the revision of DCAT. In particular to clarify that distributions are primarily representations of datasets. Also see Issue #106

The packaging of files in a dcat:Distribution is being considered as part of the revision of DCAT.

The need to provide a way to indicate the structure or schema of data in a dcat:Distribution (in addition to the media-type or serialization) has been identified as a requirement to be satisfied in the revision of DCAT.

The need to more formally encode access restrictions for both datasets and distributions has been identified as a requirement to be satisfied in the revision of DCAT.

The possible need to provide guidance regarding the extension of the notion "Distribution" is being considered for the revision of DCAT.

RDF class:dcat:Distribution
Definition:A specific representation of a dataset. A dataset might be available in several different forms, and these forms might comprise both different serializations or different schematic arrangements of the same data. Examples of distributions include a CSV file, a netCDF file, or a data-cube
Usage note:This represents a general availability of a dataset it implies no information about the actual access method of the data, i.e. whether by direct download, API, or through a Web page. The use of dcat:downloadURL property indicates directly downloadable distributions.
See also:Data distribution service

The scope of dcat:Distribution here is narrower than in DCAT-2014 [[VOCAB-DCAT-20140116]], where it also included APIs and feeds. Data catalogues designed using DCAT-2014 therefore used individuals of type dcat:Distribution to describe data distribution services. Applications consuming DCAT should be aware that catalogues designed using DCAT-2014 might use dcat:Distribution to represent both services and representations.

Under the revised scope, individuals of type dcat:Distribution SHOULD be limited to representations of datasets which might be transported as files, and SHOULD NOT be used for data services such as APIs or feeds. Data services including APIs and feeds SHOULD be described using individuals of type dcat:DataService or dcat:DataDistributionService.

Links between a dcat:Distribution and services or web addresses where it can be accessed are expressed using dcat:accessURL, dcat:accessService, dcat:downloadURL, as shown in Figure 1 and described in the definitions below.

Property: title

RDF Property:dct:title
Definition:A name given to the distribution.
Range:rdfs:Literal

Property: description

RDF Property:dct:description
Definition:free-text account of the distribution.
Range:rdfs:Literal

Property: release date

RDF Property:dct:issued
Definition:Date of formal issuance (e.g., publication) of the distribution.
Range:rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [[!DATETIME]] and typed using the appropriate XML Schema datatype [[XMLSCHEMA11-2]]
Usage note:This property SHOULD be set using the first known date of issuance.
See also: dataset release date

Property: update/modification date

RDF Property:dct:modified
Definition:Most recent date on which the distribution was changed, updated or modified.
Range:rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [[!DATETIME]] and typed using the appropriate XML Schema datatype [[XMLSCHEMA11-2]]
See also: dataset modification date

Property: license

RDF Property:dct:license
Definition:This links to the license document under which the distribution is made available.
Range:dct:LicenseDocument
Usage note:Information about licences and rights SHOULD be provided on the level of Distribution. Information about licences and rights MAY be provided for a Dataset in addition to but not in stead of the information provided for the Distributions of that Dataset. Providing licence or rights information for a Dataset that is different from information provided for a Distribution of that Dataset SHOULD be avoided as this can create legal conflicts.
See also: distribution rights, catalog license

Property: rights

RDF Property:dct:rights
Definition:Information about rights held in and over the distribution.
Range:dct:RightsStatement
Usage note:dct:license, which is a sub-property of dct:rights, can be used to link a distribution to a license document. However, dct:rights allows linking to a rights statement that can include licensing information as well as other information that supplements the licence such as attribution.
Information about licences and rights SHOULD be provided on the level of Distribution. Information about licences and rights MAY be provided for a Dataset in addition to but not instead of the information provided for the Distributions of that Dataset. Providing licence or rights information for a Dataset that is different from information provided for a Distribution of that Dataset SHOULD be avoided as this can create legal conflicts.
See also: distribution license, catalog rights

Property: access address

The granularity of dcat:accessURL is being re-considered to provide different usages for list and item endpoints as well as supporting the declaration of different profiles (for list results and data payload).

RDF Property:dcat:accessURL
Definition:Link to a resource that gives access to a distribution of the dataset. E.g. landing page, feed, SPARQL endpoint.
Domain:dcat:Distribution
Range:rdfs:Resource
Usage note:dcat:accessURL SHOULD be used for the address of a service or location that can provide access to this distribution, typically through a web form, query or API call.
dcat:downloadURL is preferred for direct links to downloadable resources.
If the distribution(s) are accessible only through a landing page (i.e. direct download URLs are not known), then the landingPage address associated with the dcat:Dataset SHOULD be duplicated as accessURL on a distribution. (see example 4.4)
See alsodownload address, access service

dcat:accessURL generally matches the property-chain dcat:accessService/dcat:endpointURL. In the RDF representation of DCAT this is axiomatized as an OWL property-chain axiom.

Property: access service

New property in this revision of DCAT.

RDF Property:dcat:accessService
Definition:A service that gives access to the distribution of the dataset
Sub-property of:dcat:accessURL
Range:dcat:DataDistributionService
Usage note:dcat:accessService SHOULD be used to link to a description of a dcat:DataDistributionService that can provide access to this distribution.
See alsodownload address, access address

Property: download address

RDF Property:dcat:downloadURL
Definition:Direct link to a downloadable file in a given format. E.g. CSV file or RDF file. The format is indicated by the distribution's dct:format and/or dcat:mediaType
Domain:dcat:Distribution
Range:rdfs:Resource
Usage note:dcat:downloadURL SHOULD be used for the address at which this distribution is available directly, typically through a HTTP Get request.
See alsoaccess address, access service

Property: byteSize

The axiomatization of dcat:byteSize is being re-evaluated as part of the revision of DCAT.

RDF Property:dcat:byteSize
Definition:The size of a distribution in bytes.
Domain:dcat:Distribution
Range:rdfs:Literal typed as xsd:decimal.
Usage note:The size in bytes can be approximated when the precise size is not known.

Property: conforms to

New property in this context in this revision of DCAT.

RDF Property:dct:conformsTo
Definition:An established standard to which the described resource conforms.
Range:dct:Standard (A basis for comparison; a reference point against which other things can be evaluated.)
Usage note:This property SHOULD be used to indicate the model, schema, ontology, view or profile that this representation conforms to. This is (generally) a complementary concern to the media-type or format.
See also:format , media type

dct:Standard is defined as "A basis for comparison; a reference point against which other things can be evaluated." It is not restricted to formal standards issued by bodies like ISO and W3C. In this context it will usually be used for a schema, ontology, data model or profile which specifies the structure of a dataset. This is not necessarily tied to a single encoding or serialization.

Property: media type

The axiomatization of dcat:mediaType is being re-evaluated as part of the revision of DCAT.

RDF Property:dcat:mediaType
Definition:The media type of the distribution as defined by IANA [[!IANA-MEDIA-TYPES]].
Sub property of:dct:format
Domain:dcat:Distribution
Range:dct:MediaTypeOrExtent
Usage note:This property SHOULD be used when the media type of the distribution is defined in IANA [[!IANA-MEDIA-TYPES]], otherwise dct:format MAY be used with different values.
See also:format , conforms to

Property: format

RDF Property:dct:format
Definition:The file format of the distribution.
Range:dct:MediaTypeOrExtent
Usage note: dcat:mediaType SHOULD be used if the type of the distribution is defined by IANA [[!IANA-MEDIA-TYPES]].
See also:media type , conforms to

Class: Catalogued resource

The following properties are recommended for use on this class: conformsTo, contact point, description, identifier, keyword, landing page, language, publisher, release date, theme, title, type, update date,

The possible association of items with zero or multiple catalogs has been identified as a requirement to be satisfied in the revision of DCAT.

The need to be able to link a catalogued resource with the source of funding that supported its production has been identified as a requirement to be satisfied in the revision of DCAT.

The need to be able to describe the business or project context related to production of a catalogued resource has been identified as a requirement to be satisfied in the revision of DCAT.

The need to be able to link a catalogued resource with the business or project context of its production has been identified as a requirement to be satisfied in the revision of DCAT.

The need describe relationships between catalogued resources has been identified as a requirement to be satisfied in the revision of DCAT.

RDF Class:dcat:Resource
Definition:Resource published or curated by a single agent.
Usage note:The class of all catalogued resources, the superclass of dcat:Dataset, dcat:DataService, dcat:Catalog and any other member of a dcat:Catalog. This class carries properties common to all catalogued resources, including datasets and data services. It is strongly recommended to use a more specific sub-class when available.
See also:Catalog record

The class dcat:Resource has been added to the DCAT vocabulary in this revision.

Property: title

RDF Property:dct:title
Definition:A name given to the item.
Range:rdfs:Literal

Property: description

RDF Property:dct:description
Definition:free-text account of the item.
Range:rdfs:Literal

Property: release date

RDF Property:dct:issued
Definition:Date of formal issuance (e.g., publication) of the item.
Range:rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [[!DATETIME]] and typed using the appropriate XML Schema datatype [[XMLSCHEMA11-2]]
Usage note:This property SHOULD be set using the first known date of issuance.

Property: update/modification date

RDF Property:dct:modified
Definition:Most recent date on which the item was changed, updated or modified.
Range:rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [[!DATETIME]] and typed using the appropriate XML Schema datatype [[XMLSCHEMA11-2]]
Usage note:The value of this property indicates a change to the actual item, not a change to the catalog record. An absent value MAY indicate that the item has never changed after its initial publication, or that the date of last modification is not known, or that the item is continuously updated.
See also:frequency

Property: language

RDF Property:dct:language
Definition:The language of the item.
Range:dct:LinguisticSystem
Resources defined by the Library of Congress (1, 2) SHOULD be used.
If a ISO 639-1 (two-letter) code is defined for language, then its corresponding IRI SHOULD be used; if no ISO 639-1 code is defined, then IRI corresponding to the ISO 639-2 (three-letter) code SHOULD be used.
Usage note:
  • This overrides the value of the catalog language in case of conflict.
  • If the item is available in multiple languages, use multiple values for this property. If each language is available separately for a dataset, define an instance of dcat:Distribution for each language and describe the specific language of each distribution using dct:language (i.e. the dataset will have multiple dct:language values and each distribution will have one of these languages as value of its dct:language property).

Property: publisher

The desire to have qualified forms of properties (as done in [[PROV-O]]) has been raised. If dcat:Dataset is a prov:Entity (not decided yet) then `publisher` could be a qualified prov:Entity -> prov:Agent relationship.

RDF Property:dct:publisher
Definition:An entity responsible for making the item available.
Usage note:Resources of type foaf:Agent are recommended as values for this property.
See also:Class: Organization/Person

Property: identifier

The desirability of dereferenceable identifiers has been raised as an item for consideration in the revision of DCAT. dct:identifier has limited expressivity for this. It has been suggested that ADMS identifier, or a property from another ontology might be recruited to help DCAT in this area.

The need to clearly distinguish between primary and legacy identifiers for a dataset has been identified as a requirement to be satisfied in the revision of DCAT.

The need to indicate the scheme or authority for identifiers for a dataset has been identified as a requirement to be satisfied in the revision of DCAT.

RDF Property:dct:identifier
Definition:A unique identifier of the item.
Range:rdfs:Literal
Usage note:The identifier might be used as part of the URI of the item, but still having it represented explicitly is useful.

Property: theme/category

In DCAT 2014 [[VOCAB-DCAT-20140116]] the domain of dcat:theme was dcat:Dataset, which limited use of this property in other contexts. The domain has been relaxed in this revision - see Issue #123.

RDF Property:dcat:theme
Definition:The main category of the resource. A resource can have multiple themes.
Sub property of:dct:subject
Range:skos:Concept
Usage note:The set of skos:Concepts used to categorize the resources are organized in a skos:ConceptScheme describing all the categories and their relations in the catalog.
See also:catalog themes taxonomy

Property: type/genre

Added in DCAT revision - see Issue #64.

RDF Property:dct:type
Definition:The nature or genre of the resource.
Sub property of:dc:type
Range:rdfs:Class
Usage note:Recommended best practice is to use a value from a controlled vocabulary, such as:
  1. DCMI Type vocabulary [[DCTERMS]]
  2. ISO 19115 scope codes [[ISO-19115-1]]
  3. Datacite resource types [[DataCite]]
  4. PARSE.Insight content-types used by re3data.org [[RE3DATA-SCHEMA]] (see item 15 contentType)
  5. MARC intellectual resource types
To describe the file format, physical medium, or dimensions of the resource, use the dct:format element.

Property: conforms to

RDF Property:dct:conformsTo
Definition:An established standard to which the described resource conforms.
Range:dct:Standard (A basis for comparison; a reference point against which other things can be evaluated.)

Property: keyword/tag

In DCAT 2014 [[VOCAB-DCAT-20140116]] the domain of dcat:keyword was dcat:Dataset, which limited use of this property in other contexts. The domain has been relaxed in this revision - see Issue #121.

RDF Property:dcat:keyword
Definition:A keyword or tag describing the resource.
Range:rdfs:Literal

Property: contact point

In DCAT 2014 [[VOCAB-DCAT-20140116]] the domain of dcat:contactPoint was dcat:Dataset, which limited use of this property in other contexts. The domain has been relaxed in this revision - see Issue #95.

The axiomatization of dcat:contactPoint is being re-evaluated as part of the revision of DCAT.

RDF Property:dcat:contactPoint
Definition:Link a dataset to relevant contact information which is provided using vCard [[VCARD-RDF]].
Range:vcard:Kind

Property: landing page

In DCAT 2014 [[!VOCAB-DCAT-20140116]] the domain of dcat:landingPage was dcat:Dataset, which limited use of this property in other contexts. The domain has been relaxed in this revision - see Issue #122.

RDF Property:dcat:landingPage
Definition:A Web page that can be navigated to in a Web browser to gain access to the dataset, its distributions and/or additional information.
Sub property of:foaf:page
Range:foaf:Document
Usage note: If the distribution(s) are accessible only through a landing page (i.e. direct download URLs are not known), then the landing page link SHOULD be duplicated as dcat:accessURL on a distribution. (see example 4.4)

Class: Concept scheme

RDF Class:skos:ConceptScheme
Definition:The knowledge organization system (KOS) used to represent themes/categories of datasets in the catalog.
See also:catalog themes, dataset theme

Class: Concept

RDF Class:skos:Concept
Definition:A category or a theme used to describe datasets in the catalog.
Usage note:It is recommended to use either skos:inScheme or skos:topConceptOf on every skos:Concept used to classify datasets to link it to the concept scheme it belongs to. This concept scheme is typically associated with the catalog using dcat:themeTaxonomy
See also:catalog themes, dataset theme

Class: Organization/Person

RDF Classes:foaf:Person for people and foaf:Organization for government agencies or other entities.
Usage note:[[!FOAF]] provides sufficient properties to describe these entities.

Quality information

This section is not-normative as it provides guidance on how to document the quality of DCAT first class entities (e.g., datasets, distributions) and it does not define new DCAT terms. The guidance relies on the Data Quality Vocabulary(DQV)[[vocab-dqv]], which is a W3C Group Note.

The need to choose or define a data quality model has been identified as a requirement to be satisfied in the revision of DCAT.

The Data Quality Vocabulary (DQV) offers common modelling patterns for different aspects of Data Quality. It can relate DCAT datasets and distributions with different types of quality information including Each type of quality information can pertain to one or more quality dimensions, namely, quality characteristics relevant to the consumer. The practice to see the quality as a multi-dimensional space is consolidated in the field of quality management to split the quality management into addressable chunks. DQV does not define a normative list of quality dimensions. It offers the quality dimensions proposed in ISO/IEC 25012 [[ISOIEC25012]] and Zaveri et al. [[ZaveriEtAl]] as two possible starting points. It also provides an RDF representation for the quality dimensions and categories defined in the latter. Ultimately, implementers will need to choose themselves the collection of quality dimensions that best fits their needs. The following section shows how DCAT and DQV can be coupled to describe the quality of datasets and distributions. For a comprehensive introduction and further examples of use, please refer to the Data Quality Vocabulary (DQV) group note [[vocab-dqv]].

The following examples make no comments on where the quality information would reside and how it is managed. That is out of scope for the DCAT vocabulary. The assumption made is that the quality individuals are available using the URIs indicated. Besides, the examples and more in general the DQV is neutral to the data portal design choices on how to collect quality information. For example, data portals can collect DQV instances by implementing specific UI to annotate data or by taking inputs from 3rd-party services.

We might want to include examples of quality documentation related to services.

Providing quality information

A data consumer (:consumer1) describes the quality of the dataset :genoaBusStopsDataset that includes a georeferenced list of bus stops in Genoa. He/she annotates the dataset with a DQV quality note (:genoaBusStopsDatasetCompletenessNote) about data completeness (ldqd:completeness) to warn that the dataset includes only 20500 out of the 30000 stops.

:genoaBusStopsDataset a dcat:Dataset ;
    dqv:hasQualityAnnotation :genoaBusStopsDatasetCompletenessNote .

:genoaBusStopsDatasetCompletenessNote
    a dqv:UserQualityFeedback ;
    oa:hasTarget :genoaBusStopsDataset ;
    oa:hasBody :textBody ;
    oa:motivatedBy dqv:qualityAssessment ;
    prov:wasAttributedTo :consumer1 ;
    prov:generatedAtTime "2018-05-27T02:52:02Z"^^xsd:dateTime ;
    dqv:inDimension ldqd:completeness
    .

:textBody a oa:TextualBody ;
    rdf:value "Incomplete dataset: it contains only 20500 out of 30000 existing bus stops" ;
    dc:language "en" ;
    dc:format "text/plain"
    .
    

The activity :myQualityChecking employs the service :myQualityChecker to check the quality of the :genoaBusStopsDataset dataset. The metric :completenessWRTExpectedNumberOfEntities is applied to measure the dataset completeness (ldqd:completeness) and it results in the quality measurement :genoaBusStopsDatasetCompletenessMeasurement.

:genoaBusStopsDataset
    dqv:hasQualityMeasurement :genoaBusStopsDatasetCompletenessMeasurement .

:genoaBusStopsDatasetCompletenessMeasurement
    a dqv:QualityMeasurement ;
    dqv:computedOn :genoaBusStopsDataset ;
    dqv:isMeasurementOf :completenessWRTExpectedNumberOfEntities ;
    dqv:value "0.6833333"^^xsd:decimal  ;
    prov:wasAttributedTo :myQualityChecker ;
    prov:generatedAtTime "2018-05-27T02:52:02Z"^^xsd:dateTime ;
    prov:wasGeneratedBy :myQualityChecking
    .

:completenessWRTExpectedNumberOfEntities
    a dqv:Metric ;
    skos:definition "it returns the degree of completeness as ratio between the actual number of entities included in the dataset and the declared expected number of entities."@en ;
    dqv:expectedDataType xsd:decimal ;
    dqv:inDimension ldqd:completeness .

# :myQualityChecker is a service computing some quality metrics
:myQualityChecker
    a prov:SoftwareAgent ;
    rdfs:label "A quality assessment service"^^xsd:string .
    # Further details about quality service/software can be provided, for example,
    # deploying  vocabularies such as Dataset Usage Vocabulary (DUV), Dublin Core or ADMS.SW

# :myQualityChecking is the activity that has generated :genoaBusStopsDatasetCompletenessMeasurement from :genoaBusStopsDataset
:myQualityChecking
    a prov:Activity;
    rdfs:label "The checking of genoaBusStopsDataset's quality"^^xsd:string;
    prov:wasAssociatedWith :myQualityChecker;
    prov:used              :genoaBusStopsDataset;
    prov:generated         :genoaBusStopsDatasetCompletenessMeasurement;
    prov:endedAtTime      "2018-05-27T02:52:02Z"^^xsd:dateTime;
    prov:startedAtTime     "2018-05-27T00:52:02Z"^^xsd:dateTime .
    

Provenance patterns

A number of requirements identify the need to provide better support for Dataset and Record provenance - see Issue #78, Issue #77, Issue #76, Issue #71, Issue #66, Issue #63, Issue #57, . It has been suggested that many of the requirements can be satisfied by using capabilities from the [[PROV-O]] ontology, in particular by treating dcat:Dataset and/or dcat:CatalogRecord a sub-class of prov:Entity. A preliminary alignment of DCAT with PROV-O is available.

In this chapter it is planned to describe patterns for the use of the [[PROV-O]] vocabulary to support the various provenance-related requirements. See Milestone 2 for more discussion.

License and rights statements

DCAT 2014 handling of license and rights do not appear to satisfy all requirements [[VOCAB-DCAT-20140116]]. The recently completed W3C ODRL vocabulary [[ODRL-VOCAB]] provides a rich language for describing many kinds of rights and obligations. In this chapter it is planned to describe some patterns for linking DCAT Datasets and/or Distributions to suitable rights expressions. See Milestone 1 for more discussion.

Dataset versions

The need to be able to describe version relationships of datasets has been identified as a requirement to be satisfied in the revision of DCAT. Also see detailed requirements in Issue #89, Issue #91, Issue #92, Issue #93,

In this chapter it is planned to describe some patterns for describing Dataset and/or Distribution versions. See Milestone 6 for more discussion.

Alignment with other metadata vocabularies

An alignment of DCAT with PROV-O [[PROV-O]] is being prepared. A provisional version is available.

An alignment of DCAT with DATS [[DATS]] is being considered.

An alignment of DCAT with HCLS [[HCLS-Dataset]] is being prepared.

An alignment of DCAT with ISO 19115 [[ISO-19115-1]] is being prepared.

An alignment of DCAT with Datacite metadata schema [[DataCite]] is being prepared.

An alignment of DCAT with DDI [[DDI]] is being prepared.

An alignment of DCAT with schema.org is being prepared. A provisional version is available.

See Milestone 4 for more discussion.

DCAT Profiles

DCAT provides a generic metadata vocabulary for cataloguing datasets. Profiles of DCAT are required for spcific applications and disciplines. Providing a model and formalization for DCAT profiles is planned to be an important part of this revision. Also see Issue #73, Issue #74, Issue #75.

See Milestone 3 for more discussion.

Relation to other W3C Recommendations

DCAT should be aligned with other recent Linked Data based Recommendations.

Linked Data Platform (LDP)

DCAT provides a data model for representation of metadata about datasets in the form of Linked Data, but it does not specify how this metadata can be accessed or modified. The DCAT compatible metadata can be viewed as collections of Catalog Records, Datasets and Data Services contained in a Catalog, and a collection of Distributions contained in a Dataset. The Linked Data Platform [[ldp]] specification deals with access to and modification of Linked Data Platform Containers (LDPCs). This section provides guidance on how to represent DCAT metadata as LDP Containers, which supports namely the implementation of Solid based DCAT catalogs.

First, we will present an example of a LDPC for datasets in a catalog. There is one catalog with one dataset. The dataset is contained in the </datasets/> LDP Direct Container. To ensure the LDPC discovery, we connect it to the Catalog using the dcat:datasets predicate.

Example 1
@prefix dcat: <http://www.w3.org/ns/dcat#> .
@prefix ldp:  <http://www.w3.org/ns/ldp#> .

@base <https://example.org/resource/catalog> .

<> a dcat:Catalog ;
	dcat:datasets </datasets/> ;
	dcat:dataset </datasets/001> .

</datasets/> a ldp:Container, ldp:DirectContainer ;
	ldp:membershipResource <> ;
	ldp:hasMemberRelation dcat:dataset ;
	ldp:contains </datasets/001> .

</datasets/001> a dcat:Dataset .

In the second example, we add LDPCs </records/> for Catalog Records and </services/> for Data Services, discoverable using dcat:records and dcat:services predicates from the Catalog:

Example 2
@prefix dcat: <http://www.w3.org/ns/dcat#> .
@prefix ldp:  <http://www.w3.org/ns/ldp#> .

@base <https://example.org/resource/catalog> .

<> a dcat:Catalog ;
	dcat:records </records/> ;
	dcat:datasets </datasets/> ;
	dcat:services </services/> ;
	dcat:dataset </datasets/001> .

</records/> a ldp:Container, ldp:DirectContainer ;
	ldp:membershipResource <> ;
	ldp:hasMemberRelation dcat:record ;
	ldp:contains </records/001> .

</datasets/> a ldp:Container, ldp:DirectContainer ;
	ldp:membershipResource <> ;
	ldp:hasMemberRelation dcat:dataset ;
	ldp:contains </datasets/001> .

</services/> a ldp:Container, ldp:DirectContainer ;
	ldp:membershipResource <> ;
	ldp:hasMemberRelation dcat:service ;
	ldp:contains </services/001> .

</records/001> a dcat:CatalogRecord ;
	foaf:primaryTopic </datasets/001> .

</datasets/001> a dcat:Dataset ;

</services/001> a dcat:DataService .

Each dataset has its own LDPC for its distributions. In the third example, we show the LDPC </datasets/001/distributions/> for distributions of a single dataset, </datasets/001>, discoverable through the dcat:distributions predicate.

Example 3
@prefix dcat: <http://www.w3.org/ns/dcat#> .
@prefix ldp:  <http://www.w3.org/ns/ldp#> .

@base <https://example.org/resource/catalog> .

</datasets/001> a dcat:Dataset ;
	dcat:distributions </datasets/001/distributions/> ;
	dcat:distribution </datasets/001/distributions/001> .

</datasets/001/distributions/> a ldp:Container, ldp:DirectContainer ;
	ldp:membershipResource </datasets/001> ;
	ldp:hasMemberRelation dcat:distribution ;
	ldp:contains </datasets/001/distributions/001> .

</datasets/001/distributions/001> a dcat:Distribution .

For catalogs with many datasets, catalog records, data services or distributions, the Linked Data Platform Paging mechanism [[ldp-paging]] SHOULD be used to provide access to them.

In the next sections we formally define the additional properties used for discovery of LDP containers.

Property: datasets

RDF Property:dcat:datasets
Definition:Connects a catalog to the LDP container of its datasets.
Domain:dcat:Catalog
Range:ldp:DirectContainer

Property: catalog records

RDF Property:dcat:records
Definition:Connects a catalog to the LDP container of its catalog records.
Domain:dcat:Catalog
Range:ldp:DirectContainer

Property: data services

RDF Property:dcat:services
Definition:Connects a catalog to the LDP container of its data services.
Domain:dcat:Catalog
Range:ldp:DirectContainer

Property: distributions

RDF Property:dcat:distributions
Definition:Connects a dataset to the LDP container of its distributions.
Domain:dcat:Dataset
Range:ldp:DirectContainer

Linked Data Notifications (LDN)

Linked Data Notifications (LDN) [[ldn]] can be used with DCAT e.g. for feedback collection. Any resource can have an LDN Inbox. In the following example we show a dataset </datasets/001> as an LDN Target with an LDN Inbox.

Example 4
@prefix dcat: <http://www.w3.org/ns/dcat#> .
@prefix ldp:  <http://www.w3.org/ns/ldp#> .

@base <https://example.org/resource/catalog> .

</datasets/001> a dcat:Dataset ;
	ldp:inbox </datasets/001/inbox/> .

</datasets/001/inbox/> ldp:contains </datasets/001/inbox/001> .

Acknowledgments

The editors gratefully acknowledge the contributions made to this document by all members of the working group.

The editors also gratefully acknowledge the chairs of this Working Group: Karen Coyle, Caroline Burle and Peter Winstanley — and staff contacts Phil Archer and Dave Raggett.

Change history

A full change-log is available on GitHub

Changes since the W3C Recommendation of 16 January 2014

The document has undergone the following changes since the W3C Recommendation of 16 January 2014 [[VOCAB-DCAT-20140116]]: