Dataset Exchange Working Group Charter (Draft)
The mission of the Dataset Exchange WG is to:
- Revise the Data Catalog Vocabulary, DCAT, taking account of related vocabularies and the extensive work done in developing a number of its application profiles.
- Facilitate and encourage the use of application profiles when requesting and serving data on the Web.
|Start date||May/June 2017|
|End date||30 June 2019|
|Chairs||Caroline Burle, NIC.br,
Karen Coyle, Dublin Core Metadata Initiative
|Team Contacts||Phil Archer (0.2 FTE), supported by the VRE4EIC project|
|Meeting Schedule|| Teleconferences: 1-hour calls will be held weekly
Face-to-face: twice per year, expected to include the W3C's annual Technical Plenary week.
Sharing data between researchers, governments and citizens, whether openly or not, requires the provision of metadata. Different communities use different metadata standards to describe their datasets, some of which are highly specialized. At a general level, W3C’s Data Catalog Vocabulary, DCAT, is in widespread use, but so too are CKAN’s native schema, schema.org's dataset description vocabulary, ISO 19115, DDI, SDMX, CERIF, VoID, INSPIRE and, in the healthcare and life sciences domain, the Dataset Description vocabulary and DATS (ref) among others.
This variety is a clear indication that no single vocabulary offers a complete and universally accepted solution. For example, catalogs increasingly provide APIs to the datasets they contain, yet the current version of DCAT lacks a way to describe these APIs (it only fully supports the discovery of static datasets). By providing a sufficiently rich description of a data API to allow the programmatic conversion to something else, the data is made more available and interoperable, and therefore more reusable.
Maximizing interoperability between services such as data catalogs, e-Infrastructures and virtual research environments requires not just the use of standard vocabularies but of application profiles. These provide cardinality constraints and/or enumerated lists of allowed values such that data can be validated. The development of several application profiles based on DCAT is particularly noteworthy in this regard.
Rather than limit the number of metadata standards and application profiles in use, systems should be able to expose and ingest (meta)-data according to multiple standards through a transparent and sustainable interface. We thus need a mechanism for servers to indicate the available application profiles, and for clients to choose an appropriate one. This leads to the concept of content negotiation by application profile, which is orthogonal to content negotiation by data format and language. It is expected that a new RFC on this topic will be developed and published in parallel with the Dataset Exchange Working Group.
The goal of the working group is to extend the existing DCAT standard in line with wider practice but also to recognize and support diverse approaches to data description and Dataset Exchange more generally.
DCAT is formulated as an RDF vocabulary and is expected to remain so, however, the working Group is agnostic about data formats. Methods for expressing DCAT in other formats are in scope.
Government data, scientific research data, enterprise and cultural heritage data, whether shared openly or not, are all explicitly in scope.
The following documents SHOULD be considered by the Working Group as direct inputs to the specifications to be developed.
- DCAT and the HCLS Community profile
- The Data Quality and Dataset Usage vocabularies
- The Smart Data & Smarter Descriptions (SDSVoc) workshop report, in particular the section on content negotiation by application profile.
- Data on the Web Best Practices
Out of Scope
The Dataset Exchange Working Group will not create application profiles or metadata standards that only apply to very specific domains (such as particle physics, accountancy, oncology etc.)
In order to advance to Proposed Recommendation, the revised version of DCAT should show use of each term in multiple catalogs and related systems. As a minimum, each term should be used at least twice although a higher number is expected for the majority of terms.
For the content negotiation by application profile specification, each feature is expected to have two independent implementations. This includes the RFC and any fall back mechanism defined by the Working Group.
The Working Group will deliver the following W3C normative specifications (titles of the documents are provisional; some documents listed below may be grouped into one document or split into several, constituent documents):
- DCAT 1.1
An update and expansion of the current DCAT Recommendation. The new version may deprecate, but will not delete, any existing terms.
- Guidance on publishing application profiles of vocabularies.
A definition of what is meant by an application profile and an explanation of one or more methods for publishing and sharing them.
- Content Negotiation by Application Profile
An explanation of how to implement the expected RFC and suitable fallback mechanisms.
Other non-normative documents may be created such as:
- A use case and requirement document
- A test suite for content negotiation by application profile
- A primer (subject to the WG’s capacity)
- UCR FPWD Q3-4 2017
- FPWD for DCAT 1.1 Q2 2017
- FPWD for Conneg by application profile Q1 2018
- CR for both Rec Track documents Q1 2019
- CR for both Rec Track documents Q2 2019
For all specifications, this Working Group will seek horizontal review for accessibility, internationalization, performance, privacy, and security with the relevant Working and Interest Groups, and with the TAG. Invitation for review must be issued during each major standards-track document transition, including FPWD and CR, and should be issued when major changes occur in a specification.
Additional technical coordination with the following Groups will be made, per the W3C Process Document:
- Web Platform Working Group
In particular over the use of the term profile and expectations for what contnet negotiation by application profile might mean in other contexts.
- Internationalization Activity
Ensure that multilinguality concerns are properly reflected in DCAT revision.
- Privacy Interest Group
Ensure that privacy concerns are addressed, for example, if a dataset includes personally identifiable information.
- Web Application Security Working Group
In particular concerning the conneg by application profile spec, ensuring that no security vulnerabilities are introduced.
- Shape Expressions Community Group
The work of this CG is of direct relevance to the concept of application profiles.
- The RDF Data Shapes Working Group
This WG is expected to have completed its work shortly after the DXWG is formed, however, efforts should be made to liaise with its community.
- schema.org for datasets Community Group
This CG is clearly of high relevance to the DXWG
- European Commission's ISA Programme
This is the body responsible for interoperability across the EU and whose outputs include various application profiles of DCAT.
To be successful, this Working Group is expected to have 6 or more active participants for its duration, including representatives from key implementors and users (e.g, governments and research data managers) of this specification, and active Editors. The Chairs, specification Editors, and Test Leads are expected to contribute half of a day per week towards the Working Group. There is no minimum requirement for other Participants.
The group encourages questions, comments and issues on its public mailing lists and document repositories, as described in Communication.
The group also welcomes non-Members to contribute technical submissions for consideration upon their agreement to the terms of the W3C Patent Policy.
Technical discussions for this Working Group are conducted in public: the meeting minutes from teleconference and face-to-face meetings will be archived for public review, and technical discussions and issue tracking will be conducted in a manner that can be both read and written to by the general public. Working Drafts and Editor's Drafts of specifications will be developed on a public repository, and may permit direct public contribution requests. The meetings themselves are not open to public participation, however.
Information about the group (including details about deliverables, issues, actions, status, participants, and meetings) will be available from the @@@Dataset Exchange Working Group home page.@@@
This group primarily conducts its technical work: on the public mailing list @@@ (@@@archive@@@). The public is invited to review, discuss and contribute to this work.
The group may use a Member-confidential mailing list for administrative purposes and, at the discretion of the Chairs and members of the group, for member-only discussions in special cases when a participant requests such a discussion.
This group will seek to make decisions through consensus and due process, per the W3C Process Document (section 3.3). Typically, an editor or other participant makes an initial proposal, which is then refined in discussion with members of the group and other reviewers, and consensus emerges with little formal voting being required.
However, if a decision is necessary for timely progress, but consensus is not achieved after careful consideration of the range of views presented, the Chairs may call for a group vote, and record a decision along with any objections.
To afford asynchronous decisions and organizational deliberation, any resolution (including publication decisions) taken in a face-to-face meeting or teleconference will be considered provisional. A call for consensus (CfC) will be issued for all resolutions (for example, via email and/or web-based survey), with a response period from one week to 10 working days, depending on the chair's evaluation of the group consensus on the issue. If no objections are raised on the mailing list by the end of the response period, the resolution will be considered to have consensus as a resolution of the Working Group.
All decisions made by the group should be considered resolved unless and until new information becomes available, or unless reopened at the discretion of the Chairs or the Director.
This charter is written in accordance with the W3C Process Document (Section 3.4, Votes), and includes no voting procedures beyond what the Process Document requires.
This Working Group operates under the W3C Patent Policy (5 February 2004 Version). To promote the widest adoption of Web standards, W3C seeks to issue Recommendations that can be implemented, according to this policy, on a Royalty-Free basis. For more information about disclosure obligations for this group, please see the W3C Patent Policy Implementation.
This Working Group will use the W3C Document license for all its deliverables.
About this Charter
This charter has been created according to section 5.2 of the Process Document. In the event of a conflict between this document or the provisions of any charter and the W3C Process, the W3C Process shall take precedence.
The following table lists details of all changes from the initial charter, per the W3C Process Document (section 5.2.3):
|Charter Period||Start Date||End Date||Changes|
|Initial Charter||[dd monthname yyyy] Expected||[dd monthname yyyy] Expected||–|