This document describes the CSVW Namespace Vocabulary Terms and Term definitions used for creating Metadata descriptions for Tabular Data. This document provides the RDFS [[RDF-SCHEMA]] vocabulary definition for terms defined in [[tabular-metadata]] and a description of the JSON-LD context definition for use with defining metadata documents.
Alternate versions of the vocabulary definition exist in
Turtle and
JSON-LD,
which also includes the @context
required for metadata descriptions.
These versions may also be retrieved from http://www.w3.org/ns/csvw
using an appropiate HTTP Accept header.
The CSV on the Web Working Group was chartered to produce a recommendation "Access methods for CSV Metadata" as well as recommendations for "Metadata vocabulary for CSV data" and "Mapping mechanism to transforming CSV into various formats (e.g., RDF, JSON, or XML)". This document attempts to partially satisfy the "Metadata vocabulary for CSV data" recommendation by definin all terms used in [[tabular-metadata]].
This document describes the RDFS vocabulary description used in the Metadata Vocabulary for Tabular Data [[tabular-metadata]] along with the default JSON-LD Context.
This specification makes use of the following namespaces:
csvw
:http://www.w3.org/ns/csvw#
rdf
:http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs
:http://www.w3.org/2000/01/rdf-schema#
xsd
:http://www.w3.org/2001/XMLSchema#
dc
:http://purl.org/dc/terms/
dcat
:http://www.w3.org/ns/dcat#
prov
:http://www.w3.org/ns/prov#
The following are class definitions in the csvw
namespace:
Cell |
Cell
A Cell represents a cell at the intersection of a Row and a Column within a Table. |
Column |
Column Description
A Column represents a vertical arrangement of Cells within a Table. |
Datatype |
Datatype
Describes facets of a datatype. |
Dialect |
Dialect Description
A Dialect Description provides hints to parsers about how to parse a linked file. |
Direction |
Direction
The class of table/text directions. |
ForeignKey |
Foreign Key Definition
Describes relationships between Columns in one or more Tables. |
NumericFormat |
Numeric Format
If the datatype is a numeric type, the format property indicates the expected format for that number. Its value must be either a single string or an object with one or more properties. |
Row |
Row
A Row represents a horizontal arrangement of cells within a Table. |
Schema |
Schema
A Schema is a definition of a tabular format that may be common to multiple tables. |
Table |
Annotated Table
An annotated table is a table that is annotated with additional metadata. |
TableGroup |
Group of Tables
A Group of Tables comprises a set of Annotated Tables and a set of annotations that relate to those Tables. |
TableReference |
Table Reference
An object property that identifies a referenced table and a set of referenced columns within that table. |
Transformation |
Transformation Definition
A Transformation Definition is a definition of how tabular data can be transformed into another format. |
The following are property definitions in the csvw
namespace:
aboutUrl |
about URL
A URI template property that MAY be used to indicate what a cell contains information about.
|
base |
base
An atomic property that contains a single string: a term defined in the default context representing a built-in datatype URL, as listed above.
|
columnReference |
column reference
A column reference property that holds either a single reference to a column description object within this schema, or an array of references. These form the referencing columns for the foreign key definition.
|
column |
column
An array property of column descriptions as described in section 5.6 Columns.
|
commentPrefix |
comment prefix
An atomic property that sets the comment prefix flag to the single provided value, which MUST be a string.
|
datatype |
datatype
An object property that contains either a single string that is the main datatype of the values of the cell or a datatype description object. If the value of this property is a string, it MUST be one of the built-in datatypes defined in section 5.11.1 Built-in Datatypes or an absolute URL; if it is an object then it describes a more specialised datatype.
|
decimalChar |
decimal character
A string whose value is used to represent a decimal point within the number.
|
default |
default
An atomic property holding a single string that is used to create a default value for the cell in cases where the original string value is an empty string.
|
describes |
describes
From IANA describes: The relationship A 'describes' B asserts that resource A provides a description of resource B. There are no constraints on the format or representation of either A or B, neither are there any further constraints on either resource.
|
delimiter |
delimiter
An atomic property that sets the delimiter flag to the single provided value, which MUST be a string.
|
dialect |
dialect
An object property that provides a single dialect description. If provided, dialect provides hints to processors about how to parse the referenced files to create tabular data models for the tables in the group.
|
doubleQuote |
double quote
A boolean atomic property that, if `true`, sets the escape character flag to `"`.
|
encoding |
encoding
An atomic property that sets the encoding flag to the single provided string value, which MUST be a defined in [[encoding]]. The default is "utf-8".
|
foreignKey |
foreign key
For a Table: a list of foreign keys on the table. For a Schema: an array property of foreign key definitions that define how the values from specified columns within this table link to rows within this table or other tables.
|
format |
format
An atomic property that contains either a single string or an object that defines the format of a value of this type, used when parsing a string value as described in Parsing Cells in [[tabular-data-model]].
|
groupChar |
group character
A string whose value is used to group digits within the number.
|
header |
header
A boolean atomic property that, if `true`, sets the header row count flag to `1`, and if `false` to `0`, unless headerRowCount is provided, in which case the value provided for the header property is ignored.
|
headerRowCount |
header row count
An numeric atomic property that sets the header row count flag to the single provided value, which must be a non-negative integer.
|
lang |
language
An atomic property giving a single string language code as defined by [[BCP47]].
|
length |
length
The exact length of the value of the cell.
|
lineTerminators |
line terminators
An atomic property that sets the line terminators flag to either an array containing the single provided string value, or the provided array.
|
maxExclusive |
max exclusive
An atomic property that contains a single number that is the maximum valid value (exclusive).
|
maxInclusive |
max inclusive
An atomic property that contains a single number that is the maximum valid value (inclusive).
|
maxLength |
max length
A numeric atomic property that contains a single integer that is the maximum length of the value.
|
minExclusive |
min exclusive
An atomic property that contains a single number that is the minimum valid value (exclusive).
|
minInclusive |
min inclusive
An atomic property that contains a single number that is the minimum valid value (inclusive).
|
minLength |
min length
An atomic property that contains a single integer that is the minimum length of the value.
|
name |
name
An atomic property that gives a single canonical name for the column. The value of this property becomes the name annotation for the described column.
|
note |
note
An array property that provides an array of objects representing arbitrary annotations on the annotated tabular data model.
|
null |
null
An atomic property giving the string or strings used for null values within the data. If the string value of the cell is equal to any one of these values, the cell value is `null`.
|
ordered |
ordered
A boolean atomic property taking a single value which indicates whether a list that is the value of the cell is ordered (if `true`) or unordered (if `false`).
|
pattern |
pattern
A regular expression string, in the syntax and interpreted as defined by [[ECMASCRIPT]].
|
primaryKey |
primary key
For Schema: A column reference property that holds either a single reference to a column description object or an array of references. For Row: a possibly empty list of cells whose values together provide a unique identifier for this row. This is similar to the name of a column.
|
propertyUrl |
property URL
An URI template property that MAY be used to create a URI for a property if the table is mapped to another format.
|
quoteChar |
quote char
An atomic property that sets the quote character flag to the single provided value, which must be a string or `null`.
|
reference |
reference
An object property that identifies a **referenced table** and a set of **referenced columns** within that table.
|
referencedRow |
referenced rows
A possibly empty list of pairs of a foreign key and a row in a table within the same group of tables.
|
required |
required
A boolean atomic property taking a single value which indicates whether the cell must have a non-null value. The default is `false`.
|
resource |
resource
A link property holding a URL that is the identifier for a specific table that is being referenced.
|
row |
row
Relates a Table to each Row output.
|
rowTitle |
row titles
A column reference property that holds either a single reference to a column description object or an array of references.
|
rownum |
row number
The position of the row amongst the rows of the Annotated Tabl, starting from 1
|
scriptFormat |
script format
A link property giving the single URL for the format that is used by the script or template.
|
schemaReference |
schema reference
A link property holding a URL that is the identifier for a schema that is being referenced.
|
separator |
separator
An atomic property that MUST have a single string value that is the character used to separate items in the string value of the cell.
|
skipBlankRows |
skip blank rows
An boolean atomic property that sets the `skip blank rows` flag to the single provided boolean value.
|
skipColumns |
skip columns
An numeric atomic property that sets the `skip columns` flag to the single provided numeric value, which MUST be a non-negative integer.
|
skipInitialSpace |
skip initial space
A boolean atomic property that, if `true`, sets the trim flag to "start". If `false`, to `false`.
|
skipRows |
skip rows
An numeric atomic property that sets the `skip rows` flag to the single provided numeric value, which MUST be a non-negative integer.
|
source |
source
A single string atomic property that provides, if specified, the format to which the tabular data should be transformed prior to the transformation using the script or template.
|
suppressOutput |
suppress output
A boolean atomic property. If `true`, suppresses any output that would be generated when converting a table or cells within a column.
|
table |
table
Relates an Table group to annotated tables.
|
tableDirection |
table direction
One of `rtl`, `ltr` or `auto`. Indicates whether the tables in the group should be displayed with the first column on the right, on the left, or based on the first character in the table that has a specific direction.
|
tableSchema |
table schema
An object property that provides a single schema description as described in section 5.5 Schemas, used as the default for all the tables in the group
|
targetFormat |
target format
A link property giving the single URL for the format that will be created through the transformation.
|
transformations |
transformations
An array property of transformation definitions that provide mechanisms to transform the tabular data into other formats.
|
textDirection |
text direction
An atomic property that must have a single value that is one of `rtl` or `ltr` (the default).
|
title |
title
For a Transformation A natural language property that describes the format that will be generated from the transformation. For a Column: A natural language property that provides possible alternative names for the column.
|
trim |
trim
An atomic property that, if the boolean `true`, sets the trim flag to `true` and if the boolean `false` to `false`. If the value provided is a string, sets the trim flag to the provided value, which must be one of "true", "false", "start" or "end".
|
url |
url
For a Table: This link property gives the single URL of the CSV file that the table is held in, relative to the location of the metadata document. For a Transformation: A link property giving the single URL of the file that the script or template is held in, relative to the location of the metadata document.
|
valueUrl |
valueUrl
An URI template property that is used to map the values of cells into URLs.
|
virtual |
virtual
A boolean atomic property taking a single value which indicates whether the column is a virtual column not present in the original source
|
The following are datatype definitions in the csvw
namespace:
JSON |
JSON
A literal containing JSON.
|
uriTemplate |
uri template
|
The following are instance definitions in the csvw
namespace:
auto |
auto
Indicates whether the tables in the group should be displayed based on the first character in the table that has a specific direction. |
inherit |
inherit
For `textDirection`, indicates that the direction is inherited from the `tableDirection` annotation of the `table`. |
ltr |
left to right
Indicates whether the tables in the group should be displayed with the first column on the right. |
rtl |
right to left
Indicates whether the tables in the group should be displayed with the first column on the left. |
csvEncodedTabularData |
CSV Encoded Tabular Data
Describes the role of a CSV file in the tabular data mapping. |
tabularMetadata |
Tabular Metadata
Describes the role of a Metadata file in the tabular data mapping. |