N-Triples is a line-based, plain text format for encoding an RDF graph.
RDF 1.2 N-Triples introduces quoted triples as a fourth kind of RDF term which can be used as the subject or object of another triple, making it possible to make statements about other statements.
This document is part of the RDF 1.2 document suite. The N-Triples format is a line-based RDF syntax based on a subset of Turtle [[RDF12-TURTLE]].
This document defines N-Triples, a concrete syntax for RDF [[RDF12-CONCEPTS]]. N-Triples is an easy to parse line-based subset of Turtle [[RDF12-TURTLE]].
The syntax is a revised version of N-Triples as originally defined in the RDF Test Cases [[RDF-TESTCASES]] document. Its original intent was for writing test cases, but it has proven to be popular as an exchange format for RDF data.
An N-Triples document contains no parsing directives.
N-Triples triples are a sequence of RDF terms representing the
of an RDF Triple.
These may be separated by white space (spaces
U+0020 or tabs
This sequence is terminated by a '
(optionally followed by white space and/or a comment),
and a new line (optional at the end of a document).
N-Triples triples are also Turtle simple triples, but Turtle includes other representations of RDF terms and abbreviations of RDF Triples. When parsed by a Turtle parser, data in the N-Triples format will produce exactly the same triples as a parser for the N-triples language.
The RDF graph represented by an N-Triples document contains
exactly each triple matching the N-Triples
An N-Triples document allows writing down an
in a textual form.
An RDF graph is made up of simple triples
consisting of a
and optional blank lines.
Comments may be given after a '
#' that is not part of
another lexical token and continue to the end of the line.
The simplest triple statement is a sequence of
and terminated by '
White space (spaces
U+0020 or tabs
U+0009) may surround terms,
except where significant as noted in the grammar.
Comments are treated as white space, and may be given after a '
#' that is not part of
another lexical token and continue to the end of the line.
A quoted triple may be the subject or object of an RDF triple.
A quoted triple
is represented as a
<< and followed by
Note that quoted triples
may be recursive.
IRIs may be written only as absolute IRIs.
IRIs are enclosed in '<' and '>' and may contain numeric escape sequences (described below).
Literals are used to identify values such as strings, numbers, dates.
Literals (Grammar production Literal) have a lexical form followed by a language tag, a datatype IRI, or neither.
The representation of the lexical form consists of an
a sequence of permitted characters or numeric escape sequence or string escape sequence,
and a final delimiter.
Literals may not contain the characters
LF (U+000A), or
except in their escaped forms.
In addition '
may not appear in any quoted literal except as part of an escape sequence
" (U+0022) character
can only be included in a quote literal using an escape sequence.
The corresponding RDF lexical form
is the characters between the delimiters, after processing any escape sequences.
If present, the language tag
is preceded by a '
If there is no language tag, there may be a datatype IRI,
preceded by '
^^' (U+005E U+005E).
If there is no datatype IRI and no language tag
it is a simple literal
and the datatype is
RDF blank nodes in N-Triples are expressed as
_: followed by a blank node label which is a series of name characters.
The characters in the label are built upon PN_CHARS_BASE, liberalized as follows:
[0-9]may appear anywhere in a blank node label.
.may appear anywhere except the first or last character.
U+2040are permitted anywhere except the first character.
A fresh RDF blank node is allocated for each unique blank node label in a document. Repeated use of the same blank node label identifies the same RDF blank node.
This section defines a canonical form of N-Triples which has a completely specified layout. The grammar for the language is unchanged.
While the N-Triples syntax allows choices for the representation and layout of RDF data,
the canonical form of N-Triples provides a unique syntactic representation of any triple.
Each code point
can be represented by only one of
or unencoded character,
where the relevant production allows for a choice in representation.
Each triple is represented entirely on a single line with specified white space.
Canonical N-Triples has the following additional constraints on layout:
object, any of which MUST be a single space (
http://www.w3.org/2001/XMLSchema#stringMUST NOT use the datatype IRI part of the literal, and are represented using only STRING_LITERAL_QUOTE.
HEXMUST use only uppercase letters (
\) MUST be encoded using
ECHAR. Characters in the range from
DEL) that are not represented using
ECHARMUST be represented by
UCHAR. All other characters MUST be represented by their native [[UNICODE]] representation.
EOLMUST be a single
EOLMUST be provided.
This specification defines conformance criteria for:
A conforming N-Triples document is a Unicode string that conforms to the grammar and additional constraints defined in , starting with the
An N-Triples document serializes an RDF graph.
A conforming Canonical N-Triples document is an N-Triples document that follows the additional constraints of Canonical N-Triples.
A conforming N-Triples parser is a system capable of reading N-Triples documents on behalf of an application. It makes the serialized RDF graph, as defined in , available to the application, usually through some form of API.
The IRI that identifies the N-Triples language is:
The media type of N-Triples is
The content encoding of N-Triples is always UTF-8.
See N-Triples Media Type for the media type
N-Triples has been historically provided with other media types.
N-Triples may also be provided as
When used in this way N-Triples MUST> use the escaped form of any character outside US-ASCII.
As N-Triples is a subset of Turtle an N-Triples document MAY also be provided as
In both of these cases the document is not an N-Triples document as an N-Triples document is only provided as
An N-Triples document is a Unicode [[UNICODE]] character string encoded in UTF-8.
White space (tab U+0009 or space U+0020) is allowed outside of terminals. Rule names below in capitals indicate where white space is significant.
White space is significant in the production STRING_LITERAL_QUOTE.
A blank line, consisting of only white space and/or a comment,
may appear wherever a
triple production is allowed,
and are treated as white space.
N-Triples allows only horizontal white space (tab U+0009 or space U+0020)
as compared to Turtle [[RDF12-TURTLE]] which also treats
as white space.
Comments in N-Triples start at '
outside an IRIREF or STRING_LITERAL_QUOTE,
and continue to the end of line
(marked by characters
CR (U+000D or
or end of file if there is no end of line after the comment
Comments are treated as white space.
The EBNF used here is defined in XML 1.0 [[EBNF-NOTATION]].
Escape sequence rules are the same as Turtle [[RDF12-TURTLE]].
However, as only the
production is allowed new lines in literals MUST be escaped.
Parsing N-Triples requires a state of one item:
bnodeLabels— A mapping from string to blank node.
This table maps productions and lexical tokens to
RDF terms or components of
RDF terms listed in :
The string after '
|IRIREF||IRI||The characters between "<" and ">" are taken, with escape sequences unescaped, to form the unicode string of the IRI.|
The characters following the
The characters between the outermost quotation marks (
The literal has a lexical form of the first rule argument,
The quoted triple
is composed of the terms constructed from
An N-Triples document defines an RDF graphs
composed of a set of RDF Triples.
produces a triple defined by the terms constructed for
The STRING_LITERAL_QUOTE production allows the use of unescaped control characters. Although this specification does not directly expose this content to an end user, it might be presented through a user agent, which may cause the presented text to be obfuscated due to presentation of such characters.
N-Triples is a general-purpose assertion language; applications may evaluate given data to infer more assertions or to dereference IRIs, invoking the security considerations of the scheme for that IRI. Note in particular, the privacy issues in [[RFC3023]] section 10 for HTTP IRIs. Data obtained from an inaccurate or malicious data source may lead to inaccurate or misleading conclusions, as well as the dereferencing of unintended IRIs. Care must be taken to align the trust in consulted resources with the sensitivity of the intended use of the data; inferences of potential medical treatments would likely require different trust than inferences for trip planning.
The N-Triples language is used to express arbitrary application data; security considerations will vary by domain of use. Security tools and protocols applicable to text (for example, PGP encryption, checksum validation, password-protected compression) may also be used on N-Triples documents. Security/privacy protocols must be imposed which reflect the sensitivity of the embedded information.
N-Triples can express data which is presented to the user, such as RDF Schema labels. Applications rendering strings retrieved from untrusted N-Triples documents, or using unescaped characters, SHOULD use warnings and other appropriate means to limit the possibility that malignant strings might be used to mislead the reader. The security considerations in the media type registration for XML ([[RFC3023]] section 10) provide additional guidance around the expression of arbitrary data and markup.
N-Triples uses IRIs as term identifiers. Applications interpreting data expressed in N-Triples SHOULD address the security issues of [[[RFC3987]]] [[RFC3987]] Section 8, as well as [[[RFC3986]]] [[RFC3986]] Section 7.
Multiple IRIs may have the same appearance. Characters in different scripts may look similar (for instance, a Cyrillic "о" may appear similar to a Latin "o"). A character followed by combining characters may have the same visual representation as another character (for example, LATIN SMALL LETTER "E" followed by COMBINING ACUTE ACCENT has the same visual representation as LATIN SMALL LETTER "E" WITH ACUTE). Any person or application that is writing or interpreting data in N-Triples must take care to use the IRI that matches the intended semantics, and avoid IRIs that may look similar. Further information about matching visually similar characters can be found in [[[UNICODE-SECURITY]]] [[UNICODE-SECURITY]] and [[[RFC3987]]] [[RFC3987]] Section 8.
The Internet Media Type (formerly known as MIME Type) for N-Triples is "application/n-triples".
It is recommended that N-Triples files have the extension ".nt" (all lowercase) on all platforms.
It is recommended that N-Triples files stored on Macintosh HFS file systems be given a file type of "TEXT".
This information that follows will be submitted to the IESG for review, approval, and registration with IANA.
The editor of the RDF 1.1 edition acknowledges valuable contributions from Gregg Kellogg, Eric Prud'hommeaux, Dave Beckett, David Robillard, Gregory Williams, Pat Hayes, Richard Cyganiak, Henry S. Thompson, Peter Ansell, Evan Patton and David Booth.
This specification is a product of extended deliberations by the members of the RDF Working Group. It draws upon the earlier specification in [[RDF-TESTCASES]], edited by Dave Beckett.
The editors of the RDF 1.2 edition acknowledge valuable contributions from Andy Seaborne.
In addition to the editors, the following people have contributed to this specification:
Recognize members of the Task Force? Not an easy to find list of contributors.