The Resource Description Framework (RDF) is a general-purpose language for representing information in the Web.
This document defines a textual syntax for RDF called Turtle that allows an RDF graph to be completely written in a compact and natural text form, with abbreviations for common usage patterns and datatypes. Turtle provides levels of compatibility with the N-Triples [[RDF12-N-TRIPLES]] format as well as the triple pattern syntax of [[[SPARQL12-QUERY]]] W3C Recommendation.
This document is a part of the RDF 1.2 document suite. The document defines Turtle, the Terse RDF Triple Language, a concrete syntax for RDF [[RDF12-CONCEPTS]].
This document defines Turtle, the Terse RDF Triple Language, a concrete syntax for RDF [[RDF12-CONCEPTS]].
A Turtle document is a textual representations of an RDF graph. The following Turtle document describes the relationship between Green Goblin and Spiderman.
This example introduces many of features of the Turtle language:
@base and Relative IRI references,
@prefix and prefixed names,
predicate lists separated by '
object lists separated by '
The Turtle grammar for
is a subset of [[[?SPARQL12-QUERY]]] [[?SPARQL12-QUERY]] grammar
The two grammars share production and terminal names where possible.
The construction of an RDF graph from a Turtle document is defined in Turtle Grammar and Parsing.
A Turtle document allows writing down an RDF graph in a compact textual form. An RDF graph is made up of triples consisting of a subject, predicate and object.
Comments may be given after a '
#' that is not part of
another lexical token and continue to the end of the line.
The simplest triple statement is a sequence of (subject, predicate, object) terms,
separated by whitespace and terminated by '
.' after each triple.
Often the same subject will be referenced by a number of predicates.
The predicateObjectList production
matches a series of predicates and objects, separated by '
following a subject.
This expresses a series of RDF Triples with that subject and each predicate
and object allocated to one triple.
Thus, the '
;' symbol is used to repeat the subject of triples
that vary only in predicate and object RDF terms.
These two examples are equivalent ways of writing the triples about Spiderman.
As with predicates often objects are repeated with the same subject and predicate.
The objectList production
matches a series of objects separated by '
,' following a predicate.
This expresses a series of RDF Triples with the corresponding subject and predicate
and each object allocated to one triple.
Thus, the '
,' symbol is used to repeat the subject and predicate
of triples that only differ in the object RDF term.
These two examples are equivalent ways of writing Spiderman's name in two languages.
There are three types of RDF Term defined in RDF Concepts: IRIs (Internationalized Resource Identifiers), literals and blank nodes. Turtle provides a number of ways of writing each.
IRIs may be written as relative or absolute IRIs or prefixed names.
Relative and absolute IRIs are enclosed in '<' and '>' and
may contain numeric escape sequences (described below).
Relative IRI references like
are resolved relative to the current base IRI.
A new base IRI can be defined using the '
@base' or '
directive. Specifics of this operation are defined in
The token '
a' in the predicate position of a Turtle triple
represents the IRI
A prefixed name is a prefix label and a local part,
separated by a colon ":".
A prefixed name is turned into an IRI by concatenating the IRI associated
with the prefix and the local part.
@prefix' or '
PREFIX' directive associates
a prefix label with an IRI.
@prefix' or '
may re-map the same prefix label.
The Turtle language originally permitted only the syntax including
@' character for writing prefix and base directives.
The case-insensitive '
PREFIX' and '
forms were added to align Turtle's syntax with that of SPARQL.
It is advisable to serialize RDF using the '
@prefix' and '
forms until RDF 1.1 Turtle parsers are widely deployed.
using a prefixed name:
somePrefix:enemyOfwhich is equivalent to writing
This can be written using either the original Turtle syntax for prefix declarations:
or SPARQL's syntax for prefix declarations:
Prefixed names are a superset of XML QNames. They differ in that the local part of prefixed names may include:
The following Turtle document contains examples of all the different ways of writing IRIs in Turtle.
@prefix' and '
require a trailing '
.' after the IRI,
the equivalent '
PREFIX' and '
must not have a trailing '
.' after the IRI part of the directive.
Literals are used to identify values such as strings, numbers, dates.
Quoted Literals (Grammar production
RDFLiteral) have a lexical form
followed by a language tag, a datatype IRI, or neither.
The representation of the lexical form consists of an initial delimiter,
a sequence of permitted characters or numeric escape sequence
or string escape sequence, and a final delimiter.
The corresponding RDF lexical form
is the characters between the delimiters,
after processing any escape sequences.
If present, the language tag is preceded by a '
If there is no language tag, there may be a datatype IRI,
preceeded by '
The datatype IRI in Turtle may be written using either an
a relative IRI reference,
or prefixed name.
If there is no datatype IRI and no language tag,
the datatype is
may not appear in any quoted literal except as part of an escape sequence.
Other restrictions depend on the delimiter:
'(U+0027), may not contain unescaped
", may not contain unescaped
'''may not contain the sequence of characters
"""may not contain the sequence of characters
Numbers can be written like other literals with lexical form and datatype
Turtle has a shorthand syntax for writing integer values,
arbitrary precision decimal values, and double precision floating point values.
||Integer values may be written as an optional sign and a series of digits.
Integers match the regular expression "
||Arbitrary-precision decimals may be written as an optional sign,
zero or more digits,
a decimal point and one or more digits.
Decimals match the regular expression "
||Double-precision floating point values may be written as
an optionally signed mantissa with an optional decimal point,
the letter "e" or "E", and an optionally signed integer exponent.
The exponent matches the regular expression "
Boolean values may be written as either '
and represent RDF literals with the datatype xsd:boolean.
RDF blank nodes in Turtle are expressed
_: followed by a blank node identifier which is a series of characters.
The characters in the identifier are built upon
liberalized as follows:
_and digits may appear anywhere in a blank node identifier.
.may appear anywhere except the first or last character.
U+2040are permitted anywhere except the first character.
A fresh RDF blank node is allocated for each unique blank node identifier in a document. Repeated use of the same blank node identifier identifies the same blank node.
In Turtle, fresh RDF blank nodes are also allocated when matching the production blankNodePropertyList and the terminal ANON. Both of these may appear in the subject or object position of a triple (see the Turtle Grammar). That subject or object is a fresh RDF blank node. This blank node also serves as the subject of the triples produced by matching the predicateObjectList production embedded in a blankNodePropertyList. The generation of these triples is described in Predicate Lists. Blank nodes are also allocated for collections described below.
The Turtle grammar allows blankNodePropertyLists
to be nested.
In this case, each inner
establishes a new subject blank node which reverts to the outer node
], and serves as the current subject
for predicate object lists.
The use of predicateObjectList within a blankNodePropertyList is a common idiom for representing a series of properties of a node.
|Abbreviated:||Corresponding simple triples:|
RDF provides a Collection [[RDF12-SEMANTICS]]
structure for lists of RDF nodes.
The Turtle syntax for Collections is a possibly empty list of RDF terms enclosed by
This collection represents an
list structure with the sequence of objects of the
statements being the order of the terms enclosed by
(…) syntax MUST appear in the
or object position of a triple
(see the Turtle Grammar).
The blank node at the head of the list is the subject or object of the containing triple.
This example is a Turtle translation of example 7 in [[[RDF12-XML]]] (example1.ttl):
An example of an RDF collection of two literals.
which is short for (example2.ttl):
An example of two identical triples containing literal objects containing newlines, written in plain and long literal forms. The line breaks in this example are LINE FEED characters (U+000A). (example3.ttl):
As indicated by the grammar,
a collection can be either
a subject or
This subject or object will be the novel blank node for the first object,
if the collection has one or more objects, or
if the collection is empty.
is syntactic sugar for (noting that the blank nodes
do not occur anywhere else in the RDF graph):
RDF collections can be nested and can involve other syntactic forms:
is syntactic sugar for:
The [[[SPARQL12-QUERY]]] (SPARQL) [[SPARQL12-QUERY]] uses a Turtle style syntax for its TriplesBlock production. This production differs from the Turtle language in that:
$name) in any part of the triple of the form.
@basedeclarations are case sensitive, the SPARQL derived
BASEare case insensitive.
true' and '
false' are case insensitive in SPARQL and case sensitive in Turtle.
TrUeis not a valid boolean value in Turtle.
For further information see the Syntax for IRIs and SPARQL Grammar sections of the SPARQL query document [[SPARQL12-QUERY]].
This specification defines conformance criteria for:
A conforming Turtle document is a Unicode string that conforms to the grammar and additional constraints defined in , starting with the
turtleDoc production. A Turtle document serializes an RDF graph.
A conforming Turtle parser is a system capable of reading Turtle documents on behalf of an application. It makes the serialized RDF graph, as defined in , available to the application, usually through some form of API.
The IRI that identifies the Turtle language is:
This specification does not define how Turtle parsers handle non-conforming input documents.
The media type of Turtle is
The content encoding of Turtle content is always UTF-8. Charset
parameters on the media type are required until such time as the
text/ media type tree permits UTF-8 to be sent without a
charset parameter. See for the media type
A Turtle document is a Unicode[[!UNICODE]] character string encoded in UTF-8. Unicode characters only in the range U+0000 to U+10FFFF inclusive are allowed.
White space (production WS) is used to separate two terminals which would otherwise be (mis-)recognized as one terminal. Rule names below in capitals indicate where white space is significant; these form a possible choice of terminals for constructing a Turtle parser.
White space is significant in the production String.
Comments in Turtle take the form of '#', outside an IRIREF or String, and continue to the end of line (marked by characters U+000D or U+000A) or end of file if there is no end of line after the comment marker. Comments are treated as white space.
Relative IRI references are resolved with base IRIs as per [[[RFC3986]]] [[RFC3986]] using only the basic algorithm in section 5.2. Neither Syntax-Based Normalization nor Scheme-Based Normalization (described in sections 6.2.2 and 6.2.3 of RFC3986) are performed. Characters additionally allowed in IRI references are treated in the same way that unreserved characters are treated in URI references, per section 6.5 of [[[RFC3987]]] [[RFC3987]].
BASE directive defines the Base IRI used to resolve relative IRI references
per [[RFC3986]] section 5.1.1, "Base URI Embedded in Content".
Section 5.1.2, "Base URI from the Encapsulating Entity"
defines how the In-Scope Base IRI may come from an encapsulating document,
such as a SOAP envelope with an `xml:base` directive or a MIME multipart document with a
The "Retrieval URI" identified in 5.1.3, Base "URI from the Retrieval URI",
is the URL from which a particular Turtle document was retrieved.
If none of the above specifies the Base URI, the default
Base URI (section 5.1.4, "Default Base URI") is used.
BASE directive sets a new In-Scope Base URI,
relative to the previous one.
There are three forms of escapes used in turtle documents:
numeric escape sequences represent Unicode code points:
|Escape sequence||Unicode code point|
|'\u' hex hex hex hex||A Unicode character in the range U+0000 to U+FFFF inclusive corresponding to the value encoded by the four hexadecimal digits interpreted from most significant to least significant digit.|
|'\U' hex hex hex hex hex hex hex hex||A Unicode character in the range U+0000 to U+10FFFF inclusive corresponding to the value encoded by the eight hexadecimal digits interpreted from most significant to least significant digit.|
where HEX is a hexadecimal character
HEX ::= [0-9] | [A-F] | [a-f]
string escape sequences represent the characters traditionally escaped in string literals:
|Escape sequence||Unicode code point|
reserved character escape sequences consist of
a '\' followed by one of
and represent the character to the right of the '\'.
|IRIs, used as RDF terms or as in @prefix, PREFIX, @base, or BASE declarations||yes||no||no|
%-encoded sequences are in the
character range for IRIs
and are explicitly allowed in local names.
These appear as a '%' followed by two hex characters and represent that
same sequence of three characters. These sequences are not
decoded during processing.
A term written as
in Turtle designates the IRI
and not IRI
A term written as
ex:%66oo-bar with a prefix
@prefix ex: <http://a.example/>
also designates the IRI
The EBNF used here is defined in XML 1.0 [[!EBNF-NOTATION]]. Production labels consisting of a number and a final 's', e.g. [60s], reference the production with that number in the [[[SPARQL12-QUERY]]] [[SPARQL12-QUERY]].
false') are case-sensitive. Keywords in double quotes ("
PREFIX") are case-insensitive.
ECHARare case sensitive.
]' token allows any amount of white space and comments between
s. The single space version is used in the grammar for clarity.
@prefix' and '
@base' match the pattern for LANGTAG, though neither "
prefix" nor "
base" are registered language subtags. This specification does not define whether a quoted literal followed by either of these tokens (e.g.
"A"@base) is in the Turtle language.
The RDF 1.2 Concepts and Abstract Syntax specification [[!RDF12-CONCEPTS]] defines three types of RDF Term:
Literals are composed of a lexical form and an optional language tag [[!BCP47]] or datatype IRI.
An extra type,
prefix, is used during parsing to map string identifiers to namespace IRIs.
This section maps a string conforming to the grammar in
to a set of triples by mapping strings matching productions and lexical tokens
to RDF terms or their components (e.g. language tags, lexical forms of literals).
Grammar productions change the parser state and emit triples.
Parsing Turtle requires a state of five items:
baseURI— When the base production is reached, the second rule argument,
IRIREF, is the base URI used for relative IRI resolution.
namespaces— The second and third rule arguments (
IRIREF) in the prefixID production assign a namespace name (
IRIREF) for the prefix (
PNAME_NS). Outside of a
PNAME_NSis substituted with the namespace. Note that the prefix may be an empty string, per the
bnodeLabels— A mapping from string to blank node.
curSubjectis bound to the
curPredicateis bound to the
verbproduction. If token matched was "
curPredicateis bound to the IRI
This table maps productions and lexical tokens to
RDF terms or components of
RDF terms listed in :
|IRIREF||IRI||The characters between "<" and ">" are taken, with the numeric escape sequences unescaped, to form the unicode string of the IRI. Relative IRI reference resolution is performed per Section 6.3.|
|PNAME_NS||prefix||When used in a prefixID or sparqlPrefix production, the |
|IRI||When used in a PrefixedName production, the |
|PNAME_LN||IRI||A potentially empty prefix is identified by the first sequence, |
|STRING_LITERAL_SINGLE_QUOTE||lexical form||The characters between the outermost "'"s are taken, with numeric and string escape sequences unescaped, to form the unicode string of a lexical form.|
|STRING_LITERAL_QUOTE||lexical form||The characters between the outermost '"'s are taken, with numeric and string escape sequences unescaped, to form the unicode string of a lexical form.|
|STRING_LITERAL_LONG_SINGLE_QUOTE||lexical form||The characters between the outermost "'''"s are taken, with numeric and string escape sequences unescaped, to form the unicode string of a lexical form.|
|STRING_LITERAL_LONG_QUOTE||lexical form||The characters between the outermost '"""'s are taken, with numeric and string escape sequences unescaped, to form the unicode string of a lexical form.|
|LANGTAG||language tag||The characters following the |
|RDFLiteral||literal||The literal has a lexical form of the first rule argument, |
|INTEGER||literal||The literal has a lexical form of the input string, and a datatype of |
|DECIMAL||literal||The literal has a lexical form of the input string, and a datatype of |
|DOUBLE||literal||The literal has a lexical form of the input string, and a datatype of |
|BooleanLiteral||literal||The literal has a lexical form of the |
|BLANK_NODE_LABEL||blank node||The string matching the second argument, |
|ANON||blank node||A blank node is generated.|
|blankNodePropertyList||blank node||A blank node is generated. Note the rules for |
|collection||blank node||For non-empty lists, a blank node is generated. Note the rules for |
|IRI||For empty lists, the resulting IRI is |
A Turtle document defines an RDF graph composed of set of RDF triples.
subject production sets the
verb production sets the
in the document produces an RDF triple:
production records the
to a novel
The node produced by matching
is the blank node
collection production records the
object in the
collection production has a
curSubject set to a novel
B and a
curPredicate set to
For each object
objectn after the first produces a triple:
collection production creates an additional triple
curSubject rdf:rest rdf:nil . and restores
The node produced by matching
collection is the first blank node
B for non-empty lists and
rdf:nil for empty lists.
The following informative example shows the semantic actions performed when parsing this Turtle document with an LALR(1) parser:
ericFoafto the IRI
curSubjectand reassign to the blank node
curPredicateto their saved values (
can be used to embed data blocks in documents. Turtle can be easily embedded in HTML this way.
Turtle content should be placed in a
script tag with the
type attribute set to
do not need to be escaped inside of script tags. The character encoding of the embedded Turtle
will match the HTML documents encoding.
text/html) can break when used in XHTML
When embedded in XHTML Turtle data blocks must be enclosed in CDATA sections. Those CDATA markers must be in Turtle comments. If the character sequence "
]]>" occurs in the document it must be escaped using strings escapes (
\u005d\u0054\u003e). This will also make Turtle safe in polyglot documents served as both
application/xhtml+xml. Failing to use CDATA sections or escape "
]]>" may result in a non well-formed XML document.
There are no syntactic or grammar differences between parsing Turtle
that has been embedded and normal Turtle documents.
A Turtle document parsed from an HTML DOM will be a stream
of character data rather than a stream of UTF-8 encoded bytes.
No decoding is necessary if the HTML document has already been parsed into DOM.
script data block is considered to be it's own Turtle document.
@base declarations in a Turtle data bloc
are scoped to that data block and do not effect other data blocks.
lang attribute or XHTML
have no effect on the parsing of the data blocks.
The base URI of the encapsulating HTML document provides a
"Base URI Embedded in Content" per RFC3986 section 5.1.1.
The STRING_LITERAL_SINGLE_QUOTE, STRING_LITERAL_QUOTE, STRING_LITERAL_LONG_SINGLE_QUOTE, and STRING_LITERAL_LONG_QUOTE, productions allow the use of unescaped control characters. Although this specification does not directly expose this content to an end user, it might be presented through a user agent, which may cause the presented text to be obfuscated due to presentation of such characters.
Turtle is a general-purpose assertion language; applications may evaluate given data to infer more assertions or to dereference IRIs, invoking the security considerations of the scheme for that IRI. Note in particular, the privacy issues in [[RFC3023]] section 10 for HTTP IRIs. Data obtained from an inaccurate or malicious data source may lead to inaccurate or misleading conclusions, as well as the dereferencing of unintended IRIs. Care must be taken to align the trust in consulted resources with the sensitivity of the intended use of the data; inferences of potential medical treatments would likely require different trust than inferences for trip planning.
The Turtle language is used to express arbitrary application data; security considerations will vary by domain of use. Security tools and protocols applicable to text (for example, PGP encryption, checksum validation, password-protected compression) may also be used on Turtle documents. Security/privacy protocols must be imposed which reflect the sensitivity of the embedded information.
Turtle can express data which is presented to the user, such as RDF Schema labels. Applications rendering strings retrieved from untrusted Turtle documents, or using unescaped characters, SHOULD use warnings and other appropriate means to limit the possibility that malignant strings might be used to mislead the reader. The security considerations in the media type registration for XML ([[RFC3023]] section 10) provide additional guidance around the expression of arbitrary data and markup.
Turtle uses IRIs as term identifiers. Applications interpreting data expressed in Turtle SHOULD address the security issues of [[[RFC3987]]] [[RFC3987]] Section 8, as well as [[[RFC3986]]] [[RFC3986]] Section 7.
Multiple IRIs may have the same appearance. Characters in different scripts may look similar (for instance, a Cyrillic "о" may appear similar to a Latin "o"). A character followed by combining characters may have the same visual representation as another character (for example, LATIN SMALL LETTER "E" followed by COMBINING ACUTE ACCENT has the same visual representation as LATIN SMALL LETTER "E" WITH ACUTE). Any person or application that is writing or interpreting data in Turtle must take care to use the IRI that matches the intended semantics, and avoid IRIs that may look similar. Further information about matching visually similar characters can be found in [[[UNICODE-SECURITY]]] [[UNICODE-SECURITY]] and [[[RFC3987]]] [[RFC3987]] Section 8.
The Internet Media Type (formerly known as MIME Type) for Turtle is "text/turtle".
It is recommended that Turtle files have the extension ".ttl" (all lowercase) on all platforms.
It is recommended that Turtle files stored on Macintosh HFS file systems be given a file type of "TEXT".
This information that follows has been submitted to the IESG for review, approval, and registration with IANA.
charset— this parameter is required when transferring non-ASCII data. If present, the value of
This work was described in the paper New Syntaxes for RDF which discusses other RDF syntaxes and the background to the Turtle (Submitted to WWW2004, referred to as N-Triples Plus there).
This work was started during the Semantic Web Advanced Development Europe (SWAD-Europe) project funded by the EU IST-7 programme IST-2001-34732 (2002-2004) and further development supported by the Institute for Learning and Research Technology at the University of Bristol, UK (2002-Sep 2005).
Valuable contributions to this version were made by Gregg Kellogg, Andy Seaborn, Sandro Hawke and the members of the RDF Working Group.
The document was improved through the review process by the wider community.
In addition to the editors, the following people have contributed to this specification:
Recognize members of the Task Force? Not an easy to find list of contributors.