Copyright © 2024 World Wide Web Consortium. W3C® liability, trademark and permissive document license rules apply.
This section describes the status of this document at the time of its publication. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
This document was published by the Web Applications Working Group as an Editor's Draft.
Publication as an Editor's Draft does not imply endorsement by W3C and its Members.
This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 03 November 2023 W3C Process Document.
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The IDL fragments in this specification must be interpreted as required for conforming IDL fragments, as described in the Web IDL specification. [WEBIDL]
Requirements phrased in the imperative as part of algorithms (such as "strip any leading space characters" or "return false and terminate these steps") are to be interpreted with the meaning of the key word ("must", "should", "may", etc) used in introducing the algorithm.
Conformance requirements phrased as algorithms or specific steps may be implemented in any manner, so long as the end result is equivalent. (In particular, the algorithms defined in this specification are intended to be easy to follow, and not intended to be performant.)
User agents may impose implementation-specific limits on otherwise unconstrained inputs, e.g. to prevent denial of service attacks, to guard against running out of memory, or to work around platform-specific limitations.
When a method or an attribute is said to call another method or attribute, the user agent must invoke its internal API for that attribute or method so that e.g. the author can't change the behavior by overriding attributes or methods with custom properties or functions in ECMAScript. [ECMA-262]
Unless otherwise stated, string comparisons are done in a case-sensitive manner.
If an algorithm calls into another algorithm, any exception that is thrown by the latter (unless it is explicitly caught), must cause the former to terminate, and the exception to be propagated up to its caller.
Vendor-specific proprietary extensions to this specification are strongly discouraged. Authors must not use such extensions, as doing so reduces interoperability and fragments the user base, allowing only users of specific user agents to access the content in question.
If vendor-specific extensions are needed, the members should be prefixed by vendor-specific strings to prevent clashes with future versions of this specification. Extensions must be defined so that the use of extensions neither contradicts nor causes the non-conformance of functionality defined in the specification.
When vendor-neutral extensions to this specification are needed, either this specification can be updated accordingly, or an extension specification can be written that overrides the requirements in this specification. Such an extension specification becomes an applicable specification for the purposes of conformance requirements in this specification.
A document object model (DOM) is an in-memory representation of various types of Nodes where each Node is connected in a tree. The [HTML5] and [DOM4] specifications describe DOM and its Nodes is greater detail.
Parsing is the term used for converting a string representation of a DOM into an actual DOM, and Serializing is the term used to transform a DOM back into a string. This specification concerns itself with defining various APIs for both parsing and serializing a DOM.
HTMLDivElement (nodeName: "div")
┃
┣━ HTMLSpanElement (nodeName: "span")
┃ ┃
┃ ┗━ Text (data: "some ")
┃
┗━ HTMLElement (nodeName: "em")
┃
┗━ Text (data: "text!")
And the HTMLDivElement
node is stored in a variable myDiv
,
then to serialize myDiv
's children simply get (read) the
Element's innerHTML property (this triggers the serialization):
var serializedChildren = myDiv.innerHTML;
// serializedChildren has the value:
// "<span>some </span><em>text!</em>"
To parse new children for myDiv
from a string (replacing its existing
children), simply set the innerHTML property (this triggers
parsing of the assigned string):
myDiv.innerHTML = "<span>new</span><em>children!</em>";
This specification describes two flavors of parsing and serializing: HTML and XML (with XHTML being a type of XML). Each follows the rules of its respective markup language. The above example shows HTML parsing and serialization. The specific algorithms for HTML parsing and serializing are defined in the [HTML5] specification. This specification contains the algorithm for XML serializing. The grammar for XML parsing is described in the [XML10] specification.
Round-tripping a DOM means to serialize and then immediately parse the serialized string back into a DOM. Ideally, this process does not result in any data loss with respect to the identity and attributes of the Node in the DOM. Round-tripping is especially tricky for an XML serialization, which must be concerned with preserving the Node's namespace identity in the serialization (wereas namespaces are ignored in HTML).
Element (nodeName: "root")
┃
┗━ HTMLScriptElement (nodeName: "script")
┃
┗━ Text (data: "alert('hello world')")
An XML serialization must include the HTMLScriptElement
Node's
namespace in order to preserve the identity of the
script
element, and to allow the serialized string to
round-trip through an XML parser. Assuming that root
is in a variable named root
:
var xmlSerialization = new XMLSerializer().serializeToString(root);
// xmlSerialization has the value:
// "<root><script xmlns="http://www.w3.org/1999/xhtml">alert('hello world')</script></root>"
The term context object means the object on which the API being discussed was called.
The following terms are understood to represent their respective namespaces in this specification (and makes it easier to read):
http://www.w3.org/1999/xhtml
http://www.w3.org/XML/1998/namespace
http://www.w3.org/2000/xmlns/
The definition of DOMParser
has moved to the HTML Standard.
WebIDL[Exposed=Window]
interface XMLSerializer
{
constructor
();
DOMString serializeToString
(Node root);
};
XMLSerializer
()
serializeToString
( root )
The XMLSerializer
() constructor must return a new XMLSerializer
object.
The serializeToString
(root) method must
produce an XML serialization of root passing a value of false
for
the require well-formed parameter, and return the result.
The definition of InnerHTML
has moved to the HTML Standard.
Element
interfaceThe definition of outerHTML
has moved to the HTML Standard.
The definition of insertAdjacentHTML
has moved to the HTML Standard.
Range
interfaceThe definition of createContextualFragment
has moved to the HTML Standard.
The definition of fragment parsing algorithm
has moved to the HTML Standard.
The definition of fragment serializing algorithm
has moved to the HTML Standard.
An XML serialization differs from an HTML serialization in the following ways:
namespaceURI
is preserved. In some cases this means that an existing
prefix
, prefix declaration attribute or default namespace declaration attribute
might be dropped, substituted or changed. An HTML serialization does not attempt to
preserve the namespaceURI
.
Otherwise, the algorithm for producing an XML serialization is designed to produce a serialization that is compatible with the HTML parser. For example, elements in the HTML namespace that contain no child nodes are serialized with an explicit begin and end tag rather than using the empty-element tag syntax.
Per [DOM4], Attr
objects do not inherit from Node, and
thus cannot be serialized by the XML serialization algorithm. An attempt to serialize an
Attr object will result in an empty string.
To produce an XML serialization of a Node
node given
a flag require well-formed, run the following steps:
null
.
The context namespace tracks the XML serialization algorithm's current default
namespace. The context namespace is changed when either an Element Node has
a default namespace declaration, or the algorithm generates a default namespace declaration for
the Element Node to match its own namespace. The algorithm assumes no namespace
(null
) to start.
xml
" to
prefix map.
1
. The generated namespace prefix index is used to generate a new unique
prefix value when no suitable existing namespace prefix is available to serialize a
node's namespaceURI
(or the namespaceURI
of one of
node's attributes). See the generate a prefix algorithm.
InvalidStateError
"
DOMException
.
Each of the following algorithms for producing an XML serialization of a DOM node take as input a node to serialize and the following arguments:
The XML serialization algorithm produces an XML serialization of an arbitrary DOM node node based on the node's interface type. Each referenced algorithm is to be passed the arguments as they were recieved by the caller and return their result to the caller. Re-throw any exceptions. If node's interface is:
Element
Document
Comment
Text
DocumentFragment
DocumentType
ProcessingInstruction
Attr
object
true
),
and this node's localName
attribute contains the character
":
" (U+003A COLON) or does not match the XML Name production, then
throw an exception; the serialization of this node would not be a well-formed
element.
<
" (U+003C LESS-THAN SIGN).
false
.
false
.
prefix
strings as its keys, with corresponding namespaceURI
Node values as the map's key values (in this map, the null
namespace is
represented by the empty string).
This map is local to each element. It is used to ensure there are no conflicting
prefixes should a new namespace prefix
attribute need to be
generated. It is also used to enable skipping of duplicate
prefix definitions when
writing an element's attributes: the map
allows the algorithm to distinguish between a prefix
in the
namespace prefix map that might be locally-defined (to the current Element) and
one that is not.
The above step will update map with any found namespace prefix
definitions, add the found prefix definitions to the local prefixes map and
return a local default namespace value defined by a default namespace attribute if one
exists. Otherwise it returns null
.
namespaceURI
attribute.
null
, then set
ignore namespace definition attribute to true
.
xml:
" and the
value of node's localName
.
localName
. The node's
prefix
if it exists, is dropped.
prefix
attribute.
The above may return null
if no namespace key ns exists
in map.
xmlns
", then run the following
steps:
prefix
"xmlns
"
will not legally round-trip in a conforming XML parser.
null
(a namespace prefix is defined which maps to ns), then:
The following may serialize a different
prefix
than the Element's existing
prefix
if it already had one. However, the
retrieving a preferred prefix string algorithm already tried to match the existing
prefix if possible.
:
" (U+003A COLON), and node's
localName
. There exists on this node or the
node's ancestry a namespace prefix definition that defines the node's
namespace.
null
(there exists
a locally-defined default namespace declaration attribute) and its value is not the
XML namespace, then let inherited ns get the value of
local default namespace unless the
local default namespace is the empty string in which case let it get
null
(the context namespace is changed to the declared default, rather
than this node's own namespace).
Any default namespace definitions or namespace prefixes that define the XML namespace are omitted when serializing this node's attributes.
null
, then:
By this step, there is no namespace or prefix mapping declaration in this
node (or any parent node visited by this algorithm) that defines
prefix otherwise the step labelled Found a suitable namespace prefix would
have been followed. The sub-steps that follow will create a new namespace prefix declaration
for prefix and ensure that prefix does not conflict with an existing
namespace prefix declaration of the same localName
in node's
attribute list.
:
" (U+003A COLON), and node's localName
.
The following serializes a namespace prefix declaration for prefix which was just added to the map.
" (U+0020 SPACE);
xmlns:
";
="
" (U+003D EQUALS SIGN, U+0022 QUOTATION MARK);
"
" (U+0022 QUOTATION MARK).
null
(there exists a
locally-defined default namespace declaration attribute), then let
inherited ns get the value of local default namespace
unless the local default namespace is the empty string in which case let
it get null
.
null
, or
local default namespace is not null
and its value is not equal
to ns, then:
At this point, the namespace for this node still needs to be serialized, but
there's no prefix
(or candidate prefix) availble; the following uses
the default namespace declaration to define the namespace--optionally replacing an existing
default declaration if present.
true
.
localName
.
The new default namespace will be used in the serialization to define this node's namespace and act as the context namespace for its children.
The following serializes the new (or replacement) default namespace definition.
" (U+0020 SPACE);
xmlns
";
="
" (U+003D EQUALS SIGN, U+0022 QUOTATION MARK);
"
" (U+0022 QUOTATION MARK).
localName
, let the value of inherited ns be ns,
and append the value of qualified name to markup.
All of the combinations where ns is not equal to
inherited ns are handled above such that node will be serialized
preserving its original namespaceURI
.
localName
matches any
one of the following void elements:
"area
",
"base
",
"basefont
",
"bgsound
",
"br
",
"col
",
"embed
",
"frame
",
"hr
",
"img
",
"input
",
"keygen
",
"link
",
"menuitem
",
"meta
",
"param
",
"source
",
"track
",
"wbr
";
then append the following to markup, in the order listed:
" (U+0020 SPACE);
/
" (U+002F SOLIDUS).
true
.
/
" (U+002F SOLIDUS) to markup
and set the skip end tag flag to true
.
>
" (U+003E GREATER-THAN SIGN) to markup.
true
, then return the value of
markup and skip the remaining steps. The node is a leaf-node.
localName
matches the string "template
", then this is a
template
element. Append to markup the result of
XML serializing a DocumentFragment node given the template element's
template contents (a DocumentFragment
), providing
inherited ns, map, prefix index, and the
require well-formed flag.
This allows template content to round-trip , given the rules for parsing XHTML documents.
</
" (U+003C LESS-THAN SIGN, U+002F SOLIDUS);
>
" (U+003E GREATER-THAN SIGN).
This following algorithm will update the namespace prefix map with any found namespace
prefix definitions, add the found prefix definitions to the local prefixes map,
and return a local default namespace value defined by a default namespace attribute if one
exists. Otherwise it returns null
.
When recording the namespace information for an Element
element, given a namespace prefix map map and a
local prefixes map (initially empty), the user agent must run the following
steps:
null
.
attributes
, in the order they are specified in the element's
attribute list:
The following conditional steps find namespace prefixes. Only attributes
in the XMLNS namespace are considered (e.g., attributes made to look like namespace
declarations via setAttribute("xmlns:pretend-prefix",
"pretend-namespace")
are not included).
namespaceURI
value.
prefix
.
null
, then attr is a default
namespace declaration. Set the default namespace attr value to attr's
value
and stop running these steps, returning to Main to visit
the next attribute.
null
and attr
is a namespace prefix definition. Run the following steps:
localName
.
value
.
XML namespace definitions in prefixes are completely ignored (in
order to avoid unnecessary work when there might be prefix conflicts).
XML namespaced elements are always handled uniformly by prefixing (and overriding
if necessary) the element's localname with the reserved "xml
" prefix.
null
instead.
This step avoids adding duplicate prefix definitions for the same namespace in the map. This has the side-effect of avoiding later serialization of duplicate namespace prefix declarations in any descendant nodes.
null
with the empty string if applicable.
The empty string is a legitimate return value and is not converted to
null
.
A namespace prefix map is a map that associates namespaceURI
and
namespace prefix lists, where namespaceURI
values are the
map's unique keys (which can include the null
value representing no namespace), and
ordered lists of associated prefix
values are the map's key values. The
namespace prefix map will be populated by previously seen namespaceURIs and all their
previously encountered prefix associations for a given node and its ancestors.
Note: the last seen prefix
for a given
namespaceURI
is at the end of its respective list. The list is searched to
find potentially matching prefixes, and if no matches are found for the given
namespaceURI
, then the last prefix
in the list is used. See
copy a namespace prefix map and retrieve a preferred prefix string for additional
details.
To copy a namespace prefix map map means to copy the map's keys into a new empty namespace prefix map, and to copy each of the values in the namespace prefix list associated with each keys' value into a new list which should be associated with the respective key in the new map.
To retrieve a preferred prefix string preferred prefix from the namespace prefix map map given a namespace ns, the user agent should:
null
value.
There will always be at least one prefix value in the list.
To check if a prefix string prefix is found in a namespace prefix map map given a namespace ns, the user agent should:
false
.
true
, otherwise return false
.
To add a prefix string prefix to the namespace prefix map map given a namespace ns, the user agent should:
null
.
null
, then create a new list with
prefix as the only item in the list, and associate that list with a new
key ns in map.
The steps in retrieve a preferred prefix string use the list to track the most recently used (MRU) prefix associated with a given namespace, which will be the prefix at the end of the list. This list may contain duplicates of the same prefix value seen earlier (and that's OK).
The XML serialization of the attributes of an Element
element together with a namespace prefix map map, a
generated namespace prefix index prefix index reference, a
local prefixes map, a ignore namespace definition attribute flag, and a
require well-formed flag, is the result of the following algorithm:
namespaceURI
and localName
pairs, and is populated as
each attr is processed. This set is used to [optionally] enforce
the well-formed constraint that an element cannot have two attributes with the same
namespaceURI
and localName
. This can occur when two
otherwise identical attributes on the same element differ only by their prefix values.
attributes
, in the order they are specified in the element's
attribute list:
true
), and the
localname set contains a tuple whose values match those of a new tuple consisting
of attr's namespaceURI
attribute and
localName
attribute, then throw an exception; the serialization of
this attr would fail to produce a well-formed element serialization.
namespaceURI
attribute and localName
attribute, and add it to the
localname set.
namespaceURI
value.
null
.
null
, then run these sub-steps:
prefix
value.
value
is the XML namespace;
The XML namespace cannot be redeclared and survive
round-tripping (unless it defines the prefix "xml
"). To avoid this
problem, this algorithm always prefixes elements in the XML namespace with
"xml
" and drops any related definitions as seen in the above condition.
prefix
is null
and the
ignore namespace definition attribute flag is true
(the
Element's default namespace attribute should be skipped);
prefix
is not null
and either
localName
is
not a key contained in the local prefixes map, or
localName
is
present in the local prefixes map but the value of the key does not match
attr's value
localName
(as the prefix to
find) is found in the namespace prefix map given the namespace consisting
of the attr's value
(the current namespace prefix
definition was exactly defined previously--on an ancestor element not the current
element whose attributes are being processed).
true
), and
the value of attr's value
attribute matches the
XMLNS namespace, then throw an exception; the serialization of this attribute would
produce invalid XML because the XMLNS namespace is reserved and cannot be applied
as an element's namespace via XML parsing.
DOM APIs do allow creation of elements in the XMLNS namespace but with strict qualifications.
true
), and
the value of attr's value
attribute is the empty string,
then throw an exception; namespace prefix declarations cannot be used to undeclare a
namespace (use a default namespace declaration instead).
prefix
matches the string
"xmlns
", then let candidate prefix be the string
"xmlns
".
" (U+0020 SPACE);
xmlns:
";
="
" (U+003D EQUALS SIGN, U+0022 QUOTATION MARK);
"
" (U+0022 QUOTATION MARK).
" (U+0020 SPACE) to result.
null
, then append to result
the concatenation of candidate prefix with ":
" (U+003A COLON).
true
), and this
attr's localName
attribute contains the character
":
" (U+003A COLON) or does not match the XML Name production or equals
"xmlns
" and attribute namespace is null
, then
throw an exception; the serialization of this attr would not be a
well-formed attribute.
localName
;
="
" (U+003D EQUALS SIGN, U+0022 QUOTATION MARK);
value
attribute and the require well-formed flag as input;
"
" (U+0022 QUOTATION MARK).
When serializing an attribute value given an attribute value and require well-formed flag, the user agent must run the following steps:
true
), and
attribute value contains characters that are not matched by the XML Char
production, then throw an exception; the serialization of this attribute value
would fail to produce a well-formed element serialization.
null
, then return the empty string.
&
" with "&
"
"
" with ""
"
<
" with "<
"
>
" with ">
"
This matches behavior present in browsers, and goes above and beyond the grammar
requirement in the XML specification's AttValue production by also replacing
">
" characters.
To generate a prefix given a namespace prefix map map, a string new namespace, and a reference to a generated namespace prefix index prefix index, the user agent must run the following steps:
ns
" and the
current numerical value of prefix index.
If the require well-formed flag is set (its value is true
), and this
node has no documentElement
(the
documentElement
attribute's value is null
), then
throw an exception; the serialization of this node would not be a well-formed
document.
Otherwise, run the following steps:
This will serialize any number of ProcessingInstruction and Comment nodes both before and after the Document's documentElement node, including at most one DocumentType node. (Text nodes are not allowed as children of the Document.)
If the require well-formed flag is set (its value is true
), and
node's data
contains characters that are not matched by the XML
Char production or contains "--
" (two adjacent U+002D HYPHEN-MINUS
characters) or that ends with a "-
" (U+002D HYPHEN-MINUS) character, then
throw an exception; the serialization of this node's data
would not be well-formed.
Otherwise, return the concatenation of "<!--
", node's
data
, and "-->
".
true
), and
node's data
contains characters that are not matched by the XML
Char production, then throw an exception; the serialization of this
node's data
would not be well-formed.
data
.
&
" in markup by
"&
".
<
" in markup by
"<
".
>
" in markup by
">
".
true
and the node's
publicId
attribute contains characters that are not matched by the XML
PubidChar production, then throw an exception; the serialization of this
node would not be a well-formed document type declaration.
true
and the node's
systemId
attribute contains characters that are not matched by the XML
Char production or that contains both a ""
" (U+0022 QUOTATION MARK) and a
"'
" (U+0027 APOSTROPHE), then throw an exception; the serialization of this
node would not be a well-formed document type declaration.
<!DOCTYPE
" to markup.
" (U+0020 SPACE) to markup.
name
attribute to markup. For a node belonging to an HTML document, the
value will be all lowercase.
publicId
is not the empty string then append
the following, in the order listed, to markup:
" (U+0020 SPACE);
PUBLIC
";
" (U+0020 SPACE);
"
" (U+0022 QUOTATION MARK);
publicId
attribute;
"
" (U+0022 QUOTATION MARK).
systemId
is not the empty string and the
node's publicId
is set to the empty string, then append the
following, in the order listed, to markup:
" (U+0020 SPACE);
SYSTEM
".
systemId
is not the empty string then append
the following, in the order listed, to markup:
" (U+0020 SPACE);
"
" (U+0022 QUOTATION MARK);
systemId
attribute;
"
" (U+0022 QUOTATION MARK).
>
" (U+003E GREATER-THAN SIGN) to markup.
true
), and
node's target
contains a ":
" (U+003A COLON)
character or is an ASCII case-insensitive match for the string "xml
", then
throw an exception; the serialization of this node's
target
would not be well-formed.
true
), and
node's data
contains characters that are not matched by the XML
Char production or contains the string "?>
" (U+003F QUESTION MARK,
U+003E GREATER-THAN SIGN), then throw an exception; the serialization of this
node's data
would not be well-formed.
AttValue
,
Char
,
EmptyElemTag
,
Name
and
PubidChar
productions
We acknowledge with gratitude the original work of Ms2ger and others at the WHATWG, who created and maintained the original DOM Parsing and Serialization Living Standard upon which this specification is based.
Thanks to C. Scott Ananian, Victor Costan, Aryeh Gregor, Anne van Kesteren, Arkadiusz Michalski, Simon Pieters, Henri Sivonen, Josh Soref and Boris Zbarsky, for their useful comments.
Special thanks to Ian Hickson for first defining the innerHTML and outerHTML attributes, and the insertAdjacentHTML method in [HTML5] and his useful comments.
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in: