TTML Live Extensions Guide v1

Introduction

The topic of live subtitle or caption authoring, routing and encoding is large. Organisations such as broadcasters, access service providers and other content providers face a variety of challenges ranging from the editorial, for example word accuracy and rate, to the technical, for example how the text data is routed from the author to the encoder, what format it should be in and how it can be configured and monitored. The classical way to address such a large “problem space” is to divide it up into more easily solvable constituent parts. This approach is taken here.

This document also provides useful options for mixing the play out of prepared subtitle documents and live subtitles, a problem that arises for all broadcast channels whose output is not either 100% pre recorded or 100% live.

Authoring conventions, for example the use of colour to identify speakers, are not directly addressed in this document; however care has been taken to ensure that the technical solutions presented here can support a variety of conventions. System setup, configuration, management, resilience, monitoring and recovery are likewise addressed indirectly by modelling the potential architectures in the abstract and designing the data format to support those architectures.

TTML as an exchange format for live and prepared subtitles.

TTML and profiles such as [[ttml-imsc1.1]] are intended for general use in exchanging prepared subtitles and captions. This workflow is extended by this document to include exchange of live subtitles and captions.

Summary of key points

The content is carried in sequences of Document Instances. In addition to the text it can contain styling, layout, timing and additional metadata information.

Each document instance indicates the sequence to which it belongs using a Sequence Identifier. The order is set by a Sequence Number.

The concept of sequence identification is separate to service identification. Document metadata may be used to allow authors to identify the services for which the sequence is intended to carry subtitles, also known as the broadcast channel etc.

Sequences of live documents are transferred between Nodes. Such transfers are called Streams. Nodes can consume, process and/or output Documents. Different types of Node can send or receive varying numbers of Streams to or from other Nodes. Some examples are shown below.

Processing Nodes output sequences that differ or may differ from those at the input. An authoring station or a spellchecker are examples of processing nodes.

Passive Nodes simply receive and optionally pass on sequences from input to output without modifying the content of any document in the sequence. A distributing node or an encoder are examples of passive nodes.

Documents can use different types of timing. Only one Document can be active at any given time. If different Documents overlap in time, the Document with the highest Sequence number ‘wins’.

When a document includes explicit times using the begin, end or dur attributes, and is available before its begin time, it becomes active at its begin time, until the next document is active, or it reaches its end.

Documents may be sent before they become active; documents may be re evaluated later, for example to archive a corrected version of the subtitles after broadcast, or to retain precise timings from source documents.

If no begin or end attributes are set in the Document, subtitles will be active as soon as they are received, until the next Document is active or the dur on the body element has been reached, if set.

The typical use case is sending subtitles as fast as possible for live subtitling. This simple scheme may not be optimal, because it does not support all the possible use cases, for example creating archive versions is more difficult.

Schematic of use case showing an authoring tool generating a stream of TTML Live subtitles.

illustrates a simple example use case in which a subtitler uses an authoring tool to create a stream of live subtitles. Those are then transferred either:

via a direct IP connection to an Improver and then on to an encoder; or
embedded into an audio/visual stream with an inserter and then de-embedded for passing to an encoder.

A potential addition to this workflow would be an additional connection for example from the Improver to an Archiver to create an archive [[TTML1]] document for later reuse.

The Improver defined in this document is a processing node that could, for example, insert a defined compensating delay, check content for correct spelling, ensure that prohibited code points are not propagated or perform other transformations to generate output suitable for encoding.

Example scenarios

The following examples represent typical real world scenarios in which documents and nodes that conform to this specification can be used.

Handover orchestration

Each subtitler in a group authors subtitles for part of a single programme; each group member takes a turn to contribute for a period of time before handing over to the next subtitler.

Each subtitler creates a distinct sequence of subtitles for their turn. Each of those input sequences has a different sequence identifier. The authoring stations emit the sequences as streams. As part of an externally orchestrated handover process a handover manager node receives all the streams, combines them and emits a new continuous stream. This new output stream’s sequence has a different sequence identifier from each of the input sequences.

Incidentally, each subtitler may subscribe to, and view the others’ streams to assist with the handover orchestration.

Author and correct

A pair of subtitlers authors and corrects live subtitles. The first subtitler creates a sequence using an authoring tool. The second subtitler receives a stream of that sequence and manipulates an Improver Node that allows the sequence to be modified and then issues a new sequence with a different sequence identifier from the input sequence, for consumption downstream.

Timing improvement

An Improver Node receives a stream and a continuous audio track in a reference time frame. The Improver analyses the audio and subtitles and creates a new sequence whose contents are time aligned relative to the audio track’s time frame, using time stamps from a common clock source. The new sequence is issued as a stream with a new sequence identifier.

Retrospective Corrections

A subtitler authors a live subtitle stream whose sequence is archived for later reuse. On noticing an error the subtitler issues a retrospectively timed correction. The archive process uses data within the sequence to apply the correction such that the error it corrects is not apparent within the generated archive document.

Terms and Definitions

The terms and definitions are as defined in the TTML Live Extensions Module, reproduced here for convenience.

Author a person or system that creates a stream of live subtitle data based on observations of some other media, for example by listening to a programme audio track.

Captions and subtitles The term “captions” describes on screen text for use by deaf and hard of hearing audiences. Captions include indications of the speakers and relevant sound effects. The term “subtitles” describes on screen text for translation purposes. For easier reading only the term “subtitles” is used in this specification as the representation of captions and subtitles is the same here. In this specification the term “captions” is used interchangeably with the term “subtitles” (except where noted).

Carriage Mechanism a mechanism by which physical streams may be transferred between nodes.

Document A subtitle document conformant to this specification.

Document availability time The time when a document becomes available for processing.

Document cache The set of documents retained by a node, for example for processing. See also .

Document resolved begin time The time when a document becomes active during a presentation.

This term is used in the same sense as "resolved begin time" is used in [[SMIL3]], when applied to a document and is further defined in the TTML Live Extensions Module.

Document resolved end time The time when a document becomes inactive during a presentation.

This term is used in the same sense as "resolved end time" is used in [[SMIL3]] when applied to a document and is further defined in the TTML Live Extensions Module.

Encoder a system that receives a stream of live subtitle data and somehow encodes it into a format suitable for use downstream, for example EBU-TT-D.

Some encoders may also package the encoded output data into other types of stream e.g. MPEG DASH.

Inserter A unit that embeds subtitle data into an audio/visual stream. This is in common use in current subtitling architectures.

Live Document Any entity defined to be a Live Document by a W3C specification, including all Documents defined in this specification.

Node A unit that creates, emits, receives or processes one or more sequences.

Node identifier The unique identifier of a Node.

Presentation In this document the term 'presentation' is used in the sense in which it is used in [[SMIL3]].

Presentation Processor as defined in [[TTML1]]:

A Content Processor which purpose is to layout, format, and render, i.e., to present, Timed Text Markup Language content by applying the presentation semantics defined in this specification.

Processing Context The configuration and operating parameters of a node that processes a document.

Root Temporal Extent As defined in [[TTML1]].

Sequence A set of related live documents each of which shares the same sequence identifier, for example the documents that define the subtitles for a single programme.

Sequence Begin The start of the interval in which a sequence is presented is referred to as the sequence begin. Equivalent to the document begin [[SMIL3]] of the first document in the sequence.

Sequence End The end of the interval in which a sequence is presented is referred to as the sequence end. Equivalent to the document end [[SMIL3]] of the last document in the sequence.

Sequence Duration The difference between the sequence end and the sequence begin is referred to as the sequence duration.

Service identifier An identifier used to uniquely identify a broadcast service, for example the HD broadcast of the broadcaster’s main channel.

Stream The transfer of a sequence between two nodes.

Logical stream A stream offered or provided by a node to zero or more nodes; identified by the source node identifier and sequence identifier.

Physical stream A stream provided between two nodes, identified by the source node identifier, destination node identifier and sequence identifier.

TTML Live document A live document that is a valid TTML document instance.

Timing and synchronisation

This section defines the temporal processing of a sequence of documents within a presentation, the management of delay in a live authoring environment and the use of reference clocks.

Document resolved begin and end times

Every document in a sequence has a time period during which it is active within a presentation, defined in [[TTML1]] as the Root Temporal Extent. At any single moment in time during the presentation of a sequence either zero documents or one document shall be active. The period during which a document is active begins at the document resolved begin time and ends at the document resolved end time.

An algorithm for computing the time when a document is active during a presentation has the following variables to consider:

The document availability time, i.e. the time when each document becomes available. In a live authoring scenario it is typical for documents to become available during the presentation, soon after they have been authored. In a replay scenario all of the documents may be available prior to the presentation beginning.
The earliest computed begin time in the document, calculated from the begin and end attribute values if present, according to the semantics of [[TTML1]].
The latest computed end time in the document, calculated from the begin and end attribute values if present, according to the semantics of [[TTML1]].
The value of the dur attribute if present on the tt:body element.
The document sequence number.
Any externally specified document activation or deactivation times, such as the beginning or end of sequence presentation.

The definitions of the document resolved begin time and the document resolved end times below are derived from the following rules:

A document cannot become active until it is available, at the earliest;
The absence of any begin time implies that the document is active immediately;
The absence of any end time implies that the document remains active indefinitely;
The dur attribute on the tt:body element imposes a maximum document duration relative to the point at which it became active.
If two documents would otherwise overlap temporally, the document with the greater sequence number supersedes the document with the lesser sequence number.

It is not necessary for all classes of processor to resolve the document begin and end times. For example a processing node that checks text spelling only can do so without reference to the timing constructs defined in this section.

In general the document time base, as specified by the ttp:timeBase attribute, can be transformed within a processing context to a value that can be compared with externally specified document activation times, for example the clock values when documents become available. Implementations that process documents that set ttp:timeBase="smpte" and ttp:markerMode="discontinuous" would not be able to assume the time values to be part of a monotonically increasing clock, but only as markers. This scenario is common with timecodes. TTML Live prohibits use of ttp:timeBase="smpte"; to accommodate operational scenarios in which SMPTE time code is in common use, several strategies are available including:

Use implicitly timed documents only, with the functional limitations implied;
In both the producer and any consumers of documents, use a common algorithm for converting time codes to values on a commonly available reference clock;
Embed values from a commonly available continuous reference clock (e.g. station clock, GPS, UTC etc) into any associated media streams and use those for synchronisation.

Definition of time values used for resolving document begin and end times

The rules for determining resolved begin and end times in this section require comparison of times that are potentially derived from different clock sources. For example the availability time of a document can be found by inspecting a local system clock whereas the earliest computed begin time in the document can be in a timebase relating to a different reference clock.

For the purpose of making these comparisons the following times shall be converted to values on the same timebase:

document availability time;
earliest computed begin time in the document;
any externally specified document activation begin time;
latest computed end time in the document;
any externally specified document deactivation time.

The earliest computed begin time is defined as the earlier of a) the earliest computed begin time of any leaf element in the document and b) the earliest computed time corresponding to a specified begin attribute value on an element that either has no end attribute or has an end attribute value that is later than the begin attribute value.

In the case that a root to leaf path contains elements all of which omit a begin attribute value this evaluates to the value zero on the document’s timebase.

The latest computed end time is defined as the latest computed end time corresponding to a specified end attribute value on an element that either has no specified begin attribute or has an end attribute value that is later than the begin attribute value.

In the case that a root to leaf path contains elements all of which omit an end attribute this evaluates to the [[SMIL3]] term "undefined", that is the latest computed end time is not determined, and is effectively infinite for comparison purposes.

It is syntactically permitted for an element to have a begin attribute value that is later than or equal to its end attribute value; in this case normally the element would be considered never to be active; this is why such elements are excluded from the calculation of the earliest computed begin time and the latest computed end time.

The dur attribute is not used when computing the latest computed end time however it is used when computing the document resolved end time relative to the document resolved begin time; see Document resolved end time.

See [[EBU-TT-Live-1-0]] Annex B for informative worked examples.

Bring the worked examples into this document.

The TTML Live Extensions Module defines the document resolved begin time and the document resolved end time in accordance with these rules.

Bring in the worked examples from EBU-TT Live Annex C.

Document creation and issuing strategies

The above rules allow for implementations to use different strategies for creating and issuing sequences. See [[EBU-TT-Live-1-0]] § 2.3.1.4 for discussion of some of the possible strategies and their potential usage.

Bring in issuing strategies text from EBU-TT Live § 2.3.1.4.

Implementation and Operational Considerations

Pruning ever-extending history

The model presented here allows infinite rewriting of history backwards in time. A new document can be sent that supersedes an arbitrary set of previous documents including a partially superseded document and has a document resolved begin time that falls at any point before, within or after the previously received sequence times. This presents a potential implementation difficulty, since it is impossible in general to know how many documents to retain, and when it is safe to discard documents. Typically it is useful for operational software designed to run continuously to be able to manage its data usage requirements to avoid growing forever. The operational rules for managing data usage are referred to below as the retention semantics.

This problem is however specific to the type of node and the task that it needs to perform. For example some passive nodes need to keep only a transient set of documents, since it is not expected to do more than minimal buffering, say for the time it takes for a distributing node to emit each received document to all subscribers.

Processing nodes may need to retain a longer history depending on what they are doing. A delay node needs to keep documents for at least the offset period by which it is applying a delay. The retention semantics for improver nodes or synthesiser nodes need to be defined based on the functions that they are designed to perform.

Consumer nodes similarly need to have some kind of defined retention semantics. One useful strategy for an encoder is to discard everything that has already been encoded into an output format. For example an [[ttml-imsc1.1]] encoder configured to output an IMSC document every 5 seconds could be configured regularly to discard all documents whose document resolved end times are earlier than the begin time of the next required output document.

The set of documents that a node retains is defined as the document cache.

Synchronising clocks and handling non zero transit times

Consider a simple system in which a subtitler authors explicitly timed subtitles using a Producer Node, and the resulting sequence is streamed immediately to a Consumer node, for example to encode into a downstream format ().

Simple system

If the producer node issues documents with the intent of them being presented immediately, and therefore specifies a begin time equal to its perceived time ‘now’ perhaps with an end time 3 seconds later and sends the document, and it takes a non zero time D_at until the document becomes available then the effective duration for which the subtitles in the document are shown will be reduced by D_at. This is because the document resolved begin time will be the availability time, but the document resolved end time is the specified end time in the document.

This scenario can impact the effective reading speed needed to read the text, but is opaque to the subtitle author, who cannot in general know how long it will take for each document to become available. A similar effect is shown diagrammatically in where the beginning of document 1 is truncated by late availability.

Further discussion of this scenario can be found in [[EBU-TT-Live-1-0]] §2.3.1.4.2.

Diagram showing resolved begin and end times for explicitly timed documents

The above scenario includes an implicit assumption that the two nodes have somehow reached a sufficiently close value for the current time, that is, that they are synchronised. Another cause of such a problem is if that assumption is invalid and the two nodes are in fact not synchronised relative to each other. For example, if a document’s duration is 1s but the consumer node’s clock runs 1s or more later than the producer node’s clock, then even if the real D_at were 0, the consumer node would consider it to be 1s late and the document’s content would never be consumed.

Strategies for identifying that two nodes are not synchronised closely enough include:

Specifying the document creation date and time using ebuttm:documentCreationDate placed as a descendant of /tt:tt/tt:head/tt:metadata
Specifying the document emission time by adding to the document an XML comment such as:
```

```
XML comments are excluded from the fn:deep-equals comparison of XML elements; therefore two otherwise similar documents that contain different comments are considered to be identical by the test defined in the System Model. As a result, passive nodes can add such comments without breaking passive node conformance requirements.

No formal syntax for this or any other XML comment is defined in this document.

Using either or both of these methods, some potential problems can be identified. In scenarios where the relationship between creation time, emission time and availability time can be modelled consistently, variations over time can highlight issues.

These data can also reveal if document times were in the future when the document was created and in the past when it became available downstream, which can cause truncation of the beginning of subtitle presentation as described above. They can also provide some information about where the delay might have occurred.

Strategies for dealing with these practical challenges include:

Ensuring all nodes use clocks synchronised to the same time reference (for example an NTP server, a GPS receiver etc).
If using a local time, specifying the time server’s URL in the ebuttp:referenceClockIdentifier parameter in documents, and using it.
Explicitly adding a delay to the begin times of documents either before emitting them or downstream (perhaps using a Retiming Delay node), where the offset period is greater than or equal to D_at.

This scenario only applies to explicitly timed documents, and a Retiming Delay node that performs this function is not expected both to adjust document times and to delay emission of the adjusted documents, since that would simply reintroduce a new availability time delay.

In this discussion, the term “clock” is used to indicate the time source that is used to convert between real time events such as documents becoming available and times in the document’s timebase. There is no requirement that this is directly related to any system clock.

Management and signalling of delay in a live authoring environment

[[EBU-TT-Live-1-0]] § 2.3.2 provides further analysis of how the delays within real world systems can be managed so that the decoded output of the overall system offers live subtitles synchronised with the audio to which they relate.

[[EBU-TT-Live-1-0]] § 2.3.3 describes how the ebuttm:authoringDelay metadata attribute defined in [[EBU-TT-M]] can be used to express the latency associated with the authoring process.

Bring in delay management text from EBU-TT Live §2.3.2 and §2.3.3.

Delay nodes

An Improver Node that applies an adjustment delay is referred to as a Delay Node. The adjustment delay applied is known as the offset period. A Delay Node could be one part of a solution for achieving resynchronisation. The value of the delay might be derived using one of a variety of techniques, including potentially:

heuristics based on ebuttm:authoringDelay, and other metadata in the stream for example the method used to author the subtitles;
knowledge of the typical delay within the broadcast chain;
a comparative measurement using audio analysis to establish a varying value adjustment delay based on the actual subtitles and speech within the programme content.

It is out of scope of this document to mandate the use of specific techniques; furthermore it is expected that the relative success of these techniques will depend on programme content, the level of variability in the chain and the quality of implementation of each technique.

Any node that receives and emits streams is likely to incur some real world processing delay; a Delay node is intended to apply a controlled relative adjustment delay.

Two types of Delay Node for applying a delay are specified:

A Buffer Delay Node buffers each received Document and emits it after a non negative delay offset period. Since this is essentially equivalent to a longer carriage latency no modification to the documents is required. The Buffer Delay Node increases the availability time of the documents received by Nodes receiving the sequence (downstream). It is primarily intended for delaying implicitly timed documents.
A Retiming Delay Node modifies the times within each Document and issues them without further emission delay as part of a new sequence with a new sequence identifier. The times are modified such that all of the computed begin and end times within the document are increased by a non negative delay offset period. The Retiming Delay Node is primarily intended for delaying explicitly timed documents.

If it is operationally required to use both types of delay node then a chain of nodes can be constructed in which both a Buffer Delay Node and a Retiming Delay Node are connected “in series” with each other.

Since the requirements for nodes here are logical definitions a real world processor could combine both functions.

Buffer Delay Node

The following behaviours of a Buffer Delay node are defined, in relation to the sequences that they receive and emit:

A Buffer Delay node is a passive node. Therefore the output documents shall be identical to the input documents.
A Buffer Delay node shall delay emission of the stream by a period not less than the offset period.
The offset period shall not be negative.

In the context of a buffer delay node a negative offset period would require documents to be emitted before they had arrived. No practical device has yet been demonstrated that can achieve this in the general case.

Retiming Delay Node

The following behaviours of a Retiming Delay node are defined in relation to the sequences that they receive, process and emit:

A Retiming Delay node is a processing node. Therefore the output sequence shall have a different sequence identifier from the input sequence.
A Retiming Delay node shall modify each document to result in the document’s computed times being increased by the offset period.
In general this results in implicitly timed documents being converted to explicitly timed documents, since all but a zero offset period requires at a minimum a begin attribute on an element, for example the tt:body element. This behaviour may be surprising.
The offset period shall not be negative.
A Retiming Delay node should not emit an output sequence with reordered subtitles.
A Retiming Delay node shall not update the value of ebuttm:authoringDelay, if present.
A Retiming Delay node should add an ebuttm:appliedProcessing element to the document metadata to indicate that the delay has been added.

In the context of a retiming delay node, applying a negative offset period could result in documents having negative begin attribute values, which is not permitted in TTML.

It is possible that delay functionality is combined with other processing in a single node, for example an accumulator; hence the requirement not to reorder is expressed in terms of subtitles not documents: there is for example no requirement that there is a 1:1 relationship between input and output documents from a Retiming Delay node, though such a relationship would be expected for the simplest conceivable Retiming Delay.

If varying the delay offset period, take care to manage the other timings to avoid inadvertently changing the displayed order of subtitles; for example one strategy could be to treat delay offset period changes as target values that are arrived at over a fixed period, so instead of jumping from, say, 10s to 4s in one step an implementation could gradually reduce the offset from 10s to 4s over, say, a 6s period. Another strategy when the delay varies is to allow the node (or a downstream node) to apply its own logic, which could result in documents being skipped to achieve the desired synchronisation.

Reference clocks

Some broadcast environments do not relate time expressions to a real world clock such as UTC but to some other generic reference clock such as a studio timecode generator. When ttp:timebase="clock" is used and ttp:clockMode="local", the ebuttp:referenceClockIdentifier parameter may be specified on the tt:tt element to identify the source of this reference clock to allow for correct synchronisation.

For real time processing of TTML Live documents correct dereferencing of the external clock is a processing requirement, therefore the referenceClockIdentifier is defined as a parameter attribute in the ebuttp parameter namespace. This is in contrast to the referenceClockIdentifier element in the ebuttm metadata namespace defined by [[EBU-TT-M]]. A TTML document instance created as an archive version of a sequence of live documents can preserve the value of the ebuttp:referenceClockIdentifier parameter attribute in a ebuttm:referenceClockIdentifier element.

Handover

In a live subtitle authoring environment it is common practice for multiple subtitlers to collaborate with each other in the creation of subtitles for a single programme. From an encoder perspective, it is desirable to manage only a single stream of live subtitles. To mediate between the streams that each subtitler creates we will refer to a Handover Manager node. See .

The Handover Manager subscribes to a set of sequences and selects documents from one sequence at a time, switching between sequences dependent on parameters within the documents. It then emits a new sequence of documents representing the time interleaved combination of subtitles from each of the authors, where each output document is derived from an input document from the selected sequence. See also for an example of handover sequences.

The Handover Manager node shall use a 'who claimed control most recently' algorithm for selecting the sequence, based on a control token parameter within each document.

Use case showing a Handover Manager selecting between Sequences A and B and emitting Sequence C

Authoring tools can subscribe to the output stream from the Handover Manager; this makes the control token parameter values visible to them to permit each to direct the Handover Manager to switch to their output; it also facilitates monitoring.

Other schemes for directing handover are possible, for example the control token could be derived from a separate mediation source or the clock.

Authors Group parameters

The following parameters on the tt:tt element are provided to facilitate handover:

The ebuttp:authorsGroupIdentifier is a string that identifies the group of authors from which a particular Handover Manager can choose. A Handover Manager should be configured to subscribe to all the streams whose documents have the same ebuttp:authorsGroupIdentifier except for its own output stream. Within a single sequence, all documents that contain the element ebuttp:authorsGroupIdentifier shall have the same ebuttp:authorsGroupIdentifier.

The Handover Manager may include within each document in its output sequence the parameter attributes ebuttp:authorsGroupIdentifier and ebuttp:authorsGroupControlToken from the current selected sequence. The Handover Manager includes within each output document the metadata attribute ebuttm:authorsGroupSelectedSequenceIdentifier [[EBU-TT-M]] set to the value of the source sequence’s sequence identifier for that document. This is so that each subscriber to the Handover Manager's output stream can know the current status, including any subscribed subtitle authoring stations.

The Handover Manager’s normative behaviour is defined in .

When present in a document, the ebuttp:authorsGroupControlToken is a number that the Handover Manager uses to identify which sequence to select: when a document is received with a higher value ebuttp:authorsGroupControlToken than that most recently received in the current selected sequence the Handover Manager switches to that document's sequence, that is, it emits a document in its output sequence corresponding to and derived from the received document with the new control token without delay.

Having selected a sequence, the Handover Manager emits further documents derived from that sequence until a new sequence is selected.

This means that the control token value can be lowered after taking control, by setting the control token value in a new document in the selected sequence to a lower number. Therefore the control token value does not need to increase forever.

Regardless of the selected sequence, the Handover Manager does not emit any documents derived from input sequence documents that do not contain both the parameters ebuttp:authorsGroupIdentifier and ebuttp:authorsGroupControlToken.

Care should be taken if the carriage mechanism does not guarantee delivery of every document in the sequence in case a document intended to take control is lost. One strategy for avoiding this would be for the subtitle authoring station to observe the Handover Manager's output and verify that control has been taken before lowering the control token value. Another strategy would be to maintain the high control token value and duplicate it in each document in the sequence until the sequence switch has been verified through another mechanism.

Sample sequences demonstrating Handover

Handover Manager algorithm

The Handover Manager node uses a 'who claimed control most recently' algorithm for selecting the sequence, based on the control token parameter present within each document, as defined in the TTML Live Extensions Module.

Practical considerations for Handover Manager implementations

A Handover Manager is likely to be used to switch between multiple contributors of live subtitles for a single service or broadcast channel. In this situation each author might be able to configure within their production system some metadata to be included in the TTML Live sequence such as the ebuttm:broadcastServiceIdentifier element. In production environments a Handover Manager should check that the input streams are consistent with each other, taking into account possible “emergency” scenarios where a live subtitle author needs to step in at short notice and may not have the opportunity to make such configurations.

The Handover Manager produces a single sequence derived from multiple input sequences. However the System Model requires that every document in a sequence has the same timing model. This implies that practical arrangements need to be made to ensure that the Handover Manager’s output sequence is conformant. These could include arranging for all input sequences to have the same timing model, or including within the Handover Manager a timing model converter that can set the output documents’ timing model correctly independently of the input documents’ timing models. Such a converter might need access to external time sources.

Describing facets of the subtitle content

In a chain of processing nodes each might modify the subtitle content to suit some particular need. For example the subtitler's priority could be to issue new documents as quickly as possible without considering spelling, grammatical correction, profanities etc. Alternatively the subtitler could be issuing a sequence that is known not to be suitable without modification for all downstream platforms: one encoder could be able to emit Unicode code points [[UNICODE]]; another could be restricted to ISO/IEC 8859-x. A combination of these scenarios is possible. In order to complete the processing needed to ensure that a stream is suitable, the system model presented here proposes that a series of Improver nodes would be used to perform the necessary processing.

However from a Consumer node's perspective it is not always desirable to rely on the node configuration being correct without further information. Additionally for compliance monitoring automated processes could be used to assess conformance against some rule set without necessarily enforcing it or making modifications to the text content in the sequence.

Some knowledge of the processing applied can be indicated at a document level using the debug ebuttm:appliedProcessing element [[EBU-TT-M]] (see ) however this is coarse grained. To indicate for a particular piece of content some aspect of its editorial or technical quality, for example if it has been spellchecked, profanity checked, had its code point set reduced for downstream compatibility, had colours applied, been positioned etc. the ebuttm:facet element [[EBU-TT-M]] can be applied. To indicate the document level summary the ebuttm:documentFacet element [[EBU-TT-M]] can be applied.

Tracing and debugging

The model presented here allows for multiple Processing Nodes to receive and emit streams. In real world scenarios it can be useful to log the activity that generated a document for audit or debugging purposes, for example to check that the correct configurations have been applied.

The ebuttm:appliedProcessing element [[EBU-TT-M]] permits such logging. If present, an ebuttm:appliedProcessing element describes in text the action that generated the document, in the @action attribute, and an identifier that performed that action, in the @generatedBy attribute. The action can be derived from a classification scheme not specified here. The @generatedBy node identifier is an URI and is also not further defined here.

Optionally the ebuttm:appliedProcessing element can identify the node that supplied the source content for the action, using the @sourceId attribute.

The ebuttm:appliedProcessing element can contain text content providing any further logging information.

Scope

Conformance