This submission describes how TTML Live documents can be carried from node to node using WebSocket connections.

This document is based on [[EBU-TT-Live-WS]] where any basis on [[EBU-TT-Live-1-0]] has been re-based on the accompanying submission defining TTML Live Contribution extensions, [[TTML-LIVE]].

These works were originally developed by EBU and benefit from technical consensus and implementation experience gathered there.

Scope

This submission accompanies the TTML Live Contribution extensions [[TTML-LIVE]]. It describes how TTML Live documents can be carried from node to node using WebSocket [[rfc6455]] connections, and fulfils the requirements for defining a Carriage Mechanism specified in the TTML Live Contribution extensions [[TTML-LIVE]].

Normative text describes indispensable or mandatory elements. It contains the conformance keywords ‘shall’, ‘should’ or ‘may’, defined as follows:

Shall and shall not: Indicate requirements to be followed strictly and from which no deviation is permitted in order to conform to the document.

Should and should not: Indicate that, among several possibilities, one is recommended as particularly suitable, without mentioning or excluding others; OR indicate that a certain course of action is preferred but not necessarily required; OR indicate that (in the negative form) a certain possibility or course of action is deprecated but not prohibited.

May and need not: Indicate a course of action permissible within the limits of the document.

Default identifies mandatory (in phrases containing “shall”) or recommended (in phrases containing “should”) presets that can, optionally, be overwritten by user action or supplemented with other options in advanced applications. Mandatory defaults must be supported. The support of recommended defaults is preferred, but not necessarily required.

Informative text is potentially helpful to the user, but it is not indispensable and it does not affect the normative text. Informative text does not contain any conformance keywords.

A conformant implementation is one which includes all mandatory provisions (‘shall’) and, if implemented, all recommended provisions (‘should’) as described. A conformant implementation need not implement optional provisions (‘may’) and need not implement them as described.

Introduction

The TTML Live specification [[TTML-LIVE]] defines the payload data for live subtitling applications, and the requirements for specifying carriage mechanisms for transferring live subtitle sequences between nodes.

This document defines one such carriage mechanism, based on the WebSocket protocol. This protocol provides a way to establish TCP socket connections over IP networks in a “web friendly” way that uses HTTP upgrades and has a TLS encryption mode.

Conceptual diagram illustrating system model of node to node connection using WebSocket over an IP network

Definition of terms

Terms defined in TTML Live Contribution Extensions carry the same definitions in this document. Additional terms are defined below:

Live Document

Any entity defined to be a Live Document by a W3C specification.

TTML Live Document

A live document that is a TTML Document Instance as defined in [[TTML-LIVE]].

TTML Live sequence

A sequence of TTML Live Documents as defined in [[TTML-LIVE]].

Carriage Specification Details

WebSocket [[rfc6455]] is a protocol that may be used to transfer a stream of TTML live documents over an IP network from one node to another.

Core networking dependencies and connection protocols

WebSocket carriage is based on TCP socket connections initiated via an HTTP upgrade request.

WebSocket does not in itself impose any data rate constraints; such constraints are imposed by the underlying network infrastructure. Data payloads are delivered in packets of limited size; these do not impact the maximum size of any Live document that can be sent. The TCP protocol ensures that all data is delivered in order. Operational implementations should ensure that there is sufficient data rate capacity available to carry the volume of TTML Live data needed for the application without imposing growing delay.

Such a growing delay would manifest as increasingly late availability times of TTML Live documents, which in turn may truncate the visibility of presented text or delay its appearance.

Synchronisation impacts and/or thresholds

The latency of WebSocket and TCP socket connections is generally dependent on the network infrastructure used. The protocol guarantees that packets are all delivered, and delivered in the same order in which they were sent. This guarantee can impose transient additional latency if network conditions change. For example failure to receive acknowledgement of delivery of a packet causes further data effectively to be held back until that packet can be successfully delivered.

Such an increase in latency would manifest as an increased availability time for TTML Live documents, which in turn may truncate the visibility of presented text or delay its appearance.

Techniques available for handling this include careful management of the network conditions, for example setting up static routes or using software defined networks. If such techniques are not locally practical then careful consideration should be given to decide if WebSocket is indeed an appropriate carriage mechanism to use.

Other network based protocols such as those based on UDP trade guaranteed delivery for guaranteed latency; in some such protocols it is possible for data loss to occur.

Information Security

See [[rfc6455]] §10. Security Considerations for further general details.

Encryption

Use of the wss: URI (Uniform Resource Identifier) [[rfc3986]] scheme requires the connection to be encrypted using TLS (Transport Layer Security) [[rfc8446]].

Authentication

From [[rfc6455]] §10.5 WebSocket Client Authentication:

The WebSocket server can use any client authentication mechanism available to a generic HTTP server, such as cookies, HTTP authentication, or TLS authentication.

Error checking and correction

The WebSocket protocol does not offer any inherent error checking and correction. As stated in [[rfc6455]] §10.7 Handling of Invalid Data, both sending and receiving nodes shall validate documents and should close the connection if invalid data is received.

To avoid issues with encoding of data with Text data type all documents shall be encoded as UTF-8.

Crossing organisational boundaries

The WebSocket protocol facilitates transition across organisational boundaries with firewalls etc. by using ports that are commonly open for HTTP; communication can pass through web proxy servers. Use of the secure web socket protocol provides a straightforward to use mechanism for encrypting communication between organisations when the intermediate network is not under direct control of either party, such as the open internet.

Endpoint cardinality

WebSocket is a point to point full duplex protocol, with connections being defined by two single end points.

Many implementations offer a ‘broadcast’ mode in which multiple connections may be made to or from the same endpoint, and the same data is sent over each connection. Such an implementation could be used to provide a Distributing Node for example.

Generally the performance of such implementations is dependent on network capacity and processing power; for TTML Live applications, practical experience shows that small numbers of connections (for example, fewer than 10) do not impose any significant additional latency on current hardware, however care should be taken to ensure that specific applications work in the applicable operational environment.

TTML Live requires only unidirectional flow of data. The reverse channel may be used for application specific purposes if and only if both nodes can validate such application specific data; in general this is a risky strategy since unexpected data causes connection closure – see below. The reverse channel shall not be used for providing an EBU-TT Live stream back from a Handover Node; such a stream shall be made available using a separate (forward) connection.

Implementations should document how many simultaneous connections each node supports.

Since connections can be initiated from either the Node that sends TTML Live sequences or a node that receives them, and it is operationally important to know which connection “direction” applies, implementations shall document whether they accept incoming connections or make outgoing connections.

A WebSocket Distributing Node that supports incoming connections for both receiving and sending can be used to mediate between implementations that make outgoing connections and those that accept incoming connections by providing a ‘receiving’ incoming connection and offering ‘sending’ connection(s) and passing through documents from one to the other without modifying them. See .

Showing how a Distributing node can be used to mediate between nodes with mutually incompatible connection initiation directions

Similarly a “connecting” node can mediate in the other direction by simultaneously making two outgoing connections, one to a “sender” and the other to a “receiver” and passing documents from the sender to the receiver without modifying them.

Connection lifecycle management

A WebSocket connection is initiated when a node dereferences a URI with a ws: or wss: URI scheme whose address resolves to a second node that supports the WebSocket Protocol, and the WebSocket opening handshake is successfully completed. A WebSocket connection is closed when either node initiates the closing handshake.

The WebSocket protocol and underlying TCP layer maintains established connections without further application support being required; no keep-alive messages are required by this carriage mechanism.

Nodes should close connections if unexpected or invalid data is received. Implementations shall support TTML Live documents. Implementations shall not close connections if valid TTML live documents are received. Implementations should document any other data formats that they support.

Channel routing

This carriage mechanism does not specify or recommend any channel routing mechanisms.

Various schemes for maintaining and querying registries of, and brokering access to, available data sources (which could be TTML Live sequences) are feasible and in use.

It would be reasonable to consider schemes that are either coincident with those intended for wider broadcast applications, such as the AMWA NMOS Discovery & Registration API [[?ANDR-API]], and that are aligned with the EBU-sponsored Joint Taskforce on Networked Media Reference Architecture [[?JT-NM]].

Alternatively schemes that are more targeted towards a generic WebSocket infrastructure, such as the Web Application Messaging Protocol [[?WAMP]] may be more appropriate for some applications.

One practical operational consideration for implementations is the design of the WebSocket URI. See [[rfc6455]] §12 and [[rfc3986]] for further details. The following examples are illustrative only:

Sequence Identifier Encoding in URIs

The data type of ebuttp:sequenceIdentifier is xs:string with a minimum length of 1. It can therefore contain characters that are reserved in URIs, as defined in [[rfc3986]] §2.2.

Prior to inserting into a URI the sequence identifier shall be percent-encoded exactly once. On extraction from a URI, and prior to use as a sequence identifier, the value shall be percent-decoded exactly once.

Stability

WebSocket and TCP guarantee delivery of data in the same order in which it was sent at the expense of latency if required.

Constant latency can only be achieved by managing the underlying network infrastructure.

Given a reliable network infrastructure a connection remains open indefinitely until either node closes it.