Well-deployed technologies

Timeline management

The basic metronome to use to react to playing media over time is provided by the timeupdate event of HTML’s <audio> and <video> elements.

Audio synchronization

The Web Audio API defines a full-fledged audio processing API that exposes the precise time at which audio will be played, exposing the latency introduced by the output pipeline in particular. This allows for very tight synchronization between different audio processing events happening in the local audio context.

Closed Captioning

To associate an external text track (e.g. closed captions) to a video, the Web VTT format can be plugged into a <video> element.

While WebVTT is the main format used to render captions in browsers, TTML 2 provides a richer language for describing timed text. It can be used as an interchange format among authoring systems, and indirectly in browsers for caption rendering via JavaScript polyfill libraries.

Timestamp accuracy in captured content

The Media Capture and Streams API, which allows to control microphones and cameras from Web applications, exposes the targeted latency of the configuration, which can be used to relate timestamps in captured content to events happening in the real world.

Synchronization to a media stream

The Media Timed Events document collects use cases and requirements for improved support for timed events related to audio or video media on the Web, such as subtitles, captions, or other web content, where synchronization to a playing audio or video media stream is needed, and makes recommendations for new or changed Web APIs to realize these requirements.

FeatureSpecification / GroupMaturityCurrent implementations
Select browsers…
Timeline managementtimeupdate event in HTML Standard
WHATWG
Living Standard
Audio synchronizationWeb Audio API
Audio Working Group
Candidate Recommendation
Closed CaptioningWebVTT: The Web Video Text Tracks Format
Timed Text Working Group
Candidate Recommendation
Timed Text Markup Language 2 (TTML2) (2nd Edition)
Timed Text Working Group
Candidate Recommendation
Timestamp accuracy in captured contentMedia Capture and Streams
WebRTC Working Group
Candidate Recommendation
Synchronization to a media streamMedia Timed Events
Media and Entertainment Interest Group
Group Note

Specifications in progress

Timeline management

Beyond text tracks, Web pages may contain many other time-based animations with which synchronization can be useful; the Web Animations API offers the tools needed to set up these synchronization points. However, it does not allow synchronizing animations to playback of audio or video media.

Exploratory work

Synchronization to a media stream

Incubation of the DataCue proposal, which builds on top of the former DataCue API that was part of HTML5 but was dropped for lack of implementation support, has started in the Web Platform Incubator Community Group following work on the Media Timed Events document. The API would allow handling of timed metadata, i.e. metadata information that is synchronized to some audio or video media. This would improve support for presentation of supplemental content alongside the audio or video or more generally, making changes to a web page, or executing application code triggered from JavaScript events, at specific points on the media timeline of an audio or video media stream.

Closed Captioning

Some advanced closed captioning scenarios cannot be expressed using WebVTT. In such cases, Web applications need to render cues on their own using JavaScript. By definition, this means that the resulting captions cannot benefit from integration with the underlying platform, e.g. to apply user style sheets or take part in Picture-in-Picture scenarios. Early discussions on TextTrackCue enhancements could pave the way for a generic solution in that field.

When media resources enclose their own text tracks, having this in-band information exposed to the Web application enables creating richer interactions; the Sourcing In-band Media Resource Tracks from Media Containers into HTML document offers guidance as to how that in-band information should be exposed in browsers.

Multi-device synchronization

There are a number of cases that need synchronization of several tracks in the same page, for instance to synchronize the sign-language transcript of an audio track with its associated video. The MediaController interface, initially defined in HTML5, was dropped from the HTML specification due to very limited implementation support. The Timing Object specification proposes a mechanism to bring shared on-line clocks to browsers and ease synchronization of heterogeneous content within a single device and across devices. The proposed timing model is generic, in that it could be applied to any timed media, such as audio or video playback or presentation or animation of web content in general.

FeatureSpecification / GroupImplementation intents
Select browsers…
Synchronization to a media streamDataCue
Web Platform Incubator Community Group
Closed CaptioningSourcing In-band Media Resource Tracks from Media Containers into HTML
Media Resource In-band Tracks Community Group
Multi-device synchronizationTiming Object
Multi-Device Timing Community Group

Discontinued features

The MediaController interface
The MediaController interface had been introduced in HTML5 to ease the playback synchronization of different media elements within a single page. The interface was dropped from the HTML5.1 specification due to very limited implementation support and concerns about the performance and technical viability of such solutions on constrained devices.

Features not covered by ongoing work

Media-synchronised Web animations
Web Animations allows Web applications to synchronise CSS Transitions, CSS Animations, and SVG with a global clock. There is no built-in mechanism to synchronize animations to playback of audio or video media, which could be useful to enable tighter synchronization in libraries such as Mozilla's Popcorn.js. Possible solutions include an extension to Web Animations to be able to associate animations with a media element, or integration with the more generic Timing Object solution.