Well-deployed technologies

The basic metronome to use to react to playing media over time is provided by the timeupdate event of HTML’s <audio> and <video> elements.

The Web Audio API defines a full-fledged audio processing API that exposes the precise time at which audio will be played, exposing the latency introduced by the output pipeline in particular. This allows for very tight synchronization between different audio processing events happening in the local audio context.

To associate an external text track (e.g. close captions) to a video, the Web VTT format can be plugged into a <video> element.

While WebVTT is the main format used to render captions in browsers, TTML 2 provides a richer language for describing timed text that can be used as an interchange format among authoring systems.

The Media Capture and Streams API, which allows to control microphones and cameras from Web applications, exposes the targeted latency of the configuration, which can be used to relate timestamps in captured content to events happening in the real world.

FeatureSpecification / GroupMaturityCurrent implementations
Select browsers…
Timeline managementtimeupdate event in HTML Standard
Living Standard
Audio synchronizationWeb Audio API
Audio Working Group
Candidate Recommendation
Close CaptioningWebVTT: The Web Video Text Tracks Format
Timed Text Working Group
Candidate Recommendation
Timed Text Markup Language 2 (TTML2)
Timed Text Working Group
Timestamp accuracy in captured contentMedia Capture and Streams
WebRTC Working Group
Candidate Recommendation

Specifications in progress

Beyond text tracks, Web pages may contain many other time-based animations with which synchronization can be useful; the Web Animations API offers the tools needed to set up these synchronization points. However, it does not allow synchronizing animations to playback of audio or video media.

FeatureSpecification / GroupMaturityCurrent implementations
Select browsers…
Timeline managementWeb Animations
CSS Working Group
SVG Working Group
Working Draft

Exploratory work

When media resources enclose their own text tracks, having this in-band information exposed to the Web application enables creating richer interactions; the Sourcing In-band Media Resource Tracks from Media Containers into HTML document offers guidance as to how that in-band information should be exposed in browsers.

The Media Timed Events document collects use cases and requirements for improved support for timed events related to audio or video media on the Web, such as subtitles, captions, or other web content, where synchronization to a playing audio or video media stream is needed, and makes recommendations for new or changed Web APIs to realize these requirements.

There are a number of cases that need synchronization of several tracks in the same page, for instance to synchronize the sign-language transcript of an audio track with its associated video. The MediaController interface, initially defined in HTML5, was dropped from the HTML specification due to very limited implementation support. The Timing Object specification proposes a mechanism to bring shared on-line clocks to browsers and ease synchronization of heterogeneous content within a single device and across devices. The proposed timing model is generic, in that it could be applied to any timed media, such as audio or video playback or presentation or animation of web content in general.

FeatureSpecification / GroupImplementation intents
Select browsers…
Close CaptioningSourcing In-band Media Resource Tracks from Media Containers into HTML
Media Resource In-band Tracks Community Group
Synchronization to a media streamMedia Timed Events
Media & Entertainment Interest Group
Multi-device synchronizationTiming Object
Multi-Device Timing Community Group

Discontinued features

The MediaController interface
The MediaController interface had been introduced in HTML5 to ease the playback synchronization of different media elements within a single page. The interface was dropped from the HTML5.1 specification due to very limited implementation support and concerns about the performance and technical viability of such solutions on constrained devices.

Features not covered by ongoing work

Media-synchronised Web animations
Web Animations allows Web applications to synchronise CSS Transitions, CSS Animations, and SVG with a global clock. There is no built-in mechanism to synchronize animations to playback of audio or video media, which could be useful to enable tighter synchronization in libraries such as Mozilla's Popcorn.js. Possible solutions include an extension to Web Animations to be able to associate animations with a media element, or integration with the more generic Timing Object solution.