Well-deployed technologies

Control of one local media element

All the basic interactions one expects from a media player (play, pause, playback rate, etc.) are available in JavaScript via the HTMLMediaElement interface defined in HTML.

Audio control

The Web Audio API defines a full-fledged audio processing API that gives precise control over the playback of audio content.

Control position in media timeline when sharing

Users often want to control the position at which to start a video (or an audio) feed when they share media content with friends on social networks. The Media Fragments URI specification defines a syntax for constructing media fragment URIs and explains how Web browsers can use this information to jump to the right position in the media timeline.

FeatureSpecification / GroupMaturityCurrent implementations
Select browsers…
Control of one local media elementHTMLMediaElement interface in HTML Standard
Living Standard
Audio controlWeb Audio API
Audio Working Group
Control position in media timeline when sharingMedia Fragments URI 1.0 (basic)
Media Fragments Working Group

Specifications in progress

Key-based control

A number of media-capable devices expose specific hardware keys and buttons to facilitate control of playing media; the Media Keys in DOM Level 3 events specification exposes these interactions to Web-based players.

OS-wide media control

In any given Web application, media playback can be in competition with media playback from other applications, and the underlying operating system is in charge of determining which of these applications should have the media focus. The Media Session specification exposes these changes of focus to Web applications.


To preserve bandwidth, memory and battery on mobile, and prevent possibly unwanted media playback, browsers have put autoplay policies into place and may deny automated playback of media content. The Autoplay Policy Detection specification lets applications know whether autoplay will succeed for a given media element.

User Interface

Remote controllers and gamepads are commonly used to navigate ten-foot user interfaces on televisions. They usually feature 4-way keys, which make them particularly suitable for 2D navigation, where focus moves between focusable elements based on their position on the screen. This behavior can be achieved through JavaScript to some extent, but cannot cater for all cases and negatively impacts performance due to the need to access the DOM repeatedly. The Spatial navigation specification defines extensions to CSS and an API for this navigation paradigm to be supported across browsers.

Virtual reality control

For immersive media experiences enabled by Virtual Reality headsets, media content needs to be adjusted based on the viewer’s head position, orientation and velocity. The work on the WebXR Device API (formerly WebVR) and its companion WebXR Gamepads Module specification for user inputs addresses this need.

FeatureSpecification / GroupMaturityCurrent implementations
Select browsers…
Key-based controlMedia keys in UI Events KeyboardEvent key Values
Web Applications Working Group
Candidate Recommendation
OS-wide media controlMedia Session Standard
Media Working Group
Working Draft
AutoplayAutoplay Policy Detection
Media Working Group
Editor's Draft
User InterfaceCSS Spatial Navigation Level 1
CSS Working Group
Working Draft
Virtual reality controlWebXR Device API
Immersive Web Working Group
Candidate Recommendation
WebXR Gamepads Module - Level 1
Immersive Web Working Group
Working Draft

Exploratory work

Input latency

Today, all DOM events need to go through the main thread, the only one that has access to DOM elements. The Input for Workers and Worklets specification proposes an event delegation scheme to workers that assumes no DOM access. This mechanism would enable latency sensitive event dependent logic, which would no longer be blocked by the main thread. This would benefit cloud-based media scenarios (e.g. cloud gaming) where input events need to be forwarded to a server with minimal latency, and interactive audio scenarios in conjuction with AudioWorklet.

Tuner control

Devices which can access broadcast content need to offer ways to select which broadcast signal to play; the TV Control API is an attempt to provide such an interface for broadcast TV and radio. The API also exposes the Electronic Program Guide when it is streamed with the media content, and explores ways to expose applications embedded in media streams as is common in interactive TV platforms (ATSC, HbbTV, Hybridcast). This work has been discontinued for lack of interest to the API as specified among potential implementers. Discussions on scope, use cases and requirements for these features have now moved to the Media and Entertainment Interest Group.

Multi-device media control

The Multi-Device Timing Community Group is exploring media control across devices: its Timing Object specification enables coordinatinatgion of the playback of multiple video, audio and other data streams in close synchrony, within a single device and across devices, independently of the network topology. This proposal would replace and extend the MediaController interface, which was dropped from HTML for lack of implementations.

FeatureSpecification / GroupImplementation intents
Select browsers…
Input latencyInput for Workers and Worklets
Web Platform Incubator Community Group
Tuner controlTV Control API
TV Control Working Group
Multi-device media controlTiming Object
Multi-Device Timing Community Group

Features not covered by ongoing work

360° video controls
To render 360° videos, applications can either rely on native support for 360° video codecs from web browsers and stick to integrated controls, or they can provide their own controls but then need to handle the underlying projection logic and adaptive streaming programmatically. Exposing controls for 360° videos on the HTMLMediaElement interface would allow applications to customize the user experience and still take advantage of native 360° video playback support when available.