Overview of Media Technologies for the Web >

Media Processing

Whether before rendering or after capture, media content often requires some processing to make it fit its expected usage.

Well-deployed technologies

Adaptive streaming

The Media Source Extensions API (MSE) allows applications to insert chunks of media (e.g. a video clip) into an existing media stream, and implement adaptive streaming algorithms at the application level.

Protected Media

For the distribution of media whose content needs specific protection from being copied, the Encrypted Media Extensions specification enables the decryption and rendering of encrypted media streams based on Content Decryption Modules (CDM).

Audio Processing

The Web Audio API provides a full-fledged audio processing and synthesis API with low-latency guarantees and hardware-accelerated operations when possible.

Image and Video Processing

The Canvas API enables pixel-level manipulation of images and, by extension, video frames when coupled with requestAnimationFrame method. However, requestAnimationFrame was not designed to signal when a video frame has been presented for composition and this approach essentially restricts Web applications to processing code that runs on the CPU and in the main thread.

Within these constraints, WebAssembly, a safe, portable, and low-level code format can be used to improve the performance of the processing code.

Beyond the canvas approach, a number of directions are being explored to enable and optimize media processing, including means to process media streams on the GPU, hardware APIs to run shape detection algorithms or neural network inferences, hooks on media streams that get the user agent the ability to further optimize processing workflows, etc.

Feature	Specification / Group	Current implementations Select browsers… Chrome Microsoft Edge Firefox Safari / WebKit Baidu Browser Opera QQ Browser Samsung Internet UC Browser
Adaptive streaming	Media Source Extensions™ Media Working Group	Shipped: Shipped in Chrome (desktop, mobile). Source: Can I use.Shipped in Microsoft Edge (desktop). Source: Can I use.Shipped in Firefox (desktop, mobile). Source: Can I use.Shipped in Safari (desktop, mobile). Source: Can I use.
Protected Media	Encrypted Media Extensions Media Working Group	Shipped: Shipped in Chrome (desktop, mobile). Source: Can I use.Shipped in Microsoft Edge (desktop). Source: Can I use.Shipped in Firefox (desktop, mobile). Source: Can I use.Shipped in Safari (desktop, mobile). Source: Can I use.
Audio Processing	Web Audio API Audio Working Group	Shipped: Shipped in Chrome (desktop, mobile). Source: Can I use.Shipped in Microsoft Edge (desktop). Source: Can I use.Shipped in Firefox (desktop, mobile). Source: Can I use.Shipped in Safari (desktop, mobile). Source: Can I use.
Image and Video Processing	The 2D rendering context in HTML Standard WHATWG	Shipped: Shipped in Chrome (desktop, mobile). Source: Can I use.Shipped in Microsoft Edge (desktop). Source: Can I use.Shipped in Firefox (desktop, mobile). Source: Can I use.Shipped in Safari (desktop, mobile). Source: Can I use.
	requestAnimationFrame in HTML Standard WHATWG	Shipped: Shipped in Chrome (desktop, mobile). Source: Can I use.Shipped in Microsoft Edge (desktop). Source: Can I use.Shipped in Firefox (desktop, mobile). Source: Can I use.Shipped in Safari (desktop, mobile). Source: Can I use.
	WebAssembly Core Specification WebAssembly Working Group	Shipped: Shipped in Chrome (desktop, mobile). Source: Can I use.Shipped in Microsoft Edge (desktop). Source: Can I use.Shipped in Firefox (desktop, mobile). Source: Can I use.Shipped in Safari (desktop, mobile). Source: Can I use.

Specifications in progress

Adaptive streaming

Some scenarios require the ability to switch to a different codec during media playback. For instance, this need may arise on program boundaries when watching linear content. Also, typical ads are of a very different nature than the media stream that they need to be inserted into. The first version of the Media Source Extensions (MSE) specification does not address such heterogeneous scenarios, forcing content providers to use complex workarounds, e.g. using two <video> elements overlaid one top of the other with no guarantees that transition between streams will be smooth, or transcoding and re-packaging the ad content to be consistent with the program content. The Media Working Group is integrating a codec switching feature in a second version of MSE.

Feature	Specification / Group	Maturity	Current implementations Select browsers… Chrome Microsoft Edge Firefox Safari / WebKit Baidu Browser Opera QQ Browser Samsung Internet UC Browser
Adaptive streaming	Codec switching in Media Source Extensions™ Media Working Group		Shipped: Shipped in Firefox (desktop). Source: Chrome Platform Status. In development: In development in Safari (desktop). Source: Chrome Platform Status.

Exploratory work

Video processing

A number of directions are being explored to go beyond simple video processing through the Canvas API and requestAnimationFrame:

WebGPU allows Web applications to perform operations such as rendering and computation on a Graphics Processing Unit (GPU).
The Web Neural Network API describes a dedicated low-level API for neural network inference hardware acceleration.
The Shape Detection API provides access to accelerated shape detectors (e.g. to recognize human faces and postures, or objects) on devices that embed relevant hardware such as most modern smartphones and laptops.
HTMLVideoElement.requestVideoFrameCallback() proposes a mechanism to signal when a video frame has been presented for composition and to provide metadata about that frame. This would allow drawing video frames onto a canvas at the video rate (instead of the browser's animation rate).
An early proposal to create an OffscreenVideo interface inspired by OffscreenCanvas to allow processing of video in a worker.

Media Encoding/Decoding

To further help with media processing and media workflows, WebCodecs defines an API that let Web applications access and tweak built-in (software and hardware) media encoders and decoders. The API could become a core component of media processing workflows in the future.

Feature	Specification / Group	Implementation intents Select browsers… Chrome Microsoft Edge Firefox Safari / WebKit Baidu Browser Opera QQ Browser Samsung Internet UC Browser
Video processing	WebGPU GPU for the Web Working Group	Shipped: Shipped in Chrome (desktop). Source: Can I use.Shipped in Microsoft Edge (desktop). Source: Can I use. Experimental: Experimental in Firefox (desktop, mobile). Feature is behind a flag. Source: Can I use. In development: In development in Safari (desktop). Source: Chrome Platform Status.
	Web Neural Network API Machine Learning for the Web Community Group
	Accelerated Shape Detection in Images Web Platform Incubator Community Group
	HTMLVideoElement.requestVideoFrameCallback() Web Platform Incubator Community Group	Under consideration: Under consideration in Firefox (desktop). Source: Chrome Platform Status.
Media Encoding/Decoding	WebCodecs Web Platform Incubator Community Group	Under consideration: Under consideration in Firefox (desktop). Source: Chrome Platform Status.

Features not covered by ongoing work

JavaScript-based Codecs: The algorithms used to compress and decompress bandwidth-intensive media content are required to be provided by the browsers at this time; a system enabling these algorithms to be written and distributed in JavaScript or WebAssembly and get them integrated in the overall media flow of user agents would provide much greater freedom to specialize and innovate in this space (see related discussion on W3C’s discourse forum).
Content Decryption Module API: The capabilities offered by the Encrypted Media Extensions rely on the integration with undefined interfaces for the Content Decryption Modules. Providing a uniform interface for that integration would simplify the addition of new CDMs in the market. Work on such an interface is out of scope for the Media Working Group who maintains the specification.
Conditional Access System: Broadcasters use a different approach to protect the content they distribute, known as Conditional Access System (CAS). As broadcast streams are coming to Web browsers, providing integration with these systems may be needed.
Hardware-accelerated video processing: The Canvas API provides capabilities to do image and video processing, but these capabilities are limited by their reliance on the CPU for execution; modern GPUs provide hardware-acceleration for a wide range of operations, but browsers don't provide hooks to these. While there is no dedicated effort to enable hardware-accelerated video processing, note that it is a use case of ongoing explorations, e.g. on WebGPU and a Web Neural Network API.

Discontinued features

Media Capture Stream with Worker: Video processing using the Canvas API is very CPU intensive, and as such, can benefit from executing separately from the rest of a Web application. The Media Capture Stream with Worker specification was an early approach to process video streams in a dedicated worker thread. That processing would still be done on the CPU though, and work on this document has been discontinued. Work on WebGPU, a Web Neural Network API and WebCodecs may create the right building blocks to create high performance video processing applications in the future.

See anything missing?

If you know of use cases that cannot be achieved today with Web technologies, please let us know!

To that end, you can start a new topic in the Media and Real-Time Communications category of W3C’s discourse forum or raise an issue on the GitHub repository of this document.

GitHub Discourse