This document defines how a stream of media can be captured from a DOM element, such as a [^video^], [^audio^], or [^canvas^] element, in the form of a {{MediaStream}} [[GETUSERMEDIA]].
This document is not complete. It is subject to major changes and, while early experimentations are encouraged, it is therefore not intended for implementation.
This document describes an extension to both HTML media elements and the HTML canvas element that enables the capture of the output of the element in the form of streaming media.
The captured media is formed into a {{MediaStream}} [[GETUSERMEDIA]], which can then be consumed by the various APIs that process streams of media, such as WebRTC [[WEBRTC]], or Web Audio [[WEBAUDIO]].
This specification defines conformance criteria that apply to a single product: the user agent that implements the interfaces that it contains.
Implementations that use ECMAScript to implement the APIs defined in this specification must implement them in a manner consistent with the ECMAScript Bindings defined in the Web IDL specification [[!WEBIDL]], as this specification uses that specification and terminology.
The method {{HTMLMediaElement/captureStream}} is added on HTML [[!HTML5]] media elements. Methods for capture are added to both {{HTMLMediaElement}} and {{HTMLCanvasElement}}.
Both {{MediaStream}} and {{HTMLMediaElement}} expose the concept
of a track
. Since there is no common type used for
{{HTMLMediaElement}}, this document uses the term track to refer
to either {{VideoTrack}}
or {{AudioTrack}}.
{{MediaStreamTrack}}
is used to identify the media in a {{MediaStream}}.
partial interface HTMLMediaElement { MediaStream captureStream (); };
The captureStream() method produces a real-time capture of the media that is rendered to the media element.
The captured {{MediaStream}} comprises of {{MediaStreamTrack}}s that render the content from the set of selected (for {{VideoTrack}}s, or other exclusively selected track types) or enabled (for {{AudioTrack}}s, or other track types that support multiple selections) tracks from the media element. If the media element does not have a selected or enabled tracks of a given type, then no {{MediaStreamTrack}} of that type is present in the captured stream.
A [^video^] element can therefore capture a video {{MediaStreamTrack}} and any number of audio {{MediaStreamTrack}}s. An [^audio^] element can capture any number of audio {{MediaStreamTrack}}s. In both cases, the set of captured {{MediaStreamTrack}}s could be empty.
Unless and until there is a track of given type that is selected or enabled, no {{MediaStreamTrack}} of that type is present in the captured stream. In particular, if the media element does not have a source assigned, then the captured {{MediaStream}} has no tracks. Consequently, a media element with a ready state of HAVE_NOTHING produces no captured {{MediaStreamTrack}} instances. Once metadata is available and the selected or enabled tracks are determined, new captured {{MediaStreamTrack}} instances are created and added to the {{MediaStream}}.
A captured {{MediaStreamTrack}} ends when playback
ends (and the ended
event fires) or when the track that it
captures is no longer selected or enabled for playback. A track is no longer
selected or enabled if the source is changed by setting the {{HTMLMediaElement/src}} or
{{HTMLMediaElement/srcObject}} attributes of the media element.
The set of captured {{MediaStreamTrack}}s change if the source of the media element changes. If the source for the media element ends, a different source is selected.
If the selected {{VideoTrack}} or enabled {{AudioTrack}}s for the media
element change, a addtrack
event with a new {{MediaStreamTrack}} is generated for each track
that was not previously selected or enabled; and a removetrack
events is generated for each track that ceases to be selected or enabled. A
{{MediaStreamTrack}} MUST end prior to being removed from the
{{MediaStream}}.
Since a {{MediaStreamTrack}} can only end once, a track that is enabled, disabled and re-enabled will be captured as two separate tracks. Similarly, restarting playback after playback ends causes a new set of captured {{MediaStreamTrack}} instances to be created. Seeking during playback without changing track selection does not generate events or cause a captured {{MediaStreamTrack}} to end.
The {{MediaStreamTrack}}s that comprise the captured {{MediaStream}} become muted or unmuted as the tracks they capture change state. At any time, a media element might not have active content available for capture on a given track for a variety of reasons:
Absence of content is reflected in captured tracks through the muted
attribute. A captured {{MediaStreamTrack}} MUST have a
muted
attribute set to true
if its corresponding
source track does not have available and accessible content. A
mute
event is raised on the {{MediaStreamTrack}} when content
availability changes.
What output a muted capture produces as a result will vary based on the type of media: a {{VideoTrack}} ceases to capture new frames when muted, causing the captured stream to show the last captured frame; a muted {{AudioTrack}} produces silence.
Whether a media element is actively rendering content (e.g., to a screen or audio device) has no effect on the content of captured streams. Muting the audio on a media element does not cause the capture to produce silence, nor does hiding a media element cause captured video to stop. Similarly, the audio level or volume of the media element does not affect the volume of captured audio.
Captured audio from an element with an effective playback rate other than 1.0 MUST be time-stretched. An unplayable playback rate causes the captured audio track to become muted.
The {{HTMLCanvasElement/captureStream}} method is added to the HTML [[!HTML5]] canvas element. The resulting {{CanvasCaptureMediaStreamTrack}} provides methods that allow for controlling when frames are sampled from the canvas.
partial interface HTMLCanvasElement { MediaStream captureStream (optional double frameRequestRate); };
The captureStream() method produces a real-time video capture of the surface of the canvas. The resulting media stream has a single video {{CanvasCaptureMediaStreamTrack}} that matches the dimensions of the canvas element.
Content from a canvas that is not origin-clean MUST NOT be captured. This method throws a {{SecurityError}} exception if the canvas is not origin-clean.
A captured stream MUST immediately cease to capture content if the [= origin-clean =] flag of the source canvas becomes false after the stream is created by {{HTMLCanvasElement/captureStream()}}. The captured {{MediaStreamTrack}} MUST become muted, producing no new content while the canvas remains in this state.
Each track that captures a canvas has a
{{track/[[frameCaptureRequested]]}} internal slot that is set to true
when a
new frame is requested from the canvas.
The value of {{track/[[frameCaptureRequested]]}} on all new
tracks is set to true
when the track is created. On creation of the
captured track with a specific, non-zero frameRequestRate, the user
agent starts a periodic timer at an interval of 1/frameRequestRate
seconds. At each activation of the timer,
{{track/[[frameCaptureRequested]]}} is set to true
.
In order to support manual control of frame capture with the {{CanvasCaptureMediaStreamTrack/requestFrame}}() method, browsers MUST support a value of 0 for frameRequestRate. However, a captured stream MUST request capture of a frame when created, even if frameRequestRate is zero.
This method throws a {{NotSupportedError}} if frameRequestRate is negative.
A new frame is requested from the canvas when {{track/[[frameCaptureRequested]]}} is true and the canvas is painted. Each time that the captured canvas is painted, the following steps are executed:
false
.
When adding new frames to track containing what was painted to the canvas, the alpha channel content of the canvas must be captured and preserved if the canvas is not fully opaque. The consumers of this track might not preserve the alpha channel.
This algorithm results in a captured track not starting until something changes in the canvas.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
frameRequestRate |
double
|
✘ | ✔ |
The CanvasCaptureMediaStreamTrack is an extension of {{MediaStreamTrack}} that provide a single {{CanvasCaptureMediaStreamTrack/requestFrame()}} method. Applications that depend on tight control over the rendering of content to the media stream can use this method to control when frames from the canvas are captured.
[Exposed=Window] interface CanvasCaptureMediaStreamTrack : MediaStreamTrack { readonly attribute HTMLCanvasElement canvas; undefined requestFrame (); };
canvas
of type {{HTMLCanvasElement}}, readonly
The requestFrame() method allows applications to manually request that a frame from the canvas be captured and rendered into the track. In cases where applications progressively render to a canvas, this allows applications to avoid capturing a partially rendered frame.
As currently specified, this results in no {{SecurityError}} or other error feedback if the canvas is not origin-clean. In part, this is because we don't track where requests for frames come from. Do we want to highlight that?
undefined
Media elements can render media resources from origins that differ from the origin of the media element. In those cases, the contents of the resulting {{MediaStreamTrack}} MUST be protected from access by the document origin.
How this protection manifests will differ, depending on how the content is accessed. For
instance, rendering inaccessible video to a [^canvas^] element [[HTML]]
causes the [= origin-clean =]
flag of the canvas to become false
; attempting to create a Web Audio
{{MediaStreamAudioSourceNode}} [[WEBAUDIO]] succeeds, but produces no information
to the document origin (that is, only silence is transmitted into the audio context);
attempting to transfer the media using WebRTC [[WEBRTC]] results in no information being
transmitted.
The origin of the media that is rendered by a media element can change at any time. This is even the case for a single media resource. User agents MUST ensure that a change in the origin of media doesn't result in exposure of cross origin content.
This section will be removed before publication.
This document is based on the stream processing specification [[streamproc]] originally developed by Robert O'Callahan.