Initial Author of this Specification was Ian Hickson, Google Inc., with the following copyright statement:
© Copyright 2004-2011 Apple Computer, Inc., Mozilla Foundation, and Opera Software ASA. You are granted a license to use, reproduce and create derivative works of this document.
All subsequent changes since 26 July 2011 done by the W3C WebRTC Working Group and the Device APIs Working Group are under the following Copyright:
© 2011-2016 W3C® (MIT, ERCIM, Keio, Beihang). Document use rules apply.
For the entire publication on the W3C site the liability and trademark rules apply.
This document defines a set of JavaScript APIs that allow local media, including audio and video, to be requested from a platform.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
This document is not complete. It is subject to major changes and, while early experimentations are encouraged, it is therefore not intended for implementation. The API is based on preliminary work done in the WHATWG.
This document was published by the Web Real-Time Communication Working Group and the Device APIs Working Group as an Editor's Draft. Comments regarding this document are welcome. Please send them to public-media-capture@w3.org (subscribe, archives).
Publication as an Editor's Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by groups operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures (Web Real-Time Communication Working Group) and a public list of any patent disclosures (Device APIs Working Group) made in connection with the deliverables of each group; these pages also include instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 1 March 2017 W3C Process Document.
This section is non-normative.
This document defines APIs for requesting access to local multimedia devices, such as microphones or video cameras.
This document also defines the MediaStream API, which provides the means to control where multimedia stream data is consumed, and provides some control over the devices that produce the media. It also exposes information about devices able to capture and render media.
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MAY, MUST, MUST NOT, NOT REQUIRED, and SHOULD are to be interpreted as described in [RFC2119].
This specification defines conformance criteria that apply to a single product: the User Agent that implements the interfaces that it contains.
Conformance requirements phrased as algorithms or specific steps may be implemented in any manner, so long as the end result is equivalent. (In particular, the algorithms defined in this specification are intended to be easy to follow, and not intended to be performant.)
Implementations that use ECMAScript [ECMA-262] to implement the APIs defined in this specification must implement them in a manner consistent with the ECMAScript Bindings defined in the Web IDL specification [ WEBIDL-1], as this specification uses that specification and terminology.
The EventHandler
interface represents a callback used for event handlers as defined in [
HTML5].
The concepts queue a task and fires a simple event are defined in [HTML5].
The terms event handlers, event handler event types and responsible document are defined in [HTML5].
The term current settings object is defined in [HTML51].
The term allowed to use is defined in WHATWG HTML.
DOMException
DOMException
is defined in WebIDL [
WEBIDL-1]
A source is the "thing" providing the source of a media stream track. The source is the broadcaster of the media itself. A source can be a physical webcam, microphone, local video or audio file from the user's hard drive, network resource, or static image. Note that this document describes the use of microphone and camera type sources only, the use of other source types is described in other documents.
An application that has no prior authorization regarding sources is only given the number of available sources, their type and any relationship to other devices. Additional information about sources can become available when applications are authorized to use a source (see 9.2.1 Access control model).
Sources do not have constraints — tracks have constraints. When a source is connected to a track, it must produce media that conforms to the constraints present on that track. Multiple tracks can be attached to the same source. User Agent processing, such as downsampling, MAY be used to ensure that all tracks have appropriate media.
Sources have constrainable properties which have
capabilities
and settings
. The constrainable properties are "owned" by the source and are common to any (multiple) tracks that happen to be using the same source (e.g., if two different track objects bound to the same source ask for the same capability or setting information, they will get back the same answer).
A setting refers to the immediate, current value of the source's constrainable properties. Settings are always read-only.
A source's settings can change dynamically over time due to environmental conditions, sink configurations, or constraint changes. A source's settings must always conform to the current set of basic (mandatory) constraints on all attached tracks. A source that cannot conform to these constraints causes affected tracks to become
overconstrained
and therefore muted. A User
Agent attempts to ensure that sources adhere to advanced (optional) constraints as closely as possible, see 11. Constrainable Pattern.
Although settings are a property of the source, they are only exposed to the application through the tracks attached to the source. This is exposed via the ConstrainablePattern
interface.
For each constrainable property, there is a capability that describes whether it is supported by the source and if so, the range of supported values. As with settings, capabilities are exposed to the application via the ConstrainablePattern
interface.
The values of the supported capabilities must be normalized to the ranges and enumerated types defined in this specification.
A getCapabilities()
call on a track returns the same underlying per-source capabilities for all tracks connected to the source.
Source capabilities are effectively constant. Applications should be able to depend on a specific source having the same capabilities for any browsing session.
This API is intentionally simplified. Capabilities are not capable of describing interactions between different values. For instance, it is not possible to accurately describe the capabilities of a camera that can produce a high resolution video stream at a low frame rate and lower resolutions at a higher frame rate. Capabilities describe the complete range of each value. Interactions between constraints are exposed by attempting to apply constraints.
Constraints provide a general control surface that allows applications to both select an appropriate source for a track and, once selected, to influence how a source operates.
Constraints limit the range of operating modes that a source can use when providing media for a track. Without provided track constraints, implementations are free to select a source's settings from the full ranges of its supported capabilities. Implementations may also adjust source settings at any time within the bounds imposed by all applied constraints.
getUserMedia()
uses constraints to help select an appropriate source for a track and configure it. Additionally, the
ConstrainablePattern
interface on tracks includes an API for dynamically changing the track's constraints at any later time.
A track will not be connected to a source using
getUserMedia()
if its initial constraints cannot be satisfied. However, the ability to meet the constraints on a track can change over time, and constraints can be changed. If circumstances change such that constraints cannot be met, the ConstrainablePattern
interface defines an appropriate error to inform the application. 5. The model: sources, sinks, constraints, and settings explains how constraints interact in more detail.
In general, User Agents will have more flexibility to optimize the media streaming experience the fewer constraints are applied, so application authors are strongly encouraged to use mandatory constraints sparingly.
For each constrainable property, a constraint exists whose name corresponds with the relevant source setting name and capability name.
RTCPeerConnection
RTCPeerConnection
is defined in [
WEBRTC10].The terms permission, retrieve the permission state, request permission and create a permission storage entry are defined in [permissions].
"request permission" isn't defined yet. Filed as permissions issue 62.
The two main components in the MediaStream API are the
MediaStreamTrack
and MediaStream
interfaces. The
object represents media of a single type that originates from one media source in the User Agent, e.g. video produced by a web camera. A
MediaStreamTrack
is used to group several
MediaStream
objects into one unit that can be recorded or rendered in a media element.MediaStreamTrack
Each
can contain zero or more
MediaStream
objects. All tracks in a
MediaStreamTrack
are intended to be synchronized when rendered. This is not a hard requirement, since it might not be possible to synchronize tracks from sources that have different clocks. Different
MediaStream
objects do not need to be synchronized.
MediaStream
While the intent is to synchronize tracks, it could be better in some circumstances to permit tracks to lose synchronization. In particular, when tracks are remotely sourced and real-time [WEBRTC10], it can be better to allow loss of synchronization than to accumulate delays or risk glitches and other artifacts. Implementations are expected to understand the implications of choices regarding synchronization of playback and the effect that these have on user perception.
A single
can represent multi-channel content, such as stereo or 5.1 audio or stereoscopic video, where the channels have a well defined relationship to each other. Information about channels might be exposed through other APIs, such as [
WEBAUDIO], but this specification provides no direct access to channels.
MediaStreamTrack
A
object has an input and an output that represent the combined input and output of all the object's tracks. The output of the MediaStream
controls how the object is rendered, e.g., what is saved if the object is recorded to a file or what is displayed if the object is used in a MediaStream
video
element. A single
object can be attached to multiple different outputs at the same time.MediaStream
A new
object can be created from existing media streams or tracks using the
MediaStream
constructor. The constructor argument can either be an existing MediaStream()
object, in which case all the tracks of the given stream are added to the new
MediaStream
object, or an array of
MediaStream
objects. The latter form makes it possible to compose a stream from different source streams.MediaStreamTrack
Both
and
MediaStream
objects can be cloned. A cloned
MediaStreamTrack
contains clones of all member tracks from the original stream. A cloned MediaStream
has a
set of constraints that is independent of the instance it is cloned from, which allows media from the same source to have different constraints applied for different
consumers. The MediaStreamTrack
MediaStream
object is also used in contexts outside getUserMedia
, such as [WEBRTC10].
The MediaStream()
constructor composes a new stream out of existing tracks. It takes an optional argument of type
or an array of
MediaStream
objects. When the constructor is invoked, the User Agent must run the following steps:MediaStreamTrack
Let stream be a newly constructed
object.MediaStream
Initialize stream's id
attribute to a newly generated value.
If the constructor's argument is present, construct a set of tracks, tracks based on the type of argument:
A
object:MediaStream
Let tracks be a set containing all the
objects in the
MediaStreamTrack
track
set.MediaStream
A sequence of
objects:MediaStreamTrack
Let tracks be a set containing all the
objects in the provided sequence.
MediaStreamTrack
For each
, track, in tracks, run the following steps:MediaStreamTrack
Return stream.
The tracks of a
are stored in a
track set. The track set MUST contain the
MediaStream
objects that correspond to the tracks of the stream. The relative order of the tracks in the set is User Agent defined and the API will never put any requirements on the order. The proper way to find a specific MediaStreamTrack
object in the set is to look it up by its MediaStreamTrack
id
.
An object that reads data from the output of a
is referred to as a
MediaStream
consumer. The list of
MediaStream
consumers currently include media elements (such as MediaStream
<video>
and
<audio>
) [HTML5], Web Real-Time Communications (WebRTC; RTCPeerConnection
) [WEBRTC10], media recording (MediaRecorder
) [mediastream-recording], image capture (ImageCapture
) [image-capture], and web audio (MediaStreamAudioSourceNode
) [WEBAUDIO].
consumers must be able to handle tracks being added and removed. This behavior is specified per consumer.
MediaStream
A
object is said to be active when it has at least one
MediaStream
that has not ended. A MediaStreamTrack
that does not have any tracks or only has tracks that are ended is inactive.MediaStream
The User Agent may update a
's track set in response to, for example, an external event. This specification does not specify any such cases, but other specifications using the MediaStream API may. One such example is the WebRTC 1.0 [WEBRTC10] specification where the track set of a MediaStream
, received from another peer, can be updated as a result of changes to the media session.
MediaStream
When the User Agent initiates adding a track to a
, with the exception of initializing a newly created MediaStream
with tracks, the User Agent
MUST queue a task that runs the following steps:MediaStream
Let track be the
in question and stream the MediaStreamTrack
object to which track is to be added.MediaStream
If track is already in stream's track set, then abort these steps.
Add track to stream's track set.
Fire a track event named addtrack
with
track at stream.
When the User Agent initiates removing a track from a
, the User Agent MUST queue a task that runs the following steps:MediaStream
Let track be the
in question and stream the MediaStreamTrack
object to which track is to be added.MediaStream
If track is not in stream's track set, then abort these steps.
Remove track from stream's track set.
Fire a track event named removetrack
with
track at stream.
[Exposed=Window,
Constructor,
Constructor(MediaStream
stream),
Constructor(sequence<MediaStreamTrack
> tracks)]
interface MediaStream
: EventTarget {
readonly attribute DOMString id
;
sequence<MediaStreamTrack
> getAudioTracks
();
sequence<MediaStreamTrack
> getVideoTracks
();
sequence<MediaStreamTrack
> getTracks
();
MediaStreamTrack
? getTrackById
(DOMString trackId);
void addTrack
(MediaStreamTrack
track);
void removeTrack
(MediaStreamTrack
track);
MediaStream
clone
();
readonly attribute boolean active
;
attribute EventHandler onaddtrack
;
attribute EventHandler onremovetrack
;
};
MediaStream
See the MediaStream constructor algorithm
MediaStream
See the MediaStream constructor algorithm
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
stream | MediaStream |
✘ | ✘ |
MediaStream
See the MediaStream constructor algorithm
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
tracks |
sequence<MediaStreamTrack> |
✘ | ✘ |
id
of type DOMString, readonlyWhen a
is created, the User Agent MUST generate an identifier string, and MUST initialize the object's MediaStream
id
attribute to that string, unless the object is created as part of a special purpose algorithm that specifies how the stream id must be initialized. A good practice is to use a UUID [rfc4122], which is 36 characters long in its canonical form. To avoid fingerprinting, implementations SHOULD use the forms in section 4.4 or 4.5 of RFC 4122 when generating UUIDs.
The id
attribute MUST return the value to which it was initialized when the object was created.
active
of type boolean, readonlyThe active
attribute MUST return
true
if this
is
active and MediaStream
false
otherwise.
onaddtrack
of type EventHandlerThe event type of this event handler is addtrack
.
onremovetrack
of type EventHandlerThe event type of this event handler is removetrack
.
getAudioTracks
Returns a sequence of
objects representing the audio tracks in this stream.MediaStreamTrack
The getAudioTracks
method MUST return a sequence that represents a snapshot of all the
objects in this stream's
track set whose MediaStreamTrack
kind
is equal to "
audio
". The conversion from the track set to the sequence is User Agent defined and the order does not have to be stable between calls.
sequence<MediaStreamTrack>
getVideoTracks
Returns a sequence of
objects representing the video tracks in this stream.MediaStreamTrack
The getVideoTracks
method MUST return a sequence that represents a snapshot of all the
objects in this stream's
track set whose MediaStreamTrack
kind
is equal to "
video
". The conversion from the track set to the sequence is User Agent defined and the order does not have to be stable between calls.
sequence<MediaStreamTrack>
getTracks
Returns a sequence of
objects representing all the tracks in this stream.MediaStreamTrack
The getTracks
method
MUST return a sequence that represents a snapshot of all the
objects in this stream's
track set, regardless of MediaStreamTrack
kind
. The conversion from the track set to the sequence is User Agent defined and the order does not have to be stable between calls.
sequence<MediaStreamTrack>
getTrackById
The getTrackById
method MUST return either a
object from this stream's track set whose MediaStreamTrack
id
is equal to trackId, or null, if no such track exists.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
trackId | DOMString |
✘ | ✘ |
MediaStreamTrack
, nullable
addTrack
Adds the given
to this
MediaStreamTrack
.MediaStream
When the addTrack
method is invoked, the User Agent MUST run the following steps:
Let track be the methods argument and
stream the
object on which the method was called.MediaStream
If track is already in stream's track set, then abort these steps.
Add track to stream's track set.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
track | MediaStreamTrack |
✘ | ✘ |
void
removeTrack
Removes the given
object from this MediaStreamTrack
.MediaStream
When the removeTrack
method is invoked, the User Agent MUST run the following steps:
Let track be the methods argument and
stream the
object on which the method was called.MediaStream
If track is not in stream's track set, then abort these steps.
Remove track from stream's track set.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
track | MediaStreamTrack |
✘ | ✘ |
void
clone
Clones the given
and all its tracks.
MediaStream
When the clone()
method is invoked, the User Agent MUST run the following steps:
Let streamClone be a newly constructed
object.MediaStream
Initialize streamClone's id
attribute to a newly generated value.
Clone each track in this
object and add the result to
streamClone's track
set.MediaStream
MediaStream
A
object represents a media source in the User Agent. An example source is a device connected to the User Agent. Other specifications may define sources for
MediaStreamTrack
that override the behavior specified here. Several MediaStreamTrack
objects can represent the same media source, e.g., when the user chooses the same camera in the UI shown by two consecutive calls to MediaStreamTrack
.getUserMedia()
The data from a
object does not necessarily have a canonical binary form; for example, it could just be "the video currently coming from the user's video camera". This allows User Agents to manipulate media in whatever fashion is most suitable on the user's platform.MediaStreamTrack
A script can indicate that a
object no longer needs its source with the MediaStreamTrack
stop()
method. When all tracks using a source have been stopped or ended by some other means, the source is stopped. If the source is a device exposed by getUserMedia()
, then when the source is stopped, the UA
MUST run the following steps:
Let deviceId be the device's deviceId.
Set [[devicesLiveMap]][deviceId] to
false
.
If the result of retrieving the permission state of the permission associated with the device's kind and
deviceId, is not “granted”, then set [[devicesAccessibleMap]]
[deviceId] to
false
.
An implementation may use a per-source reference count to keep track of source usage, but the specifics are out of scope for this specification.
To clone a track the User Agent MUST run the following steps:
Let track be the
object to be cloned.MediaStreamTrack
Let trackClone be a newly constructed
object.MediaStreamTrack
Initialize trackClone's id
attribute to a newly generated value.
Initialize trackClone's kind
, label
, readyState
, and
enabled
attributes by copying the corresponding values from
track.
Let trackClone's underlying source be the source of track.
Set trackClone's constraints to the active constrains of track.
Return trackClone.
A
has two states in its life-cycle: MediaStreamTrack
live
and ended
. A newly created
can be in either state depending on how it was created. For example, cloning an ended track results in a new ended track. The current state is reflected by the object's
MediaStreamTrack
readyState
attribute.
In the live
state, the track is active and media is available for use by consumers (but may be replaced by zero-information-content if the
is
muted or disabled, see below).MediaStreamTrack
A muted or disabled
renders either silence (audio), black frames (video), or a zero-information-content equivalent. For example, a video element sourced by a muted or disabled MediaStreamTrack
(contained within a MediaStreamTrack
), is playing but the rendered content is the muted output.MediaStream
If the source is a device exposed by getUserMedia()
, then when a track becomes either muted or disabled, and this brings all tracks connected to the device to be either muted, disabled, or stopped, then the UA MAY, using the device's deviceId,
deviceId, set [[devicesLiveMap]][deviceId] to
false
, provided the UA sets it back to true
as soon as any unstopped track connected to this device becomes un-muted or enabled again.
The muted/unmuted state of a track reflects whether the source provides any media at this moment. The enabled/disabled state is under application control and determines whether the track outputs media (to its consumers). Hence, media from the source only flows when a
object is both unmuted and enabled.
MediaStreamTrack
A
is muted when the source is temporarily unable to provide the track with data. A track can be muted by a user. Often this action is outside the control of the application. This could be as a result of the user hitting a hardware switch or toggling a control in the operating system / browser chrome. A track can also be muted by the User Agent.MediaStreamTrack
Applications are able to enable or disable a
to prevent it from rendering media from the source. A muted track will however, regardless of the enabled state, render silence and blackness. A disabled track is logically equivalent to a muted track, from a consumer point of view.
MediaStreamTrack
For a newly created
object, the following applies. The track is always enabled unless stated otherwise (for example when cloned) and the muted state reflects the state of the source at the time the track is created.MediaStreamTrack
A
object is said to
end when the source of the track is disconnected or exhausted.
MediaStreamTrack
If all
s that are using the same source are ended, the source will be
stopped.MediaStreamTrack
When a
object ends for any reason (e.g., because the user rescinds the permission for the page to use the local camera, or because the application invoked the
MediaStreamTrack
stop()
method on the
object, or because the User Agent has instructed the track to end for any reason) it is said to be
ended.MediaStreamTrack
When a
track ends for any reason other than the MediaStreamTrack
stop()
method being invoked, the User Agent MUST queue a task that runs the following steps:
If the track's readyState
attribute has the value ended
already, then abort these steps.
Set track's readyState
attribute to ended
.
Notify track's source that track is
ended so that the source may be stopped, unless other
objects depend on it.MediaStreamTrack
Fire a simple event named ended
at the object.
If the end of the stream was reached due to a user request, the event source for this event is the user interaction event source.
There are two dimensions related to the media flow for a
live
: muted / not muted, and enabled / disabled.MediaStreamTrack
Muted refers to the input to the
. If live samples are not made available to the MediaStreamTrack
it is muted.MediaStreamTrack
Muted is out of control for the application, but can be observed by the application by reading the muted
attribute and listening to the associated events mute
and unmute
. There can be several reasons for a
to be muted: the user pushing a physical mute button on the microphone, the user toggling a control in the operating system, the user clicking a mute button in the browser chrome, the User Agent (on behalf of the user) mutes, etc.MediaStreamTrack
To update a track's muted state to newState, the User Agent MUST queue a task to run the following steps:
Let track be the MediaStreamTrack
in question.
Set track's muted
attribute to
newState.
If newState is true
let
eventName be mute
, otherwise
unmute
.
Fire a simple event named eventName on track.
Enabled/disabled on the other hand is available to the application to control (and observe) via the
enabled
attribute.
The result for the consumer is the same in the sense that whenever
is muted or disabled (or both) the consumer gets zero-information-content, which means silence for audio and black frames for video. In other words, media from the source only flows when a MediaStreamTrack
object is both unmuted and enabled. For example, a video element sourced by a muted or disabled MediaStreamTrack
(contained in a
MediaStreamTrack
), is playing but rendering blackness.
MediaStream
For a newly created
object, the following applies: the track is always enabled unless stated otherwise (for example when cloned) and the muted state reflects the state of the source at the time the track is created.MediaStreamTrack
is a constrainable
object as defined in the Constrainable Pattern section. Constraints are set on tracks and may affect sources.MediaStreamTrack
Whether
were provided at track initialization time or need to be established later at runtime, the APIs defined in the Constraints
ConstrainablePattern
Interface allow the retrieval and manipulation of the constraints currently established on a track.
If the
event is thrown, the track MUST be muted until either new satisfiable constraints are applied or the existing constraints become satisfiable.overconstrained
[Exposed=Window]
interface MediaStreamTrack
: EventTarget {
readonly attribute DOMString kind
;
readonly attribute DOMString id
;
readonly attribute DOMString label
;
attribute boolean enabled
;
readonly attribute boolean muted
;
attribute EventHandler onmute
;
attribute EventHandler onunmute
;
readonly attribute MediaStreamTrackState
readyState
;
attribute EventHandler onended
;
MediaStreamTrack
clone
();
void stop();
MediaTrackCapabilities
getCapabilities();
MediaTrackConstraints
getConstraints();
MediaTrackSettings
getSettings();
Promise<void> applyConstraints(optional MediaTrackConstraints
constraints);
attribute EventHandler onoverconstrained
;
};
kind
of type DOMString, readonlyThe kind
attribute
MUST return the string "audio
" if this object represents an audio track or "video
" if this object represents a video track.
id
of type DOMString, readonlyWhen a
is created, the User Agent MUST generate an identifier string, and MUST initialize the object's MediaStreamTrack
id
attribute to that string, unless the object is created as part of a special purpose algorithm that specifies how the stream id must be initialized. See
's
MediaStream
id
attribute for guidelines on how to generate such an identifier.
An example of an algorithm that specifies how the track id must be initialized is the algorithm to represent an incoming network component with a
object. [WEBRTC10]MediaStreamTrack
id
attribute MUST return the value to which it was initialized when the object was created.
label
of type DOMString, readonlyUser Agents MAY label audio and video sources (e.g., "Internal microphone" or "External USB Webcam"). The label
attribute
MUST return the label of the object's corresponding source, if any. If the corresponding source has or had no label, the attribute MUST instead return the empty string.
enabled
of type booleanThe enabled
attribute controls the enabled
state for the object.
On getting, the attribute MUST return the value to which it was last set. On setting, it MUST be set to the new value.
Thus, after a
has ended, its MediaStreamTrack
enabled
attribute still changes value when set; it just doesn't do anything with that new value.
muted
of type boolean, readonlyThe muted
attribute
MUST return true
if the track is muted, and false
otherwise.
onmute
of type EventHandlerThe event type of this event handler is mute
.
onunmute
of type EventHandlerThe event type of this event handler is unmute
.
readyState
of type MediaStreamTrackState
, readonlyThe readyState
attribute represents the state of the track. It MUST return the value as most recently set by the User Agent.
onended
of type EventHandlerThe event type of this event handler is ended
.
onoverconstrained
of type
EventHandlerThe event type of this event handler is overconstrained
.
See ConstrainablePattern
Interface for more information about the
overconstrained
event.
clone
Clones this
.MediaStreamTrack
When the clone()
method is invoked, the User Agent MUST return the result of cloning
this track.
MediaStreamTrack
stop
When a
object's
MediaStreamTrack
stop()
method is invoked, the User Agent MUST run following steps:
Let track be the current
object.MediaStreamTrack
Notify track's source that track is ended.
A source that is notified of a track ending will be
stopped, unless other
objects depend on it.
MediaStreamTrack
Set track's readyState
attribute to ended
.
The task source for the tasks queued for the stop()
method is the DOM manipulation task source.
void
getCapabilities()
Returns the capabilites of the source that this
, the constrainable
object, represents.MediaStreamTrack
See ConstrainablePattern Interface for the definition of this method.
Since this method gives likely persistent, cross-origin information about the underlying device, it adds to the fingerprint surface of the device.
MediaTrackCapabilities
getConstraints()
See ConstrainablePattern Interface for the definition of this method.
MediaTrackConstraints
getSettings()
See ConstrainablePattern Interface for the definition of this method.
MediaTrackSettings
applyConstraints()
See ConstrainablePattern Interface for the definition of this method, where
MediaStreamTrack
on which this method was called, and
MediaTrackSettings
dictionary.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
constraints |
MediaTrackConstraints |
✘ | ✔ |
A new constraint structure to apply to this object. |
Promise<void>
enum MediaStreamTrackState
{
"live",
"ended"
};
MediaStreamTrackState Enumeration description |
|
---|---|
live |
The track is active (the track's underlying media source is making a best-effort attempt to provide data in real time). The output of a track in the |
ended |
The track has ended (the track's underlying media source is no longer providing data, and will never provide more data for this track). Once a track enters this state, it never exits it. For example, a video track in a
|
MediaTrackSupportedConstraints
represents the list of constraints recognized by a User Agent for controlling the
MediaTrackSupportedConstraints
Capabilities
of a
object. This dictionary is used as a function return value, and never as an operation argument.MediaStreamTrack
Future specifications can extend the MediaTrackSupportedConstraints dictionary by defining a partial dictionary with dictionary members of type boolean.
dictionary MediaTrackSupportedConstraints
{
boolean width
= true;
boolean height
= true;
boolean aspectRatio
= true;
boolean frameRate
= true;
boolean facingMode
= true;
boolean volume
= true;
boolean sampleRate
= true;
boolean sampleSize
= true;
boolean echoCancellation
= true;
boolean autoGainControl
= true;
boolean noiseSuppression
= true;
boolean latency
= true;
boolean channelCount
= true;
boolean deviceId
= true;
boolean groupId
= true;
};
MediaTrackSupportedConstraints
Memberswidth
of type boolean, defaulting to
true
width
for details.height
of type boolean, defaulting to
true
height
for details.aspectRatio
of type boolean, defaulting to
true
aspectRatio
for details.frameRate
of type boolean, defaulting to
true
frameRate
for details.
facingMode
of type boolean, defaulting to
true
facingMode
for details.
volume
of type boolean, defaulting to
true
volume
for details.sampleRate
of type boolean, defaulting to
true
sampleRate
for details.
sampleSize
of type boolean, defaulting to
true
sampleSize
for details.
echoCancellation
of type boolean, defaulting to
true
echoCancellation
for details.autoGainControl
of type boolean, defaulting to
true
autoGainControl
for details.noiseSuppression
of type boolean, defaulting to
true
noiseSuppression
for details.latency
of type boolean, defaulting to
true
latency
for details.channelCount
of type boolean, defaulting to
true
channelCount
for details.
deviceId
of type boolean, defaulting to
true
deviceId
for details.groupId
of type boolean, defaulting to
true
groupId
for details.MediaTrackCapabilities
represents the
MediaTrackCapabilities
Capabilities
of a
object.MediaStreamTrack
Future specifications can extend the MediaTrackCapabilities dictionary by defining a partial dictionary with dictionary members of appropriate type.
dictionary MediaTrackCapabilities
{
LongRange
width
;
LongRange
height
;
DoubleRange
aspectRatio
;
DoubleRange
frameRate
;
sequence<DOMString> facingMode
;
DoubleRange
volume
;
LongRange
sampleRate
;
LongRange
sampleSize
;
sequence<boolean> echoCancellation
;
sequence<boolean> autoGainControl
;
sequence<boolean> noiseSuppression
;
DoubleRange
latency
;
LongRange
channelCount
;
DOMString deviceId
;
DOMString groupId
;
};
MediaTrackCapabilities
Memberswidth
of type LongRange
width
for details.height
of type LongRange
height
for details.aspectRatio
of type DoubleRange
aspectRatio
for details.frameRate
of type DoubleRange
frameRate
for details.
facingMode
of type sequence<DOMString>A camera can report multiple facing modes. For example, in a high-end telepresence solution with several cameras facing the user, a camera to the left of the user can report both "left" and "user". See facingMode
for additional details.
volume
of type DoubleRange
volume
for details.sampleRate
of type LongRange
sampleRate
for details.
sampleSize
of type LongRange
sampleSize
for details.
echoCancellation
of type sequence<boolean>If the source cannot do echo cancellation a single
false
is reported. If echo cancellation cannot be turned off, a single true
is reported. If the script can control the feature, the source reports a list with both true
and false
as possible values. See echoCancellation
for additional details.
autoGainControl
of type sequence<boolean>If the source cannot do auto gain control a single
false
is reported. If auto gain control cannot be turned off, a single true
is reported. If the script can control the feature, the source reports a list with both true
and false
as possible values. See autoGainControl
for additional details.
noiseSuppression
of type sequence<boolean>If the source cannot do noise suppression a single
false
is reported. If noise suppression cannot be turned off, a single true
is reported. If the script can control the feature, the source reports a list with both true
and false
as possible values. See noiseSuppression
for additional details.
latency
of type DoubleRange
latency
for details.channelCount
of type LongRange
channelCount
for details.
deviceId
of type DOMStringdeviceId
for details.groupId
of type DOMStringgroupId
for details.MediaTrackConstraints
dictionary MediaTrackConstraints
: MediaTrackConstraintSet
{
sequence<MediaTrackConstraintSet
> advanced
;
};
MediaTrackConstraints
Membersadvanced
of type sequence<MediaTrackConstraintSet
>See Constraints and ConstraintSet for the definition of this element.
Future specifications can extend the MediaTrackConstraintSet
dictionary by defining a partial dictionary with dictionary members of appropriate type.
dictionary MediaTrackConstraintSet
{
ConstrainLong
width
;
ConstrainLong
height
;
ConstrainDouble
aspectRatio
;
ConstrainDouble
frameRate
;
ConstrainDOMString
facingMode
;
ConstrainDouble
volume
;
ConstrainLong
sampleRate
;
ConstrainLong
sampleSize
;
ConstrainBoolean
echoCancellation
;
ConstrainBoolean
autoGainControl
;
ConstrainBoolean
noiseSuppression
;
ConstrainDouble
latency
;
ConstrainLong
channelCount
;
ConstrainDOMString
deviceId
;
ConstrainDOMString
groupId
;
};
MediaTrackConstraintSet
Memberswidth
of type ConstrainLong
width
for details.height
of type ConstrainLong
height
for details.aspectRatio
of type ConstrainDouble
aspectRatio
for details.frameRate
of type ConstrainDouble
frameRate
for details.
facingMode
of type ConstrainDOMString
facingMode
for details.
volume
of type ConstrainDouble
volume
for details.sampleRate
of type ConstrainLong
sampleRate
for details.
sampleSize
of type ConstrainLong
sampleSize
for details.
echoCancellation
of type ConstrainBoolean
echoCancellation
for details.autoGainControl
of type ConstrainBoolean
autoGainControl
for details.noiseSuppression
of type ConstrainBoolean
noiseSuppression
for details.latency
of type ConstrainDouble
latency
for details.channelCount
of type ConstrainLong
channelCount
for details.
deviceId
of type ConstrainDOMString
deviceId
for details.groupId
of type ConstrainDOMString
groupId
for details.MediaTrackSettings
represents the
MediaTrackSettings
Settings
of a
object.MediaStreamTrack
Future specifications can extend the MediaTrackSettings dictionary by defining a partial dictionary with dictionary members of appropriate type.
dictionary MediaTrackSettings
{
long width
;
long height
;
double aspectRatio
;
double frameRate
;
DOMString facingMode
;
double volume
;
long sampleRate
;
long sampleSize
;
boolean echoCancellation
;
boolean autoGainControl
;
boolean noiseSuppression
;
double latency
;
long channelCount
;
DOMString deviceId
;
DOMString groupId
;
};
MediaTrackSettings
Memberswidth
of type longwidth
for details.height
of type longheight
for details.aspectRatio
of type doubleaspectRatio
for details.frameRate
of type doubleframeRate
for details.
facingMode
of type DOMStringfacingMode
for details.
volume
of type doublevolume
for details.sampleRate
of type longsampleRate
for details.
sampleSize
of type longsampleSize
for details.
echoCancellation
of type booleanechoCancellation
for details.autoGainControl
of type booleanautoGainControl
for details.noiseSuppression
of type booleannoiseSuppression
for details.latency
of type doublelatency
for details.channelCount
of type longchannelCount
for details.
deviceId
of type DOMStringdeviceId
for details.groupId
of type DOMStringgroupId
for details.The names of the initial set of constrainable properties for MediaStreamTrack are defined below.
The following constrainable properties are defined to apply to both video and audio
objects:MediaStreamTrack
Property Name | Values | Notes |
---|---|---|
deviceId | DOMString |
The origin-unique identifier for the source of the
MediaStreamTrack . The same identifier MUST be valid between browsing sessions of this origin, but MUST also be different for other origins. Some sort of GUID is recommended for the identifier. Note that the setting of this property is uniquely determined by the source that is attached to the
MediaStreamTrack . In particular,
getCapabilities() will return only a single value for deviceId. This property can therefore be used for initial media selection with getUserMedia() . However, it is not useful for subsequent media control with applyConstraints() , since any attempt to set a different value will result in an unsatisfiable ConstraintSet .
|
groupId | DOMString |
The browsing session-unique group identifier for the source of the MediaStreamTrack . Two devices have the same group identifier if they belong to the same physical device; for example, the audio input and output devices representing the speaker and microphone of the same headset would have the same groupId. Note that the setting of this property is uniquely determined by the source that is attached to the
MediaStreamTrack . In particular,
getCapabilities() will return only a single value for groupId. Since this property is not stable between browsing sessions its usefulness for initial media selection with
getUserMedia() is limited. It is not useful for subsequent media control with applyConstraints() , since any attempt to set a different value will result in an unsatisfiable ConstraintSet .
|
The following constrainable properties are defined to apply only to video
objects:MediaStreamTrack
Property Name | Values | Notes |
---|---|---|
width |
|
The width or width range, in pixels. As a capability, the range should span the video source's pre-set width values with min being the smallest width and max being the largest width. |
height |
|
The height or height range, in pixels. As a capability, the range should span the video source's pre-set height values with min being the smallest height and max being the largest height. |
frameRate |
|
The exact frame rate (frames per second) or frame rate range. If this frame rate cannot be determined (e.g. the source does not natively provide a frame rate, or the frame rate cannot be determined from the source stream), then this value MUST refer to the User Agent's vsync display rate. |
aspectRatio |
|
The exact aspect ratio (width in pixels divided by height in pixels, represented as a double rounded to the tenth decimal place) or aspect ratio range. |
facingMode |
|
This string (or each string, when a list) should be one of the members of . The members describe the directions that the camera can face, as seen from the user's perspective. Note that may not return exactly the same string for strings not in this enum. This preserves the possibility of using a future version of WebIDL enum for this property. |
enum VideoFacingModeEnum
{
"user",
"environment",
"left",
"right"
};
VideoFacingModeEnum Enumeration description |
|
---|---|
user |
The source is facing toward the user (a self-view camera). |
environment |
The source is facing away from the user (viewing the environment). |
left |
The source is facing to the left of the user. |
right |
The source is facing to the right of the user. |
Below is an illustration of the video facing modes in relation to the user.
The following constrainable properties are defined to apply only to audio
objects:MediaStreamTrack
Property Name | Values | Notes |
---|---|---|
volume |
|
The volume or volume range, as a multiplier of the linear audio sample values. A volume of 0.0 is silence, while a volume of 1.0 is the maximum supported volume. A volume of 0.5 will result in an approximately 6 dBSPL change in the sound pressure level from the maximum volume. Note that any ConstraintSet that specifies values outside of this range of 0 to 1 can never be satisfied. |
sampleRate |
|
The sample rate in samples per second for the audio data. |
sampleSize |
|
The linear sample size in bits. This constraint can only be satisfied for audio devices that produce linear samples. |
echoCancellation |
|
When one or more audio streams is being played in the processes of various microphones, it is often desirable to attempt to remove the sound being played from the input signals recorded by the microphones. This is referred to as echo cancellation. There are cases where it is not needed and it is desirable to turn it off so that no audio artifacts are introduced. This allows applications to control this behavior. |
autoGainControl |
|
Automatic gain control is often desirable on the input signal recorded by the microphone. There are cases where it is not needed and it is desirable to turn it off so that the audio is not altered. This allows applications to control this behavior. |
noiseSuppression |
|
Noise suppression is often desirable on the input signal recorded by the microphone. There are cases where it is not needed and it is desirable to turn it off so that the audio is not altered. This allows applications to control this behavior. |
latency |
|
The latency or latency range, in seconds. The latency is the time between start of processing (for instance, when sound occurs in the real world) to the data being available to the next step in the process. Low latency is critical for some applications; high latency may be acceptable for other applications because it helps with power constraints. The number is expected to be the target latency of the configuration; the actual latency may show some variation from that. |
channelCount |
|
The number of independent channels of sound that the audio data contains, i.e. the number of audio samples per sample frame. |
The addtrack
and removetrack
events use the
interface.MediaStreamTrackEvent
The addtrack
and removetrack
events notify the script that the track set of a
has been updated by the User Agent.MediaStream
Firing a track event named
e with a
track means that an event with the name e, which does not bubble (except where otherwise stated) and is not cancelable (except where otherwise stated), and which uses the
MediaStreamTrack
interface with the
MediaStreamTrackEvent
track
attribute set to track, MUST be created and dispatched at the given target.
[Exposed=Window,
Constructor(DOMString type, MediaStreamTrackEventInit
eventInitDict)]
interface MediaStreamTrackEvent
: Event {
[SameObject]
readonly attribute MediaStreamTrack
track
;
};
MediaStreamTrackEvent
Constructs a new
.MediaStreamTrackEvent
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
type | DOMString |
✘ | ✘ | |
eventInitDict |
MediaStreamTrackEventInit |
✘ | ✘ |
track
of type MediaStreamTrack
, readonlyThe track
attribute represents the
object associated with the event.MediaStreamTrack
dictionary MediaStreamTrackEventInit
: EventInit {
required MediaStreamTrack
track
;
};
MediaStreamTrackEventInit
Memberstrack
of type MediaStreamTrack
, requiredThis section is non-normative.
Browsers provide a media pipeline from sources to sinks. In a browser, sinks are the <img>, <video>, and <audio> tags. Traditional sources include streamed content, files, and web resources. The media produced by these sources typically does not change over time - these sources can be considered to be static.
The sinks that display these sources to the user (the actual tags themselves) have a variety of controls for manipulating the source content. For example, an <img> tag scales down a huge source image of 1600x1200 pixels to fit in a rectangle defined with
width="400"
and height="300"
.
The getUserMedia API adds dynamic sources such as microphones and cameras - the characteristics of these sources can change in response to application needs. These sources can be considered to be dynamic in nature. A <video> element that displays media from a dynamic source can either perform scaling or it can feed back information along the media pipeline and have the source produce content more suitable for display.
Note: This sort of feedback loop is obviously just enabling an "optimization", but it's a non-trivial gain. This optimization can save battery, allow for less network congestion, etc...
Note that MediaStream
sinks (such as
<video>
, <audio>
, and even
RTCPeerConnection
) will continue to have mechanisms to further transform the source stream beyond that which the Settings
,
Capabilities
, and Constraints
described in this specification offer. (The sink transformation options, including those of
RTCPeerConnection
, are outside the scope of this specification.)
The act of changing or applying a track constraint may affect the
settings
of all tracks sharing that source and consequently all down-level sinks that are using that source. Many sinks may be able to take these changes in stride, such as the
<video>
element or RTCPeerConnection
. Others like the Recorder API may fail as a result of a source setting change.
The RTCPeerConnection
is an interesting object because it acts simultaneously as both a sink and a source for over-the-network streams. As a sink, it has source transformational capabilities (e.g., lowering bit-rates, scaling-up / down resolutions, and adjusting frame-rates), and as a source it could have its own settings changed by a track source.
To illustrate how changes to a given source impact various sinks, consider the following example. This example only uses width and height, but the same principles apply to all of the Settings
exposed in this specification. In the first figure a home client has obtained a video source from its local video camera. The source's width and height settings are 800 pixels and 600 pixels, respectively. Three
objects on the home client contain tracks that use this same MediaStream
deviceId
. The three media streams are connected to three different sinks: a <video>
element (A), another <video>
element (B), and a peer connection (C). The peer connection is streaming the source video to a remote client. On the remote client there are two media streams with tracks that use the peer connection as a source. These two media streams are connected to two <video>
element sinks (Y and Z).
Note that at this moment, all of the sinks on the home client must apply a transformation to the original source's provided dimension settings. B is scaling the video down, A is scaling the video up (resulting in loss of quality), and C is also scaling the video up slightly for sending over the network. On the remote client, sink Y is scaling the video way down, while sink Z is not applying any scaling.
In response to
being called, one of the tracks wants a higher resolution (1920 by 1200 pixels) from the home client's video source.applyConstraints()
Note that the source change immediately affects all of the tracks and sinks on the home client, but does not impact any of the sinks (or sources) on the remote client. With the increase in the home client source video's dimensions, sink A no longer has to perform any scaling, while sink B must scale down even further than before. Sink C (the peer connection) must now scale down the video in order to keep the transmission constant to the remote client.
While not shown, an equally valid settings change request could be made on the remote client's side. In addition to impacting sink Y and Z in the same manner as A, B and C were impacted earlier, it could lead to re-negotiation with the peer connection on the home client in order to alter the transformation that it is applying to the home client's video source. Such a change is NOT REQUIRED to change anything related to sink A or B or the home client's video source.
Note that this specification does not define a mechanism by which a change to the remote client's video source could automatically trigger a change to the home client's video source. Implementations may choose to make such source-to-sink optimizations as long as they only do so within the constraints established by the application, as the next example demonstrates.
It is fairly obvious that changes to a given source will impact sink consumers. However, in some situations changes to a given sink may also cause implementations to adjust a source's settings. This is illustrated in the following figures. In the first figure below, the home client's video source is sending a video stream sized at 1920 by 1200 pixels. The video source is also unconstrained, such that the exact source dimensions are flexible as far as the application is concerned. Two
objects contain tracks with the same
MediaStream
deviceId
, and those
s are connected to two different MediaStream
<video>
element sinks A and B. Sink A has been sized to width="1920"
and
height="1200"
and is displaying the source's video content without any transformations. Sink B has been sized smaller and, as a result, is scaling the video down to fit its rectangle of 320 pixels across by 200 pixels down.
When the application changes sink A to a smaller dimension (from 1920 to 1024 pixels wide and from 1200 to 768 pixels tall), the browser's media pipeline may recognize that none of its sinks require the higher source resolution, and needless work is being done both on the part of the source and sink A. In such a case and without any other constraints forcing the source to continue producing the higher resolution video, the media pipeline MAY change the source resolution:
In the above figure, the home client's video source resolution was changed to the greater of that from sink A and B in order to optimize playback. While not shown above, the same behavior could apply to peer connections and other sinks.
It is possible that constraints can be applied to a track which a source is unable to satisfy, either because the source itself cannot satisfy the constraint or because the source is already satisfying a conflicting constraint. When this happens, the promise returned from
will be rejected, without applying any of the new constraints. Since no change in constraints occurs in this case, there is also no required change to the source itself as a result of this condition. Here is an example of this behavior.applyConstraints()
In this example, two media streams each have a video track that share the same source. The first track initially has no constraints applied. It is connected to sink N. Sink N has a resolution of 800 by 600 pixels and is scaling down the source's resolution of 1024 by 768 to fit. The other track has a mandatory constraint forcing off the source's fill light; it is connected to sink P. Sink P has a width and height equal to that of the source.
Now, the first track adds a mandatory constraint that the fill light should be forced on. At this point, both mandatory constraints cannot be satisfied by the source (the fill light cannot be simultaneously on and off at the same time). Since this state was caused by the first track's attempt to apply a conflicting constraint, the constraint application fails and there is no change in the source's settings nor to the constraints on either track.
Let's look at a slightly different situation starting from the same point. In this case, instead of the first track attempting to apply a conflicting constraint, the user physically locks the camera into a mode where the fill light is on. At this point the source can no longer satisfy the second track's mandatory constraint that the fill light be off. The second track is transitioned into the muted state and receives an
overconstrained
event. At the same time, the source notes that its remaining active sink only requires a resolution of 800 by 600 and so it adjusts its resolution down to match (this is an optional optimization that the User Agent is allowed to make given the situation).
At this point, it is the responsibility of the application to address the problem that led to the overconstrained situation, perhaps by removing the fill light mandatory constraint on the second track or by closing the second track altogether and informing the user.
A MediaStream
may be assigned to media elements. A
MediaStream
is not preloadable or seekable and represents a simple, potentially infinite, linear media timeline. The timeline starts at 0 and increments linearly in real time as long as the
MediaStream
is playing. The timeline does not increment when the playout of the MediaStream
is paused.
User Agents that support this specification MUST support the
srcObject
attribute of the HTMLMediaElement
interface defined in [HTML51], which includes support for playing
MediaStream
objects.
The [HTML51] document outlines how the HTMLMediaElement
works with a media provider object. The following applies when the
media provider object is a
:MediaStream
Whenever an [HTML51]
AudioTrack
or a
VideoTrack
is created, the id
and
label
attributes must be initialized to the corresponding attributes of the MediaStreamTrack
, the kind
attribute must be initialized to "main" and the language
attribute to the empty string
MediaStream
and MUST NOT buffer.Since the order in the
's track set is undefined, no requirements are put on how the MediaStream
AudioTrackList
and
VideoTrackList
is ordered
When the
state moves from the active to the inactive state, the User Agent
MUST raise an
ended event on the MediaStream
HTMLMediaElement
and set its
ended attribute to true
. Note that once
ended equals true
the HTMLMediaElement
will not play media even if new
's are added to the MediaStreamTrack
(causing it to return to the active state) unless MediaStream
autoplay
is true
or the web application restarts the element, e.g., by calling
play()
Any calls to the
fastSeek
method on a HTMLMediaElement
must be ignored
The nature of the
places certain restrictions on the behavior and attribute values of the associated
MediaStream
HTMLMediaElement
and on the operations that can be performed on it, as shown below:
Attribute Name | Attribute Type | Valid Values When Using a MediaStream | Additional considerations |
---|---|---|---|
preload
|
DOMString |
On getting: none . On setting: ignored. |
A MediaStream cannot be preloaded. |
buffered
|
TimeRanges
|
buffered.length MUST return 0 . |
A MediaStream cannot be preloaded. Therefore, the amount buffered is always an empty TimeRange. |
currentTime
|
double |
Any non-negative integer. The initial value is 0 and the values increments linearly in real time whenever the stream is playing. | The value is the current stream position, in seconds. On any attempt to set this attribute, the User Agent must throw an
InvalidStateError exception. |
seeking
|
boolean |
false | A MediaStream is not seekable. Therefore, this attribute MUST always have the value false . |
defaultPlaybackRate
|
double |
On setting: ignored. On getting: return 1.0 | A MediaStream is not seekable. Therefore, this attribute MUST always have the value 1.0 and any attempt to alter it
MUST be ignored. Note that this also means that the
ratechange event will not fire. |
playbackRate
|
double |
1.0 | A MediaStream is not seekable. Therefore, this attribute MUST always have the value 1.0 and any attempt to alter it
MUST be ignored. Note that this also means that the
ratechange event will not fire. |
played
|
TimeRanges
|
played.length MUST return 1 .played.start(0) MUST return 0 .played.end(0) MUST return the last known
currentTime .
|
A 's timeline always consists of a single range, starting at 0 and extending up to the currentTime. |
seekable
|
TimeRanges
|
seekable.length MUST return 0 . |
A is not seekable. |
loop
|
boolean |
true, false | Setting the loop attribute has no effect since a
has no defined end and therefore cannot be looped. |
When applicable, behavior outlined above for
HTMLMediaElement
carry over to
MediaController
's.
This section and its subsections extend the list of Error subclasses defined in [ES6] following the pattern for NativeError in section 19.5.6 of that specification. Assume the following:
The following terms used in this section are defined in [ES6].
Term/Notation | Section in [ES6] |
---|---|
Type(X) | 6 |
intrinsic object | 6.1.7.4 |
[[ErrorData]] | 19.5.1 |
internal slot | 6.1.7.2 |
NewTarget | various uses, but no definition |
active function object | 8.3 |
OrdinaryCreateFromConstructor() | 9.1.14 |
ReturnIfAbrupt() | 6.2.2.4 |
Assert | 5.2 |
String | 4.3.17-19, depending on context |
PropertyDescriptor | 6.2.4 |
[[Value]] | 6.1.7.1 |
[[Writable]] | 6.1.7.1 |
[[Enumerable]] | 6.1.7.1 |
[[Configurable]] | 6.1.7.1 |
DefinePropertyOrThrow() | 7.3.7 |
abrupt completion | 6.2.2 |
ToString() | 7.1.12 |
[[Prototype]] | 9.1 |
%Error% | 19.5.1 |
Error | 19.5 |
%ErrorPrototype% | 19.5.3 |
Object.prototype.toString | 19.1.3.6 |
The OverconstrainedError Constructor is the %OverconstrainedError% intrinsic object. When OverconstrainedError
is called as a function rather than as a constructor, it creates and initializes a new OverconstrainedError object. A call of the object as a function is equivalent to calling it as a constructor with the same arguments. Thus the function call OverconstrainedError(...)
is equivalent to the object creation expression new
OverconstrainedError(...)
with the same arguments.
The OverconstrainedError
constructor is designed to be subclassable. It may be used as the value of an extends
clause of a class definition. Subclass constructors that intend to inherit the specified OverconstrainedError
behaviour must include a super
call to the
OverconstrainedError
constructor to create and initialize the subclass instance with an [[ErrorData]] internal slot.
When the OverconstrainedError
function is called with arguments constraint and message the following steps are taken:
"%OverconstrainedErrorPrototype%"
, «[[ErrorData]]» ).
constraint
", constraintDesc).message
", msgDesc).The value of the [[Prototype]] internal slot of the OverconstrainedError constructor is the intrinsic object %Error%.
Besides the length
property (whose value is 1), the OverconstrainedError constructor has the following properties:
The initial value of OverconstrainedError.prototype
is the OverconstrainedError
prototype object. This property has the attributes { [[Writable]]: false, [[Enumerable]]: false, [[Configurable]]: false }.
The OverconstrainedError prototype object is an ordinary object. It is not an Error instance and does not have an [[ErrorData]] internal slot.
The value of the [[Prototype]] internal slot of the OverconstrainedError prototype object is the intrinsic object %ErrorPrototype%.
The initial value of the constructor property of the prototype for the OverconstrainedError constructor is the intrinsic object %OverconstrainedError%.
The initial value of the constraint property of the prototype for the OverconstrainedError constructor is the empty String.
The initial value of the message property of the prototype for the OverconstrainedError constructor is the empty String.
The initial value of the name property of the prototype for the OverconstrainedError constructor is
"OverconstrainedError"
.
OverconstrainedError instances are ordinary objects that inherit properties from the OverconstrainedError prototype object and have an [[ErrorData]] internal slot whose value is undefined. The only specified use of [[ErrorData]] is by Object.prototype.toString ([ ES6], section 19.1.3.6) to identify instances of Error or its various subclasses.
The following interface is defined for cases when an OverconstrainedError is raised as an event:
[Exposed=Window,
Constructor(DOMString type, OverconstrainedErrorEventInit
eventInitDict)]
interface OverconstrainedErrorEvent
: Event {
readonly attribute OverconstrainedError
? error
;
};
OverconstrainedErrorEvent
Constructs a new
.OverconstrainedErrorEvent
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
type | DOMString |
✘ | ✘ | |
eventInitDict |
OverconstrainedErrorEventInit |
✘ | ✘ |
error
of type OverconstrainedError
, readonly ,
nullableThe
describing the error that triggered the event (if any).OverconstrainedError
dictionary OverconstrainedErrorEventInit
: EventInit {
OverconstrainedError
? error
= null;
};
OverconstrainedErrorEventInit
Memberserror
of type OverconstrainedError
, nullable,
defaulting to null
The
describing the error associated with the event (if any)OverconstrainedError
This section is non-normative.
The following events fire on
objects:MediaStream
Event name | Interface | Fired when... |
---|---|---|
addtrack |
|
A new has been added to this stream. Note that this event is not fired when the script directly modifies the tracks of a . |
removetrack |
|
A has been removed from this stream. Note that this event is not fired when the script directly modifies the tracks of a . |
The following events fire on
objects:MediaStreamTrack
Event name | Interface | Fired when... |
---|---|---|
mute |
Event |
The object's source is temporarily unable to provide data. |
unmute |
Event |
The object's source is live again after having been temporarily unable to provide data. |
overconstrained |
|
This error event fires for each affected track (when multiple tracks share the same source) after the User Agent has evaluated the current constraints against a given source and is not able to configure the source within the limitations established by the intersection of imposed constraints. Due to being over-constrained, the User Agent must mute each affected track. The affected track(s) will remain muted until the application adjusts the constraints to accommodate the source's current effective capabilities. |
ended |
Event |
The |
The following events fire on
objects:MediaDevices
Event name | Interface | Fired when... |
---|---|---|
devicechange |
Event |
The set of media devices, available to the User Agent, has changed. The current list devices can be retrieved with the
enumerateDevices() method.
|
This section describes an API that the script can use to query the User Agent about connected media input and output devices (for example a web camera or a headset).
The MediaDevices
object is the entry point to the API used to examine and get access to media devices available to the User Agent.
On page load, run the following steps:
On the relevant global object, run the following steps:
Create three internal slots: [[devicesLiveMap]], [[devicesAccessibleMap]], and [[kindsAccessibleMap]], each initialized to a different empty object.
Create one internal slot: [[storedDeviceList]], initialized to null.
For each kind of device, kind, that
getUserMedia()
exposes, set [[kindsAccessibleMap]]
[kind] either to true
if the result of retrieving the permission state of the permission associated with kind (e.g. "camera", "microphone"), is "granted", or to false
otherwise.
For each individual device that getUserMedia()
exposes, using the device's deviceId, deviceId, set [[devicesLiveMap]]
[deviceId] to false
, and set [[devicesAccessibleMap]][deviceId] either to
true
if the result of retrieving the permission state of the permission associated with the device’s kind and
deviceId, is “granted”, or to false otherwise.
For each kind of device, kind, that getUserMedia()
exposes, whenever a transition
occurs of the permission state of the permission associated with
kind, run the following steps:
If the transition is to “granted” from another value, then set [[kindsAccessibleMap]]
[kind] to true
.
If the transition is from “granted” to another value, then set [[kindsAccessibleMap]]
[kind] to false
.
For each device that getUserMedia()
exposes, whenever a transition occurs of the
permission state of the permission associated with the device's kind and the device's deviceId, deviceId, run the following steps:
If the transition is to “granted” from another value, then set [[devicesAccessibleMap]]
[deviceId] to true
, if it isn’t already true
.
If the transition is from “granted” to another value, and the device is currently stopped, then set [[devicesAccessibleMap]]
[deviceId] to
false
.
When new media input and/or output devices are made available, or any available input and/or output device becomes unavailable, the User Agent MUST run the following steps in browsing contexts where at least one of the following criteria are met, but in no other contexts:
Set [[storedDeviceList]] to null.
Queue a task that fires a simple event named devicechange
at the
object.MediaDevices
If a browsing context later comes to meet the criteria (e.g. gains focus), the User Agent MUST execute the steps at that time.
The User Agent MAY combine firing multiple events into firing one event when several events are due or when multiple devices are added or removed at the same time, e.g. a camera with a microphone.
These events are potentially triggered simultaneously across browsing contexts on different origins; user agents MAY add fuzzing on the timing of events to avoid cross-origin activity correlation.
[Exposed=Window]
interface MediaDevices
: EventTarget {
attribute EventHandler ondevicechange
;
Promise<sequence<MediaDeviceInfo
>> enumerateDevices();
};
ondevicechange
of type EventHandlerThe event type of this event handler is devicechange
.
enumerateDevices
Collects information about the User Agent's available media input and output devices.
This method returns a promise. The promise will be fulfilled with a sequence of
dictionaries representing the User Agent's available media input and output devices if enumeration is successful.MediaDeviceInfo
Elements of this sequence that represent input devices will be of type
which extends
InputDeviceInfo
.MediaDeviceInfo
Camera and microphone sources should be enumerable. Specifications that add additional types of source will provide recommendations about whether the source type should be enumerable.
When the enumerateDevices()
method is called, the User Agent must run the following steps:
Let p be a new promise.
Run the following steps in parallel:
If [[storedDeviceList]] is not null, then let resultList be a copy of [[storedDeviceList]], and jump to the step labeled Complete Enumeration.
Let resultList be an empty list.
If this method has been called previously within this browsing session, let oldList be the list of
objects that was produced at that call (resultList); otherwise, let oldList be an empty list.MediaDeviceInfo
Probe the User Agent for available media devices, and run the following sub steps for each discovered device, device:
If device is represented by a
object in
oldList, append that object to
resultList, abort these steps and continue with the next device (if any).MediaDeviceInfo
Let deviceInfo be a new
object to represent device.MediaDeviceInfo
If a stored
exists for
device, initialize deviceInfo's
deviceId
to that value. Otherwise, let deviceInfo's
deviceId
member be a newly generated unique identifier.deviceId
If device belongs to the same physical device as a device already represented in
oldList or resultList, initialize deviceInfo's
member to the
groupId
value of the existing groupId
object. Otherwise, let deviceInfo's
MediaDeviceInfo
member be a newly generated unique identifier.groupId
Append deviceInfo to resultList.
Set [[storedDeviceList]] to resultList.
Complete Enumeration: run the following sub steps to resolve p:
If any of the local devices are attached to a
live
MediaStreamTrack in the current browsing context, set list-permission to "granted", otherwise set list-permission to the result of retrieving the
permission state of the "device-info" permission.
If list-permission is not "granted", let filteredList be a copy of
resultList, and all its elements, where the
member is the empty string.label
If filteredList is a non-empty list, then resolve p with filteredList. Otherwise, resolve p with resultList.
Return p.
Since this method returns persistent information across browsing sessions and origins via the number and grouping of media capture devices, it adds to the fingerprinting surface exposed by the user agent.
Once authorization has been granted to one of the capture devices, it provides additional persistent cross-origin information via the human readable labels associated with available capture devices, which furhter adds to the fingerprinting surface.
Promise<sequence<MediaDeviceInfo>>
The algorithm described above means that the access to media device information depends on whether or not permission has been granted to the page's origin.
If no such access has been granted, the
dictionary will contain the deviceId, kind, and groupId.MediaDeviceInfo
If access has been granted for a media device, the
MediaDeviceInfo
dictionary will contain the deviceId, kind, label, and groupId.
[Exposed=Window]
interface MediaDeviceInfo
{
readonly attribute DOMString deviceId
;
readonly attribute MediaDeviceKind
kind
;
readonly attribute DOMString label
;
readonly attribute DOMString groupId
;
serializer = {attribute};
};
deviceId
of type DOMString, readonlyA unique identifier for the represented device.
All enumerable devices have an identifier that MUST be unique to the page's origin. This identifier MUST be un-guessable by applications of other origins to prevent the identifier from being used to correlate the same user across different origins.
If any local devices have been attached to a live MediaStreamTrack in a page from this origin, or stored permission to access local devices has been granted to this origin, then this identifier MUST be persisted, except as detailed below. Unique and stable identifiers let the application save, identify the availability of, and directly request specific sources, across multiple visits.
However, as long as no local device has been attached to a live MediaStreamTrack in a page from this origin, and no stored permission to access local devices has been granted to this origin, then the user agent MAY clear this identifier once the last browsing session from this origin has been closed. If the user agent chooses not to clear the identifier in this condition, then it MUST provide for the user to visibly inspect and delete the identifier, like a cookie.
Since deviceId
may persist across browsing sessions and to reduce its potential as a fingerprinting mechanism, deviceId
is to be treated as other persistent storage mechanisms such as cookies [
COOKIES], in that user agents MUST NOT persist device identifiers for sites that are blocked from using cookies, and user agents MUST reset per-origin device identifiers when other persistent storage are cleared.
kind
of type MediaDeviceKind
, readonlyDescribes the kind of the represented device.
label
of type DOMString, readonlyA label describing this device (for example "External USB Webcam"). If the device has no associated label, then this attribute MUST return the empty string.
groupId
of type DOMString, readonlyReturns the group identifier of the represented device. Two devices have the same group identifier if they belong to the same physical device; for example a monitor with a built-in camera and microphone.
Serializer
Instances of this interface are serialized as a map with entries for each of the serializable attributes.
enum MediaDeviceKind
{
"audioinput",
"audiooutput",
"videoinput"
};
MediaDeviceKind Enumeration description |
|
---|---|
audioinput |
Represents an audio input device; for example a microphone. |
audiooutput |
Represents an audio output device; for example a pair of headphones. |
videoinput |
Represents a video input device; for example a webcam. |
The InputDeviceInfo
interface gives access to the capabilities of the input device it represents.
interface InputDeviceInfo
: MediaDeviceInfo
{
MediaTrackCapabilities
getCapabilities();
};
getCapabilities()
Returns a
object describing the primary audio or video track of a device's
MediaTrackCapabilities
(according to its
MediaStream
kind
value), in the absence of any user-supplied constraints. These capabilities MUST be identical to those that would have been obtained by calling
getCapabilities()
on the first
of this type in a
MediaStreamTrack
returned by
MediaStream
getUserMedia({deviceId: id})
where id is the value of the deviceId
attribute of this
MediaDeviceInfo
.
If no access has been granted to any local devices and this
InputDeviceInfo
has been filtered with respect to unique identifying information (see above description of
enumerateDevices()
result), then this method returns
null
.
MediaTrackCapabilities
This section extends
and
NavigatorUserMedia
with APIs to request permission to access media input devices available to the User Agent.MediaDevices
Alternatively, a local
can be captured from certain types of DOM elements, such as the video element [
mediacapture-fromelement]. This can be useful for automated testing.MediaStream
When on an insecure origin [mixed-content], User Agents are encouraged to warn about usage of navigator.mediaDevices.getUserMedia
,
navigator.getUserMedia
, and any prefixed variants in their developer tools, error logs, etc. It is explicitly permitted for User Agents to remove these APIs entirely when on an insecure origin, as long as they remove all of them at once (e.g., they should not leave just the prefixed version available on insecure origins).
First, the official definition for the getUserMedia() method, and the one which developers are encouraged to use, is now at MediaDevices. This decision reflected consensus as long as the original API remained available here under the Navigator object for backwards compatibility reasons, since the working group acknowledges that early users of these APIs have been encouraged to define getUserMedia as "var getUserMedia = navigator.getUserMedia || navigator.webkitGetUserMedia || navigator.mozGetUserMedia;" in order for their code to be functional both before and after official implementations of getUserMedia() in popular browsers. To ensure functional equivalence, the getUserMedia() method here is defined in terms of the method under MediaDevices.
Second, the decision to change all other callback-based methods in the specification to be based on Promises instead required that the navigator.getUserMedia() definition reflect this in its use of navigator.mediaDevices.getUserMedia(). Because navigator.getUserMedia() is now the only callback-based method remaining in the specification, there is ongoing discussion as to a) whether it still belongs in the specification, and b) if it does, whether its syntax should remain callback-based or change in some way to use Promises. Input on these questions is encouraged, particularly from developers actively using today's implementations of this functionality.
Note that the other methods that changed from a callback-based syntax to a Promises-based syntax were not considered to have been implemented widely enough in any form to have to consider legacy usage.
getUserMedia()
Prompts the user for permission to use their Web cam or other video or audio input.
The constraints argument is a dictionary of type
.MediaStreamConstraints
The successCallback will be invoked with a suitable
object as its argument if the user accepts valid tracks as described in getUserMedia() on
MediaStream
MediaDevices
.
The errorCallback will be invoked if there is a failure in finding valid tracks or if the user denies permission, as described in getUserMedia() on
MediaDevices
.
When the getUserMedia()
method is called, the User Agent MUST run the following steps:
Let constraints be the method's first argument.
Let successCallback be the callback indicated by the method's second argument.
Let errorCallback be the callback indicated by the method's third argument.
Run the steps specified by the getUserMedia() algorithm with constraints as the argument, and let p be the resulting promise.
Upon fulfillment of p with value stream, run the following step:
Invoke successCallback with stream as the argument.
Upon rejection of p with reason r, run the following step:
Invoke errorCallback with r as the argument.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
constraints |
MediaStreamConstraints |
✘ | ✘ | |
successCallback |
NavigatorUserMediaSuccessCallback |
✘ | ✘ | |
errorCallback |
NavigatorUserMediaErrorCallback |
✘ | ✘ |
void
First, the official definition for the getUserMedia() method, and the one which developers are encouraged to use, is now the one defined here under MediaDevices. This decision reflected consensus as long as the original API remained available at NavigatorUserMedia.getUserMedia under the Navigator object for backwards compatibility reasons, since the working group acknowledges that early users of these APIs have been encouraged to define getUserMedia as "var getUserMedia = navigator.getUserMedia || navigator.webkitGetUserMedia || navigator.mozGetUserMedia;" in order for their code to be functional both before and after official implementations of getUserMedia() in popular browsers. To ensure functional equivalence, the getUserMedia() method under NavigatorUserMedia is defined in terms of the method here.
Second, the method defined here is Promises-based, while the one defined under NavigatorUserMedia is currently still callback-based. Developers expecting to find getUserMedia() defined under NavigatorUserMedia are strongly encouraged to read the detailed Note given there.
The getSupportedConstraints
method is provided to allow the application to determine which constraints the User Agent recognizes.
partial interface MediaDevices
{
MediaTrackSupportedConstraints
getSupportedConstraints
();
Promise<MediaStream
> getUserMedia(optional MediaStreamConstraints
constraints);
};
getSupportedConstraints
Returns a dictionary whose members are the constrainable properties known to the User Agent. A supported constrainable property MUST be represented and any constrainable properties not supported by the User Agent MUST NOT be present in the returned dictionary. The values returned represent what the browser implements and will not change during a browsing session.
MediaTrackSupportedConstraints
getUserMedia
Prompts the user for permission to use their Web cam or other video or audio input.
The constraints argument is a dictionary of type
.MediaStreamConstraints
This method returns a promise. The promise will be fulfilled with a suitable
object if the user accepts valid tracks as described below.MediaStream
The promise will be rejected if there is a failure in finding valid tracks or if the user denies permission, as described below.
When the getUserMedia()
method is called, the User Agent MUST run the following steps:
Let constraints be the method's first argument.
Let requestedMediaTypes be the set of media types in constraints with either a dictionary value or a value of "true".
If requestedMediaTypes is the empty set, return a promise rejected with a TypeError
. The word "optional" occurs in the WebIDL due to WebIDL rules, but the argument MUST be supplied in order for the call to succeed.
If the current settings object's responsible
document is NOT allowed to use the feature indicated by attribute name allowusermedia
, return a promise rejected with a
object whose
DOMException
name
attribute has the value
SecurityError
.
Let p be a new promise.
Run the following steps in parallel:
Let finalSet be an (initially) empty set.
For each media type T in requestedMediaTypes,
For each possible source for that media type, construct an unconstrained MediaStreamTrack with that source as its source.
Call this set of tracks the candidateSet.
If candidateSet is the empty set, reject p with a new
object whose
DOMException
name
attribute has the value
NotFoundError
and abort these steps.
Run the SelectSettings
algorithm on each track in CandidateSet with CS as the constraint set. If the algorithm returns undefined
, remove the track from candidateSet. This eliminates devices unable to satisfy the constraints, by verifying that at least one settings dictionary exists that satisfies the constraints.
If candidateSet is the empty set, let
constraint be any required constraint whose fitness distance was infinity for all settings dictionaries examined while executing the
SelectSettings
algorithm, let
message be either undefined
or an informative human-readable message, and reject
p with a new
created by calling
OverconstrainedError
OverconstrainedError(constraint,
message)
, then abort these steps.
This error gives information about what the underlying device is not capable of producing, before the user has given any authorization to any device, and can thus be used as a fingerprinting surface.
Retrieve the permission state for all candidate devices in candidateSet that are not attached to a live
MediaStreamTrack in the current browsing context. Remove from
candidateSet any device for which the permission state is "denied".
If candidateSet is now empty, indicating that all devices of this type are in state "denied", jump to the step labeled PermissionFailure below.
Add all tracks from candidateSet to finalSet.
Optionally, e.g., based on a previously-established user preference, for security reasons, or due to platform limitations, jump to the step labeled Permission Failure below.
Let originIdentifier be the [HTML51] top-level browsing context's origin.
If the current [HTML51] browsing context is a [HTML51] nested browsing context whose origin is different from originIdentifier, let originIdentifier be the result of combining originIdentifier and the current browsing context's origin.
For the origin identified by
originIdentifier, request permission for use of the devices, while considering all devices attached to a live
MediaStreamTrack in the current
browsing context to have permission status "granted", resulting in a set of provided media.
The provided media MUST include precisely one track of each media type in requestedMediaTypes from the
finalSet. The decision of which devices to choose from the finalSet is completely up to the User Agent and may be determined by asking the user. Once selected, the source of a
MUST NOT change.MediaStreamTrack
The User Agent MAY use the value of the computed "fitness distance" from the SelectSettings algorithm, or any other internally-available information about the devices, as an input to the selection algorithm.
User Agents are encouraged to default to using the user's primary or system default camera and/or microphone (when possible) to generate the media stream. User Agents MAY allow users to use any media source, including pre-recorded media files.
If the result of the request is "granted", then for each device that is sourcing the provided media, using the device's deviceId, deviceId, set [[devicesLiveMap]]
[deviceId] to
true
, if it isn’t already true
, and set the [[devicesAccessibleMap]]
[deviceId] to
true
, if it isn’t already
true
.
If the result is "denied", jump to the step labeled Permission Failure below. If the user never responds, this algorithm stalls on this step.
If the user grants permission but a hardware error such as an OS/program/webpage lock prevents access, reject p with a new
object whose
DOMException
name
attribute has the value
NotReadableError
and abort these steps.
If the result is "granted" but device access fails for any reason other than those listed above, reject
p with a new
object whose DOMException
name
attribute has the value AbortError
and abort these steps.
Let stream be the
object for which the user granted permission.MediaStream
Run the ApplyConstraints algorithm on all tracks in stream with the appropriate constraints.
Resolve p with stream and abort these steps.
Permission Failure: Reject p with a new
object whose
DOMException
name
attribute has the value
NotAllowedError
.
Return p.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
constraints |
MediaStreamConstraints |
✘ | ✘ |
Promise<MediaStream>
In the algorithm above, constraints are checked twice - once at device selection, and once after access approval. Time may have passed between those checks, so it is conceivable that the selected device is no longer suitable. In this case, a NotReadableError will result.
The MediaStreamConstraints
dictionary is used to instruct the User Agent what sort of MediaStreamTrack
s to include in the
MediaStream
returned by getUserMedia()
.
dictionary MediaStreamConstraints
{
(boolean or MediaTrackConstraints
) video
= false;
(boolean or MediaTrackConstraints
) audio
= false;
};
MediaStreamConstraints
Membersvideo
of type (boolean or MediaTrackConstraints),
defaulting to false
If true
, it requests that the returned
MediaStream
contain a video track. If a Constraints
structure is provided, it further specifies the nature and settings of the video Track. If false
, the
MediaStream
MUST NOT contain a video Track.
audio
of type (boolean or MediaTrackConstraints),
defaulting to false
If true
, it requests that the returned
MediaStream
contain an audio track. If a
Constraints
structure is provided, it further specifies the nature and settings of the audio Track. If
false
, the MediaStream
MUST NOT contain an audio Track.
This section is non-normative.
The User Agent is encouraged to reserve resources when it has determined that a given call to
getUserMedia()
will be successful. It is preferable to reserve the resource prior to resolving the returned promise. Subsequent calls to getUserMedia()
(in this page or any other) should treat the resource that was previously allocated, as well as resources held by other applications, as busy. Resources marked as busy should not be provided as sources to the current web page, unless specified by the user. Optionally, the User Agent may choose to provide a stream sourced from a busy source but only to a page whose origin matches the owner of the original stream that is keeping the source busy.
This document recommends that in the permission grant dialog or device selection interface (if one is present), the user be allowed to select any available hardware as a source for the stream requested by the page (provided the resource is able to fulfill any specified mandatory constraints). Although not specifically recommended as best practice, note that some User Agents may support the ability to substitute a video or audio source with local files and other media. A file picker may be used to provide this functionality to the user.
This document also recommends that the user be shown all resources that are currently busy as a result of prior calls to getUserMedia() (in this page or any other page that is still alive) and be allowed to terminate that stream and utilize the resource for the current page instead. If possible in the current operating environment, it is also suggested that resources currently held by other applications be presented and treated in the same manner. If the user chooses this option, the track corresponding to the resource that was provided to the page whose stream was affected must be removed.
When permission is requested for a device, the User Agent may choose to create a permission storage entry for later use by the same origin, so that the user does not need to grant permission again at a later time. [ RTCWEB-SECURITY-ARCH] Section 5.2 requires that such storing MUST only be done when the page is secure (served over HTTPS and having no mixed content). It is a User Agent choice whether it offers functionality to store permission to each device separately, all devices of a given class, or all devices; the choice needs to be apparent to the user, and permission must have been granted for the entire set whose permission is being stored, e.g., to store permission to use all cameras the user must have given permission to use all cameras and not just one.
As described, this specification does not dictate whether or not granting permission results in a stored permission. When permission is not stored, permission will last only until such time as all MediaStreamTracks sourced from that device have been stopped.
A MediaStream
may contain more than one video and audio track. This makes it possible to include video from two or more webcams in a single stream object, for example. However, the current API does not allow a page to express a need for multiple video streams from independent sources.
It is recommended for multiple calls to getUserMedia() from the same page to be allowed as a way for pages to request multiple discrete video and/or audio streams.
Note also that if multiple getUserMedia() calls are done by a page, the order in which they request resources, and the order in which they complete, is not constrained by this specification.
A single call to getUserMedia() will always return a stream with either zero or one audio tracks, and either zero or one video tracks. If a script calls getUserMedia() multiple times before reaching a stable state, this document advises the UI designer that the permission dialogs should be merged, so that the user can give permission for the use of multiple cameras and/or media sources in one dialog interaction. The constraints on each getUserMedia call can be used to decide which stream gets which media sources.
The Constrainable pattern allows applications to inspect and adjust the properties of objects implementing it (the constrainable
object). It is broken out as a separate set of definitions so that it can be referred to by other specifications. The core concept is the Capability, which consists of a constrainable property of an object and the set of its possible values, which may be specified either as a range or as an enumeration. For example, a camera might be capable of framerates (a property) between 20 and 50 frames per second (a range) and may be able to be positioned (a property) facing towards the user, away from the user, or to the left or right of the user (an enumerated set). The application can examine a constrainable property's supported Capabilities via the
getCapabilities()
accessor.
The application can select the (range of) values it wants for an object's Capabilities by means of basic and/or advanced ConstraintSets and the applyConstraints()
method. A ConstraintSet consists of the names of one or more properties of the object plus the desired value (or a range of desired values) for each property. Each of those property/value pairs can be considered to be an individual constraint. For example, the application may set a ConstraintSet containing two constraints, the first stating that the framerate of a camera be between 30 and 40 frames per second (a range) and the second that the camera should be facing the user (a specific value). How the individual constraints interact depends on whether and how they are given in the basic Constraint structure, which is a ConstraintSet with an additional 'advanced' property, or whether they are in a ConstraintSet in the advanced list. The behavior is as follows: all 'min', 'max', and 'exact' constraints in the basic Constraint structure are together treated as the 'required' set, and if it is not possible to satisfy simultaneously all of those individual constraints for the indicated property names, the User Agent MUST reject the returned promise. Otherwise, it must apply the required constraints. Next, it will consider any ConstraintSets given in the 'advanced' list, in the order in which they are specified, and will try to satisfy/apply each complete ConstraintSet (i.e., all constraints in the ConstraintSet together), but will skip a ConstraintSet if and only if it cannot satisfy/apply it in its entirety. Next, the User Agent MUST attempt to apply, individually, any 'ideal' constraints or a constraint given as a bare value for the property. Of these properties, it MUST satisfy the largest number that it can, in any order. Finally, the User Agent MUST resolve the returned promise.
getSupportedConstraints()
, that all the named properties that are used are supported by the browser. The reason for this is that WebIDL drops any unsupported names from the dictionary holding the constraints, so the browser does not see them and the unsupported names end up being silently ignored. This will cause confusing programming errors as the JavaScript code will be setting constraints but the browser will be ignoring them. Browsers that support (recognize) the name of a required constraint but cannot satisfy it will generate an error, while browsers that do not support the constrainable property will not generate an error.
The following examples may help to understand how constraints work. The first shows a basic Constraint structure. Three constraints are given, each of which the User Agent will attempt to satisfy individually. Depending upon the resolutions available for this camera, it is possible that not all three constraints can be satisfied at the same time. If so, the User Agent will satisfy two if it can, or only one if not even two constraints can be satisfied together. Note that if not all three can be satisfied simultaneously, it is possible that there is more than one combination of two constraints that could be satisfied. If so, the User Agent will choose.
var supports = navigator.mediaDevices.getSupportedConstraints();
if(!supports["aspectRatio"]) {
// Treat like an error.
}
var constraints =
{
width: 1280,
height: 720,
aspectRatio: 1.5
};
This next example adds a small bit of complexity. The ideal values are still given for width and height, but this time with minimum requirements on each as well as a minimum frameRate that must be satisfied. If it cannot satisfy the frameRate, width or height minimum it will reject the promise. Otherwise, it will try to satisfy the width, height, and aspectRatio target values as well and then resolve the promise. Note that the frameRate minimum might be within the capabilities of the camera and satisfiable in ideal lighting conditions, but not in low light, and could therefore result in firing of the onoverconstrained
event handler under poor lighting conditions.
var supports = navigator.mediaDevices.getSupportedConstraints();
if(!supports["aspectRatio"] || !supports["frameRate"]) {
// Treat like an error.
}
var constraints =
{
frameRate: {min: 20},
width: {min: 640, ideal: 1280},
height: {min: 480, ideal: 720},
aspectRatio: 1.5
};
This example illustrates the full control possible with the Constraints structure by adding the 'advanced' property. In this case, the User Agent behaves the same way with respect to the required constraints, but before attempting to satisfy the ideal values it will process the 'advanced' list. In this example the 'advanced' list contains two ConstraintSets. The first specifies width and height constraints, and the second specifies an aspectRatio constraint. Note that in the advanced list, these bare values are treated as 'exact' values. This example represents the following: "I need my video to be at least 640 pixels wide and at least 480 pixels high. My preference is for precisely 1920x1280, but if you can't give me that, give me an aspectRatio of 4x3 if at all possible. If even that is not possible, give me a resolution as close to 1280x720 as possible."
var supports = navigator.mediaDevices.getSupportedConstraints();
if(!supports["width"] || !supports["height"]) {
// Treat like an error.
}
var constraints =
{
width: {min: 640, ideal: 1280},
height: {min: 480, ideal: 720},
advanced: [{width: 1920, height: 1280},
{aspectRatio: 1.3333333333}]
};
The ordering of advanced ConstraintSets is significant. In the preceding example it is impossible to satisfy both the 1920x1280 ConstraintSet and the 4x3 aspect ratio ConstraintSet at the same time. Since the 1920x1280 occurs first in the list, the User Agent will attempt to satisfy it first. Application authors can therefore implement a backoff strategy by specifying multiple advanced ConstraintSets for the same property. For example, an application might specify three advanced ConstraintSets, the first asking for a frame rate greater than 500, the second asking for a frame rate greater than 400, and the third asking for one greater than 300. If the User Agent is capable of setting a frame rate greater than 500, it will (and the subsequent two ConstraintSets will be trivially satisfied). However, if the User Agent cannot set the frame rate above 500, it will skip that ConstraintSet and attempt to set the frame rate above 400. If that fails, it will then try to set it above 300. If the User Agent cannot satisfy any of the three ConstraintSets, it will set the frame rate to any value it can get. If the developers wanted to insist on 300 as a lower bound, they could provide that as a 'min' value in the basic ConstraintSet. In that case, the User Agent would fail altogether if it couldn't get a value over 300, but would choose a value over 500 if possible, then try for a value over 400.
Note that, unlike basic constraints, the constraints within a ConstraintSet in the advanced list must be satisfied together or skipped together. Thus, {width: 1920, height: 1280} is a request for that specific resolution, not a request for that width or that height. One can think of the basic constraints as requesting an 'or' (non-exclusive) of the individual constraints, while each advanced ConstraintSet is requesting an 'and' of the individual constraints in the ConstraintSet. An application may inspect the full set of Constraints currently in effect via the
getConstraints()
accessor.
The specific value that the User Agent chooses for a constrainable property is referred to as a Setting. For example, if the application applies a ConstraintSet specifying that the frameRate must be at least 30 frames per second, and no greater than 40, the Setting can be any intermediate value, e.g., 32, 35, or 37 frames per second. The application can query the current settings of the object's constrainable properties via the
accessor.getSettings()
Although this specification formally defines ConstrainablePattern
as a WebIDL interface, it is actually a template or pattern for other interfaces and cannot be inherited directly since the return values of the methods need to be extended, something WebIDL cannot do. Thus, each interface that wishes to make use of the functionality defined here will have to provide its own copy of the WebIDL for the functions and interfaces given here. However it can refer to the semantics defined here, which will not change. See MediaStreamTrack Interface
Definition for an example of this.
When the User Agent is no longer able to satisfy the
requiredConstraints from the currently valid Constraints, the User Agent MUST queue a task that fires an
OverconstrainedErrorEvent
, initialized as described in the following paragraph, at the constrainable object. The event firing task MAY also be used to update the constrainable object as a result of the overconstrained situation.
The OverconstrainedErrorEvent
references an
whose OverconstrainedError
constraint
attribute is set to one of the requiredConstraints that can no longer be satisfied. The message
attribute of the
OverconstrainedError
SHOULD contain a string that is useful for debugging. The conditions under which this error might occur are platform and application-specific. For example, the user might physically manipulate a camera in a way that makes it impossible to provide a resolution or frameRate that satisfies the constraints. The User Agent
MAY take other actions as a result of the overconstrained situation.
[NoInterfaceObject]
interface ConstrainablePattern
{
Capabilities
getCapabilities();
Constraints
getConstraints();
Settings
getSettings();
Promise<void> applyConstraints(optional Constraints
constraints);
attribute EventHandler onoverconstrained
;
};
onoverconstrained
of type EventHandleroverconstrained
.getCapabilities
The getCapabilities()
method returns the dictionary of the names of the constrainable properties that the object supports.
It is possible that the underlying hardware may not exactly map to the range defined for the constrainable property. Where this is possible, the entry SHOULD define how to translate and scale the hardware's setting onto the values defined for the property. For example, suppose that a hypothetical fluxCapacitance property ranges from -10 (min) to 10 (max), but there are common hardware devices that support only values of "off" "medium" and "full". The constrainable property definition might specify that for such hardware, the User Agent should map the range value of -10 to "off", 10 to "full", and 0 to "medium". It might also indicate that given a ConstraintSet imposing a strict value of 3, the User Agent should attempt to set the value of "medium" on the hardware, and that
should return a fluxCapacitance of 0, since that is the value defined as corresponding to "medium".getSettings()
Capabilities
getConstraints
The getConstraints()
method returns the Constraints that were the argument to the most recent successful invocation of the applyConstraints algorithm on the object, maintaining the order in which they were specified. Note that some of the advanced ConstraintSets returned may not be currently satisfied. To check which ConstraintSets are currently in effect, the application should use getSettings
. Instead of returning the exact constraints as described above, the UA MAY return a constraint set that has the identical effect in all situations as the applied constraints.
Constraints
getSettings
The getSettings()
method returns the current settings of all the constrainable properties of the object, whether they are platform defaults or have been set by the applyConstraints algorithm. Note that the actual setting of a property MUST be a single value.
Settings
applyConstraints
When applyConstraints()
is called, it runs the
applyConstraints algorithm on the object.
The applyConstraints algorithm for applying constraints is stated below. Here are some preliminary definitions that are used in the statement of the algorithm:
We use the term settings dictionary for the set of values that might be applied as settings to the object.
For string valued constraints, we define "==" below to be true if one of the values in the sequence is exactly the same as the value being compared against.
We define the fitness distance between a settings dictionary and a constraint set CS as the sum, for each constraint provided for a constraint name in CS, of the following values:
If the constraint is not supported by the browser, the fitness distance is 0.
If the constraint is required ('min', 'max', or 'exact'), and the settings dictionary's value for the constraint does not satisfy the constraint, the fitness distance is positive infinity.
If the constraint is not required, and does not apply for this type of device, the fitness distance is 0 (that is, the constraint does not influence the fitness distance).
(actual == ideal) ? 0 : |actual - ideal|/max(|actual|,|ideal|)
(actual == ideal) ? 0 : 1
More definitions:
We define the SelectSettings algorithm as follows:
Note that unknown properties are discarded by WebIDL, which means that unknown/unsupported required constraints will silently disappear. To avoid this being a surprise, application authors are expected to first use the
getSupportedConstraints()
method as shown in the Examples below.
ConstrainablePattern
object on which this algorithm is applied. Let copy be an unconstrained copy of object (i.e., copy should behave as if it were object with all ConstraintSets removed.)
For every possible settings dictionary of copy compute its fitness distance, treating bare values of properties as ideal values. Let candidates be the set of settings dictionaries for which the fitness distance is finite.
If candidates is empty, return
undefined
as the result of the
SelectSettings()
algorithm.
compute the fitness distance between it and each settings dictionary in candidates, treating bare values of properties as exact.
If the fitness distance is finite for one or more settings dictionaries in candidates, keep those settings dictionaries in candidates, discarding others.
If the fitness distance is infinite for all settings dictionaries in candidates, ignore this ConstraintSet.
Select one settings dictionary from candidates, and return it as the result of the
SelectSettings()
algorithm. The UA SHOULD use the one with the smallest fitness distance
, as calculated in step 3.
When the applyConstraints algorithm is called, the User Agent MUST run the following steps:
Let p be a new promise.
Let newConstraints be the argument to this function.
Run the following steps in parallel:
Let successfulSettings be the result of running the SelectSettings algorithm with newConstraints as the constraint set.
If successfulSettings is
undefined
, let failedConstraint be any required constraint whose fitness distance was infinity for all settings dictionaries examined while executing the SelectSettings
algorithm, let message be either
undefined
or an informative human-readable message, reject p with a new
OverconstrainedError
created by calling
OverconstrainedError(failedConstraint,
message)
, and abort these steps. The existing constraints remain in effect in this case.
Return p.
The User Agent MAY choose new settings for the constrainable properties of the object at any time. When it does so it MUST attempt to satisfy all current Constraints, in the manner described in the algorithm above.
Any implementation that has the same result as the algorithm above is an allowed implementation. For instance, the implementation may choose to keep track of the maximum and minimum values for a setting that are OK under the constraints considered, rather than keeping track of all possible values for the setting.
When picking a settings dictionary, the UA can use any information available to it. Examples of such information may be whether the selection is done as part of device selection in getUserMedia, whether the energy usage of the camera varies between the settings dictionaries, or whether using a settings dictionary will cause the device driver to apply resampling.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
constraints | Constraints |
✘ | ✔ |
A new constraint structure to apply to this object. |
Promise<void>
An example of Constraints that could be passed into
or returned as a value of
applyConstraints()
constraints
is below. It uses the constrainable properties defined for
MediaStreamTrack
.
var supports = navigator.mediaDevices.getSupportedConstraints();
if(!supports["facingMode"]) {
// Treat like an error.
}
var constraints = {
width: {
min: 640
},
height: {
min: 480
},
advanced: [{
width: 650
}, {
width: {
min: 650
}
}, {
frameRate: 60
}, {
width: {
max: 800
}
}, {
facingMode: "user"
}]
};
Here is another example, specifically for a video track where I must have a particular camera and have separate preferences for the width and height:
var supports = navigator.mediaDevices.getSupportedConstraints();
if(!supports["deviceId"]) {
// Treat like an error.
}
var constraints = {
deviceId: {exact: "20983-20o198-109283-098-09812"},
advanced: [{
width: {
min: 800,
max: 1200
}
}, {
height: {
min: 600
}
}]
};
And here's one for an audio track:
var supports = navigator.mediaDevices.getSupportedConstraints();
if(!supports["deviceId"] || !supports["volume"]) {
// Treat like an error.
}
var constraints = {
advanced: [{
deviceId: "64815-wi3c89-1839dk-x82-392aa"
}, {
volume: 0.5
}]
};
Here's an example of use of ideal:
var supports = navigator.mediaDevices.getSupportedConstraints();
if(!supports["aspectRatio"] || !supports["facingMode"]) {
// Treat like an error.
}
var gotten = navigator.mediaDevices.getUserMedia({
video: {
width: {min: 320, ideal: 1280, max: 1920},
height: {min: 240, ideal: 720, max: 1080},
frameRate: 30, // Shorthand for ideal.
// facingMode: "environment" would be optional.
facingMode: {exact: "environment"}
}});
Here's an example of "I want 720p, but I can accept up to 1080p and down to VGA.":
var supports = navigator.mediaDevices.getSupportedConstraints();
if(!supports["width"] || !supports["height"]) {
// Treat like an error.
}
var gotten = navigator.mediaDevices.getUserMedia({video: {
width: {min: 640, ideal: 1280, max: 1920},
height: {min: 480, ideal: 720, max: 1080},
}});
Here's an example of "I want a front-facing camera and it must be VGA.":
var supports = navigator.mediaDevices.getSupportedConstraints();
if(!supports["width"] || !supports["height"] || !supports["facingMode"]) {
// Treat like an error.
}
var gotten = navigator.mediaDevices.getUserMedia({video: {
facingMode: {exact: "user"},
width: {exact: 640},
height: {exact: 480}
}});
The syntax for the specification of the set of legal values depends on the type of the values. In addition to the standard atomic types (boolean, long, double, DOMString), legal values include lists of any of the atomic types, plus min-max ranges, as defined below.
List values MUST be interpreted as disjunctions. For example, if a property 'facingMode' for a camera is defined as having legal values ["left", "right", "user", "environment"], this means that 'facingMode' can have the values "left", "right", "environment", and "user". Similarly
Constraints
restricting 'facingMode' to ["user", "left", "right"] would mean that the User Agent should select a camera (or point the camera, if that is possible) so that "facingMode" is either "user", "left", or "right". This Constraint would thus request that the camera not be facing away from the user, but would allow the User Agent to allow the user to choose other directions.
dictionary DoubleRange
{
double max
;
double min
;
};
DoubleRange
Membersmax
of type doubleThe maximum legal value of this property.
min
of type doubleThe minimum value of this Property.
dictionary ConstrainDoubleRange
: DoubleRange
{
double exact
;
double ideal
;
};
ConstrainDoubleRange
Membersexact
of type doubleThe exact required value for this property.
ideal
of type doubleThe ideal (target) value for this property.
LongRange
Membersmax
of type longThe maximum legal value of this property.
min
of type longThe minimum value of this property.
ConstrainLongRange
Membersexact
of type longThe exact required value for this property.
ideal
of type longThe ideal (target) value for this property.
dictionary ConstrainBooleanParameters
{
boolean exact
;
boolean ideal
;
};
ConstrainBooleanParameters
Membersexact
of type booleanThe exact required value for this property.
ideal
of type booleanThe ideal (target) value for this property.
dictionary ConstrainDOMStringParameters
{
(DOMString or sequence<DOMString>) exact
;
(DOMString or sequence<DOMString>) ideal
;
};
ConstrainDOMStringParameters
Membersexact
of type (DOMString or
sequence<DOMString>)The exact required value for this property.
ideal
of type (DOMString or
sequence<DOMString>)The ideal (target) value for this property.
typedef (long or ConstrainLongRange
) ConstrainLong
;
ConstrainLong
is used to refer to the
(long or ConstrainLongRange) type.
typedef (double or ConstrainDoubleRange
) ConstrainDouble
;
ConstrainDouble
is used to refer to the
(double or ConstrainDoubleRange) type.
typedef (boolean or ConstrainBooleanParameters
) ConstrainBoolean
;
ConstrainBoolean
is used to refer to the
(boolean or
ConstrainBooleanParameters) type.
typedef (DOMString or sequence<DOMString> or ConstrainDOMStringParameters
) ConstrainDOMString
;
ConstrainDOMString
is used to refer to the
(DOMString or sequence<DOMString> or ConstrainDOMStringParameters) type.
Capabilities
is a dictionary containing one or more key-value pairs, where each key MUST be a constrainable property, and each value MUST be a subset of the set of values allowed for that property. The exact syntax of the value expression depends on the type of the property. The Capabilities dictionary specifies which constrainable properties that can be applied, as constraints, to the constrainable
object. Note that the Capabilities of a constrainable object
MAY be a subset of the properties defined in the Web platform, with a subset of the set values for those properties. Note that Capabilities are returned from the User Agent to the application, and cannot be specified by the application. However, the application can control the Settings that the User Agent chooses for constrainable properties by means of Constraints.
An example of a Capabilities dictionary is shown below. In this case, the constrainable object is a video source with a very limited set of Capabilities.
{
frameRate: {
min: 1.0,
max: 60.0
},
facingMode: [ "user", "left" ]
}
The next example below points out that capabilities for range values provide ranges for individual constrainable properties, not combinations. This is particularly relevant for video width and height, since the ranges for width and height are reported separately. In the example, if the constrainable object can only provide 640x480 and 800x600 resolutions the relevant capabilities returned would be:
{
width: {
min: 640,
max: 800
},
height: {
min: 480,
max: 600
},
aspectRatio: 1.3333333333
}
Note in the example above that the aspectRatio would make clear that arbitrary combination of widths and heights are not possible, although it would still suggest that more than two resolutions were available.
A specification using the Constrainable Pattern should not subclass the below dictionary, but instead provide its own definition. SeeMediaTrackCapabilities
for an example.
dictionary Capabilities
{
};
Settings
is a dictionary containing one or more key-value pairs. It MUST contain each key returned in
getCapabilities()
for which the property is defined on the object type it's returned on; for instance, an audio
has no "width" property. There MUST be a single value for each key and the value MUST be a member of the set defined for that property by
MediaStreamTrack
getCapabilities()
. The Settings
dictionary contains the actual values that the User Agent has chosen for the object's constrainable properties. The exact syntax of the value depends on the type of the property.
A conforming User Agent MUST support all the constrainable properties defined in this specification.
An example of a Settings dictionary is shown below. This example is not very realistic in that a browser would actually be required to support more constrainable properties than just these.
{
frameRate: 30.0,
facingMode: "user"
}
MediaTrackSettings
for an example.
dictionary Settings
{
};
Due to the limitations of WebIDL, interfaces implementing the Constrainable Pattern cannot simply subclass Constraints and ConstraintSet as they are defined here. Instead they must provide their own definitions that follow this pattern. See MediaTrackConstraints for an example of this.
dictionary ConstraintSet
{
};
Each member of a ConstraintSet
corresponds to a constrainable property and specifies a subset of the property's legal Capability values. Applying a ConstraintSet instructs the User Agent to restrict the settings of the corresponding constrainable properties to the specified values or ranges of values. A given property MAY occur both in the basic Constraints set and in the advanced ConstraintSets list, and MAY occur at most once in each ConstraintSet in the advanced list.
dictionary Constraints
: ConstraintSet
{
sequence<ConstraintSet
> advanced
;
};
Constraints
Membersadvanced
of type sequence<ConstraintSet
>This is the list of ConstraintSets that the User Agent MUST attempt to satisfy, in order, skipping only those that cannot be satisfied. The order of these ConstraintSets is significant. In particular, when they are passed as an argument to
applyConstraints
, the User Agent MUST try to satisfy them in the order that is specified. Thus if advanced ConstraintSets C1 and C2 can be satisfied individually, but not together, then whichever of C1 and C2 is first in this list will be satisfied, and the other will not. The User Agent MUST attempt to satisfy all ConstraintSets in the list, even if some cannot be satisfied. Thus, in the preceding example, if constraint C3 is specified after C1 and C2, the User Agent will attempt to satisfy C3 even though C2 cannot be satisfied. Note that a given property name may occur only once in each ConstraintSet but may occur in more than one ConstraintSet.
This sample code exposes a button. When clicked, the button is disabled and the user is prompted to offer a stream. The user can cause the button to be re-enabled by providing a stream (e.g., giving the page access to the local camera) and then disabling the stream (e.g., revoking that access).
<input type="button" value="Start" onclick="start()" id="startBtn">
<script>
var startBtn = document.getElementById('startBtn');
function start() {
navigator.mediaDevices.getUserMedia({
audio: true,
video: true
}).then(gotStream).catch(logError);
startBtn.disabled = true;
}
function gotStream(stream) {
stream.getTracks().forEach(function (track) {
track.onended = function () {
startBtn.disabled = stream.active;
};
});
}
function logError(error) {
log(error.name + ": " + error.message);
}
</script>
This example allows people to take photos of themselves from the local video camera. Note that the Image Capture specification [image-capture] provides a simpler way to accomplish this.
<article>
<style scoped>
video { transform: scaleX(-1); }
p { text-align: center; }
</style>
<h1>Snapshot Kiosk</h1>
<section id="splash">
<p id="errorMessage">Loading...</p>
</section>
<section id="app" hidden>
<p><video id="monitor" autoplay></video> <canvas id="photo"></canvas>
<p><input type=button value="📷" onclick="snapshot()">
</section>
<script>
var video = document.getElementById('monitor');
var canvas = document.getElementById('photo');
navigator.mediaDevices.getUserMedia({
video: true
}).then(function (stream) {
video.srcObject = stream;
video.onloadedmetadata = function () {
canvas.width = video.videoWidth;
canvas.height = video.videoHeight;
document.getElementById('splash').hidden = true;
document.getElementById('app').hidden = false;
};
}).catch(function (reason) {
document.getElementById('errorMessage').textContent = 'No camera available.';
});
function snapshot() {
canvas.getContext('2d').drawImage(video, 0, 0);
}
</script>
</article>
For each kind of device that getUserMedia()
exposes,
Define anyAccessible to be the logical OR of all any<kind>Accessible values.
Define anyLive to be the logical OR of all any<kind>Live values.
Then the following are requirements on the User Agent:
and the following are encouraged behaviors for the User Agent:
This section is non-normative; it specifies no new behavior, but instead summarizes information already present in other parts of the specification.
This document extends the Web platform with the ability to manage input devices for media - in this iteration, microphones, and cameras. It also allows the manipulation of audio output devices (speakers and headphones). Capturing audio and video exposes personally-identifiable information to applications, and this specification requires obtaining explicit user consent before sharing it.
Without authorization (to the "drive-by web"), it offers the ability to tell how many devices there are of each class, and how they are grouped together (e.g. a microphone and camera belonging to a single Web cam). The identifiers for the devices are designed to not be useful for a fingerprint that can track the user between origins, but the number and grouping of devices adds to the fingerprint surface. It recommends to treat the per-origin persistent identifier deviceId
as other persistent storage (e.g. cookies) are treated.
When authorization is given, this document describes how to get access to, and use, media data from the devices mentioned. This data may be sensitive; advice is given that indicators should be supplied to indicate that devices are in use, but both the nature of authorization and the indicators of in-use devices are platform decisions.
Authorization may be given on a case-by-case basis, or be persistent. In the case of a case-by-case authorization, it is important that the user be able to say "no" in a way that prevents the UI from blocking user interaction until permission is given - either by offering a way to say a "persistent NO" or by not using a modal permissions dialog.
When authorization to any media device is given, application developers gain access to the labels of all available media capture devices. In most cases, the labels are persistent across browsing sessions and across origins that have also been granted authorization, and thus potentially provide a way to track a given device across time and origins.
For origins to which permission has been granted, the
event will be emitted across browsing contexts and origins each time a new media device is added or removed; user agents can mitigate the risk of correlation of browsing activity across origins by fuzzing the timing of these events.devicechange
Once a developer gains access to a media stream from a capture device, the developer also gains access to detailed information about the device, including its range of operating capabilities (e.g. available resolutions for a camera). These operating capabilities are for the most part persistent across browsing sessions and origins, and thus provide a way to track a given device across time and origins.
Once access to a video stream from a capture device is obtained, that stream can most likely be used to fingerprint uniquely the said device (e.g. via dead pixel detection). Similarly, once access to an audio stream is obtained, that stream can most likely be used to fingerprint user location down to the level of a room or even simultaneous occupation of a room by disparate users (e.g. via analysis of ambient audio or of unique audio purposely played out of the device speaker). User-level mitigation for both audio and video consists of covering up the camera and/or microphone or revoking permission via browser chrome controls.
It is possible to use constraints so that the failure of a getUserMedia call will return information about devices on the system without prompting the user, which increases the surface available for fingerprinting. The User Agent should consider limiting the rate at which failed getUserMedia calls are allowed in order to limit this additional surface.
In the case of persistent authorization via a stored permission, it is important that it is easy to find the list of granted permissions and revoke permissions that the user wishes to revoke.
Once permission has been granted, the User Agent should make two things readily apparent to the user:
Developers of sites with stored permissions should be careful that these permissions not be abused. These permissions can be revoked using the [permissions] API.
In particular, they should not make it possible to automatically send audio or video streams from authorized media devices to an end point that a third party can select.
Indeed, if a site offered URLs such as
https://webrtc.example.org/?call=user
that would automatically set up calls and transmit audio/video to
user
, it would be open for instance to the following abuse:
Users who have granted stored permissions to
https://webrtc.example.org/
could be tricked to send their audio/video streams to an attacker EvilSpy
by following a link or being redirected to
https://webrtc.example.org/?user=EvilSpy
.
Although [RTCWEB-SECURITY-ARCH] Section 5.2 indicates that implementations may refuse all access permissions for HTTP origins, it recommends that implementations allow one-time camera/microphone access. While allowing one-time access for HTTP origins is convenient, this makes it possible for an attacker to obtain access to the camera/microphone of an unsuspecting user.
This section is non-normative.
Although new versions of this specification may be produced in the future, it is also expected that other standards will need to define new capabilities that build upon those in this specification. The purpose of this section is to provide guidance to creators of such extensions.
Any WebIDL-defined interfaces, methods, or attributes in the specification may be extended. Two likely extension points are defining a new media type and defining a new constrainable property.
At a minimum, defining a new media type would require
MediaStream
interface,kind
attribute on the MediaStreamTrack
interface,HTMLMediaElement
works with a
MediaStream
containing a track of the new media type (see 6. MediaStreams in Media Elements),
MediaDeviceKind
if the new type has enumerable devices,getCapabilities()
and
getUserMedia()
descriptions,MediaStreamConstraints
dictionary,Additionally, it should include updating
label
attribute on the
MediaStreamTrack
interface,It might also include
MediaStreamTrackState
of how such a track might become ended.This will require thinking through and defining how Constraints, Capabilities, and Settings for the property (see 3. Terminology) will work. The relevant text in
,
MediaTrackSupportedConstraints
,
MediaTrackCapabilities
,
MediaTrackConstraints
, 4.3.8 Constrainable Properties, and
MediaTrackSettings
are the model to use.MediaStreamConstraints
Creators of extension specifications are strongly encouraged to notify the Media Capture Task Force of their extension by emailing the list at public-media-capture@w3.org.
Future versions of this specification and others created by the Media Capture Task Force will take into consideration all extensions they are aware of in an attempt to reduce potential usage conflicts.
It is also likely that new consumers of
s or MediaStream
s will be defined in the future. The following section provides guidance.MediaStreamTrack
At a minimum, any new consumer of a
will need to defineMediaStreamTrack
MediaStreamTrack
will render in the various states in which it can be, including muted and disabled (see 4.3.1 Life-cycle and Media Flow).
This section will be removed before publication.
getSupportedConstraints()
method.The editors wish to thank the Working Group chairs and Team Contact, Harald Alvestrand, Stefan Håkansson, Erik Lagerway and Dominique Hazaël-Massieux, for their support. Substantial text in this specification was provided by many people including Jim Barnett, Harald Alvestrand, Travis Leithead, Josh Soref, Martin Thomson, Jan-Ivar Bruaroey, Peter Thatcher, Dominique Hazaël-Massieux, and Stefan Håkansson. Dan Burnett would like to acknowledge the significant support received from Voxeo and Aspect during the development of this specification.