Copyright © 2014 W3C® (MIT, ERCIM, Keio, Beihang), All Rights Reserved. W3C liability, trademark and document use rules apply.
This specification extends the Media Capture and Streams specification [GETUSERMEDIA] to allow a depth stream to be requested from the web platform using APIs familiar to web authors.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This document is not complete and is subject to change. Early experimentations are encouraged to allow the Media Capture Task Force to evolve the specification based on technical discussions within the Task Force, implementation experience gained from early implementations, and feedback from other groups and individuals.
This document was published by the Device APIs Working Group and Web Real-Time Communications Working Group as a First Public Working Draft. This document is intended to become a W3C Recommendation. If you wish to make comments regarding this document, please send them to public-media-capture@w3.org (subscribe, archives). All comments are welcome.
Publication as a First Public Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures (Device APIs Working Group, Web Real-Time Communications Working Group) made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 1 August 2014 W3C Process Document.
This specification extends the MediaStream
interface [GETUSERMEDIA] to enable it to also contain depth-based
MediaStreamTrack
s. A depth-based
MediaStreamTrack
, referred to as a depth
track, represents an abstraction of a stream of frames that can
each be converted to objects which contain an array of pixel data,
where each pixel represents the distance between the camera and the
objects in the scene for that point in the array. A
MediaStream
object that contains one or more
depth tracks is referred to as a depth stream.
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MUST, MUST NOT, REQUIRED, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this specification are to be interpreted as described in [RFC2119].
This specification defines conformance criteria that apply to a single product: the user agent that implements the interfaces that it contains.
Implementations that use ECMAScript to implement the APIs defined in this specification must implement them in a manner consistent with the ECMAScript Bindings defined in the Web IDL specification [WEBIDL], as this specification uses that specification and terminology.
The
Constraints
,
MediaStreamConstraints
, and
MediaTrackConstraints
dictionaries,
MediaStreamTrack
and
MediaStream
interfaces this specification extends
are defined in [GETUSERMEDIA].
The
NavigatorUserMediaSuccessCallback
callback is
defined in [GETUSERMEDIA].
The ImageData
interface and its
data
attribute are defined in [2DCONTEXT2].
The ArrayBuffer
,
ArrayBufferView
and Uint16Array
types are defined in [TYPEDARRAY].
A depth stream is a MediaStream
object
that contains one or more depth tracks.
A depth track represents media sourced from a depth camera or other similar source.
A depth data represents the underlying depth data structure
of an area of a canvas
element.
MediaStreamConstraints
dictionary
partial dictionary MediaStreamConstraints {
(boolean or MediaTrackConstraints) depth = false;
};
The depth
attribute MUST return the value it was initialized to. When the
object is created, this attribute MUST be initialized to false. If
true, the attribute represents a request that the
MediaStream
object returned as an argument of the
NavigatorUserMediaSuccessCallback
contains a
depth track. If a Constraints
structure is
provided, it further specifies the nature and settings of the
depth track.
MediaStream
interface
partial interface MediaStream {
sequence<MediaStreamTrack> getDepthTracks ();
};
The getDepthTracks()
method, when invoked, MUST return a sequence of
MediaStreamTrack
objects representing the
depth tracks in this stream.
The getDepthTracks()
method MUST return a sequence that represents a snapshot of all the
MediaStreamTrack
objects in this stream's track
set whose kind
is equal to "depth
".
The conversion from the track set to the sequence is user
agent defined and the order does not have to be stable between
calls.
MediaStreamTrack
interface
The kind
attribute MUST, on getting, return
the string "depth
" if the object represents a depth
track.
DepthData
interface
Depth cameras usually produce 16-bit depth values per pixel. However, the canvas drawing surface used to draw and manipulate 2D graphics on the web platform does not currently support 16bpp.
To address the issue, this specification defines a new
interface and extends the
DepthData
interface to provide
pixel manipulation constructors and methods that create, and
interact with, CanvasRenderingContext2D
objects.
DepthData
[Constructor(unsigned long sw, unsigned long sh),
Constructor(Uint16Array data, unsigned long sw, optional unsigned long sh),
Exposed=Window,Worker]
interface DepthData {
readonly attribute unsigned long width;
readonly attribute unsigned long height;
readonly attribute Uint16Array data;
readonly attribute CameraParameters
parameters;
};
New
objects MUST be initialised so that
their DepthData
width
attribute is set to the number of entries
per row in the depth data, their height
attribute
is set to the number of rows in the depth data, and their
data
attribute, except where an existing array is
provided, is initialised to a new Uint16Array
object.
The Uint16Array
object MUST use a new Canvas depth
ArrayBuffer
for its storage, and MUST have a zero
start offset and a length equal to the length of its storage, in
bytes.
A Canvas Depth ArrayBuffer
is an
ArrayBuffer
whose data is represented in
left-to-right order, row by row top to bottom, starting with the top
left, with each pixel's depth component being given in that order for
each pixel. Each depth component of each pixel represented in this
array MUST be in the range 0..65536, representing the 16 bit value
for that depth component. The depth components MUST be assigned
consecutive indices starting with 0 for the top left pixel's depth
component.
CameraParameters
interface
What is the minimum set of metadata
should expose? At minimum, the
general pinhole camera model have to be calculable. For related
discussion, see the
Focal length/fov capabilities and general camera intrinsics
thread.
CameraParameters
Each DepthData
interface is associated with a
CameraParameters
object. It represents the parameters of a
pinhole camera model that describes the mathematical relationship
between the coordinates of a 3D point and its projection onto the
image plane.
interface CameraParameters {
readonly attribute double focalLength;
readonly attribute double horizontalViewAngle;
readonly attribute double verticalViewAngle;
};
The focalLength
attribute, on getting, MUST return the
focal length of the camera in millimeters.
The horizontalViewAngle
attribute, on getting, MUST
return the horizontal angle of view in degrees.
The verticalViewAngle
attribute, on getting, MUST return
the vertical angle of view in degrees.
CanvasRenderingContext2D
interface
partial interface CanvasRenderingContext2D {
DepthData
createDepthData (double sw, double sh);
DepthData
createDepthData (DepthData
depthdata);
DepthData
getDepthData (double sx, double sy, double sw, double sh);
void putDepthData (DepthData
depthdata, double dx, double dy);
void putDepthData (DepthData
depthdata, double dx, double dy, double dirtyX, double dirtyY, double dirtyWidth, double dirtyHeight);
};
Define the algorithms for the createDepthData()
,
getDepthData()
, and putDepthData()
methods.
We may want to file bugs against [2DCONTEXT2] to add extension points this specification can hook into to facilitate reuse of common algorithms and avoid monkey patching.
Thanks to everyone who contributed to the Use Cases and Requirements, sent feedback and comments. Special thanks to Ningxin Hu for experimental implementations, as well as to the Project Tango for their experiments.