Copyright © 2025 World Wide Web Consortium. W3C® liability, trademark and permissive document license rules apply.
Consider a Web application capturer which has used getDisplayMedia
()
to
start capturing another display surface, capturee. This specification introduces a set
of APIs that allow capturer the following new capabilities:
This section describes the status of this document at the time of its publication. A list of current W3C publications and the latest revision of this technical report can be found in the W3C standards and drafts index at https://www.w3.org/TR/.
This document was published by the Web Real-Time Communications Working Group as an Editor's Draft.
Publication as an Editor's Draft does not imply endorsement by W3C and its Members.
This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 03 November 2023 W3C Process Document.
Nearly all video-conferencing Web applications offer their users the ability to share display surfaces - typically a browser tab (browser), a native app's window (window), or an entire screen (monitor).
Many of these applications also show the local user a "preview tile" with a video of the captured display surface.
All these applications suffer from one key drawback - if the user wishes to interact with a captured display surface, the user must first switch to that surface, taking them away from the video-conferencing application. This presents a few issues:
It bears mentioning that Document Picture-in-Picture goes a long way towards addressing some of these issues. However, it not always a suitable solution, as not all use cases are adequately addressed by a floating window which will often be small, which obscures arbitrary other content on the screen, and whose size and positioning must be manually controlled by the user.
This specification defines a policy-controlled feature identified by the string
"captured-surface-control"
. Its default allowlist is "self"
.
The API surfaces introduced by this specification can be categorized as either read-access
or write-access. Note that only the write-access APIs (forwardWheel
,
increaseZoomLevel
, decreaseZoomLevel
and
resetZoomLevel
) are gated by the "captured-surface-control"
permissions policy.
We define a concept of an integer "zoom level" that can be applied to display surfaces of any type, and which is independent of the user agent and the platform. It is expected that in the case of browser display surfaces, this concept will match the concept of zoom level that user agents typically exposed to the user.
For a given display surface of type surfaceType, we define the user agent's set of supported zoom levels for surfaceType as a non-empty set of integers including at least the default zoom level (100), and not including any integers lesser than 1.
We define the permitted event types for zoom-setting as a set composed of the following event types:
WebIDLpartial interface CaptureController {
sequence<long> getSupportedZoomLevels
();
readonly attribute long? zoomLevel
;
Promise<undefined> increaseZoomLevel
();
Promise<undefined> decreaseZoomLevel
();
Promise<undefined> resetZoomLevel
();
attribute EventHandler onzoomlevelchange
;
};
getSupportedZoomLevels()
This method allows applications to discover the set of zoom levels supported by the user agent.
When invoked, the user agent MUST run the following steps:
InvalidStateError
" DOMException
.
[[DisplaySurfaceType]]
.NotSupportedError
" DOMException
.
zoomLevel
This attribute allows applications to discover the captured display surface's zoom level.
On getting, the user agent MUST return this.[[ZoomLevel]]
.
increaseZoomLevel()
This method allows applications to set the captured display surface's zoom level one step higher than its current value.
When this method is invoked, the user agent MUST run the set zoom level algorithm
with this as the controller and "increase"
as the zoomAction.
decreaseZoomLevel()
This method allows applications to set the captured display surface's zoom level one step lower than its current value.
When this method is invoked, the user agent MUST run the set zoom level algorithm
with this as the controller and "decrease"
as the zoomAction.
resetZoomLevel()
This method allows applications to set the captured display surface's zoom level to 100.
When this method is invoked, the user agent MUST run the set zoom level algorithm
with this as the controller and "reset"
as the zoomAction.
onzoomlevelchange
An event handler IDL attribute whose event handler event type is
zoomlevelchange
.
Whenever this.[[Source]]'s zoom level changes to newZoomLevel, the user agent MUST queue a global task on the user interaction task source given the current realm's global object, which will run the following stpes:
[[ZoomLevel]]
to newZoomLevel.zoomlevelchange
at this.Examples of causes include:
increaseZoomLevel
()
.WebIDLpartial interface CaptureController {
constructor
();
Promise<undefined> forwardWheel
(HTMLElement? element);
};
constructor
CaptureController
's
constructor is
extended to also define and initialize the following internal slots:
Internal Slot | Initial value |
---|---|
[[ZoomLevel]] | null |
[[ForwardWheelElement]] | null |
[[ForwardWheelEventListener]] | null |
forwardWheel()
This method allows applications to automatically forward
wheel events
from an HTMLElement
to the viewport of a captured display surface.
When invoked, the user agent MUST run the following steps:
DOMException
object whose name
attribute has the value
InvalidStateError
.
DOMException
object whose name
attribute has the value
InvalidStateError
.
[[DisplaySurfaceType]]
.DOMException
object whose name
attribute has the value NotSupportedError
.
Promise
.granted
", and the relevant global object does NOT have transient activation, then:
DOMException
object whose name
attribute has the value
InvalidStateError
.
This step ensures that on the one hand, permission prompts are not be shown
without transient activation, while on the one hand, if the permission
is already "granted
",
forwardWheel
()
may be called immediately after
getDisplayMedia
()
resolves, even if the transient activation that permitted the call to forwardWheel
()
has since expired.
PermissionDescriptor
with its
name
member set to
"captured-surface-control". If the result of the request is
"denied
", then:
DOMException
object whose name
is NotAllowedError
.
[[ForwardWheelElement]]
is not null
,
remove an event listener with
this.[[ForwardWheelElement]]
as eventTarget and
this.[[ForwardWheelEventListener]]
as listener.
[[ForwardWheelEventListener]]
to null
.
[[ForwardWheelElement]]
to element.[[ForwardWheelElement]]
is not null
:
[[ForwardWheelEventListener]]
to an
event listener defined as follows:
wheel
EventListener
instance
representing a reference to a function of one argument of type Event
event. This function executes the forward wheel event algorithm
given this and event.
[[ForwardWheelElement]]
as eventTarget and
this.[[ForwardWheelEventListener]]
as listener.
Extend the getDisplayMedia algorithm as follows:
Recall that p is the promise which the algorithm returns. Immediately before the step which resolves it, add the following steps:
null
and controller.[[DisplaySurfaceType]]
is a supported display surface type, then set
controller.[[ZoomLevel]]
to controller.[[Source]]'s zoom level.
To determine if a CaptureController
controller is
actively capturing, run the following steps:
[[Source]]
.null
, return false
.false
.
true
.
To determine if a CaptureController
controller is
is self-capturing, run the following steps:
false
.[[Source]]
is a display surface of type
browser, and represents the relevant global object's
associated Document
, return true
.
false
.To determine if a display surface surfaceType is supported display surface type, run the following steps:
true
.false
.Whether window should be supported is under discussion.
The set zoom level algorithm, given a controller of type
CaptureController
and a zoomAction of type DOMString
as arguments,
consists of running the following steps:
DOMException
object whose name
attribute has the value
InvalidStateError
.
DOMException
object whose name
attribute has the value
InvalidStateError
.
[[DisplaySurfaceType]]
.DOMException
object whose name
attribute
has the value NotSupportedError
.
Ensure that the code is running from within the context of an event handler which was triggered by the browser agent firing a trusted event, triggered by the user interacting with the user agent. To do so, run the following steps:
Window
.event
.undefined
, return a promise rejected with a
DOMException
object whose name
attribute has the value
InvalidStateError
.
isTrusted
is false
, return a promise
rejected with a DOMException
object whose name
attribute has the value InvalidStateError
.
type
is not in permitted event types for zoom-setting, return a promise rejected with a DOMException
object
whose name
attribute has the value InvalidStateError
.
It follows from these steps that increaseZoomLevel
()
,
decreaseZoomLevel
()
and
resetZoomLevel
()
are only callable with transient activation, because permitted event types for zoom-setting only contains
event types that confer this activation.
In fact, our API shape implies a stronger guarantee - whereas transient activation persists for several seconds after the user action, the API shape here limits zoom-setting to immediately after the user's action.
[[Source]]
's zoom level
long
. Set its value as follows:If zoomAction is "decrease"
then:
DOMException
object whose
name
attribute has the value InvalidStateError
.
Else, if zoomAction is "increase"
then:
DOMException
object whose
name
attribute has the value InvalidStateError
.
Else:
"reset"
.100
.Promise
.Run the following steps in parallel:
PermissionDescriptor
with its
name
member set to
"captured-surface-control". If the result of the request is
"denied
", then:
DOMException
object whose name
is NotAllowedError
.
[[Source]]
's zoom level to targetZoomLevel.
The forward wheel event algorithm takes a CaptureController
controller
and a WheelEvent
event, and runs the following steps:
[[DisplaySurfaceType]]
.granted
", abort these steps.
isTrusted
is false
, abort these steps.offsetX
, event.offsetY
] and
this.[[ForwardWheelElement]]
.
"wheel"
using WheelEvent
with the x
attribute
initialized to scaledX, the y
attribute initialized to scaledY,
the deltaX
attribute initialized to event.deltaX and the
deltaY
attribute initialized to event.deltaY, at the
topmost event target.
The scale element coordinates algorithm takes double
coordinates [x, y]
and a CaptureController
controller, and run the following steps:
(x /
controller.[[ForwardWheelElement]]
.getBoundingClientRect
()
.width
)
.
(x /
controller.[[ForwardWheelElement]]
.getBoundingClientRect
()
.height
)
.
[[Source]]
's viewport's width.
[[Source]]
's viewport's
height.
(|scaleFactorX| * |surfaceWidth|)
.(|scaleFactorY| * |surfaceHeight|)
.This subroutine assumes that controller is actively capturing.
The API surfaces introduced in this specification allow a capturing application limited control over a captured application. These APIs allow the capturing application to gain access to additional pixels in the captured application. This specification employs multiple means to ensure that new capabilities are used in accordance with the user's intentions. Among these means:
PermissionsPolicy
called "captured-surface-control" is used.forwardWheel
()
is designed such that only the user's scrolling over
an Element
can trigger scrolling in the captured application. This API shape ensures
that the capturing application can only forward wheel events to the captured application at the time when the user agent dispatches the
trusted wheel event on the capturing application itself.
increaseZoomLevel
()
, decreaseZoomLevel
()
and
resetZoomLevel
()
are only callable from event handlers of specific
event types - the permitted event types for zoom-setting. These are events dispatched directly by the user agent, triggered by user
interaction. This specification intentionally excludes from this set such events as
"mousemove", which users are liable to trigger
inadvertently.
The shape of forwardWheel
()
is intentionally chosen to limit the
capturing application's control. The application designates a specific element which, when
the user scrolls over it, the corresponding wheel events are forwarded to the captured
application.
This specification does not limit the type of Element
for which either
increaseZoomLevel
()
, decreaseZoomLevel
()
,
resetZoomLevel
()
or forwardWheel
()
work. Such
a limitation would accomplish nothing, because malicious applications could always overlay
transparent permitted Element
types on top of visible non-permitted Element
s,
thereby bypassing this restriction.
The limitation of interaction types is sufficient. This is accomplished by
forwardWheel
()
through its shape, and by
increaseZoomLevel
()
, decreaseZoomLevel
()
and
resetZoomLevel
()
through their gating on
event types.
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key word MUST in this document is to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.