RTC Accessibility User Requirements

Abstract

This document outlines various accessibility related user needs, requirements and scenarios for real-time communication (RTC). These user needs should drive accessibility requirements in various related specifications and the overall architecture that enables it. It first introduces a definition of RTC as used throughout the document and outlines how RTC accessibility can support the needs of people with disabilities. It defines the term user needs as used throughout the document and then goes on to list a range of these user needs and their related requirements. Following that some quality related scenarios are outlined and finally a data table that maps the user needs contained in this document to related use case requirements found in other technical specifications.

This document is most explicitly not a collection of baseline requirements. It is also important to note that some of the requirements may be implemented at a system or platform level, and some may be authoring requirements.

The following outlines a range of user needs and requirements. The user needs have also been compared to existing use cases for real-time text (RTT) such as the IETF 'Framework for Real-Time Text over IP Using the Session Initiation Protocol (SIP)' RFC 5194 and the European Procurement Standard EN 301 549. [rtt-sip] [EN301-549]

User Need 1: A deaf or hard of hearing user needs to anchor or pin certain windows in an RTC application so both a sign language interpreter and the person speaking (whose speech is being interpreted) are simultaneously visible.
REQ 1a: Provide the ability to anchor or pin specific windows so the user can associate the sign language interpreter with the correct speaker.
REQ 1b: Allow the use of flexible pinning of captions or other related content alternatives. This may be to second screen devices.
REQ 1c: Ensure the source of any captions, transcriptions or other alternatives is clear to the user, even when second screen devices are used.
REQ 1d: Atomic pieces of data such as information regarding the person currently speaking, activities such as the people entering or leaving a meeting, or the last message posted in the chat channel, can be pinned to a user interface.
REQ 1e: For pinned content, there is a need to handle and support the metadata that allows the client engine to re-aggregate or re-route any pinned content.
REQ 1f: A user should have the ability to change the size of the window especially for sign language interpreters, certified deaf interpreters, and signing people.

Note

Not all atomic items necessarily are pinned next to other atomic elements but some may be dependent, related or updated synchronously. For example, if there are multiple atomic data points destined for an 80 character braille display that has been sectioned to display 4 atomic items in up to 19 spaces each (leaving at least one blank cell for spacing).

Note

Here the term atomic relates to small pieces of data. For the purposes of accessibility conformance testing, the definitions and use of the terms 'atomic' and 'atomic rules' may also be useful. [applicability-atomic] [rule-types]

User Need 2: A deaf or hard of hearing user may need captioning of content to be private in a meeting or presentation.
REQ 2a: Ensure there is a host operable toggle in the captioning service (whether human or automated) that facilitates going on and off record for the preserved transcript, but continues to provide captions meanwhile for 'off record' conversations.
REQ 2b: Ensure the toggle between saving recordings also applies to the saving of captions. There should be a mechanism that both audio and captions can be paused or stopped, and both can be simultaneously restored for recording.

User Need 3: A user may need to change device or environment and have their accessibility user preferences preserved.
REQ 3a: Ensure user profiles and accessibility preferences in RTC applications are mobile and can move with the user as they change device or environment.

User Need 4: A screen-reader user or user with a cognitive impairment needs to know a call is incoming and needs to recognise the ID of the caller. A deaf or hard of hearing user may also need to identify an incoming relay call.
REQ 4a: Provide indication of incoming calls in an unobtrusive way via a symbol set or other browser notification.
REQ 4b: Alert assistive technologies via relevant APIs.
REQ 4c: Support the presentation and display of call prefix information for relay calls.

Note

Successful design of operations required for acting on incoming calls, getting informed about who the caller is and connecting relay services should not require complicated sequences of user actions.

User Need 5: A user of speech and Augmentative and Alternative Communication (AAC), or a blind user of screen reader and braille output devices simultaneously, will need to manage audio and other output separately.
REQ 5a: Provide or support a range of browser level audio output options.
REQ 5b: Allow controlled routing of alerts and other browser output to a braille device or other hardware.

User Need 6: A deaf user needs to move parts of a live teleconference session (as separate streams) to one or more devices for greater control.
REQ 6a: Allow the separate routing of video streams such as, captioning or a sign language interpreter to a separate high resolution display.

User Need 7: Users with cognitive disabilities or blind users may have relative volume levels set as preferences that relate to importance, urgency or meaning.
REQ 7a: Allow the panning or setting of relative levels of different audio output.
REQ 7b: Support multichannel audio in the browser.

User Need 8: A user may struggle to hear audio description in a live teleconferencing situation.
REQ 8a: Ensure Audio Description (AD) recommended sound values are dynamic and have independent volume, EQ adjustment and routing capability.
REQ 8b: Support a users custom EQ profile.
REQ 8c: If not transmitted in a live screen share - ensure the platform doesn't strip captions or descriptions that may have been part of the original video.

Note

Moving beyond mono in this context is also important, as the stereo spread allows audio descriptions to be sound staged. Applications should also inherit customization settings from the user's operating system.

User Need 9: Any user watching captioning or audio description needs to be confident it is synchronised and accurate.
REQ 9a: Ensure that any outages or loss to captioning or audio description will be repaired while preserving context and meaning.
REQ 9b: Ensure that the integrity of related alternate supporting tracks or streams such as transcriptions, are in sync with any repairs.

User Need 10: A deaf user needs to simultaneously talk on a call, send and receive real-time text and/or instant messages via a text interface and watch sign language using a video stream.
REQ 10a: Ensure support for multiple simultaneous streams.

Note

This user need may also indicate necessary support for 'Total conversation' services as defined by ITU in WebRTC applications. These are combinations of voice, video, and RTT in the same real-time session. [total-conversation]

User Need 11: In an emergency situation an Augmentative and Alternative Communication (AAC) user, deaf, speech impaired, hard of hearing or deaf blind user needs to make an emergency call, instantly send and receive related text messages and/or sign via a video stream.
REQ 11a: Provide or ensure support for RTT in WebRTC.
REQ 11b: Avoid the problem of unsent emergency messages. A user may not be aware when they have not successfully sent an emergency message. For example, RTT avoids this problem due to instantaneous data transfer but this may be an issue for other messaging methods or platforms.

User Need 12: A deaf, speech impaired, or hard of hearing user, needs to communicate on a call using a remote video interpretation service (VRI) to access sign language and interpreter services.
REQ 12a: Provide or ensure support for video relay and remote interpretation services. This user need may relate to interoperability with third-party services; IETF has looked at standardizing a way to use Session Initiation Protocol (SIP) with VRS services. [ietf-relay]
REQ 12b: Provide VRS and VRI support for different specified sign languages and various spoken language translations. A user may also need to stream or pin both.
REQ 12c: Ensure that privacy and security options are maintained when using relay services.

Note

To successfully connect video or text relay services should not require a complicated sequence of user actions.

User Need 13: A deaf or deaf blind user needs to tell the difference between incoming text and outgoing text.
REQ 13a: Ensure when used with RTT functionality, WebRTC handles the routing of this information to a format or output of the users choosing.

User Need 14: In a teleconference a user needs to know what participants are on the call, as well as their status.
REQ 14a: Ensure participant details such as name and status; whether the person is muted or talking is accessible to users of assistive technologies.
REQ 14b: Ensure participant metadata such as their name, their affiliation or other relevant information, is correctly associated with the meeting record and can be preserved for review after the call. This should be done with the participants consent.

User Need 15: A deaf user or user with a cognitive disability needs to access a channel containing live transcriptions during a conference call or broadcast.
REQ 15a: Honor user preferences relating to captioned content. Provide support for signing or use of symbol sets e.g. Augmentative and Alternative Communication (AAC).

User Need 16: Users with cognitive disabilities may need assistance when using audio or video communication.
REQ 16a: Ensure a WebRTC video call can host a technical or user support channel.
REQ 16b: Provide support that is customised to the needs of the user. This may be via a relay service or speech-speech-relay-service.

User Need 17: Users with cognitive disabilities may need to use symbol sets or AAC for identifying functions available in a WebRTC enabled client for voice, file or data transfer.
REQ 17a: Provide personalization support for symbols set replacements of existing user interface rendering of current functions or controls.

Note

This relates to cognitive accessibility requirements. For related work at W3C see the 'Personalization Semantics Content Module 1.0' and 'Media Queries Level 5'. [personalization] [media-queries]

User Need 18: To translate text to speech interactions into comprehensible speech; a blind screen reader user depending on text to speech (TTS) to interact with their computers and smart devices needs a traditional Internet relay chat (IRC) style interface.
REQ 18a: Preserve IRC as a configuration option in user agents that implement WebRTC as opposed to having only the real-time text type interface. RTT is favoured by users who are deaf or hearing impaired. For screen reader users, TTS cannot reasonably translate text into comprehensible speech unless characters are transmitted in very close timing to one another. Typical gaps will result in stuttering and highly unintelligible speech output from the TTS engine.

Note

Some braille users will also prefer the RTT model. However, braille users desiring text displayed with standard contracted braille might better be served in the manner users relying on TTS engines are served, by buffering the data to be transmitted until an end of line character is reached.

The following is a list of new user needs and requirements since the publication of the previous working draft:

Window anchoring and pinning: A deaf or hard of hearing user needs to anchor or pin certain windows in an RTC application so both a sign language interpreter and the person speaking (whose speech is being interpreted) are simultaneously visible.
REQ 1a: Provide the ability to anchor or pin specific windows so the user can associate the sign language interpreter with the correct speaker.
REQ 1b: Allow the use of flexible pinning of captions or other related content alternatives. This may be to second screen devices.
REQ 1c: Ensure the source of any captions, transcriptions or other alternatives is clear to the user, even when second screen devices are used.
REQ 1d: Atomic pieces of data such as information regarding the person currently speaking, activities such as the people entering or leaving a meeting, or the last message posted in the chat channel, can be pinned to a user interface.
REQ 1e: For pinned content, there is a need to handle and support the metadata that allows the client engine to re-aggregate or re-route any pinned content.
Pause 'on record' captioning in RTC : A deaf or hard of hearing user may need captioning of content to be private in a meeting or presentation.
REQ 2a: Ensure there is a host operable toggle in the captioning service (whether human or automated) that facilitates going on and off record for the preserved transcript, but continues to provide captions meanwhile for 'off record' conversations.
REQ 2b: Ensure the toggle between saving recordings also applies to the saving of captions. There should be a mechanism that both audio and captions can be paused or stopped, and both can be simultaneously restored for recording.

Accessibility user preferences and profiles: A user may need to change device or environment and have their accessibility user preferences preserved.
REQ 3a: Ensure user profiles and accessibility preferences in RTC applications are mobile and can move with the user as they change device or environment.

The following is a list of updated requirements to existing user needs:

Incoming calls and caller ID - REQ 4c: Support the presentation and display of call prefix information for relay calls.
Audio description in live conferencing - REQ 8b: Support a users custom EQ profile.
Audio description in live conferencing - REQ 8c: If not transmitted in a live screen share - ensure the platform doesn't strip captions or descriptions that may have been part of the original video.
Emergency calls and RTT - REQ 11b: Avoid the problem of unsent emergency messages. A user may not be aware when they have not successfully sent an emergency message. For example, RTT avoids this problem due to instantaneous data transfer but this may be an issue for other messaging methods or platforms.
Video relay services (VRS) and video remote interpretation (VRI) - REQ 12b: Provide support for other sign languages and translations. For example, VRS calls may be made between a sign language user and a person speaking another language. There are variations in signing itself such as Irish Sign Language (ISL), which is related to French sign language, and British Sign Language (BSL). A user may need to stream or pin both.
Video relay services (VRS) and video remote interpretation (VRI) - REQ 12c: Ensure that privacy and security options are maintained when using relay services.

The following are other changes in this document:

Changed the title of 'Dynamic audio description values in live conferencing' to 'Audio description in live conferencing'.
New note on the relationship between RTC and XR Accessibility User Requirements.
New note on personalization semantics and CSS media queries.
Moved 'User Need 19: A deaf user watching a signed broadcast needs a high-quality frame rate to maintain legibility and clarity in order to understand what is being signed' to the 'Quality of service issues' section.
Added note on ITU definition of Total Conversation services that relates to 'REQ 10a: Ensure support for multiple simultaneous streams'.

Note

This document has been updated based on document feedback, discussion and Research Questions Task Force consensus.

RTC Accessibility User Requirements

Abstract

Status of This Document

Introduction

What is real-time communication (RTC)?

1. Real-time communication and accessibility

2. User needs definition

3. User needs and requirements

3.1 Window anchoring and pinning

3.2 Pause 'on record' captioning in RTC

3.3 Accessibility user preferences and profiles

3.4 Incoming calls and caller ID

3.5 Routing and communication channel control

3.6 Audio description in live conferencing

3.7 Quality synchronisation and playback

3.8 Simultaneous voice, text & signing

3.9 Emergency calls: Support for Real-Time Text (RTT)

3.10 Text and Video relay services (VRS)

3.11 Distinguishing sent and received text with RTT

3.12 Call participants and status

3.13 Captioning support

3.14 Assistance for users with cognitive disabilities

3.15 Personalized symbol sets for users with cognitive disabilities

3.16 Internet relay chat (IRC) style interfaces

4. Relationship between RTC and XR Accessibility

5. Quality of service scenarios

5.1 Deaf users: Video resolution and frame rates

5.2 Audio frequency bandwidth

6. Quality requirements for video

A. Change Log

B. Acknowledgments

B.1 Participants of the APA working group active in the development of this document:

B.2 Previously Active Participants, Commenters, and Other Contributors

C. Enabling funders

D. References

D.1 Informative references