The EPUB Accessibility Metadata Guide details when to set the Schema.org accessibility metadata properties in the EPUB package document.

This guide only details usage for EPUB publications. The guidance in this document may not be relevant to other formats.

Introduction

Schema.org [[schema-org]] is described as a collaborative, community activity with a mission to create, maintain, and promote schemas for structured data. Of particular importance is that it includes the following set of accessibility properties for describing creative works:

These properties are key to the discoverability of EPUB publications and are central to the requirements of the EPUB Accessibility standard [[epub-a11y]]. They are also used to generate accessibility statements about a publication, as described in the Accessibility Metadata Display Guide for Digital Publications.

To provide maximum value for readers, the metadata needs to be consistent, which is where the Schema.org Accessibility Properties for Discoverability Vocabulary [[a11y-discov-vocab]] comes in. This vocabulary defines controlled sets values to use with each property (with the exception of summaries, which are free-form text).

But even with a controlled vocabulary, as schema.org is intended to describe any resource on or referenced from the web, the value definitions do not always fully reflect how to apply them to EPUB publications. That is where this guide fits in. Its goal is add clarity on how to apply the metadata to both reflowable and fixed-layout EPUB publications.

Access modes

Overview

An access mode is defined as a "human sense perceptual system or cognitive faculty through which a user may process or perceive the content of a digital resource." [[iso24751-3]] For example, if an EPUB publication contains images and video, visual perception is required to consume the content exactly as it was created.

There are four access modes that are typically specified for EPUB publications:

General application guidelines

Access modes are set in the [[schema-org]] accessMode property. Repeat the property for each access mode.

For a user to determine whether an EPUB publication is suitable for their needs, they need to know which access modes are required to consume the content. List all applicable access modes in the [[schema-org]] accessMode property, repeating the property for each applicable mode.

Do not list access modes for content that does not contain information necessary to understand a publication. Most EPUB publications contain cover images, for example, but it is not necessary to see the cover image to read the publication. The same is true of publisher logos and images in the content are only for presentational purposes (i.e., have no information so have empty [[html]] alt attribute values). If these are the only visual content, then it is not valid to list a visual access mode. Similarly, if the only audio an EPUB publication contains is background music (e.g., for an instructional video with text captions, or as mood music while reading), listing an auditory access mode is not valid.

Note that the access modes of the content do not reflect any adaptations that have been provided. For example, if a comic book also includes alternative text for each image, it does not have a textual access mode. See the following section on sufficient access modes for how to indicate that the available adaptations allow the content to be consumed in another mode.

Setting values

Primary access modes

Auditory

Indicates that the resource contains information encoded in auditory form.

This value is not set when the auditory content conveys no information. For example, an instructional video might include background music while all the necessary information to complete the task is conveyed visually and/or through text captions.

Textual

Indicates that the resource contains information encoded in textual form.

This value is not set if the only textual content is for navigational purposes. For example, an audiobook might include a table of contents, but it is not necessary to read the table of contents to read the work. Likewise, books with synchronized text-audio playback may only include headings to allow structured navigation.

Visual

Indicates that the resource contains information encoded in visual form.

This value is not set if the only visual imagery is presentational or not directly relevant to understanding the content.

If the only images within the EPUB are: cover, author, corporate logos, or decorative images the accessMode of 'visual' would not be included in the metadata.

Visual content indicators

Text On visual

Indicates that the resource contains text encoded in visual form.

Synchronized text and audio

Setting the correct access modes and sufficient access modes for EPUB 3 publications that contain synchronized text-audio playback requires evaluating whether playback is essential to reading the publication or an additional feature.

EPUB 3's media overlays [[epub-3]] allows EPUB creators to synchronize the full text of a publication with full audio narration. These types of publications are commonly referred to as "read aloud" books, as the user chooses whether or not to turn on the narration (unlike traditional audiobooks where only the audio is available).

In this case, because the audio playback is an extra feature, EPUB creators would not list "auditory" as an access mode. Rather, they would indicate the presence of text and audio synchronization as an accessibility feature:

<meta
    property="schema:accessibilityFeature">
   synchronizedAudioText
</meta>

Although the audio is not essential to reading the publication, having full audio playback capability means that there is an auditory sufficient access mode — the user can listen to the complete publication. Consequently, the EPUB creator would declare a schema:accessModeSufficient property with the value auditory:

<meta
    property="schema:accessModeSufficient">
   auditory
</meta>

When media overlays are present, EPUB creators should not add "auditory" to all the possible sufficient access modes just because it is possible to turn on text-audio playback.

For example, a publication with media overlays that has text and images (with text alternatives) would only declare the following sufficient access modes: "textual", "auditory", and "textual,visual".

It would not declare "textual,auditory" or "textual,visual,auditory".

Another use for media overlays — more typical among accessible republishers such as libraries that serve blind and low vision readers — is to provide full audio synchronized to the major headings of a publication. These types of publications are more like traditional audiobooks, as all the information in the work is typically available in auditory form. The minimal text does not allow meaningful visual reading or text-to-speech playback; it is only to provide structured navigation capabilities.

In this case, being able to hear the audio is essential to being able to read the publication, so EPUB creators will list an auditory access mode:

<meta
    property="schema:accessMode">
   auditory
</meta>

In a reverse of the first case discussed, EPUB creators will not identify text-audio synchronization as a feature of the publication since the amount of text provided is trivial.

Likewise, since the text in the publication only provides heading navigation, EPUB creators will not list a textual access mode. The user does not need to, and is not expected to, visually read this text.

Some publications that provide the body in auditory form may include the backmatter in text form, allowing users to use text-to-speech playback to render it. In this case, there would be a textual component.

Sufficient access modes

Overview

The access modes sufficient to consume an EPUB publication express a broader picture of the potential usability than do the basic access modes. Where the basic access modes identify the default nature of the media used in the publication, sufficient access modes identify all the individual modes, and sets of modes, that allow a user to read a publication. Sufficient access modes account for the affordances and adaptations that the EPUB creators have provided, allowing users to determine whether they can the read content regardless of its default nature.

General application guidelines

Sufficient access modes are identified in the [[schema-org]] accessModeSufficient property. Repeat the property for each set of sufficient access modes.

The most strongly recommended sufficient access modes to list are the ones that consist of only a single value. Users seeking alternatives to the default encoding of the information typically want to read the content without switching reading modes (e.g., they want a purely textual alternative to use text-to-speech playback or an auditory alternative to listen to prerecorded narration). Listing the single value sets allows users to easily determine whether a publication will meet their reading needs.

The most common single sufficient access modes for EPUB publications are:

As an example of setting sufficient access modes, consider an EPUB publication that contains graphics and charts, as well as descriptions for all these images. The publication has both textual and visual content, so the EPUB creator will include the following schema:accessMode metadata entries to indicate this:

<meta
    property="schema:accessMode">
   textual
</meta>
<meta
    property="schema:accessMode">
   visual
</meta>

This metadata does not make clear whether a textual access mode is sufficient to read the entire publication, or whether a visual one is, only that the user requires the ability to read in those two modes by default. This discrepancy is why sufficiency is also important to know.

Since the EPUB creator has also included textual alternatives and/or descriptions for all the images in the example publication, the metadata can also indicate that a purely textual access mode is sufficient to read the content:

<meta
    property="schema:accessModeSufficient">
   textual
</meta>

Without this metadata, users would not have known that they could read the publication only via its textual content.

It is important to emphasize when listing individual values in the schema:accessModeSufficient property not to simply restate each individual access mode. When adding a schema:accessModeSufficient property with only a single value, all information in the publication must be available in that mode.

For example, the EPUB creator would not declare a sufficient access mode of "visual" for the example publication as the information is not entirely available in image-based form. (Photo books without text would be examples of works with a purely visual sufficient access mode.)

EPUB creators may also list any sets of sufficient access modes that allow full access to the information. As the most typical set of values is the combination of all the access modes, however, the information this provides users is less helpful in determining the usability of a publication.

For example, the metadata the EPUB creator inputs for the example publication re-establishes that there is a textual and visual reading option:

<meta
    property="schema:accessModeSufficient">
   textual,visual
</meta>

The order in which EPUB creators list the access modes in a set is not important. The only requirement is to separate the values by commas.

The complete set of schema:accessMode and schema:accessModeSufficient entries for the example publication is as follows:

<meta
    property="schema:accessMode">
   textual
</meta>
<meta
    property="schema:accessMode">
   visual
</meta>
<meta
    property="schema:accessModeSufficient">
   textual,visual
</meta>
<meta
    property="schema:accessModeSufficient">
   textual
</meta>

Note that sufficiency of access is often a subjective determination of the EPUB creator based on their understanding of what information is essential to comprehending the text. Some information loss occurs by not being able to view a video, for example, but the EPUB creator might regard the visual or auditory losses as inconsequential if a transcript provides all the necessary information to understand the concepts being conveyed.

Refer to The accessModeSufficient Property [[a11y-discov-vocab]] for more information about this property and its values.

The accessModeSufficient property, as defined in [[schema-org]], allows more complicated expressions than can be represented in the EPUB 2 or 3 package document (e.g., definition of lists of values and inclusion of a human-readable description). A future version of EPUB might allow for richer metadata, but the basic expression shown in this section is sufficient for discovery purposes.

Setting values

Single reading modes

Having a single accessModeSufficient implies that the entire content of the EPUB can be consumed with only that mode of accessing the content within the book.

Visual-only readability

Indicates that only visual perception is necessary to consume the information.

An example of this would be a children's picture book.

Textual-only readability

Indicates that the ability to read textual content is necessary to consume the information.

Note that reading textual content does not require visual perception, as textual content can be rendered as audio using a text-to-speech capable device or embossed as braille for tactile reading.

An example of this would be a romance novel.

Auditory-only readability

Indicates that auditory perception is necessary to consume the information.

An example of this would be an audio book.

Combined reading modes

Combining various accessModeSufficients implies that a combination of access modes are required to completely consume and understand the content within the EPUB.

Visual & textual readability

Indicates that both visual perception and the ability to read textual content is necessary to consume the information.

An example of this would be a cookbook.

Visual & auditory readability

Indicates that both visual perception and the ability to hear the content is necessary to consume the information.

An example of this would be a narrated picture book.

Auditory & textual readability

Indicates that both the ability to hear the content and read textual content is necessary to consume the information.

Examples of this would be an an audio book with synchronized text highlighting, or an interactive dictionary where you could hear the pronunciation of the defined words.

Accessibility features

Overview

Identifying all the accessibility features and adaptations included in an EPUB publication allows users to determine whether the content is usable at a more fine-grained level than the access modes do.

For example, a math textbook might have a textual access mode, but that alone does not indicate whether MathML markup is available. Whether a visual work only provides alternative text or whether it includes extended descriptions is also important to know when gauging its usability.

General application guidelines

Accessibility features are identified in the [[schema-org]] accessibilityFeature property. Repeat this property for each feature.

The EPUB format requires that some accessibility features will always be present (e.g., a table of contents). Do not exclude these features from the accessibility metadata, as users typically are not aware what features are built into a format. Failing to include entries will reduce the discoverability of the publication when users search for specific features.

Be aware that although the vocabulary for the accessibilityFeature property [[a11y-discov-vocab]] contains the values "none" and "unknown", these terms cannot be used to meet the reporting requirements for the property. Authors must indicate at least one feature that is not one of these values to claim conformance to EPUB Accessibility 1.1.1 [[epub-a11y-111]].

Refer to The accessibilityFeature Property [[a11y-discov-vocab]] for more information about this property and its values.

Setting values

Organization of values

The accessibilityFeature property provides a list of all the applicable accessibility characteristics of the content. It allows a user agent to discover these characteristics without having to parse or interpret the structure of the content.

For ease of reading, this section splits the vocabulary into the following distinct groups:

Structure and navigation terms identify navigation aids that are provided to simplify moving around within the media, such as the inclusion of a table of contents or an index.

Adaptation terms identify content features that provide alternate access to a resource. The inclusion of alternative text in an [HTML] alt attribute is one of the most commonly identifiable augmentation features.

Rendering control terms identify content rendering features that users have access to or can control. The ability to modify the appearance of the text is one example.

Specialized markup terms identify that content is encoded using domain-specific grammars like MathML and ChemML that can provide users a richer reading experience.

Clarity terms identify ways that the content has been enhanced for clearer readability. Audio with minimized background noise is one example, while content formatted for large print reading is another.

Tactile terms identify content that is formatted for tactile use, such as graphics and objects.

Internationalization terms identify those accessibility characteristics of the content which are required for internationalization.

Structure and navigation terms

ARIA

Indicates the resource includes ARIA roles to organize and improve the structure and navigation.

The use of this value corresponds to the inclusion of [[DPUB-ARIA]] semantics, Document Structure, Landmark, and Window roles [[WAI-ARIA]].

Index

The resource includes an index to the content.

Page Break Markers

The resource includes static page markers, such as those identified by the doc-pagebreak role [[DPUB-ARIA]].

Page Navigation

The resource includes a Page List to the content.

Reading Order

The reading order of the content is clearly defined in the markup (e.g., figures, sidebars and other secondary content has been marked up to allow it to be skipped automatically and/or manually escaped from.

Add this metadata of readingOrder only when the accessible reading order matches the visual layout reading order.

An example when this would not be the case is a two page spread where the intent is to visually read across the page boundry, since AT would read down the first page and then continue reading down the 2nd page.

The metadata to include in the package document when the visual reading order matches the accessible reading order would be:

                            
                            <meta property="schema:accessibilityFeature">readingOrder</meta>
                            
                        
Structural Navigation

The metadata structuralNavigation is used to indicate that an EPUB document has been structured to support accessible navigation through its content. This is crucial for users of assistive technologies.

The use of headings, lists, tables, asides, etc. in the resource fully and accurately reflects the document hierarchy, allowing navigation by assistive technologies.

You should use this metadata when your document meets the following criteria:

  • Semantic Structure: The document uses semantic HTML5 elements that accurately represent the content's logical structure.
  • Heading Hierarchy: Headings (h1-h6) are used in a meaningful, hierarchical manner that reflects the document's outline.
  • Navigational Elements: The document includes:
    • Properly nested headings
    • Ordered and unordered lists where appropriate
    • Tables with appropriate headers and captions
    • Semantic elements like section, article, aside, etc.
    • Meaningful landmark roles

Benefits of Structural Navigation:

  • Enables screen readers to provide better navigation
  • Allows users to jump between sections easily
  • Provides a clear, logical understanding of the document's structure
  • Improves overall accessibility and usability

The metadata to include in the package document when proper structural elements are used to facility accessible navigation would be:

                                
                                <meta property="schema:accessibilityFeature">structuralNavigation</meta>
                                
                            
Table Of Contents

The resource includes a table of contents that provides links to the major sections of the content.

The Table of Contents referred to here is the one in the 'Nav Doc', which is processed by the Reading System to access the content. The publisher could also add an additional Table of Contents in the front matter of the book in the 'spine' but this is not required for use of this metadata.

Adaptation Terms

Alternative Text

Alternative text is provided for visual content. In fixed layout books this would typically be in the form of textual descriptions of the images contained within the publication.

Audio Description

Audio descriptions are available (e.g., via an HTML track element with its kind attribute set to "descriptions").

Closed Captions

Indicates that synchronized closed captions are available for audio and video content.

Closed captions are defined separately from the video, allowing users to control whether they are rendered or not, unlike open captions which would be rendered directly onto the video during the video editing process, becoming a permanent part of the visual content.

Described Math

Textual descriptions of math equations are included in the alt attribute for image-based equations, or by other means.

Long Description

Descriptions are provided for image-based visual content and/or complex structures such as tables, mathematics, diagrams, and charts.

Open Captions

Indicates that synchronized open captions are available for audio and video content.

Open captions are part of the video stream and cannot be turned off by the user, unlike closed captions.

Sign Language

Sign language interpretation is available for audio and video content.

Transcript

Rendering Control Terms

Synchronized Audio Text
Timing Control

Specialized Markup Terms

Chemistry Represented using ChemML
Chemistry Represented using LaTeX
Chemistry Represented using MathML
Math Represented using LaTeX
Math Represented using MathML
Text-To-Speech Markup

Clarity Terms

High Contrast Audio
High Contrast Display
Large Print

Tactile Terms

Braille
Tactile Graphic
Tactile Object

Internationalization Terms

Full Ruby Annotations
horizontalWriting
Ruby Annotations
Vertical Writing
With Additional Word Segmentation
Without Additional Word Segmentation

None

Indicates that the resource does not contain any accessibility features.

The none value must not be set with any other feature value.

Unknown

Indicates that the author has not yet checked if the resource contains accessibility features. This value is only intended as a placeholder until an accessibility review can be completed.

The unknown value must not be set with any other feature value.

Accessibility hazards

Overview

There are three widely recognized hazards that can affect readers of digital content:

EPUB creators have to report whether their EPUB publications contain resources that present any of these hazards to users, as they can have real physical effects.

What precisely constitutes a sound hazard, and how to test for these hazards, is not standardized as of publication of this document. EPUB creators will have to use their discretion on when to specify a sound hazard until additional guidance is developed. This technique will be updated whenever there is more clarity on this issue.

General application guidelines

Hazards are identified in the [[schema-org]] accessibilityHazard property. Repeat this property for each hazard.

Unlike other accessibility properties, the presence of hazards can be expressed both positively and negatively. This design decision was made because users most often search for content that is free from hazards that affect them, but also want to know what dangers are present in any publications they discover. To indicate that hazards are not present, use the values "noFlashingHazard", "noMotionSimulationHazard", and "noSoundHazard".

Do not skip reporting hazards just because an EPUB publication does not contain any content that could present risks. Users cannot infer a meaning when no metadata is present. The value "none" can be used in such cases instead of repeating each non-hazard. When the "none" value is used, no other hazards values may be specified.

If an EPUB publication contains a hazard, provide additional information about its source and nature in the accessibility summary.

If an EPUB creator cannot determine if a publication presents a specific hazard for users, list that hazard as unknown. The following values are used to identify individual unknown hazards: "unknownFlashingHazard", "unknownMotionSimultationHazard", and "unknownSoundHazard".

For example, determining whether sound hazards are present can be challenging as the causes are currently not well defined. In this case, EPUB creators may prefer to set the "unknownSoundHazard" value, as in the following example.

If it is not possible to determine any hazards, the value "unknown" can be used in place of setting the individual hazards to unknown. This value should be used sparingly, however, as it is of no value to users. EPUB creators should make every effort to determine if hazards are present. When the "unknown" value is set, no other hazard values may be specified.

EPUB creators must ensure that information about all three types of hazards is included when not using the "unknown" or "none" values.

Refer to The accessibilityHazard Property [[a11y-discov-vocab]] for more information about this property and its values.

Setting values

Hazards

Unknown hazards

No hazards

Accessibility summary

Overview

An accessibility summary provides a brief, human-readable description of the accessibility characteristics of an EPUB publication that cannot be expressed through the other discovery metadata.

General application guidelines

An accessibility summary is provided using the [[schema-org]] accessibilitySummary property.

The accessibility summary should not simply repeat the conformance information provided in the dcterms:conformsTo property, for example, or the features listed in the schema:accessibilityFeature properties. When other accessibility metadata is present in the package document, systems that process EPUB publications can already present it to users. Repeating it in the summary only makes them hear the information again.

EPUB creators should not include an accessibility summary when they have nothing more to add to the conformance claim and other discovery metadata.

If an EPUB publication does not meet the requirements for content accessibility in [[epub-a11y-111]], the reason(s) it fails should be noted in the summary. Similarly, if an EPUB creator is hesitant to make a formal claim of conformance, the reasons why can be explained in the summary.

Do not repeat this property to provide translations of a summary. EPUB does not define a method for including translations. Putting different xml:lang attributes on properties does not indicate a translation and could lead to wrong summary being rendered to users.

Accessibility APIs

It is not necessary to set the schema:accesibilityAPI property for EPUB publications. EPUB creators are not responsible for the interaction between reading systems and the underlying platform APIs.

Accessible Control

It is not necessary to set the schema:accesibilityControl property for EPUB publications. This property does not differentiate issues arising from the reading system interface from those in the underlying content, which has led to confusion about its use.

Meeting the requirements of [[wcag2]] will mitigate most known issues with the content and is sufficient for authoring purposes.

Examples

The following examples show the metadata that would be added to an EPUB publication that has textual and visual access modes, is sufficient for reading by text, contains alternative text and MathML markup, and has a flashing hazard.