The EPUB Accessibility Metadata Guide details when to set the Schema.org accessibility metadata properties in the EPUB package document.
This guide only details usage for EPUB publications. The guidance in this document may not be relevant to other formats.
Schema.org [[schema-org]] is described as a collaborative, community activity with a mission to
create, maintain, and promote schemas for structured data.
Of particular importance is that it
includes the following set of accessibility properties for describing creative works:
These properties are key to the discoverability of EPUB publications and are central to the requirements of the EPUB Accessibility standard [[epub-a11y]]. They are also used to generate accessibility statements about a publication, as described in the Accessibility Metadata Display Guide for Digital Publications.
To provide maximum value for readers, the metadata needs to be consistent, which is where the Schema.org Accessibility Properties for Discoverability Vocabulary [[a11y-discov-vocab]] comes in. This vocabulary defines controlled sets values to use with each property (with the exception of summaries, which are free-form text).
But even with a controlled vocabulary, as schema.org is intended to describe any resource on or referenced from the web, the value definitions do not always fully reflect how to apply them to EPUB publications. That is where this guide fits in. Its goal is add clarity on how to apply the metadata to both reflowable and fixed-layout EPUB publications.
An access mode is defined as a "human sense perceptual system or cognitive faculty through which a user may process or perceive the content of a digital resource." [[iso24751-3]] For example, if an EPUB publication contains images and video, visual perception is required to consume the content exactly as it was created.
There are four access modes that are typically specified for EPUB publications:
textual — the publication contains text content (headings, paragraphs, etc.).
visual — the publication contains visual content such as images, graphics, diagrams, animations, and video.
auditory — the publication contains auditory content such as standalone audio clips and audio soundtracks for video content.
tactile — the publication contains tactile content such as embedded braille and tactile diagrams.
Access modes are set in the [[schema-org]] accessMode
property. Repeat the property for each access mode.
For a user to determine whether an EPUB publication is suitable for their needs, they need to know
which access modes are required to consume the content. List all applicable access modes in the
[[schema-org]] accessMode
property,
repeating the property for each applicable mode.
Do not list access modes for content that does not contain information necessary to
understand a publication. Most EPUB publications contain cover images, for example, but it is not
necessary to see the cover image to read the publication. The same is true of publisher logos and
images in the content are only for presentational purposes (i.e., have no information so have empty
[[html]] alt
attribute values). If these are the only visual content, then it is not
valid to list a visual access mode. Similarly, if the only audio an EPUB publication contains is
background music (e.g., for an instructional video with text captions, or as mood music while
reading), listing an auditory access mode is not valid.
Note that the access modes of the content do not reflect any adaptations that have been provided. For example, if a comic book also includes alternative text for each image, it does not have a textual access mode. See the following section on sufficient access modes for how to indicate that the available adaptations allow the content to be consumed in another mode.
Indicates that the resource contains information encoded in auditory form.
This value is not set when the auditory content conveys no information. For example, an instructional video might include background music while all the necessary information to complete the task is conveyed visually and/or through text captions.
Indicates that the resource contains information encoded in textual form.
This value is not set if the only textual content is for navigational purposes. For example, an audiobook might include a table of contents, but it is not necessary to read the table of contents to read the work. Likewise, books with synchronized text-audio playback may only include headings to allow structured navigation.
Indicates that the resource contains information encoded in visual form.
This value is not set if the only visual imagery is presentational or not directly relevant to understanding the content.
If the only images within the EPUB are: cover, author, corporate logos, or decorative
images the accessMode
of 'visual' would not be included in the
metadata.
Indicates that the resource contains text encoded in visual form.
Setting the correct access modes and sufficient access modes for EPUB 3 publications that contain synchronized text-audio playback requires evaluating whether playback is essential to reading the publication or an additional feature.
EPUB 3's media overlays [[epub-3]] allows EPUB creators to synchronize the full text of a publication with full audio narration. These types of publications are commonly referred to as "read aloud" books, as the user chooses whether or not to turn on the narration (unlike traditional audiobooks where only the audio is available).
In this case, because the audio playback is an extra feature, EPUB creators would
not list "auditory
" as an access
mode. Rather, they would indicate the presence of text and audio synchronization as an
accessibility feature:
<meta property="schema:accessibilityFeature"> synchronizedAudioText </meta>
Although the audio is not essential to reading the publication, having full audio playback
capability means that there is an auditory sufficient access
mode — the user can listen to the complete publication. Consequently, the EPUB
creator would declare a schema:accessModeSufficient
property with the value
auditory
:
<meta property="schema:accessModeSufficient"> auditory </meta>
When media overlays are present, EPUB creators should not add "auditory
" to all
the possible sufficient access modes just because it is possible to turn on text-audio
playback.
For example, a publication with media overlays that has text and images (with text
alternatives) would only declare the following sufficient access modes:
"textual
", "auditory
", and "textual,visual
".
It would not declare "textual,auditory
" or
"textual,visual,auditory
".
Another use for media overlays — more typical among accessible republishers such as libraries that serve blind and low vision readers — is to provide full audio synchronized to the major headings of a publication. These types of publications are more like traditional audiobooks, as all the information in the work is typically available in auditory form. The minimal text does not allow meaningful visual reading or text-to-speech playback; it is only to provide structured navigation capabilities.
In this case, being able to hear the audio is essential to being able to read the publication, so EPUB creators will list an auditory access mode:
<meta property="schema:accessMode"> auditory </meta>
In a reverse of the first case discussed, EPUB creators will not identify text-audio synchronization as a feature of the publication since the amount of text provided is trivial.
Likewise, since the text in the publication only provides heading navigation, EPUB creators will not list a textual access mode. The user does not need to, and is not expected to, visually read this text.
Some publications that provide the body in auditory form may include the backmatter in text form, allowing users to use text-to-speech playback to render it. In this case, there would be a textual component.
The access modes sufficient to consume an EPUB publication express a broader picture of the potential usability than do the basic access modes. Where the basic access modes identify the default nature of the media used in the publication, sufficient access modes identify all the individual modes, and sets of modes, that allow a user to read a publication. Sufficient access modes account for the affordances and adaptations that the EPUB creators have provided, allowing users to determine whether they can the read content regardless of its default nature.
Sufficient access modes are identified in the [[schema-org]] accessModeSufficient
property.
Repeat the property for each set of sufficient access modes.
The most strongly recommended sufficient access modes to list are the ones that consist of only a single value. Users seeking alternatives to the default encoding of the information typically want to read the content without switching reading modes (e.g., they want a purely textual alternative to use text-to-speech playback or an auditory alternative to listen to prerecorded narration). Listing the single value sets allows users to easily determine whether a publication will meet their reading needs.
The most common single sufficient access modes for EPUB publications are:
textual
— setting this value indicates that the publication is screen-reader
friendly (i.e., all the content is available for text-to-speech rendering by an assistive
technology). EPUB creators can set this value for EPUB publications that conform to Level A [[wcag2]] or higher, for example, as
there must be textual alternatives for non-text content to meet these thresholds.auditory
— setting this value indicates that pre-recorded audio is available
for the publication. EPUB creators can set this value for EPUB publications that provide media
overlays for all the content.As an example of setting sufficient access modes, consider an EPUB publication that contains graphics
and charts, as well as descriptions for all these images. The publication has both textual and
visual content, so the EPUB creator will
include the following schema:accessMode
metadata entries to indicate this:
<meta property="schema:accessMode"> textual </meta> <meta property="schema:accessMode"> visual </meta>
This metadata does not make clear whether a textual access mode is sufficient to read the entire publication, or whether a visual one is, only that the user requires the ability to read in those two modes by default. This discrepancy is why sufficiency is also important to know.
Since the EPUB creator has also included textual alternatives and/or descriptions for all the images in the example publication, the metadata can also indicate that a purely textual access mode is sufficient to read the content:
<meta property="schema:accessModeSufficient"> textual </meta>
Without this metadata, users would not have known that they could read the publication only via its textual content.
It is important to emphasize when listing individual values in the
schema:accessModeSufficient
property not to simply restate each individual access
mode. When adding a schema:accessModeSufficient
property with only a single value,
all information in the publication must be available in that mode.
For example, the EPUB creator would not declare a sufficient access mode of "visual
" for
the example publication as the information is not entirely available in image-based form. (Photo
books without text would be examples of works with a purely visual sufficient access mode.)
EPUB creators may also list any sets of sufficient access modes that allow full access to the information. As the most typical set of values is the combination of all the access modes, however, the information this provides users is less helpful in determining the usability of a publication.
For example, the metadata the EPUB creator inputs for the example publication re-establishes that there is a textual and visual reading option:
<meta property="schema:accessModeSufficient"> textual,visual </meta>
The order in which EPUB creators list the access modes in a set is not important. The only requirement is to separate the values by commas.
The complete set of schema:accessMode
and schema:accessModeSufficient
entries for the example publication is as follows:
<meta property="schema:accessMode"> textual </meta> <meta property="schema:accessMode"> visual </meta> <meta property="schema:accessModeSufficient"> textual,visual </meta> <meta property="schema:accessModeSufficient"> textual </meta>
Note that sufficiency of access is often a subjective determination of the EPUB creator based on their understanding of what information is essential to comprehending the text. Some information loss occurs by not being able to view a video, for example, but the EPUB creator might regard the visual or auditory losses as inconsequential if a transcript provides all the necessary information to understand the concepts being conveyed.
Refer to The
accessModeSufficient
Property [[a11y-discov-vocab]] for more information
about this property and its values.
The accessModeSufficient
property, as defined in [[schema-org]], allows more
complicated expressions than can be represented in the EPUB 2 or 3 package document (e.g.,
definition of lists of values and inclusion of a human-readable description). A future version
of EPUB might allow for richer metadata, but the basic expression shown in this section is
sufficient for discovery purposes.
Having a single accessModeSufficient
implies that the entire content of the EPUB can
be consumed with only that mode of accessing the content within the book.
Indicates that only visual perception is necessary to consume the information.
An example of this would be a children's picture book.
Indicates that the ability to read textual content is necessary to consume the information.
Note that reading textual content does not require visual perception, as textual content can be rendered as audio using a text-to-speech capable device or embossed as braille for tactile reading.
An example of this would be a romance novel.
Indicates that auditory perception is necessary to consume the information.
An example of this would be an audio book.
Combining various accessModeSufficient
s implies that a combination of access modes
are required to completely consume and understand the content within the EPUB.
Indicates that both visual perception and the ability to read textual content is necessary to consume the information.
An example of this would be a cookbook.
Indicates that both visual perception and the ability to hear the content is necessary to consume the information.
An example of this would be a narrated picture book.
Indicates that both the ability to hear the content and read textual content is necessary to consume the information.
Examples of this would be an an audio book with synchronized text highlighting, or an interactive dictionary where you could hear the pronunciation of the defined words.
Identifying all the accessibility features and adaptations included in an EPUB publication allows users to determine whether the content is usable at a more fine-grained level than the access modes do.
For example, a math textbook might have a textual access mode, but that alone does not indicate whether MathML markup is available. Whether a visual work only provides alternative text or whether it includes extended descriptions is also important to know when gauging its usability.
Accessibility features are identified in the [[schema-org]] accessibilityFeature
property.
Repeat this property for each feature.
The EPUB format requires that some accessibility features will always be present (e.g., a table of contents). Do not exclude these features from the accessibility metadata, as users typically are not aware what features are built into a format. Failing to include entries will reduce the discoverability of the publication when users search for specific features.
Be aware that although the vocabulary for the accessibilityFeature
property [[a11y-discov-vocab]] contains the values
"none
"
and "unknown
", these terms cannot be used to meet the reporting requirements
for the property. Authors must indicate at least one feature that is not one of these values to
claim conformance to EPUB Accessibility 1.1.1 [[epub-a11y-111]].
Refer to The
accessibilityFeature
Property [[a11y-discov-vocab]] for more information
about this property and its values.
The accessibilityFeature
property provides a list of all the applicable
accessibility characteristics of the content. It allows a user agent to discover these
characteristics without having to parse or interpret the structure of the content.
For ease of reading, this section splits the vocabulary into the following distinct groups:
Structure and Navigation Terms identify navigation aids that are provided to simplify moving around within the media, such as the inclusion of a table of contents or an index.
Adaptation Terms identify content features that provide alternate access to a resource. The inclusion of alternative text in an [HTML] alt attribute is one of the most commonly identifiable augmentation features.
Rendering Control Terms identify content rendering features that users have access to or can control. The ability to modify the appearance of the text is one example.
Specialized Markup Terms identify that content is encoded using domain-specific grammars like MathML and LaTeX that can provide users a richer reading experience.
Clarity Terms identify ways that the content has been enhanced for clearer readability. Audio with minimized background noise is one example, while content formatted for large print reading is another.
Tactile Terms identify content that is formatted for tactile use, such as graphics and objects.
Internationalization Terms identify those accessibility characteristics of the content which are required for internationalization.
Unsupported Terms include those features that are currently not supported in EPUB.
Alternative text is provided for visual content. In fixed layout books this would typically be in the form of textual descriptions of the images contained within the publication.
Audio descriptions are available (e.g., via an HTML track
element with its
kind
attribute set to "descriptions").
Indicates that synchronized closed captions are available for audio and video content.
Closed captions are defined separately from the video, allowing users to control whether they are rendered or not, unlike open captions which would be rendered directly onto the video during the video editing process, becoming a permanent part of the visual content.
Textual descriptions of math equations are included in the alt attribute for image-based equations, or by other means.
Descriptions are provided for image-based visual content and/or complex structures such as tables, mathematics, diagrams, and charts.
<figure>
<img src="complex-chart.jpg"
alt="Annual sales growth bar graph"
aria-details="sales-long-description">
<details id="sales-long-description">
<summary>Detailed description of sales growth chart</summary>
This bar graph illustrates the company's annual sales growth from 2019 to 2023.
The vertical axis represents revenue in millions of dollars, ranging from 0 to 50 million.
The horizontal axis shows years from 2019 to 2023.
Each bar represents the total annual revenue, with notable increases in 2021 and 2022,
showing a significant recovery and growth after the initial impact of the global pandemic.
The 2022 bar reaches the highest point at 45 million dollars,
demonstrating the most successful year in the company's recent history.
</details>
</figure>
The metadata to include in the package document would be:
<meta property="schema:accessibilityFeature">longDescription</meta>
Indicates that synchronized open captions are available for audio and video content.
Open captions are part of the video stream and cannot be turned off by the user, unlike closed captions.
<video controls width="480" height="360">
<source src="ch1-interview-with-embedded-captions.mp4" type="video/mp4">
Your browser does not support the video element.
</video>
The metadata to include in the package document would be:
<meta property="schema:accessibilityFeature">openCaptions</meta>
Sign language interpretation is available for audio and video content.
<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" xml:lang="en" lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Video with Sign Language</title>
<style>
.video-container {
display: grid;
grid-template-columns: 3fr 1fr;
gap: 10px;
max-width: 1200px;
margin: 20px auto;
}
.main-video, .sign-video {
width: 100%;
}
@media (max-width: 768px) {
.video-container {
grid-template-columns: 1fr;
}
}
</style>
</head>
<body>
<div class="video-container">
<video class="main-video" controls>
<source src="chapter2-video.mp4" type="video/mp4">
<track kind="captions" src="chapter2-closed_captions.vtt" srclang="en" label="English">
</video>
<video class="sign-video" controls>
<source src="chapter2-sign-language.mp4" type="video/mp4">
</video>
</div>
</body>
</html>
The metadata to include in the package document would be:
<meta property="schema:accessibilityFeature">signLanguage</meta>
Also since there are captions you would include closedCaptions
<meta property="schema:accessibilityFeature">closedCaptions</meta>
Indicates that a transcript of the audio or video content is available.
<!--
The aria-label provides a specific name for the audio player.
The aria-details attribute programmatically links the audio player
to the transcript container for assistive technologies. It points to the
ID of the <details> element itself, ensuring the link is always
valid whether the details are open or closed.
-->
<audio controls
aria-details="audio-transcript"
aria-label="Audio recording of the interview">
<source src="interview.mp3" type="audio/mpeg">
<track
label="English"
kind="captions"
srclang="en"
src="captions.vtt"
default>
Your browser does not support the audio element.
</audio>
<!--
The <details> element has the ID that aria-details references.
The <summary> acts as the visible, interactive control for all users.
-->
<details id="audio-transcript">
<summary>View Transcript</summary>
<div>
<p>This is a transcript of the audio recording.</p>
<p>[Interviewer] Welcome to the show.</p>
<p>[Guest] Thank you for having me.</p>
</div>
</details>
The metadata to include in the package document would be:
<meta property="schema:accessibilityFeature">transcript</meta>
Indicates that the visual presentation of the text can be transformed by the user to meet their reading requirements. This includes the ability to resize text, adjust line spacing and margins, change fonts, and modify color contrast (e.g., for a light, dark, or sepia mode).
This feature is characteristic of reflowable EPUBs that are built with semantic HTML and flexible CSS, avoiding practices that lock down the presentation, such as hard-coding sizes in absolute units or embedding text in images.
A publication enables transformability by using flexible CSS and, most importantly, by declaring the reflowable layout model in the package document.
Example Package Document (package.opf
):
<?xml version="1.0" encoding="UTF-8"?>
<package xmlns="http://www.idpf.org/2007/opf" version="3.0" unique-identifier="pub-id">
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:title>My Book</dc:title>
<dc:creator>Author Name</dc:creator>
<dc:language>en</dc:language>
<meta property="dcterms:modified">2025-06-16T10:48:00Z</meta>
<!-- This meta tag is essential for declaring the reflowable layout -->
<meta property="rendition:layout">reflowable</meta>
</metadata>
<manifest>
<!-- ... manifest items ... -->
</manifest>
<spine>
<!-- ... spine items ... -->
</spine>
</package>
Example Stylesheet (style.css
):
/* Use CSS Custom Properties for adaptable theming */
:root {
--text-color: #121212;
--background-color: #fefefe;
--base-font-size: 1em; /* Base font size on the reading system's default */
}
/* Adapt to the user's preference for a dark theme */
@media (prefers-color-scheme: dark) {
:root {
--text-color: #e3e3e3;
--background-color: #121212;
}
}
body {
font-family: sans-serif; /* Use a generic font family that can be overridden */
font-size: var(--base-font-size);
color: var(--text-color);
background-color: var(--background-color);
line-height: 1.5; /* Use a relative unit for line spacing */
}
The metadata to include in the package document to advertise this feature would be:
<meta property="schema:accessibilityFeature">displayTransformability</meta>
Indicates that the EPUB contains synchronized pre-recorded audio narration that matches the text of the document, commonly known as a Media Overlay. The reading system can highlight units of text as they are spoken.
This feature is enabled by associating a Media Overlay file (SMIL) with a content document (XHTML) inside the EPUB Package Document. This association is the trigger for the reading system to provide the synchronized playback and text highlighting.
Example Package Document (package.opf
):
<?xml version="1.0" encoding="UTF-8"?>
<package xmlns="http://www.idpf.org/2007/opf" version="3.0" unique-identifier="pub-id">
<metadata>
<!-- ... metadata ... -->
</metadata>
<manifest>
<!-- The XHTML text document -->
<item id="chap1" href="content.xhtml" media-type="application/xhtml+xml" />
<!-- The SMIL file with the timing info -->
<item id="chap1-overlay" href="overlay.smil" media-type="application/smil+xml" media-overlay="chap1" />
<!-- The audio file -->
<item id="narration-audio" href="audio/narration.mp3" media-type="audio/mpeg" />
</manifest>
<!-- The 'media-overlay' attribute creates the crucial link between the text and the SMIL file -->
<spine>
<itemref idref="chap1" />
</spine>
</package>
Example Content Document (content.xhtml
) referenced by the package:
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">
<head>...</head>
<body>
<p id="p1">To be, or not to be, that is the question:</p>
<p id="p2">Whether 'tis nobler in the mind to suffer</p>
<p id="p3">The slings and arrows of outrageous fortune,</p>
</body>
</html>
Example Media Overlay (overlay.smil
) referenced by the package:
<smil xmlns="http://www.w3.org/ns/SMIL" version="3.0">
<body>
<seq id="seq1">
<par id="par1">
<text src="content.xhtml#p1"/>
<audio src="audio/narration.mp3" clipBegin="0s" clipEnd="3.5s"/>
</par>
<par id="par2">
<text src="content.xhtml#p2"/>
<audio src="audio/narration.mp3" clipBegin="3.5s" clipEnd="6.2s"/>
</par>
<par id="par3">
<text src="content.xhtml#p3"/>
<audio src="audio/narration.mp3" clipBegin="6.2s" clipEnd="9.0s"/>
</par>
</seq>
</body>
</smil>
The metadata to include in the package document would be:
<meta property="schema:accessibilityFeature">synchronizedAudioText</meta>
Indicates that the user can control the timing of any time-based interaction to meet their needs. This allows the user to turn off, adjust, or extend time limits on activities like tests or forms.
For more details see Success Criterion 2.2.1 Timing Adjustable.
This feature is critical for users who may require more time to read, comprehend, or respond to content.
A common example is a timed quiz. To be fully accessible, the quiz must provide straightforward controls for the user to pause the timer or add more time. It should also ensure keyboard focus is clearly visible and that status updates are announced in a non-disruptive manner.
Example Content Document (quiz.xhtml
):
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Timed Quiz</title>
<link rel="stylesheet" href="quiz.css" type="text/css" />
<script src="quiz.js" defer></script>
</head>
<body>
<h1>Quiz Question</h1>
<p>What is the capital of California?</p>
<textarea id="quiz-answer" aria-label="Answer"></textarea>
<div class="timer-controls">
<div>Time Remaining: <span id="time-display" aria-live="polite">60</span> seconds</div>
<button id="start-btn">Start Timer</button>
<button id="add-time-btn">Add 30 Seconds</button>
<button id="pause-btn">Pause Timer</button>
</div>
<!-- This empty element will be used for important, non-disruptive announcements -->
<div id="quiz-status" role="alert" aria-live="assertive"></div>
</body>
</html>
Example Stylesheet (quiz.css
) for enhanced visibility:
/* Add a clear, high-contrast outline to buttons when focused via keyboard */
.timer-controls button:focus-visible {
outline: 3px solid dodgerblue;
outline-offset: 2px;
}
/* Style the status message for visibility */
#quiz-status {
margin-top: 1em;
font-weight: bold;
color: #d00; /* A color for alerts */
}
Example JavaScript (quiz.js
) with non-disruptive alerts:
document.addEventListener('DOMContentLoaded', () => {
const timeDisplay = document.getElementById('time-display');
const startBtn = document.getElementById('start-btn');
const addTimeBtn = document.getElementById('add-time-btn');
const pauseBtn = document.getElementById('pause-btn');
const statusDiv = document.getElementById('quiz-status');
let timeLeft = 60;
let timerId = null;
function updateTimer() {
timeLeft--;
timeDisplay.textContent = timeLeft;
if (timeLeft <= 0) {
clearInterval(timerId);
// Announce "Time is up!" in the status div instead of a disruptive alert().
statusDiv.textContent = 'Time is up!';
}
}
startBtn.addEventListener('click', () => {
if (!timerId && timeLeft > 0) {
timerId = setInterval(updateTimer, 1000);
}
});
addTimeBtn.addEventListener('click', () => {
if (timeLeft > 0) {
timeLeft += 30;
timeDisplay.textContent = timeLeft;
}
});
pauseBtn.addEventListener('click', () => {
if (timerId) {
clearInterval(timerId);
timerId = null;
pauseBtn.textContent = 'Resume Timer';
} else {
if (timeLeft > 0) {
timerId = setInterval(updateTimer, 1000);
pauseBtn.textContent = 'Pause Timer';
}
}
});
});
The metadata to include in the package document would be:
<meta property="schema:accessibilityFeature">timingControl</meta>
No digital rights management or other content restriction protocols have been applied to the resource.
The absence of DRM is a critical accessibility feature as it ensures that users can access the content with their preferred assistive technologies, which might otherwise be blocked by encryption. It allows for personal use and transformations of the content that may be necessary for a user to perceive and understand it (e.g., using it with specialized text-to-speech software).
This accessibility feature is not represented by a specific code structure within the publication's content, but rather by the absence of any encryption or content restriction technology being applied to the EPUB container.
A publication that is "unlocked" does not contain an encryption.xml
file. The container.xml
file, located in the META-INF
directory, points to the package document but includes no references to encryption information.
Example of a standard container.xml
for a DRM-free EPUB:
<?xml version="1.0" encoding="UTF-8"?>
<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
<rootfiles>
<!--
This entry points to the package document.
The absence of a second rootfile entry for an encryption.xml file
is indicative that the book is not encrypted.
-->
<rootfile full-path="EPUB/package.opf"
media-type="application/oebps-package+xml"/>
</rootfiles>
</container>
The metadata to include in the package document to advertise this feature would be:
<meta property="schema:accessibilityFeature">unlocked</meta>
Identifies that the LaTeX typesetting system is used to encode chemical equations and formulas.
In this method, the raw LaTeX source is embedded directly in the content document. A JavaScript-based rendering engine, such as MathJax, should then be used to convert this source into a visually and accessibly rendered formula.
The core of this method is the raw LaTeX source code embedded in the text. The example below shows the LaTeX for the formula for water, using the \ce{...}
macro from the mhchem package.
<p>The chemical formula for water is \(\ce{H2O}\).</p>
The example above is incomplete on its own. Without a rendering engine, it would only display the literal text "\(\ce{H2O}\)" to the user. To be functional, a library must be included to process the LaTeX.
The metadata to include in the package document to advertise this feature would be:
<meta property="schema:accessibilityFeature">MathML-chemistry</meta>
The following self-contained example shows how to include the MathJax library to find and render the raw LaTeX source code from the previous step.
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Chemistry with MathJax</title>
<!-- Configure MathJax to load the 'mhchem' package for chemistry -->
<script>
MathJax = {
tex: {
packages: {'[+]': ['mhchem']}
}
};
</script>
<!-- Load the MathJax library -->
<script id="MathJax-script" async
src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js">
</script>
</head>
<body>
<p>The chemical formula for water is \(\ce{H2O}\).</p>
</body>
</html>
The metadata to include in the package document to advertise this feature would be:
<meta property="schema:accessibilityFeature">latex-chemistry</meta>
Identifies that the Mathematical Markup Language (MathML) is used to encode chemical equations and formulas. MathML provides a structured, semantic representation that can be rendered directly by modern browsers and assistive technologies.
This method provides a robust, accessible alternative to images, as the notation can be resized, searched, copied, and interpreted correctly by screen readers.
MathML uses markup to describe the structure of a formula. The following example shows the chemical formula for water (H₂O) embedded directly in the content document. Note the required xmlns="http://www.w3.org/1998/Math/MathML"
namespace on the root <math>
element.
Example of H₂O represented using MathML:
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Chemistry with MathML</title>
</head>
<body>
<p>The chemical formula for water is shown below:</p>
<math xmlns="http://www.w3.org/1998/Math/MathML">
<msub>
<mi mathvariant="normal">H</mi>
<mn>2</mn>
</msub>
<mi mathvariant="normal">O</mi>
</math>
</body>
</html>
Future-facing (MathML 4): The upcoming MathML 4 specification, currently in development by the W3C as of June 2025, introduces an intent
attribute to explicitly declare the semantic meaning of a formula or equation. This will allow authors to unambiguously state that a formula represents chemistry, greatly improving accessibility and machine-readability. An example might look like this:
<math intent=":chemical-formula" xmlns="http://www.w3.org/1998/Math/MathML">
<msub>
<mi mathvariant="normal">H</mi>
<mn>2</mn>
</msub>
<mi mathvariant="normal">O</mi>
</math>
The metadata to include in the package document to advertise this feature would be:
<meta property="schema:accessibilityFeature">MathML-chemistry</meta>
Identifies that the LaTeX typesetting system is used to encode mathematical or other scientific notations.
In this method, the raw LaTeX source is embedded directly in the content document. A JavaScript-based rendering engine, such as MathJax, should then be used to convert this source into a visually and accessibly rendered formula.
The core of this method is the raw LaTeX source code embedded in the text. The example below shows the LaTeX for a common mathematical formula.
<p>The quadratic formula is \[x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}\].</p>
The example above is incomplete on its own. Without a rendering engine, it would only display the literal text "\[x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}\]" to the user. To be functional, a library must be included to process it.
The metadata to include in the package document would be:
<meta property="schema:accessibilityFeature">latex</meta>
The following self-contained example shows how to include the MathJax library to find and render the raw LaTeX source. Standard LaTeX mathematical commands do not require any special configuration.
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>LaTeX with MathJax</title>
<!-- Load the MathJax library -->
<script id="MathJax-script" async
src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js">
</script>
</head>
<body>
<p>The quadratic formula is \[x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}\].</p>
</body>
</html>
The metadata to include in the package document would be:
<meta property="schema:accessibilityFeature">latex</meta>
Identifies that the Mathematical Markup Language (MathML) is used to encode mathematical equations and formulas. MathML provides a structured, semantic representation that can be rendered directly by modern browsers and assistive technologies.
This method provides a robust, accessible alternative to images, as the notation can be resized, searched, copied, and interpreted correctly by screen readers for proper speech output or even braille display.
MathML uses markup to describe the structure of a formula, which modern reading systems can render directly without external libraries. Note the required xmlns="http://www.w3.org/1998/Math/MathML"
namespace on the root <math>
element.
Example of the quadratic formula represented using MathML:
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Math with MathML</title>
</head>
<body>
<p>The quadratic formula is shown below:</p>
<math xmlns="http://www.w3.org/1998/Math/MathML">
<mi>x</mi>
<mo>=</mo>
<mfrac>
<mrow>
<mrow>
<mo>-</mo>
<mi>b</mi>
</mrow>
<mo>±</mo>
<msqrt>
<mrow>
<msup>
<mi>b</mi>
<mn>2</mn>
</msup>
<mo>-</mo>
<mrow>
<mn>4</mn>
<mo>⁢</mo>
<mi>a</mi>
<mo>⁢</mo>
<mi>c</mi>
</mrow>
</mrow>
</msqrt>
</mrow>
<mrow>
<mn>2</mn>
<mo>⁢</mo>
<mi>a</mi>
</mrow>
</mfrac>
</math>
</body>
</html>
The metadata to include in the package document to advertise this feature would be:
<meta property="schema:accessibilityFeature">MathML</meta>
One or more of [SSML], [Pronunciation-Lexicon], and [CSS3-Speech] properties has been used to enhance text-to-speech playback quality.
These technologies allow an author to resolve ambiguities and improve the clarity of synthesized speech by correcting mispronunciations of names or jargon, indicating emphasis, and controlling the pacing and prosody of the narration.
TTS Markup can be applied using inline SSML attributes in the XHTML or through CSS properties from the CSS3 Speech Module. The following example shows both methods being used to enhance a single sentence.
Example Content Document (content.xhtml
):
The ssml:say-as
attribute instructs the TTS engine to spell out the acronym "W3C". Note the required SSML namespace declaration on the <html>
element.
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml"
xmlns:ssml="http://www.w3.org/2001/10/synthesis">
<head>
<title>TTS Markup Example</title>
<link rel="stylesheet" href="style.css" type="text/css" />
</head>
<body>
<p>The new standard is from the <span ssml:say-as="spell-out">W3C</span>.
It is <span class="emphasis">very</span> important.</p>
</body>
</html>
Example Stylesheet (style.css
):
The CSS3 Speech Module can be used to control how the text is spoken. Here, it adds emphasis to the targeted word.
/* Use the CSS3 Speech Module to add emphasis */
.emphasis {
/* A higher pitch can indicate emphasis to the listener */
pitch: high;
/* Other properties like voice-stress could also be used */
/* voice-stress: strong; */
}
The metadata to include in the package document to advertise this feature would be:
<meta property="schema:accessibilityFeature">ttsMarkup</meta>
Indicates that the resource does not contain any accessibility features.
The none value must not be set with any other feature value.
Indicates that the author has not yet checked if the resource contains accessibility features. This value is only intended as a placeholder until an accessibility review can be completed.
The unknown value must not be set with any other feature value.
Definition: The resource includes annotations from the author, instructor and/or others.
Status: Deprecated
Reasoning: The use of the annotations
value is now deprecated due to the general nature of annotations in published works. When authors or publishers add annotations to a work, they provide information for all readers; they are not used as a means of enhancing otherwise inaccessible content.
Definition: The work includes bookmarks to facilitate navigation to key points.
Status: Deprecated
Reasoning: The use of the bookmarks
value is now deprecated due to its ambiguity. Reading systems typically provide bookmark support not the EPUB itself.
Definition: Indicates that synchronized captions are available for audio and video content.
Status: Deprecated
Reasoning: The use of the captions
value is now deprecated. Authors should use the more specific closedCaptions
or openCaptions
values, as appropriate.
Definition: Indicates that chemical notations, such as molecular formulas and reaction diagrams, are represented in a structured format.
Status: Unsupported. Although the metadata value is ChemML
, the underlying format it actually identifies is the Chemical Markup Language (CML).
Reasoning: There is no native support for CML in HTML. Because EPUB content is built on web standards, the vast majority of e-reader devices cannot render CML visually, making this feature unreliable without a fallback image. The name also causes significant confusion with the unrelated 'ChemML' Python library for machine learning. Here is a link to the CML (Chemical Markup Language) specification.
Definition: The resource includes print page numbers.
Status: Deprecated
Reasoning: The metadata term printPageNumbers
value has been replaced by pageBreakMarkers
.
Definition: Indicates that a PDF document's contents have been tagged to create a logical structure, permitting access by assistive technologies.
Status: Unsupported. This metadata property is exclusively for describing PDF documents and is not relevant in the context of an EPUB file.
Reasoning: EPUB and PDF are fundamentally different technologies. An EPUB's accessibility is achieved through its use of semantic HTML and ARIA, while a PDF's accessibility relies on its internal tagging structure. An EPUB's metadata must describe the features of the EPUB itself. Applying a PDF-specific property like taggedPDF
is inappropriate and would be ignored by reading systems. While an EPUB might link to an external PDF, the metadata for the EPUB should not describe the features of that external file.
There are three widely recognized hazards that can affect readers of digital content:
flashing — if a resource flashes more than three times a second, it can cause seizures (e.g., videos and animations). See also Guideline 2.3 [[wcag2]].
motion simulation — if a resource simulates motion, it can cause a user to become nauseated
(e.g., a video game drawn on the [[html]] canvas
element or parallax scrolling with CSS).
sound — certain sound patterns, such as ringing and buzzing, can cause seizures, while loud or sudden changes in volume can also negatively affect users.
EPUB creators have to report whether their EPUB publications contain resources that present any of these hazards to users, as they can have real physical effects.
What precisely constitutes a sound hazard, and how to test for these hazards, is not standardized as of publication of this document. EPUB creators will have to use their discretion on when to specify a sound hazard until additional guidance is developed. This technique will be updated whenever there is more clarity on this issue.
Hazards are identified in the [[schema-org]] accessibilityHazard
property. Repeat this property for each hazard.
Unlike other accessibility properties, the presence of hazards can be expressed both positively and
negatively. This design decision was made because users most often search for content that is free
from hazards that affect them, but also want to know what dangers are present in any publications
they discover. To indicate that hazards are not present, use the values
"noFlashingHazard
", "noMotionSimulationHazard
", and
"noSoundHazard
".
<meta property="schema:accessibilityHazard"> flashing </meta> <meta property="schema:accessibilityHazard"> noMotionSimulationHazard </meta> <meta property="schema:accessibilityHazard"> noSoundHazard </meta>
Do not skip reporting hazards just because an EPUB publication does not contain any content that
could present risks. Users cannot infer a meaning when no metadata is present. The value
"none
" can be used in such cases instead of repeating each non-hazard. When the
"none
" value is used, no other hazards values may be specified.
If an EPUB publication contains a hazard, provide additional information about its source and nature in the accessibility summary.
If an EPUB creator cannot determine if a publication presents a specific hazard for users, list that
hazard as unknown. The following values are used to identify individual unknown hazards:
"unknownFlashingHazard
", "unknownMotionSimultationHazard
", and
"unknownSoundHazard
".
For example, determining whether sound hazards are present can be challenging as the causes are
currently not well defined. In this case, EPUB creators may prefer to set the
"unknownSoundHazard
" value, as in the following example.
<meta property="schema:accessibilityHazard"> noFlashingHazard </meta> <meta property="schema:accessibilityHazard"> noMotionSimulationHazard </meta> <meta property="schema:accessibilityHazard"> unknownSoundHazard </meta>
If it is not possible to determine any hazards, the value "unknown
" can be used in place
of setting the individual hazards to unknown. This value should be used sparingly, however, as it is
of no value to users. EPUB creators should make every effort to determine if hazards are present.
When the "unknown
" value is set, no other hazard values may be specified.
EPUB creators must ensure that information about all three types of hazards is included when not
using the "unknown
" or "none
" values.
Refer to The
accessibilityHazard
Property [[a11y-discov-vocab]] for more information
about this property and its values.
An accessibility summary provides a brief, human-readable description of the accessibility characteristics of an EPUB publication that cannot be expressed through the other discovery metadata.
An accessibility summary is provided using the [[schema-org]] accessibilitySummary
property.
The accessibility summary should not simply repeat the conformance information provided in the
dcterms:conformsTo
property, for example, or the features listed in the
schema:accessibilityFeature
properties. When other accessibility metadata is
present in the package document, systems that process EPUB publications can already present it to
users. Repeating it in the summary only makes them hear the information again.
<meta property="dcterms:conformsTo"> EPUB Accessibility 1.1 - WCAG 2.1 Level AA </meta> <meta property="schema:accessibilitySummary"> In addition to the requirements of its conformance claim, this publication includes sign language interpretation for all audio content.… </meta>
EPUB creators should not include an accessibility summary when they have nothing more to add to the conformance claim and other discovery metadata.
If an EPUB publication does not meet the requirements for content accessibility in [[epub-a11y-111]], the reason(s) it fails should be noted in the summary. Similarly, if an EPUB creator is hesitant to make a formal claim of conformance, the reasons why can be explained in the summary.
<meta property="schema:accessibilitySummary"> The publication is missing alternative text for complex diagrams. The publication otherwise meets WCAG 2.0 Level A. </meta>
<meta property="schema:accessibilitySummary"> Although this publication meets the requirements of its accessibility claim, the publication contains a motion hazard in chapter 8 page 202., which could cause motion sickness in certain individuals. There is a video showing a rollercoaster going up and down rapidly. Watching the video is interesting, but not essential for understanding the concept of elevation loss and gain. </meta>
<meta property="schema:accessibilitySummary"> This publication strives to meet accepted Web Content Accessibility Guidelines (WCAG) at the AA level. Subject experts were used to create the ALT text. A comprehensive index is included with links to the top of the page. </meta>
<meta property="schema:accessibilitySummary"> AI was used to generate the alt text without review by people familiar with the material. </meta>
<meta property="schema:accessibilitySummary"> People familiar with the material used AI to assist in the generation of Alt text. </meta>
Do not repeat this property to provide translations of a summary. EPUB does not define a method for
including translations. Putting different xml:lang
attributes on properties does not
indicate a translation and could lead to wrong summary being rendered to users.
It is not necessary to set the schema:accessibilityAPI
property for EPUB publications. EPUB creators are not responsible for the
interaction between reading systems
and the underlying platform APIs.
It is not necessary to set the schema:accessibilityControl
property for EPUB publications. This property does
not differentiate issues arising from the reading system interface from those in the underlying content, which has led to confusion about
its use.
Meeting the requirements of [[wcag2]] will mitigate most known issues with the content and is sufficient for authoring purposes.
The following examples show the metadata that would be added to an EPUB publication that has textual and visual access modes, is sufficient for reading by text, contains alternative text and MathML markup, and has a flashing hazard.
<package … > <metadata …> … <meta property="schema:accessMode"> textual </meta> <meta property="schema:accessMode"> visual </meta> <meta property="schema:accessModeSufficient"> textual,visual </meta> <meta property="schema:accessModeSufficient"> textual </meta> <meta property="schema:accessibilityFeature"> transcript </meta> <meta property="schema:accessibilityFeature"> MathML </meta> <meta property="schema:accessibilityFeature"> alternativeText </meta> <meta property="schema:accessibilityHazard"> flashing </meta> <meta property="schema:accessibilityHazard"> noMotionSimulationHazard </meta> <meta property="schema:accessibilityHazard"> noSoundHazard </meta> <meta property="schema:accessibilitySummary"> The video in chapter 2 presents a flashing hazard. A transcript is provided that covers all the essential information contained in the video. The publication otherwise meets WCAG 2.0 Level A. </meta> </metadata> … </package>
<package … > <metadata …> … <meta name="schema:accessMode" content="textual"/> <meta name="schema:accessMode" content="visual"/> <meta name="schema:accessModeSufficient" content="textual,visual"/> <meta name="schema:accessModeSufficient" content="textual"/> <meta name="schema:accessibilityFeature" content="transcript"/> <meta name="schema:accessibilityFeature" content="MathML"/> <meta name="schema:accessibilityFeature" content="alternativeText"/> <meta name="schema:accessibilityHazard" content="flashing"/> <meta name="schema:accessibilityHazard" content="noMotionSimulationHazard"/> <meta name="schema:accessibilityHazard" content="noSoundHazard"/> <meta name="schema:accessibilitySummary" content="The video in chapter 2 presents a flashing hazard. A transcript is provided that covers all the essential information contained in the video. The publication otherwise meets WCAG 2.0 Level AA."/> </metadata> … </package>