The EPUB Accessibility Metadata Guide details when to set the Schema.org accessibility metadata properties in the EPUB package document.
This guide only details usage for EPUB publications. The guidance in this document may not be relevant to other formats.
Accessibility metadata expressed in the EPUB 3 package document is derived from three main sources: Schema.org [[schema-org]], the EPUB accessibility vocabulary [[epub-a11y-12]] and its related properties, and Dublin Core [[dcterms]].
Schema.org is described as a collaborative, community activity with a mission to create, maintain, and
promote schemas for structured data.
Of particular importance is that it includes the following
set of accessibility properties for describing creative works:
These properties are key to the discoverability of EPUB publications and are central to the requirements of the EPUB Accessibility standard [[epub-a11y]]. They are also used to generate accessibility statements about a publication, as described in the Accessibility Metadata Display Guide for Digital Publications.
To provide maximum value for readers, the metadata needs to be consistent, which is where the Schema.org Accessibility Properties for Discoverability Vocabulary [[a11y-discov-vocab]] comes in. This vocabulary defines controlled sets values to use with each property (with the exception of summaries, which are free-form text).
But even with a controlled vocabulary, as schema.org is intended to describe any resource on or referenced from the web, the value definitions do not always fully reflect how to apply them to EPUB publications. That is where this guide fits in. Its goal is to add clarity on how to apply the metadata to both reflowable and fixed-layout EPUB publications.
The EPUB accessibility vocabulary defines accessibility properties do not have a single focus in the same
way as the schema.org properties. They are identifiable by the use of the a11y prefix.
One of their primary uses is to provide additional information about a conformance claim. The Dublin Core
dcterms:conformsTo property [[dcterms]] is used to declare conformance to a
standard such as the EPUB Accessibility [[epub-a11y-12]] specification, while the EPUB accessibility
properties provide additional context, such as:
a11y:certifiedBy);a11y:certifierCredential);a11y:certifierReport); anda11y:contactEmail).The Dublin Core vocabulary also includes the dcterms:date property that is used to indicate
when the evaluation was performed.
The EPUB accessibility vocabulary also includes properties that are not yet, or may never, be officially
added to the vocabulary in [[epub-a11y-12]]. The most notable of these properties is
a11y:exemption, which is used to indicate that a publication fails accessibility
conformance but is exempted from conformance under the laws of a jurisdiction.
The combination of all this accessibility information in the EPUB package document provides users with a better picture of whether the content is going to meet their specific needs. The rest of this document delves into the specifics of when and how to apply this metadata.
Accessibility metadata can also be expressed in supply chains through the provision of ONIX records, but that is outside of the scope of this document. For more information, refer to the crosswalk between package document and ONIX metadata as well as the essentials accessibility metadata guide and advanced accessibility metadata guide.
An access mode is defined as a "human sense perceptual system or cognitive faculty through which a user may process or perceive the content of a digital resource." [[iso24751-3]] For example, if an EPUB publication contains images and video, visual perception is required to consume the content exactly as it was created.
There are four access modes that are typically specified for EPUB publications:
textual — the publication contains text content (headings, paragraphs, etc.).
visual — the publication contains visual content such as images, graphics, diagrams, animations, and video.
auditory — the publication contains auditory content such as standalone audio clips and audio soundtracks for video content.
tactile — the publication contains tactile content such as embedded braille and tactile diagrams.
For a user to determine whether an EPUB publication is suitable for their needs, they need to know
which access modes are required to consume the content. List all applicable access modes in the schema:accessMode property [[schema-org]], repeating the property
for each applicable mode. Do not group all the access modes together in a single tag.
The order in which the access modes are declared currently does not carry any meaning, but best
practice is to list the access modes by their relevance to reading the content. For example, if a
book is primarily text but has some images, list textual first and visual
second.
Do not list access modes for content that does not contain information necessary to
understand a publication. Most EPUB publications contain cover images, for example, but it is not
necessary to see the cover image to read the publication. The same is true of publisher logos and
images in the content are only for presentational purposes (i.e., have no information so have empty
[[html]] alt attribute values). If these are the only visual content, then it is not
valid to list a visual access mode. Similarly, if the only audio an EPUB publication contains is
background music (e.g., for an instructional video with text captions, or as mood music while
reading), listing an auditory access mode is not valid.
Note that the access modes of the content do not reflect any adaptations that have been provided. For example, if a comic book also includes alternative text for each image, it does not have a textual access mode. See the following section on sufficient access modes for how to indicate that the available adaptations allow the content to be consumed in another mode.
For examples of how to express primary and sufficient access modes in real world situations, refer to .
[[[epub-a11y-12]]] [[epub-a11y-12]] only recommends that the access modes of a publication be
specified in the accessMode property [[a11y-discov-vocab]] as they are not as important
for determining usability as the sufficient access
modes.
Regardless, it is considered best practice to always specify the access modes to provide users with the most complete information about the content.
The primary access modes are the ones that directly capture how the information necessary to read and understand a publication is encoded. They do not include affordances to make content accessible, such as alternative text and extended descriptions to make visual content readable. The following sections explain when to apply each of the primary access modes.
The most common access mode for EPUB publications is textual. It indicates that at least some of the information necessary to read a publication is encoded using Unicode text characters, and is declared by setting a textual (access mode) access mode in the package document.
Note that the emphasis for declaring a textual access mode is that the information is necessary to read the publication. This means that not all the textual data that might be found in a publication counts towards declaring a textual access mode.
Some exceptions where a textual access mode would not be declared include:
As access modes do not take into account any affordances used to make non-textual content accessible (such as alternative text and descriptions for images or transcripts for audio), do not declare a textual access mode if these are the only text content (e.g., for a manga or comic).
A textual access mode is also not declared if the only text content is encoded as images. For example, headings represented as images in order to preserve unique styling and images of text in the panels of comics are not considered in determining if there is a textual access mode.
If the text data is preserved separately from the image data, for example by using SVG's text elements, then a textual access mode would be declared as such data is accessible to assistive technologies and can sometimes have its appearance modified by reading systems.
It is common for EPUB publications to contain some visual content, and even be entirely image based as in the case of comics and manga. Visual does not mean the information is only in static images, either. Visual content also encompasses videos as well any dynamic content, such as interactive images or games drawn on the [[html]] [^canvas^] element.
When at least some of the information necessary to read a publication is encoded in visual form, a visual (access mode) access mode is declared in the package document.
Not all visual content in a publication carries information necessary to its understanding. Examples of when a visual access mode would not be set include:
Auditory content includes any content that requires a user to be able to hear it in order to understand it. The most common forms of auditory content are standalone audio clips embedded in a publication and any video that is accompanied by an auditory track, but sounds can be important in other ways, too. If an interactive graphic drawn on the [[html]] [^canvas^] element includes auditory cues or announcements, for example, or if auditory cues are the only indication to users that time is running out on an interaction, an auditory perception is required.
When at least some of the information necessary to read a publication is encoded in auditory form, a auditory (access mode) access mode is declared in the package document.
Not all auditory content in a publication carries information necessary to its understanding. Examples of when an auditory access mode would not be set include:
A special case for audio in EPUB 3 involves text and audio synchronization using the media overlays feature [[epub-3]]. Setting the correct access modes and sufficient access modes for EPUB 3 publications that contain synchronized text-audio playback requires evaluating whether playback is essential to reading the publication or an additional feature.
One of the more common features of media overlays, particularly in mainstream publishing, is to provide synchronized playback of the full text of a publication with full audio narration. These types of publications are commonly referred to as "read aloud" books, as the user chooses whether or not to turn on the narration (unlike traditional audiobooks where only the audio is available).
In this case, because the audio playback is an extra feature, and is not required for reading
the publication, it is not listed as an auditory access mode. Instead, the
presence of text and audio synchronization is declared using the synchronizedAudioText accessibility
feature.
Having full audio playback capability in this case instead mean that an auditory sufficient access mode needs to be declared — the user can listen to the complete publication as an alternative to reading the text.
When media overlays are present, EPUB creators should not add "auditory" to
all the possible sufficient access modes just because it is possible to turn on
text-audio playback.
For example, a publication with media overlays that has text and images (with text alternatives) would only declare the following sufficient access modes:
<meta property="schema:accessModeSufficient">textual</meta> <meta property="schema:accessModeSufficient">auditory</meta> <meta property="schema:accessModeSufficient">textual,visual</meta>
It would not include declarations for "textual,auditory" or
"textual,visual,auditory".
Another use for media overlays — more typical among accessible republishers such as libraries that serve blind and low vision readers — is to provide full audio synchronized to the major headings of a publication. These types of publications are more like traditional audiobooks, as all the information in the work is typically available in auditory form. The minimal text does not allow meaningful visual reading or text-to-speech playback; it is only to provide structured navigation capabilities.
In this case, being able to hear the audio is essential to being able to read the publication, so an auditory access mode is declared.
In a reverse of the first case discussed, EPUB creators will not identify text-audio synchronization as a feature of the publication since the amount of text provided is trivial.
Likewise, since the text in the publication only provides heading navigation, EPUB creators will not list a textual access mode. The user does not need to, and is not expected to, visually read this text.
Some publications that provide the body in auditory form may include the backmatter in text form, allowing users to use text-to-speech playback to render it. In this case, there would be a textual component.
Tactile is less common in mainstream EPUB publications but it is sometimes found in content produced specially for users who are blind. Tactile partly overlaps with both textual and visual in what it represents, but the difference is in how. Using braille Unicode characters to represent the text content is a direct corollary to a textual access mode, while tactile graphics and objects are how visual imagery get represented.
Tactile for text only refers to the use of Unicode braille characters (see Braille Patterns). A tactile access mode is never declared simply because it is possible to convert standard text content into braille on the fly. Users seeking to find content that can be converted in this way will look for a fully textual sufficient access mode.
When at least some of the information necessary to read a publication is encoded in tactile form, a tactile (access mode) access mode is declared in the package document.
The same as with the textual and visual content it represents, not all tactile content in a publication may carry information necessary to its understanding. The inclusion of inessential textual and visual content in a fully tactile edition, however, is generally quite rare. But, a mainstream publication could, for example, include links to embossable tactile graphics or printable 3D objects. As this linked content would be considered affordances, or at least supplementary aids, a tactile access mode would not be declared.
In the case of linked content, a tactile graphic or tactile object accessibility feature would be declared instead.
In this section:
The visual content indicators are a special class of access mode that extend the visual access mode. They provide additional information about the type of content that is encoded as images.
The most common method of encoding visual content in EPUB is as images. Although support for MathML has improved in recent years, it is still common, for example, to find math equations in EPUB publications represented as images to ensure their rendering. Informing users who are blind that a publication has images of math lets them know that they will have to rely on alternative text and extended descriptions instead of being able to inspect the equations in the kind of detail that MathML allows. It is also an aid to visual readers, as it indicates that the text cannot be resized (only zoomed) or have its appearance changed as necessary to make it more accessible.
But the visual content indicators are not exclusive to image formats. If a video has information that requires color perception to understand, for example, a color dependence indicator could be set. Or if an interactive chart is drawn on an [[html]] [^canvas^] element, the chart on visual could be set.
More than one visual content indicator can apply to an piece of visual content. A publication with an image of a chart that uses color to represent the data, for example, would include both the chart on visual and color dependence indicators. When declaring visual content indicators, make sure to check a publication's visual content for all that apply.
As the visual content indicators are a supplementary piece of information, their use is only advisory. It is more important to ensure that a visual access mode is set, and a declaration of a visual content indicator must never occur without a declaration a visual access mode accompanying it.
The following sections describe the visual content indicator types and how to apply them.
The chartOnVisual indicator is used to declare that there are visual representations of data, such as pie charts, bar charts, scatter plots, and line charts.
For visual representations of concepts, such as in flowcharts, refer to the diagramOnVisual indicator.
The chemOnVisual indicator is used to declare that chemical equations and formulas are represented only in visual form.
The chemOnVisual indicator is typically used to indicate that chemical content is
represented using static image formats like JPEGs and PNGs. It could also apply if a publication
includes a video that, for example, shows chemical formulas and equations on screen that the
reader needs to understand but that are not described in an audio track.
If the chemical content is decorative, then this value is not declared. For example, a cartoon might show a character with a swirl of superficial equations over their head to indicate confusion. As the actual equations are irrelevant to understanding the content, this would not represent a case of visual chemical content.
MathML and LaTeX are sometimes used to represent chemical content in a publication. In these cases, although the final equation is drawn visually, the fact that the drawing is based on structured markup or formatting means it is not purely visual; assistive technologies can help users step through the underlying structure. The use of MathML and LaTeX are instead represented as accessibility features.
The colorDependent indicator is used to declare that visual content requires the ability to perceive color in order to understand the information being conveyed.
An example of color dependence would be a bar chart where each bar is a different color and a color-coded key is the only way to determine the meaning of each color. A user who cannot differentiate the colors could not match the key values to the each bar.
This value does not apply if there are secondary indicators that allow users to distinguish the information. For example, if the bar chart in the previous example also had lines running in different directions to differentiate each bar, then a user could use those patterns to match the bars to the key. Similarly if keywords are placed within each bar instead of using a separate color key, there would be no chance for confusion.
The color dependence indicator is not intended to capture issues with CSS styling, such as hyperlinks that might not be perceivable or student exercises coded by border or background color. If a book fails accessibility conformance for these kinds of color problems, it should be noted in the accessibility summary.
The diagramOnVisual indicator is used to declare that there are visual representations of concepts, such as Venn diagrams, flowcharts, mind maps, and family trees.
For visual representations of data, such as in pie and bar charts, refer to the chartOnVisual indicator.
The mathOnVisual indicator is used to declare that math equations and formulas are represented in visual form.
The musicOnVisual indicator is used to declare that musical notation is represented visually.
Unlike math and chemical content, there is no support for rendering a markup language for music, such as MusicXML. Visual representation, often in image form, is the most common way to add music notation. (Text tablature notation can be added to EPUBs for stringed instruments, but as unstructured text it is also not easily accessible to users who cannot visually read the text.)
Images are not the only way that music can be visually encoded. An instructional video in a music book could show musical notation to allow the user to play along with a recorded sample. This would also necessitate declaring a visual music indicator.
The textOnVisual indicator is used to declare that there are images of text.
There are many reasons why images of text are included in EPUB publications. Appearance is one, as having to faithfully represent the appearance of the text in a print edition because font and CSS support cannot reliably replicate an effect is sometimes non-negotiable for a publisher. Or an original document, like a newspaper clipping or historical document, is reproduced exactly as it was printed. It also includes the text encoded in the images of comics and manga, children's books, cookbooks, and other image-based fixed layout publications.
As with the other visual indicators, the representation of text in visual form also encompasses video content. A silent instructional video, for example, might include text directives directly in the video.
In all these cases, it would be appropriate to list textOnVisual as an access mode.
Only if the text in the visual content is in a decorative image or not directly relevant to
understanding the text is there no need to specify the textOnVisual indicator.
For example, the text on signs and billboards in the background of a photograph do not represent text on visual imagery if they are incidental to what the image is conveying. Similarly, a publisher's name in their logo should not be declared as text in visual content.
The use of SVG is a special case when semantic elements are used to add the text to the image. When this is done in reflowable EPUBs, the text on visual indicator can be omitted as there is the possibility to access and change the appearance of the text. But it would not be true in a fixed layout EPUB as reading systems restrict control over the appearance of the pages.
If SVG's text elements are not used to display the text, then textOnVisual is
specified since the text is not separate from the drawn image.
The access modes sufficient to consume an EPUB publication express a broader picture of the potential usability than do the primary access modes. The primary access modes identify the default nature of the media used in the publication (e.g., that it has text and images). Sufficient access modes, on the hand, account for all the ways that a user can read that content.
Another way of thinking of the difference between the primary access modes and the sufficient access modes is that the sufficient access modes also account for the affordances and adaptations that have been provided. They allow users to determine whether the content can be read in a manner that is more useful to them. For example, a user with a screen reader will want to know if a book that consists of text and images can also be read entirely as text without loss of information — that there is alternative text and extended descriptions for the visual content.
Sufficient access modes are identified in the schema:accessModeSufficient property
[[schema-org]].
Unlike the other Schema.org metadata expressed in the EPUB 3 package document, sufficient access modes can consist of one or more values separated by a comma and optional space. Because sufficient access modes are telling users the different ways that the content can be read, each set is expressed in a separate tag rather than each individual value.
The most common reason for having multiple schema:accessModeSufficient declarations is
to capture both the default nature of a publication as well as the alternative reading pathways
available. For example, that a book with textual and visual content is also fully readable as
text.
Note that sufficiency of access is often a subjective determination based on an understanding of what information is essential to comprehending the text. Some information loss occurs by not being able to view a video, for example, but the visual or auditory losses might be considered inconsequential if a transcript provides all the necessary information to understand the concepts being conveyed.
For examples of how to express primary and sufficient access modes in real world situations, refer to .
[[[epub-a11y-12]]] [[epub-a11y-12]] requires that the sufficient access modes of a publication be
specified in the accessModeSufficient property [[a11y-discov-vocab]].
Although sufficient access mode are expressed as sets consisting of one or more values, the most strongly recommended sets to list are the ones that in fact consist of only a single value. The reason why these are the most important is that they tell users that all the content can be read in the specified method.
Someone who is blind, for example, might want to know that all the content is available in text form so they can use an assistive technology or their reading system's built-in read aloud functionality to read the book. Or they might want to know that all the content is available as pre-recorded audio for enhanced legibility. They are going to be less interested that if you can perceive text, visual, and auditory content that you can also read the book (even though this is the default sufficient access mode based on the primary access modes), as the visual content is problematic without the provision of alternative text and long descriptions.
The possible single sufficient access mode values are identical to the primary access modes:
Setting this value indicates that information necessary to read a publication is available in textual form.
Unlike a textual primary access mode, the inclusion of text alternatives such as alternative text, extended descriptions, closed captions, and transcripts does allow a textual sufficient access mode to be set.
When textual is the only sufficient access mode set, it means that all the
information necessary to read the publication is available in textual form, so it is
generally interpreted as indicating that the publication is screen-reader and dynamic
braille display friendly.
A single textual sufficient access mode can be set for EPUB publications that conform to Level A [[wcag2]] or higher, for example, as there must be textual alternatives for non-text content to meet these thresholds.
Setting this value indicates that information necessary to read a publication is available as prerecorded audio (whether prerecorded human speech or synthetic speech).
When auditory is the only sufficient access mode set, it means that all the
information necessary to read the publication is available as prerecorded audio.
A single auditory sufficient access mode typically indicates that media overlays have been used to provide audio for all the text and visual content, but it could also set if, for example, a book of poetry also contained (non-synchronized) recordings for each poem.
Setting this value indicates that information necessary to read a publication is available in visual form, such as images and video.
When visual is the only sufficient access mode set, it means that all the
information necessary to read the publication is available in visual form. It is most often
applied to works like comics and manga.
A single visual sufficient access mode is typically the least accessible single sufficient access mode that can be set, as it indicates that only sighted readers can access the information of the publication. It does not, by itself, mean the publication is not accessible, but there would need to be another single sufficient access mode declaration to indicate the publication is readable by non-visual users.
Setting this value indicates that information necessary to read a publication is available in tactile form. It indicates some mix of braille Unicode characters, tactile graphics, and tactile objects are included in the publication.
Unlike a tactile primary access mode, the provision of links to tactile graphics and objects does allow a tactile sufficient access mode to be set. Such linked content rarely correlates with a publication being fully readable in tactile form; see the next section on combined reading modes.
When tactile is the only sufficient access mode set, it means that all the
information necessary to read the publication is available in tactile form.
A single tactile sufficient access mode is the least common single sufficient access mode, at least for mainstream publishing. It is typically associated with a braille publication produced in the EPUB format.
Although the single sufficient access modes are considered the most important to set, it is also helpful to give users a complete picture of all the ways the content can be read, especially if the primary access modes are not set.
Publications that have more than one primary access mode always have a sufficient access mode set
that is the combination of all those primary modes. For example, a book with text and images has
textual and visual primary access modes. Since those two values tell
the user that they have to be able to read text content and perceive visual content, the content has
a sufficient reading mode that is also those two values.
Not all publications have more than one sufficient access mode. Novels, for example, are often purely textual and in such cases only have a single textual sufficient access mode.
Because sufficient access modes account for affordances, the possible sets of sufficient access modes
are not always strictly combinations of the primary modes, even if there is always one set that
equals all the primary modes. An example would be a comic that only provides alternative text so is
not fully accessible as text. In this case, it would have a visual primary access mode
but also a set of sufficient access modes that includes textual.
The combinations of possible sufficient access modes grows as there are more primary access modes and affordances to make them accessible. This is why it is not always helpful to users to have all the possibilities listed. It can make the metadata more confusing to understand.
Identifying all the accessibility features and adaptations included in an EPUB publication allows users to determine whether the content is usable at a more fine-grained level than the access modes do.
For example, a math textbook might have a textual access mode, but that alone does not indicate whether MathML markup is available. Whether a visual work only provides alternative text or whether it includes extended descriptions is also important to know when gauging its usability.
Accessibility features are identified in the schema:accessibilityFeature property [[schema-org]]. The property is
repeated for each feature; do not group all the features into one tag.
The EPUB format requires that some accessibility features will always be present (e.g., a table of contents). Do not exclude these features from the accessibility metadata, as users typically are not aware what features are built into a format. Failing to include entries will reduce the discoverability of the publication when users search for specific features.
[[[epub-a11y-12]]] [[epub-a11y-12]] requires that the accessibility features of a publication
be specified in the accessibilityFeature property [[a11y-discov-vocab]]. This means
that at least one accessibility feature must be declared in the package document metadata in
order for an EPUB publication to claim conformance.
One of the most common questions about accessibility features is not what they refer to but when they can be claimed in the package document metadata. Features may not be applicable in every situation, for example, such as is the case for extended descriptions. Or a mix of features might be used, such as to provide alternatives for audio and video content.
Ideally, a feature will only be claimed when it, possibly in combination with other features, covers all cases, as this will make the content optimally accessible. But limiting reporting to this standard would mean that users will have difficulty determining if content that may not be fully conforming to accessibility standards may still be usable to them.
Consequently, a measure of common sense is necessary when deciding whether to apply a feature that is not universal in application. The goal of this metadata is to help users determine the suitability of the content, not to make publishers feel better about content that they know they have not made fully accessible.
If only one out of a hundred images that needs extended descriptions has one, for example, claiming that the publication provides this feature is clearly stretching the truth. But if half the images are described, there is some value to users.
The key when not providing full coverage is to detail why not in the accessibility summary so that users are aware of the limitations up front. The following information is especially useful to provide when making claims against incomplete coverage:
There may be times when the presence of accessibility features is not known but a statement is still needed in the package document metadata. This usually happens if a placeholder is needed until an accessibility evaluation is carried out.
To indicate that the presence of accessibility features is not known, the unknown (feature) accessibility feature is declared.
The unknown value is not intended for permanent use in an EPUB. If a publication is
distributed with the value because the publisher cannot wait on an accessibility review to
complete, it is expected that they will release an updated publication with new metadata once
the evaluation is complete.
The unknown value must not be set if any other
accessibilityFeature value is also declared. Declaring unknown
acts as a temporary placeholder, explicitly stating that an accessibility audit has not yet been
performed. It distinguishes the content from a publication that has been audited and confirmed
to have no features.
It is first worth noting that it is rare that there are no accessibility features in an EPUB publication, even if the publication as a whole does not meet minimum accessibility conformance standards. Except for some edge cases, all EPUBs can typically claim to provide a table of contents, for example, since this is a required component of the standard. It is also most often the case that reflowable EPUB publications can claim a reading order.
But in the case where an EPUB publication provides no accessibility features, the lack of features is indicated in the package document metadata using the none (feature) value.
This value must not be set if any other accessibilityFeature value
is declared. Declaring the none value provides an explicit and unambiguous
statement that no specific accessibility features have been included, which aids in content
discovery. This is different from a publication that simply omits all accessibility
metadata.
This section addresses accessibility features based on common use patterns rather than as individual values in isolation.
All the features that are used to address the accessibility of audio and video are discussed together, for example, to better explain the purpose and complimentary uses of the features. This also allows features to be discussed that have exclusive values, such as whether full or partial ruby annotations are provided.
A navigation menu is included at the start of each section to help locate which features are addressed in it. In addition, the index of terms provides links to all the features and other properties defined in this document.
In this section:
EPUB 3 publications allow the aural rendering of text content through the use of text-to-speech (TTS) playback (commonly referred to as "read aloud" capability) and media overlays. Both renderings are capable of providing audio playback of the text synchronized with highlighting of the associated text.
Text-to-speech engines, however, often fail to provide accurate readings of the text. Heteronyms, for example, where two words are spelled the same but have different pronunciations, often result in the wrong word being pronounced. Complex words, such as chemical names and technical jargon, also result in illegible playback, as do fictional words and names created by an author (for example, in works of science fiction). TTS playback also suffers from an inability to indicating emphasis or to controlling the pacing and prosody of the narration.
It is possible to provide enhancements in the markup of EPUB publications to help TTS engines render text accurately. EPUB 3's SSML attribute, for example, were intended to allow the inlining of pronunciation information, while PLS lexicons [[Pronunciation-Lexicon]] were designed to provide a reusable library of pronunciations. CSS Speech Level 1 [[CSS-Speech]] properties address the issues text-to-speech playback quality (for example, the voice to use, whether to spell out words, and when to provide emphasis).
Unfortunately, at this time none of these technologies have been proven to work at any kind of scale in EPUB reading systems. EPUB 3's SSML attributes will likely never be implemented and are best avoided. PLS lexicons are theoretically very useful but there are no known implementations after over fifteen years of existence. And CSS Speech has only had limited experimental support in Safari browsers. There is a possibility that PLS lexicons and CSS Speech will find support some day, and other technologies may come along to fix the problem of inlining SSML pronunciation information.
Despite these shortcomings, if a publisher makes the effort to enhance TTS playback, they can indicate this by adding the ttsMarkup feature to the EPUB package document.
It is only recommended to set this feature if the TTS enhancements substantively cover the rendering issues that are likely to arise — recognizing that different TTS engines have different capabilities and authors cannot anticipate every misreading they might make.
Media overlays differ from text-to-speech playback in that they provide prerecorded audio synchronized with the text. The synchronization, and granularity of playback synchronization, is set by the creator of the EPUB publication through markup files that conform to a profile of the SMIL [[SMIL]] standard.
When including media overlays, the synchronizedAudioText feature is specified in the package document.
The audio provided with media overlays is typically human narrated but some producers will also use text-to-speech engines to create the audio. TTS is often used when providing audio-synchronized playback of backmatter, which is tedious for humans to read.
Although this prerecorded synthetic speech is prone to the same failings as dynamic TTS generation,
the ttsMarkup is not set in the package document metadata if the producer makes an
effort to improve their audio. There currently is no feature to differentiate the quality of
prerecorded synthetic speech.
It is possible with media overlays to synchronize dynamic text-to-speech rendering with text highlighting, although this is not commonly done or supported by reading systems.
The use of the TTS enhancements and media overlays is not mutually exclusive. TTS markup enhancements could be provided, for example, to improve the quality of the media overlays dynamic TTS generation feature. Some non-visual readers also prefer to use a reading system's read aloud capabilities even when media overlays are available as it allows them to read faster.
In this section:
Providing accessible markup for chemical formulas and equations allows users to inspect the equations themselves rather than relying images of the content with alternative text and descriptions provided by the publisher. Even when care is taken in the writing of descriptions, ambiguities can arise because human language is not always as precise as the equations themselves.
Accessible markup also provides a robust, accessible alternative to images, as the notation can be resized, searched, copied, and interpreted correctly by screen readers.
Although EPUB 3, and web browsers more broadly, do not currently support ChemML, there are still two markup syntaxes that can be used to add accessible chemical formulas and equations.
The first of these, despite its name, is MathML. MathML provides a structured, semantic representation that can be rendered directly by modern browsers and assistive technologies. MathML markup can be added directly to HTML documents or stored in separate files and embedded using the [^object^] element. Just as it is able to represent mathematical equations, its syntax is also adaptable to representing chemical equations and formulas.
When an EPUB publication contains chemical formulas and equations marked up as MathML, the MathML-chemistry feature is set in the package document.
It is not required that every chemical equation or formula be represented as MathML in order to set this value, but the coverage should be substantive. Judgement will always be involved in determining whether MathML is necessary for every equation. Some simple equations may not benefit from the additional markup overhead.
The other way to represent chemical equations and formulas is using the LaTeX typesetting system. LaTeX rendering is not natively supported in reading systems, so it is often only used by publishers for preparing print manuscripts and transformed into another form, like images, for digital publications. But it is possible to embed LaTeX-formatted equations and use a JavaScript library like MathJAX to translate them for visual reading (the caveat, of course, is that not all reading systems support scripting).
Another possible use for LaTeX-formatted equations is as the alternative text for images, although this can lead to poor pronunciation by assistive technologies.
One reason for including the raw LaTeX is that it offers users the opportunity to copy the equations and use them in an application that allows them to read the syntax. But whether users are able to do this depends on whether they know how to unzip an EPUB and locate the equations, and whether digital rights management schemes even allow them access into the EPUB.
For these reasons, it is not common to find LaTeX-formatted chemical equations and formulas in EPUBs, but if the syntax is provided the latex-chemistry feature is set in the package document.
Similar to setting MathML-chemistry, it is not required that every chemical equation or
formula be represented as LaTeX in order to set this value, but the coverage should be substantive
for all complex equations.
Although in theory it is possible for an EPUB publication to contain chemical equations and formulas encoded both in MathML and LaTeX, such dual encoding is not found in practice. Publishers use one or the other syntax.
In this section:
The displayTransformability accessibility feature value is used to indicate that the user can modify the presentation of the text content of an EPUB publication to their own preferences without negatively affecting the readability.
When to set the displayTransformability value depends on the characteristics of the
language a publication is written in. In general, the following characteristics serve as a basis for
what must be modifiable:
Making sure content is transformable on these characteristics will be sufficient for many languages. But not all of these characteristics will apply to all languages. And depending on the language, other requirements can be equally important to ensure are modifiable, such as:
Providing guidelines tailored to every language is outside the scope of this guide so publishers will need to take into consideration the local accessibility of needs when determining if the content is sufficiently transformable. One measure of conformance is to match the transformability to common display control properties provided by reading systems tailored to the local language.
To claim this feature, note that text must not be represented as images unless it falls under one of the exceptions specified in success criterion 1.4.5 [[wcag2]] or the text is incidental to the purpose of the image (e.g., text on a billboard or street sign in the background of an image).
In general, reflowable EPUB publications will meet display transformability requirements so long as none of the content is constrained visually by its bounding box. This typically only happens if overflow is clipped using CSS to prevent scroll bars from appearing. An example would be if a publisher adds a sidebar with a fixed height and width and clips any content that exceeds these dimensions. If the text is sized to neatly fit in this box only at the publisher's preferred display, increasing the font size or spacing would cause text to progressively disappear.
This feature is not applicable to fixed layout EPUBs because reading systems do not allow users to modify the display. Reading systems must support the accessibility of a feature before it can be claimed.
In this section:
The schema:accessibilityFeature property defines two values for indicating that the
display of an EPUB publication has been enhanced by the publisher.
The first of these is the highContrastDisplay feature, which is used to state that there is a higher-than-normal contrast between text and its background. legibility for users with low vision or color vision deficiencies who require a to read comfortably.
This feature can be set when text and background color combinations meet or exceed the 7:1 contrast ratio for normal-sized text. It is the equivalent of meeting [[wcag2]] success criteria 1.4.6.
Note that to meet the requirements for setting this feature that images of text also have to meet the contrast requirements. The only exceptions are defined in SC 1.4.6, with the most typical exception for digital publications being for publisher and imprint logos.
There is no feature for meeting the minimal contrast requirements defined in [[wcag2]] success criteria 1.4.3.
The other feature for enhanced display is largePrint which is used to indicate that a publication has been formatted for users who require larger font sizes to read.
There is no single accepted definition of what constitutes large print, so publishers should consult regional guidelines, when available, for what is acceptable. In general, though, a font size of 14pt to 16pt is considered enlarged print, while a font size of 18pt and above is considered large print. For this reason, it is recommended to include information about the large print formatting used in the accessibility summary so that users are not left guessing which size has been used.
In practice, large print is rarely used in EPUB publications. Reflowable EPUBs can typically have their font size, family, line spacing, and other properties adjusted to whatever the user prefers (see the displayTransformability feature). This is a much better feature for users than picking the specific formatting.
It is also problematic to indicate that reflowable EPUB publications are formatted for large print reading when user preference settings are typically applied to the text. The formatting the publisher has provided may not be viewable unless the user reverts their own preferences.
The largePrint feature must never be used to indicate that it is possible to reformat
the text (i.e., that the display is transformable). Features are set based on the defaults of a
publication not on what it might be possible to achieve.
The most likely scenario for setting the largePrint feature is when a print-equivalent
reproduction of a physical large print book is created using EPUB 3 fixed layouts. But the reality
is that such works are rarely, if ever, created.
In this section:
Providing text alternatives is key to making images accessible to readers who cannot perceive them or have difficulty processing the content.
There are two primary features that identify that accessible alternatives have been provided:
alternative text and extended descriptions. Alternative text is a short description that can stand
in for the image (usually added to the alt attribute in HTML documents) while extended
descriptions provide additional detail when images express information that cannot be conveyed
through a brief description.
With the exception of decorative images (those that convey no information needed to understand the text), all images require alternative text to be accessible. The accessibility feature for indicating that alternative text has been provided is alternativeText.
Publishers can declare that alternative text is provided even if not every image is described, although omitting alternative text is never recommended.
The alternative text provided for images must be meaningful both to meet accessibility requirements
and to claim the alternativeText accessibility feature. A publication that simply
auto-fills every alt attribute with the text "Image", for example, is not providing
users with a meaningful alternative.
Sometimes images will contain more information than can be expressed in a short alternative text field. In these cases, it is also necessary to provide an extended description. The accessibility feature for indicating that alternative text has been provided is longDescription.
Similar to the requirement for claiming that alternative text has been provided, the extended descriptions for images must fully capture all information being conveyed through the image. It is not required that every image have a description, though, only those that require more extensive detail.
The name "long description" was commonly used in the past for extended descriptions and was the
nomenclature behind the now obsolete longdesc attribute in HTML. The name was a
misnomer, however, as the descriptions provide additional detail and complement alternative
text, they are not simply longer in length. That is why "extended description" is now the common
name. Changing the schema.org vocabulary would only add development costs, so the old name is
still used for metadata.
In this section:
There are several ways to provide accessible math equations in EPUB publications, but support in reading systems has historically been a barrier to the adoption of the most accessible. For example, although MathML would intuitively seem to be the most accessible way to add math, it was not well supported in HTML until recently. Its rendering in older reading systems still often relies on a JavaScript library that greatly slows the presentation the more equations there are to render.
With improved rendering in newer reading systems, it is now recommended as the most accessible way to provide math content. When MathML is provided for math content, the MathML accessibility feature is set in the package document metadata.
MathML markup allows for intelligent interpretation and voice rendering by assistive technologies, allows users to explore the components of an equation, and even allows copying and pasting of equations into external applications.
Because of the history of poor support for MathML in reading systems, there is also a legacy requirement to identify all documents that contain MathML in the manifest. This was to allow reading systems to selectively turn on MathJAX rendering without slowing down the rendering of documents that had no math, for example, or to present a fallback document with math encoded as images. Do not rely on vendors inferring that MathML is present from this information.
Although ideally all math equations would be represented as MathML, in practice this is often impractical to do. Basic math equations can often be represented using ASCII text without any loss of information, for example. When claiming that MathML is available as accessible feature, however, all, or substantively all, equations that would be ambiguous in pure text form should be represented using MathML markup.
LaTeX is another formatting syntax for expressing math equations and its presence is indicated in accessibility metadata using the latex accessibility feature.
Unlike other formal technology names, like MathML and ARIA, the spelling "LaTeX" is not used to identify the feature in the schema.org vocabulary. It is strongly recommended to use the lowercase vocabulary spelling to avoid issues with processors that are case-sensitive.
Unlike MathML, which is markup based, LaTeX uses text commands (identifiable by a starting slash) to structure the different components of an equation.
Reading systems do not natively support the rendering of LaTeX equations the way they do MathML, but it is possible to add a JavaScript library like MathJAX to visually render the equations. Due to the lack of consistent support for scripting in EPUB, however, most publishers do not include LaTeX formatting directly in their publications. It is more commonly found in the source from which an EPUB is generated, and converted to an image for the EPUB, otherwise users without a scripting-capable reading system will have to view and read the raw LaTeX formatting.
Even though direct rendering of LaTeX is not well supported, a possible accessible compromise would be to include the raw LaTeX formatted equation as the alternative text for an image. This would potentially allow users to copy the LaTeX to another application to inspect it. As assistive technologies would only expect text as the alternative, so would not be able to meaningfully read or display the syntax, this practice is really only recommended when advanced math users are the target audience.
While not the most desirable from an accessibility perspective, it is possible to provide accessible images of math equations if high quality alternative text and descriptions are provided. The presence of described images is indicated in the metadata using the describedMath accessibility feature.
The problem with describing math images is that human language is often not as precise as the equations it tries to describe and there is no way for the user to inspect the original equation to try and clarify. For example, the description "the square root of x and y" could be interpreted as either taking the square root of x and adding that value to y, or adding x and y together and taking the square root of the result.
Images of math are still prevalent in EPUB, however, because improvements to MathML rendering are recent and there are many current and legacy reading systems that do not render MathML well.
In this section:
Audio and video content often require a mix of features to improve their accessibility. While audio-only files only have to take their sound into consideration, video content typically mixes both important visual information with audio that needs to be conveyed to users. Moreover, a video without any sound still conveys information that has to be captured for users.
The practices for making audio accessible are generally the same regardless of whether the audio is part of a video or is a standalone clip. The biggest difference between the two cases is that audio with video typically requires more context to be accessible because the video content is also presenting important information.
One of the most common methods for making audio content in EPUBs accessible is by adding captions. There are two ways that this can be done: closed captions and open captions.
Closed captions are when the captions are defined separately from the audio content, allowing users to control their appearance and turn them on or off as they prefer. Their presence is indicated in the package document metadata using the closedCaptions accessibility feature.
Although closed captions are most beneficial to users who are deaf or hard of hearing, they can also be helpful to users who are blind or have low vision. It is possible for assistive technologies to read aloud the captions or render them as braille (although this is not common in EPUB reading systems). Audio descriptions and transcripts are generally more helpful for users who cannot see the content.
The typical way to add closed captions for audio in XHTML content documents is using the [^track^] element [[html]], with [[WebVTT]] being the most commonly used technology for captioning.
It is not possible to add closed captions natively to an audio-only clip in HTML. Although the [^audio^] element [[html]] supports tracks, unlike the video element it does not render any visual content. To work around this, the [^video^] element can be used to play audio-only clips with closed captions.
Open captions are the opposite of closed captions — they are a part of the visual display of a video and cannot be turned off or have their appearance modified, and cannot be accessed by assistive technologies. Their presence is indicated in the package document metadata using the openCaptions accessibility feature.
Open captions are most beneficial to users who are deaf or hard of hearing; they do not make auditory content accessible to users who are blind. (While sign language interpretation is often preferable for users who are deaf, it is not easy to provide as will be discussed later in this section.)
Open captions cannot be used with audio-only clips in HTML unless the audio is converted to a video (e.g., with a black background) and rendered in a [[html]] [^video^] element. The [[html]] [^audio^] element has no display.
While captions are helpful for those who cannot hear the audio, they are not always the ideal. Many users who are deaf are more proficient at reading sign language than they are written text. For these users, having an interpreter to translate what is being said on screen is preferable. The inclusion of sign language interpretation is indicated in the package document metadata using the signLanguage accessibility feature.
Similar to captioning, there are two common ways to provide sign language interpretation. One is to include the interpreter as part of the video itself, often in a bubble in the lower corner of the screen. This method is best used where there is not a lot of action (e.g., for a person giving a speech) as it can obscure what is happening for all users.
The other method is to provide sign language interpretation in a separate video and link the two video elements so that they play back together. This method is problematic for ebooks both because HTML does not have a method of synchronizing two video elements (it would require scripting, which is not well supported in reading systems) and also because the author cannot know how the videos will be laid out and paginated (the sign language interpreter could end up on a separate page from the video). It is not common to find sign language interpretation in EPUBs for these reasons.
Another drawback of captions is that they only capture auditory content. For users who cannot see what is happening in a video, the context of what is occurring on screen is lost when only captions are provided. Audio descriptions fill this gap by describing what is happening visually (e.g. actor movements or expressions). Their presence is indicated in the package document metadata using the audioDescription accessibility feature.
Another common method of making both the audio and video content accessible is through a transcript. A transcript, like an audio descriptions, can also describe more than just what is being said. For example, it can be structured more like a screen play. For this reason, they are also useful for explaining what is happening in video-only content like animated instructional videos. When transcripts are provided, their presence is indicated in the package document metadata using the transcript accessibility feature.
The one drawback of using transcripts is that they are not available in real time like captions and audio descriptions. As a result, users who can see the content but not hear it may have to watch the video once in silence and then try to match up through memory what they are reading in the transcript against what they just watched. Even with detailed transcripts that include the actions to help orient the user, the time to process the video is greatly increased.
A final audio consideration for users who are hard of hearing or have difficult processing speech is to provide greater contrast between spoken content and background noises. High contrast audio refers to audio of speech that has no background noise, that maintains a difference of at least 20 decibels between the foreground speech and background noise, or that allows users to turn off any background noise. Its presence is indicated in the package document metadata using the highContrastAudio accessibility feature.
For specific details on evaluating when auditory content meets the requirements to be classified as high contrast, refer to success criterion 1.4.7. Not all audio has to meet this requirement, for example. The audio must be primarily speech and there is an exemption for musical content.
Information about how the audio meets the requirement should be provided in the accessibility summary (i.e., there is no background noise, at least 20db difference between foreground speech and background noise, or the background noise can be turned off.)
EPUB publications will often contain a mix of the above features to make audio and video content accessible. It is not required to pick only one feature for all audio and video clips, nor should publishers feel they have to limit themselves to providing only one accessible alternative per clip. As already discussed, different users will prefer different options.
In this section:
Whether an EPUB publication is produced with a print equivalent or not, being able to locate and move to static page break locations is an feature important to help users navigate the content. It is useful in educational settings, for example, where a users who are blind might use an EPUB to read an assignment instead of the print version their peers are using.
The first feature for identifying static pagination is page break markers. These markers are inserted
into the markup XHTML content documents using the doc-pagebreak role to make them
readable by assistive technologies.
The epub:type attribute value pagebreak is also often paired with
doc-pagebreak role for historical reasons, but the value does not provide any
accessibility benefits.
The presence of static page break markers is indicated in the package document metadata using the pageBreakMarkers accessibility feature.
Static page break markers, as their name suggests, do not change depending on the device being used or the font and spacing preferences of the user. The pagination built into many reading systems is not static because it changes depending on these display factors.
It is not required that every page break marker from the source be present, or that they be in the same order as the source, as digital publications often omit material that is only relevant to print and have front and back matter rearranged. The markers for blank pages at the end of chapters and sections are also often omitted from digital publications. When page markers are omitted, the accessibility summary should be used to explain why.
If the page breaks markers provided have limited utility, the pageBreakMarkers feature
should not be claimed. For example, if only the page breaks at the start of chapters are identified,
the ability to synchronize reading is no better than what the table of contents will provide. At a
minimum, if the feature is claimed in cases like this, the limitation must be explained in the
accessibility summary.
While static page break markers can be useful on their own, they are most commonly paired with a page list to allow users to quickly find a destination. The page list, as its name suggests, is a list of hyperlinks to each static page break location.
It is not required static page break markers, as identified by a role or
epub:type attribute value, be present at the page list's hyperlink
destinations. A destination ID for each page break location is all that is required.
The presence of a page list is indicated in the package document metadata using the pageNavigation accessibility feature.
Similar to page break markers, a page list is most useful when it includes links to all the static page break locations. If the list is incomplete because blank pages or certain front or back matter is omitted, the summary should explain why. If the page list is mostly incomplete, such as when page breaks are only provided for chapter start locations, it is recommended not to claim it as an accessibility feature.
The resource includes the source of pagination so users can safely cite and retrieve citations, or synchronize reading in mixed reading environments where both printed and digital versions are used.
Regardless of whether static page break markers or a page list, or both, are included, the source of the pagination needs to be identified, even when there is no source (EPUBs with no print or other statically paginated version). As there may be multiple print editions of a work, all with different pagination, it is important for users to know which one the EPUB's pagination matches up with.
The source of the pagination is identified in a a11y:pageBreakSource property. It is recommended to use a unique identifier such as an ISBN or ISSN to identify the source, when available.
The pagination source is not an accessibility feature and there is no value to set for it using
the schema:accessibilityFeature property.
If a unique identifier is not available, provide as much information as possible that can uniquely identify the source.
If the pagination is unique to the EPUB, use the value "none".
In this section:
The absence of digital rights management (DRM) schemes is a critical accessibility feature as it ensures that users can access the content with their preferred reading systems and assistive technologies, which might otherwise be blocked by encryption. It also allows for personal use and transformations of the content that may be necessary for a user to perceive and understand it (e.g., using it with specialized text-to-speech software).
That a publication is not DRM protected is indicated in the package document metadata using the unlocked accessibility feature.
Do not set the unlocked feature unless it is known for certain that the publication will
be available to users without DRM. If, for example, the publication will be sent to a distributor
who will pass it on to multiple vendors, some of whom may apply DRM to the publication, the feature
must not be set, even if it might be true in some cases.
The use of DRM does not mean that a publication will not be usable by assistive technologies. When DRM is applied, publishers should ensure that they do not add restrictions that could affect access, but even when this is done vendor DRM could still have impacts on the usability.
In this section:
It is important to indicate if an EPUB has a defined reading order in the markup because it ensures that users of assistive technologies can follow the narrative in order. When the visual presentation of the content does not match the order in the markup, it makes it harder for non-visual readers to understand the text.
The presence of a reading order is indicated in the package document metadata using the readingOrder accessibility feature.
Unlike many of the other accessibility features, where claims can be made even if coverage is not
complete, the readingOrder feature must not be set unless the entire work conforms.
Users cannot be expected to read the accessibility summary for problems with the reading order and
then have to make sense of the mis-ordered content.
The lack of a logical reading order at the markup level most commonly occurs in fixed layout EPUBs. Reasons why include:
Reflowable EPUBs typically do not have issues with their markup not reflecting the visual reading order, as there is less control over the layout position of content. But it is possible to create reading sequences that are out of order on a smaller scale than a page, as content can be positioned in a fixed manner within container elements. For example, a sidebar element could be given a fixed width and height and have elements positioned within that space.
In this section:
Ruby annotations are used as pronunciation guides for logographic characters in languages like Chinese, Japanese, and Korean (CJK). They make difficult CJK ideographic characters more accessible for language learners, native speakers of all reading levels, and people with reading disabilities. They also provide critical information for text-to-speech (TTS) engines.
Ruby annotations are implemented by attaching the [[html]] [^ruby^] tag to ideographic characters
to provide the annotation. The [^rt^] tag contains the phonetic reading of the base character in
the ruby tag.
Unlike other accessibility features, ruby annotations distinguish between publications that provide full coverage and partial coverage of all ideographic characters. This is because it is common for CJK publishers to provide different coverage options based on reader need. Publications with partial coverage target pronunciation support for rarer characters, proper nouns, or key terms without cluttering the text with ruby on every single character.
The presence of full ruby coverage, where all ideographic characters have annotations, is indicated in the package document metadata using the fullRubyAnnotations accessibility feature.
When not all CJK ideographic characters have ruby annotations, the rubyAnnotations value is used instead.
In this section:
As EPUB publications can be read on refreshable braille devices, it is possible for a publication to be authored using Unicode braille characters (see Braille Patterns). The advantage of using braille characters is that it allows the text to be optimized for braille users. This avoids the user having to rely on their braille device to translate the text, which can be problematic especially for specialized notations like mathematics and music.
The presence of braille Unicode characters is indicated in the package document metadata using the braille accessibility feature.
The eBraille format is designed to represent braille
character data in an EPUB-conformant package and is a better choice for authoring braille in EPUB.
It is not necessary to declare that en eBraille file has a braille feature, however,
since the format itself and its required metadata already make this fact clear. The
braille feature is expected only when braille character data supplements the
standard text.
Another advantage of the eBraille format is that it provides metadata to indicate the type of braille being included (language, code, grade, etc.). When braille is added to a standard EPUB, the accessibility summary is likely the only reliable way to provide this information to users, as distributors and reading systems are unlikely to check or report braille metadata.
Both eBraille and standard EPUB publications may contain tactile graphics. These are images that are specially designed to allow users who cannot see the images to explore them tactilely, either by printing the images (embossing) or rendering them on specialized refreshable braille devices.
The presence of tactile graphics is indicated in the package document metadata using the tactileGraphic accessibility feature.
The tactile graphic does not have to be displayed in the content to claim this feature. If links are provided to external graphic files, the feature can still be claimed.
Refer to the BANA Guidelines and Standards for Tactile Graphics for more information about tactile graphic formats and formatting.
Although it may seem counterintuitive that a digital format like EPUB could contain a tactile object, the case is similar to external tactile graphics. If an EPUB provides links to files that can be downloaded and printed, it is possible to declare that tactile objects are available.
The presence of tactile objects is indicated in the package document metadata using the tactileObject accessibility feature.
In this section:
Word segmentation refers to whether additional spacing is added to languages that do not normally use whitespace characters to separate words (e.g., Chinese, Japanese, Thai, and Lao). Adding additional whitespace can make it easier to read for some users, especially those with cognitive or visual impairments who have difficulty following visually when there are no spaces.
The presence of additional word segmentation is indicated in the package document metadata using the withAdditionalWordSegmentation accessibility feature.
The lack of word segmentation is not an accessibility feature in the strictest sense, but it is included as a feature represents to indicate that the default authoring practice has been followed for these languages.
The lack of additional word segmentation is indicated in the package document metadata using the withoutAdditionalWordSegmentation accessibility feature.
Setting one or the other of these values, as appropriate, is recommended to avoid ambiguity (i.e., leaving the user guessing if word segmentation is not provided or if the publisher forgot to add metadata). Do not, however, set either value for languages that normally use whitespace to separate words.
If a publication combines text with added word segmentation and text without, whichever method is predominant should be set as the accessibility feature. That the full text is not available in that mode should be explained in the accessibility summary.
In this section:
Some languages, such as Chinese, Japanese, and Korean, can be written both vertically and horizontally, but the choice of writing direction can make it harder for some users to follow the text.
To indicate that horizontal writing is used, the package document metadata must indicate the horizontalWriting accessibility feature.
To indicate that vertical writing is used, the package document metadata must indicate the verticalWriting accessibility feature.
Setting the appropriate writing direction is always recommended, but do not, set either value for languages that are normally only written in one direction. For example, it adds no benefit to users to declare that English is written horizontally.
If a publication combines writing directions (for significant amounts of text, not for minor formatting like representing foreign language text), whichever method is predominant should be set as the accessibility feature. That the full text is not available in the specified direction should be explained in the accessibility summary.
There are three types of hazards that can affect readers of digital content:
flashing — if a resource flashes more than three times a second, it can cause seizures (e.g., videos and animations).
motion simulation — if a resource simulates motion, it can cause a user to become nauseated
(e.g., a video game drawn on the [[html]] canvas element or parallax scrolling with CSS).
sound — certain sound patterns, such as ringing and buzzing, can cause seizures, while loud or sudden changes in volume can also negatively affect users.
It is not only important for users to be aware if these hazards are present, but also if they are not present or if their status is unknown. Without this information, users would have no way of knowing if a publication is free from hazards or simply has not been checked, potentially leading them to obtain content which could be physically harmful to them.
To allow for complete metadata declarations, each of the hazards defined above can be in three different ways:
flashing, motionSimulation, and sound values can be declared.noFlashingHazard, noMotionSimulationHazard, and noSoundHazard.unknownFlashingHazard, unknownMotionSimulationHazard, and unknownSoundHazard.Each hazard status is identified in the EPUB package document using a schema:accessibilityHazard property [[schema-org]]. The property
is repeated for each status; do not group all the hazard statuses into a single tag.
These values can be paired together to identify each risk, but authors have to ensure they do not
accidentally declare conflicting statements for the same hazard type. For example,
flashing, noFlashingHazard and unknownFlashingHazard must
never be declared for the same publication.
To help simplify authoring hazard metadata, there are also two global values that can be declared:
none
can be declared instead of setting a status for each individual hazard type.unknown can be declared instead of setting a status for each
individual hazard type.These values are mutually exclusive and must never be declared for the same publication. Also, when
either of these global values is declared, no individual hazard statuses should be declared as it is
likely to cause conflicting statements and confuse users. For example, when none is
declared, the flashing, motionSimulation, and sound hazards
must not also be declared, nor any of their unknown value equivalents. Best practice is to never mix
a global declaration with individual ones, even if they technically do not conflict.
It is not strictly an error to declare redundant statements. The none value could,
for example, be paired with noFlashingHazard, noMotionSimulationHazard
and/or noSoundHazard, since these statements are stating the same thing
(none is equivalent to declaring the three individual statements). But it is
strongly advised not to include redundant statements as they add no value and can lead to errors
in the future if the status of hazards has to change.
If an EPUB publication contains a hazard, or if the presence of a hazard type is not known, provide additional information in the accessibility summary. For hazards, this could include where the hazard occurs (both location within the publication and the point within the resource, when applicable). For unknown hazards, the reason why the hazard is not known should be stated, as well as whether and when a future update to the publication will provide more information.
[[[epub-a11y-12]]] [[epub-a11y-12]] requires the status of all content hazards be identified
using the accessibilityHazard property [[a11y-discov-vocab]].
Although the vocabulary for the accessibilityHazard property contains the value
"unknown", this terms cannot be used to
meet the reporting requirements for the property.
In this section:
It is critical health and safety issue to alert photosensitive users if a publication is known to contain content that flashes more than three times in any one second period or where the flash is below the general flash and red flash thresholds.
For the precise meaning of these definitions, refer to [[wcag2]] success criterion 2.3.1.
Content that can exhibit these characteristics includes videos, animated graphics, dynamic content drawn on the [[html]] [^canvas^] element, and CSS animations.
To indicate that there is a flashing hazard, the flashing hazard value must be declared in the package document metadata.
Motion simulation is equally problematic as it can trigger severe dizziness, nausea, and disorientation in users with vestibular disorders.
Some examples of motion simulation include video games with a first-person perspective and CSS-controlled backgrounds that move at a different rate from the foreground content (parallax scrolling).
Refer to [[wcag2]] success criterion 2.3.3 for more information about motion simulation from animations. Although the success criterion makes an exception for essential motion simulation, that is only for conformance. A hazard should still be declared.
To indicate that there is a motion simulation hazard, the motionSimulation hazard value must be declared in the package document metadata.
What precisely constitutes a sound hazard, and how to test for these hazards, is not yet standardized. As a result, it is particularly important when declaring a sound hazard to explain what the hazard is being declared in the accessibility summary.
To indicate that there is a sound hazard, the sound hazard value must be declared in the package document metadata.
In this section:
Many books have no content that will pose a hazard to users. This is especially true of novels and similar books that consist only of headings and text. Instead of having to declare that each individual hazard type is not present, a shorthand to cover all hazards is provided in the none (hazard) hazard value.
Setting the none value is the equivalent of setting the individual statements noSoundHazard, noMotionSimulationHazard, and noFlashingHazard.
If a publication consists of a mix of known, not applicable, and unknown hazards, then each hazard that is not present needs to be separately identified.
To indicate that there are no flashing hazards, the noFlashingHazard value must be declared in the package document metadata.
To indicate that there are no motion simulation hazards, the noMotionSimulationHazard hazard value must be declared in the package document metadata.
To indicate that there are no sound hazards, the noSoundHazard hazard value must be declared in the package document metadata.
The noSoundHazard value provides a positive assertion that the content will not
interfere with screen readers or other assistive technologies.
In this section:
There may be times when the presence of accessibility hazards is not known but a statement is still needed in the package document metadata. This usually happens if a placeholder is needed until an accessibility evaluation is carried out.
To indicate that no hazards are known, the unknown (hazard) hazard value must be declared in the package document metadata.
Setting the unknown hazard status is the equivalent of setting each of the
individual unknown hazard statuses — unknownSoundHazard, unknownMotionSimulationHazard, and unknownFlashingHazard.
If the presence of hazards has not yet been determined, it is recommended to use the
unknown value rather than the individual unknown statements for each type of
hazard. It avoids the problem of missing one of the individual declarations and makes it easier
to remove the statement once the hazards have been checked as there is only one metadata tag to
delete.
If a publication consists of a mix of known, not applicable, and unknown hazards, then each hazard that is not known needs to be separately identified.
To indicate that the presence of flashing hazards is not known, the unknownFlashingHazard value must be declared in the package document metadata.
To indicate that motion simulation hazards are not known, the unknownMotionSimulationHazard hazard value must be declared in the package document metadata.
To indicate that the presence of sound hazards is not known, the unknownSoundHazard hazard value must be declared in the package document metadata.
An accessibility summary provides a brief, human-readable description of the accessibility characteristics of an EPUB publication that cannot be expressed through the other discovery metadata.
An accessibility summary is provided using the schema:accessibilitySummary property
[[schema-org]].
The accessibility summary should not simply repeat the conformance information provided in the
dcterms:conformsTo property, for example, or the features listed in the
schema:accessibilityFeature properties. When other accessibility metadata is present in
the package document, systems that process EPUB publications can already present it to users. Repeating
it in the summary only makes them hear the information again.
Do not include an accessibility summary when there is nothing more to add to the conformance claim and other discovery metadata. It is not required metadata.
If an EPUB publication does not meet the requirements for content accessibility in [[epub-a11y-12]], the reason(s) it fails should be noted in the summary. Similarly, if an EPUB creator is hesitant to make a formal claim of conformance, the reasons why can be explained in the summary.
Do not repeat the schema:accessibilitySummary property to provide translations of a
summary. EPUB does not define a method for including translations. Putting different
xml:lang attributes on properties does not indicate a translation and could lead to
wrong summary being rendered to users.
In this section:
[[[epub-a11y-12]]] [[epub-a11y-12]] defines metadata for indicating whether a publication meets its conformance requirements, as well as how to report information such as who performed the evaluation, what credentials the evaluator holds, and who to contact for more information about the accessibility of a publication.
Although this metadata is not as descriptive about the specific accessible characteristics of the content, it is just as important to set for multiple reasons. Primary among them is that reporting accessibility conformance to an accepted standard shows users that the content is broadly accessible, without them having to go through all the specific details to build a picture. The level of conformance, whether WCAG 2 Level A or AA, also provides insight into the user groups that will benefit from the accessibility affordances provided.
Reporting conformance also directly benefits publishers as it establishes trust with their customers. When customers see that a publisher consistently does the work to make their content accessible, they are more likely to seek out content from that publisher.
Reporting accessibility conformance is also important at a regulatory level. Many jurisdictions now have laws around the provision of accessible content. Indicating that a publication is conforming and providing details about the evaluation is often not optional metadata.
For historical reasons, the EPUB accessibility vocabulary uses "certifier" and "certified" in the names of its properties. No official authority was intended from this name, but it often causes confusion, especially in regions where certification has legal meaning. The conformance metadata is only intended to describe an evaluator performing an evaluation of the content, which is why this document and others refer to these concepts in the descriptions of the metadata.
When an EPUB publication meets the discovery and accessibility requirements of the EPUB Accessibility specification, a conformance claim can be made using the dcterms:conformsTo property [[dcterms]].
The EPUB Accessibility specification defines a controlled vocabulary of values that are allowed in the conformance statement. These values follow the pattern:
These patterns only apply to EPUB Accessibility 1.1 [[epub-a11y-11]] specification and above. The EPUB Accessibility 1.0 [[epub-a11y-10]] specification was produced by a different organization and used URLs to identify conformance.
Conforming to the 1.0 standard is no longer recommended both due to its age and because it allowed conformance claims to WCAG 2 without fully meeting the requirements of that standard. Claiming conformance to EPUB Accessibility 1.0 makes it ambiguous whether the content is fully conforming which can prove problematic where accessibility is mandated by law.
It is important to ensure that an EPUB publication fully meets the requirements of the EPUB Accessibility specification and the version and level of WCAG specified. It is not sufficient, for example, to make a conformance claim based solely on an EPUB publication passing a automated conformance checker. While these tools are helpful for finding machine-identifiable issues, there are many checks that only a human can carry out.
Although omitted from the previous examples for simplicity of reading, it is typical for a
conformance claim meta tag to also declare an id. As will be detailed
in the following sections, the ID assigned to the tag allows other statements to be attached to
it, such as who performed the evaluation and, when present, the date of the evaluation and where a detailed report can be found.
There is currently no standardized way to state that a publication fails conformance reporting,
as publishers typically omit any claims when a publication fails to meet minimum standards. One
option is to use the word "none" in a dctems:conformsTo statement, but this might
not be recognized as an accessibility claim if it is not accompanied by other evaluator
information. It could also cause display issues in bookstores and reading systems, as they will
not expect a failing conformance statement and, for example, would not translate or modify the
value to make more sense for users.
The dcterms:conformsTo is not exclusive to making accessibility claims. It can
be used to make other types of claims, including additional accessibility claims (e.g., to a
publisher's internal conformance standard or to another regional accessibility
standard).
Providing the name of the evaluator is required when a conformance claim is made. It is helpful to users in terms of assessing the quality of the evaluation performed. A user might be warier of a publisher self-certifying their work, for example, especially if they do not declare any evaluation credentials, than a trusted evaluation agency.
The name of the evaluator is provided in the package document metadata using the a11y:certifiedBy property. It is strongly recommended to attach
the evaluator name to the conformance statement using the refines attribute, where
the value of the attribute is a reference to the ID of the conformance claim.
It is only necessary to add an ID to the meta tag for the evaluator's name if a
credential, certification date, or the location of a more detailed report is also going to be
specified. Otherwise, it can be omitted.
If multiple parties performed an evaluation, it is possible to provide more than one
a11y:certifiedBy property that links back to the conformance claim, but there
is no guarantee if all, or even which one, will be displayed in a bookstore or reading system's
metadata. As per the recommendation for other EPUB metadata, such as author names, it is advised
to put all the names in one tag if it is important that they all be displayed.
It is also possible to provide the evaluator's name in another language or script in EPUB 3
publications using the alternate-script
property, but this metadata is not often used.
To help users establish trust that the evaluator of the publication has the skills necessary to properly evaluate an EPUB publication for conformance, a credential can be attached to their name using the a11y:certifierCredential property.
Note that there is no need for the credential meta tag to have an
id attribute as no other metadata is attached to the credential.
Ensure that the credential is not attached to the conformance statement, as it does not make sense that a conformance claim can have a credential.
Although users will typically assume that an evaluation was performed prior to the release of an EPUB publication, it can be helpful to users to have the date explicitly stated. More often, though, this date is helpful for the publisher to be able to determine when an evaluation was last performed.
The date of the evaluation is attached to the evaluator's name metadata using a dcterms:date property. The format of the data must be expressed as an ISO 8601-1 conforming date string. These typically take the form of a single four-digit year, a four-digit year followed by a two-digit month, or a four-digit year followed by a two-digit month followed by a two-digit day (with hyphens used to separate the values in all cases).
More information about the recommended date format can be found in [[[datetime]]] [[datetime]].
When including an evaluation date, it is important to remember to update it if the publication is
re-evaluated (e.g., for a revised release). Do not add a second dcterms:date
property for the new date as it could lead to the wrong information being displayed to
users.
Although the date format allows a time to be included after the date, in general there is little value in providing such detailed information. The time portion of the date string is also unlikely to be presented to users, so adding a time would only have value for internal processes.
When performing a conformance evaluation, a detailed report is often produced to prove that a publication meets all the requirements of the EPUB Accessibility specification and WCAG success criteria. Making this report available can aid users who need more detailed information about the content, and may be legally necessary to provide in order to sell a publication.
When a detailed report is available, it is possible to also link to it from the package document.
Unlike the other conformance metadata, however, the link is provided in a link
element using the a11y:certifierReport property.
Ensure that the report is hosted in a publicly readable location. Providing a link to a internal publisher document, for example, will prevent users from accessing it. The link should be stable so that the report remains accessible over time and to prevent security issues (e.g., to prevent broken-link hijacking when a domain expires and gets bought by a malicious party).
It is possible to include a copy of the report in the EPUB publication if it cannot be hosted on a secure, public server. Including reports in the EPUB container is not common practice, however, because it increases the difficulty for users to access it. Vendors and reading systems are required to extract the report and put it someplace readable so that users can read it prior to buying or opening the publication. Otherwise, the only option is for the user to unzip the EPUB publication and locate the report themselves.
When embedding a report, the link element's href attribute would
contain a relative path to the location of the file in the zip container. Because the report
is not a publication resource, it does not have to be listed in the manifest.
In jurisdictions where accessibility conformance is mandated by law, it is also possible that some publishers are granted exemptions from full conformance. Exemptions may be granted, for example, for small and self-publishers who cannot afford the full costs of making their publications accessible or for content that cannot be made accessible without fundamentally altering its presentation.
The a11y:exemption property is used in these cases to indicate that an EPUB publication is exempt from a jurisdiction's accessibility requirements.
A publication can fail the minimum accessibility requirements for a jurisdiction while still meeting
the minimum conformance requirements of the EPUB Accessibility standard. This is because conformance
claims can be made when a publication meet WCAG 2 Level A while most legislations requires Level AA
conformance. For this reason, the exemption is not linked to a conformance claim or an evaluator
using the refines attribute.
At the time of writing, the European Accessibility Act (EAA) is the only legislation known to provide these types of exemptions. Publications must meet the minimum accessibility requirements of the Act or fall under one of the following three exemptions in order to be distributed within the bloc of member countries:
shall apply only to the extent that compliance: … (b) does not result in the imposition of a disproportionate burden on the economic operators concerned. An exemption under this clause is declared using the value "
eaa-disproportionate-burden".shall apply only to the extent that compliance: (a) does not require a significant change in a product or service that results in the fundamental alteration of its basic nature. An exemption under this clause is declared using the value "
eaa-fundamental-alteration".an enterprise which employs fewer than 10 persons and which has an annual turnover not exceeding EUR 2 million or an annual balance sheet total not exceeding EUR 2 million. An exemption under this clause is declared using the value "
eaa-microenterprise".Publishers should seek legal advice before claiming any exemption to EAA rules. Exemptions may also be time-limited and require a periodic reassessment to confirm that they still apply.
Providing a contact email address for accessibility inquiries is another helpful way to establish trust with users. While conformance and discovery metadata provide a useful picture of the accessibility of a publication, it cannot cover every need. Before purchasing content, users may want to reach out to the publisher for more information.
A contact email is also a helpful way for publishers to gather feedback on the actual accessibility of their content. Sometimes inaccessible content will slip through quality assurance testing. And while user testing before release, is ideal, it is not always possible and cannot verify every user's needs are met.
The email address to use to contact the publisher is specified in an a11y:contactEmail property.
The contact email does not have to go to the publisher. If they use a third-party that is responsible for the accessibility of their products, that party's address can be used. The key point is that the email should go directly to someone who can handle accessibility requests. It should not be a general purpose feedback address unless there is no other option.
Without media overlays:
<meta property="schema:accessMode">textual</meta> <meta property="schema:accessModeSufficient">textual</meta>
With media overlays:
<meta property="schema:accessMode">textual</meta> <meta property="schema:accessModeSufficient">textual</meta> <meta property="schema:accessModeSufficient">auditory</meta>
If text alternatives are provided and they sufficiently capture the visual content:
<meta property="schema:accessMode">textual</meta> <meta property="schema:accessMode">visual</meta> <meta property="schema:accessModeSufficient">visual,textual</meta> <meta property="schema:accessModeSufficient">textual</meta>
If no text alternatives are provided, or they do not sufficiently capture the visual content:
<meta property="schema:accessMode">textual</meta> <meta property="schema:accessMode">visual</meta> <meta property="schema:accessModeSufficient">visual,textual</meta>
If the video has audio essential to its understanding and the auditory and visual information is fully described with textual alternatives:
<meta property="schema:accessMode">textual</meta> <meta property="schema:accessMode">visual</meta> <meta property="schema:accessMode">auditory</meta> <meta property="schema:accessModeSufficient">textual,visual,auditory</meta> <meta property="schema:accessModeSufficient">textual,visual</meta> <meta property="schema:accessModeSufficient">textual</meta>
If no text alternatives are provided, or they do not sufficiently capture the visual content:
<meta property="schema:accessMode">textual</meta> <meta property="schema:accessMode">visual</meta> <meta property="schema:accessModeSufficient">visual,textual</meta>
Without no text alternatives:
<meta property="schema:accessMode">visual</meta> <meta property="schema:accessModeSufficient">visual</meta>
With partial text alternatives:
<meta property="schema:accessMode">visual</meta> <meta property="schema:accessModeSufficient">visual</meta> <meta property="schema:accessModeSufficient">visual,textual</meta>
With full text alternatives:
<meta property="schema:accessMode">visual</meta> <meta property="schema:accessModeSufficient">visual</meta> <meta property="schema:accessModeSufficient">textual</meta>
With media overlays and partial or no text alternatives:
<meta property="schema:accessMode">visual</meta> <meta property="schema:accessModeSufficient">visual</meta> <meta property="schema:accessModeSufficient">visual,auditory</meta>
With media overlays and full text alternatives:
<meta property="schema:accessMode">visual</meta> <meta property="schema:accessModeSufficient">visual</meta> <meta property="schema:accessModeSufficient">auditory</meta>
Without no text alternatives:
<meta property="schema:accessMode">visual</meta> <meta property="schema:accessMode">textual</meta> <meta property="schema:accessModeSufficient">visual,textual</meta>
With partial text alternatives:
<meta property="schema:accessMode">visual</meta> <meta property="schema:accessMode">textual</meta> <meta property="schema:accessModeSufficient">visual,textual</meta>
With full text alternatives:
<meta property="schema:accessMode">visual</meta> <meta property="schema:accessMode">textual</meta> <meta property="schema:accessModeSufficient">visual,textual</meta> <meta property="schema:accessModeSufficient">textual</meta>
With media overlays and partial or no text alternatives:
<meta property="schema:accessMode">visual</meta> <meta property="schema:accessMode">textual</meta> <meta property="schema:accessModeSufficient">visual,textual</meta> <meta property="schema:accessModeSufficient">visual,auditory</meta>
With media overlays and full text alternatives:
<meta property="schema:accessMode">visual</meta> <meta property="schema:accessMode">textual</meta> <meta property="schema:accessModeSufficient">visual,textual</meta> <meta property="schema:accessModeSufficient">textual</meta> <meta property="schema:accessModeSufficient">auditory</meta>
<meta property="schema:accessMode">auditory</meta> <meta property="schema:accessModeSufficient">auditory</meta>
<meta property="schema:accessMode">tactile</meta> <meta property="schema:accessModeSufficient">tactile</meta>
The following examples show the metadata that would be added to an EPUB publication that has textual and visual access modes, is sufficient for reading by text, contains alternative text and MathML markup, and has a flashing hazard.
In this section:
It is not necessary to set the schema:accessibilityAPI property for EPUB publications. EPUB creators are not
responsible for the interaction between reading systems and the
underlying platform APIs.
It is not necessary to set the schema:accessibilityControl property for EPUB publications. This
property does not differentiate issues arising from the reading system interface
from those in the underlying content, which has led to confusion about its use.
Meeting the requirements of [[wcag2]] will mitigate most known issues with the content and is sufficient for authoring purposes.
The use of the annotations value is now deprecated due to the general nature of
annotations in published works. When authors or publishers add annotations to a work, they
provide information for all readers; they are not used as a means of enhancing otherwise
inaccessible content.
The use of the bookmarks value is now deprecated due to its ambiguity. Reading
systems typically provide bookmark support not the EPUB itself.
The use of the captions value is now deprecated. Authors should use the more
specific closedCaptions or openCaptions values, as appropriate.
The metadata term printPageNumbers value has been replaced by pageBreakMarkers.
There is no native support for rendering Chemical Markup
Language (ChemML or CML) in HTML. Unlike MathML, it is not a part of the HTML
standard so the markup cannot be embedded directly in HTML documents. It is also not a core
media type in EPUB so any use of ChemML files requires a manifest fallback to a supported
image. For these reasons, the ChemML term is not listed as a usable feature in
EPUB metadata at this time.
EPUB and PDF are fundamentally different technologies. An EPUB's accessibility is achieved
through its use of semantic HTML and ARIA, while a PDF's accessibility relies on its
internal tagging structure. An EPUB's metadata must describe the features of the EPUB
itself. Applying a PDF-specific property like taggedPDF is inappropriate and
would be ignored by reading systems. While an EPUB might link to an external PDF, the
metadata for the EPUB should not describe the features of that external file.