Introduction
The Web emerged in 1994, based on a model of individual pages loosely joined by hyperlinks. Clustering within
domains and with explicit navigation elements built into them, webpages evolved into websites. Despite the
Web's strong connections to print media (e.g. web resources are “pages” and the in-memory model for Web
applications is the “Document Object Model”), this document argues that the web platform may still not be
meeting certain requirements from print media that users desire.
Over centuries, “books” have assumed many forms: journals, magazines, pamphlets of long-form articles and
essays, newspapers, atlases, comics, notebooks, albums of all sorts. We can define these different
manifestations as “publications”: bound editions of meaningful media, made public.
Another form of publication that also has a long history in both the printed as well as the digital world are
documents. These are publications that are written and distributed in a more ad-hoc manner, such as legal
briefs, corporate memos, and even the definitions of standards, such as this content currently being read.
We believe there is great value in combining this older tradition of portable, bounded publications with the
pervasive accessibility, addressability, and interconnectedness of the Open Web Platform (OWP). New models of
economic sustainability and innovative experiences of knowledge depend on this.
It is the task of the W3C Digital Publishing Interest Group to
explore the uniqueness, desirability, and feasibility of bringing these two great models of publishing
together. This document explores requirements based on examples of real world use cases and scenarios.
Requirements for publications on the Web are explored first, without referring to any packaging aspect that
would correspond to current practices like EPUB. This is followed by requirements of those packaging aspects,
as a structure on top of a purely Web-based distribution. The complete list of requirements is also collected
in a separate table in an appendix.
Terminology
-
This document uses the term user agent, as used by the Web community; see, for example, the
WAI glossary entry. The
publishing community often uses the term “reading system” for roughly the same notion; while there may be
subtle differences, it is better to stick to a single term for the purposes of this document.
-
A Web Publication (WP) is a collection of one or more constituent resources, organized together
in a uniquely identifiable grouping, and presented using standard Open Web Platform technologies.
-
A Packaged Web Publication (PWP) is a Web Publication whose constituent resources are
combined into a single distributable file, using some standard packaging format.
-
In this document, manifest refers to an abstract means to contain information necessary to the
proper management, rendering, and so on, of a publication. This is opposed to metadata that contains
information on the content of the publication like author, publication date, and so on. The precise format
of how such a manifest is stored is not considered in this document.
Web Standards
Open Web Platform
Web Publications should be able to make use of all features offered by the Open Web Platform (OWP).
There is a remarkable development of tools and frameworks built on top of OWP that make it possible to
develop powerful interactive layers on top of OWP. These include, for example, data visualization systems
(e.g., d3, built on top of SVG), possibilities to access external services like Wolfram Alpha, or tools to
create and store (possibly as part of the publication) annotations. These tools have been traditionally
developed around browsers and provide possibilities that publications should also benefit from. That
requires that Web Publications become first class citizens on the Web
platform.
-
A large, multidisciplinary, Web-based journal relies on traditional Web technologies like HTML and CSS for
its content. The journal, responding to the evolving expectations of its audience, is increasingly using
additional media such as video, audio, animated graphics, and very large images; the trend is to consider
these as integral parts of the scientific output. The journal as a result needs access to the latest
visualization and other data management tools that the OWP-based tools can offer.
-
Educational publications are increasingly making use of OWP features. In addition to video, audio, and
animations, they may also include interactive exams (possibly linked to online evaluation facilities),
visualization of data or of algorithms, and built-in interpreters for various languages (e.g., for courses
on programming). In many respects, the borderline between these publications and Web applications is
becoming fuzzy.
-
A large technology company with extensive “in-house” documentation to support technical and administrative
processes and user documentation for their various products, develops all this material in digital-only
formats. The quantity of documentation makes it impractical to produce these documents in print. Instead,
the company publishes them on the company intranet, and/or provides them to their employees and contractors
via specialized mobile applications. These documents, as a type of publication, require accessibility,
portability of annotations, and the possible inclusion of complex media.
-
News outlets often rely on video, audio, large images, and interactive elements to enhance their content.
Web Publications would allow news outlets to provide this content in a unique, identifiable format that can
facilitate archiving and offline access.
-
Image-based narrative content like comics, manga, and graphic novels relies heavily on support for large
images. This type of content may also experiment with interactivity and unique forms of navigation.
Horizontal Dependencies
A Web Publication should conform to the requirements of all horizontal dependencies: accessibility,
internationalization, device independence, security, and privacy.
Web content has to be consumed under different circumstances: it must be available to the largest possible
audience in a secure manner, providing the necessary protection of the reader’s privacy. Publication content
must be able to answer to a number of principles like accessibility, internationalization, device
independence, security, and privacy. (These are usually referred to, in the W3C context, as “horizontal”
dependencies.) These principles are, in general terms:
- Accessibility:
-
People with disabilities should be able to access the content of a publication. They should be able to
perceive, understand, navigate, and interact with it, as well as contribute to it. Accessibility
encompasses all disabilities that affect access to the content, including visual, auditory, physical,
speech, cognitive, and neurological disabilities.
- Internationalization:
-
Publications should be well adapted to any language, writing systems, region, or culture. This includes
the usage, when appropriate, of left-to-right, right-to-left, horizontal or vertical writing; item
numbering, or interactive forms specific to local cultures; usage of the right character sets and of
local typographic conventions.
- Device Independence:
-
The content in a publication should be usable on a large number of devices with very different device
characteristics: different screen types and sizes, various input modalities, varying level of processing
power, etc. These different affordances should be automatic with no, or very little, user intervention.
- Security:
-
Publications should be presented by a User Agent using a security model that is at least (if not more)
secure than the standard Web security model. Doing this will
prevent publications that contain malicious attacks, data theft, and other security incidents from
impacting users by jeopardizing the integrity of the underlying data or machine operations.
- Privacy:
-
The content in a publication should maintain and support user privacy, in spite of the fact that the
evolution of online technologies has increased the possibility for the collection and processing of personal,
and possibly sensitive, data. However, since a publication may use any part of the OWP,
it may choose to use functionality such as the ability to track a user's activity within the publication.
These principles correspond to technical requirements on the underlying technologies (i.e., OWP, and its
possible extension to Web Publications) insofar as the technologies must empower the authors (writers,
editors, publishers, etc.) to produce content that follow them. Whether authors use the possibilities of
these technologies or not is not addressed in this document.
All these constraints are formalized in the context of the usage on the Web and by extension Web
Publications. This means that they are valid for publications in general. In some cases, for example due to
legislative reasons, the demands on publications may be more stringent than for generic Web sites. The use
cases below provide some examples for the publication-specific situations. Note also that some aspect of
horizontal dependencies (e.g., accessibility or security), are also the subject of further use cases and
requirements elsewhere in this document.
-
(On Accessibility) Legal Publishing Ltd. publishes all the official texts
as issued by the government of its country. Per local legislation, the publication must be accessible,
following W3C’s WCAG Level AA requirements, to serve as official references in courts.
-
(On Privacy, Accessibility) EducationPublishing Ltd. publishes digital
textbooks to cover BigUniversity’s curriculae. These (digital) educational publications also include access
to interactive tests via specialized services on the Web that regularly access the student’s progress. The
privacy and the integrity of the student’s test data must be preserved. This, and the fact that digital
textbooks must also abide to WCAG Level AA requirement in terms of accessibility, are such that
EducationPublishing may be liable in case they are not fulfilled.
-
(On Internationalization) PublicationInternational SA. publishes literary
work all over the world and in many languages. In order to continue its business in different countries, it
must be able to produce digital publications acceptable by local customers. Vertical, right-to-left, and
bidirectional writing, among other typesetting traditions, must be supported by the reading systems and
possible to enable in the Web Publications. Additionally, the reading system should allow for varied
interaction with the content, such as right-to-left page navigation for content in languages like Japanese
and Arabic.
-
(On Privacy) Thomas has written a pamphlet advocating a government
overthrow. The government has decreed that the author of the pamphlet as well as the readers of the pamphlet
shall be jailed. Thomas needs to distribute the pamphlet in ways that preserve his anonymity and allow the
public to read without fear of the government cyber-police.
-
(On Device Independence) Yoshio usually reads a book on his tablet when
he is at home, but he does not carry his tablet around while commuting on the train. Instead, he prefers to
use his phone to continue reading. Publications must be able to adapt to the consumption environment, so as
to provide a good reading experience regardless of the device.
-
(On Security) LocalLibrary receives publications from a variety of
sources that they then make available to their members. It is imperative that none of these publications can
cause any damage to their own systems or those of their members.
Escalating Trust
User agents may provide a method for escalating trust for a specific publication.
Some publications may require additional capabilities (for example, access to camera or geolocation) that a
user agent might normally not enable. Today, some platform and UA vendors offer methods for otherwise
untrusted local scripts to become trusted and regain API privileges, a similar ability needs to exist for
publications as well.
-
Rebecca has assigned an exercise from a WP textbook. The exercise requires the use of geolocation to measure
distances from the user’s location to a target. The UA detects that the WP came from a trusted source (the
textbook publisher), and therefore allows the WP to use the full capabilities of the UA.
Document Composition
Identification
A Web Publication, as a collection of resources, must be identified by either a
single URL or a unique handle that can resolve to a single URL.
The unique identification of a specific Web Publication is essential. If not expressed as a URL, there should
be a way to map this unique identification onto a Web Address. The Web Publication must be
identifiable as a single logical resource
with its own URL beyond the references to its constituent
resources.
-
Scholarly references demand a unique identification of the publication and, possibly, its internal structure.
That unique identification must be available as a Web link, to make it possible for other publications and
other sites (e.g., the authors’ institutional sites) to unambiguously link to the publication. These
features are essential in the scholarly community to make, for example, the assessment of individual
researchers possible.
-
Textbook editions should be clearly distinct from one another using unique identifiers. Links should be
accurate in order to avoid confusion in classes, especially since each class is not necessarily using the
latest edition.
-
Marwin wants to search for a term on the publication. As a reader, he does not know the internal structure
of the book, i.e., whether the content is one or several HTML files; he wants to search to be executed on
the whole (logical) content, regardless of its internal representation.
-
Svetlana sets her preferences in terms of font selection and size, background color, etc, for her WP textbook.
She wants those to be in effect on all chapters of the book automatically.
-
User agents that support value counters (page counters, section numbering, footnotes, endnotes), should do
so across the entire Web Publication (as opposed to individual components being numbered separately)
-
Assistive Technology such as screen readers or voice dictation control needs to have the Web Publication
presented to it as if it was a single unit.
All constituent resources, and their contents, should be identified by either a URL or a unique handle that can resolve to a URL.
The requirement that a Web Publication be uniquely identifiable can be easily extended to the constituents
of a Web Publication, as well as the fragments, parts, sections, etc, of those resources. Those
idenfications should be stable and resilient to changes and new iterations of the publication.
-
Markus refers to a specific mathematical theorem in a publication. That reference must be unique, stable and
retrievable on the Web, and it should not depend on whether the publisher issues a new iteration of the
target publication (thereby possibly change the section numbering).
-
Tanya and Kelly are collaborating on curriculum for the upcoming school year. They are able to reference the
same Web Publication in their shared documents and rely on stable, retrievable links to sections within the
WP.
-
Judit uses an annotation tool to comment on a publication authored by Pablo. She puts an annotation against
a sentence in a particular paragraph, anchoring that annotation to the sentence using a reliable way of
identifying it. That identification should not be invalidated by a subsequent change of the document by
Pablo (unless he, e.g., removes that sentence).
Resources
The information regarding the constituent resources of a Web Publication must be easily discovered and there
should be a way to differentiate between essential and non-essential resources.
A Web Publication will likely be composed of multiple Web documents and their resources. A more complicated
Web Publication may have many resources, some of which are essential and some of which are not. Because of
this complexity, extracting in advance all the references to some or all constituent resources may be
prohibitive. It is therefore necessary for the user agent to have an easy access to the list of constituent
resources and some of their characteristics, such as media type, size, and whether they are essential.
In a publication, some content is essential to the user being able to consume it while other content could
be either absent or have a provided fallback for situations such as limited connectivity or storage. This
information, provided by the author or publisher of the Web Publication, would enable a user agent to provide
a better experience to the user. For example, the user agent can ensure that essential resources are made
available when offline (see ).
-
Nick is reading a long-form narrative on a device with limited storage: a publication filled with text,
images, audio, and multimedia files. Nick also rides the subway where he loses Internet connectivity
frequently and without warning for long stretches of time. During offline or low-storage situations, there
are still critical parts of the publication that are consumable, mainly the text (and possibly images).
Having a reasonable fallback for video, such as a poster image or placeholder image, would allow Nick to
read the content while offline or on a device with limited storage.
-
Henry creates a Web Publication and includes the accessibility metadata indicating that the publication has
descriptions for videos. He marks the accessible descriptions as essential and marks videos as non-essential
while providing images as fall-back. This enables the print-disabled readers to access the accessible
descriptions and video when there is good internet connectivity, and fall-back images along with accessible
descriptions when internet connectivity is not optimum.
-
Gösta is reading a treatise on the theory of functions. A mathematical font is essential for the proper
display of the mathematical formula in this publication, so the author has marked the font as essential so
it is made available offline.
-
While reading an article on a new spam analysis algorithm, Lars is primarily interested in the findings of
the research. Since the research was funded by a government agency, the dataset, consisting of millions of
anonymized log files, is also available. Because of its size, the researchers have marked the dataset as
non-essential for conveying the results of the paper and therefore indicates it can be skipped when reading
the publication offline.
-
Sarah is reading a publication about the stock exchange. The current value of the stock is fetched (from a
remote resource) when she opens the publication. However, when she is on the train (without a connection)
one week later and opens the stock exchange publication, she will continue to see the value of the stock as
it was the last time she opened the publication. It should be possible for either the content itself or the
user agent to provide some user experience that notifies her that the currently presented data is a week old.
-
Risha publishes an article which includes an interactive component that accesses a database, exposed to the
Web via a RESTful API. The interactive component is implemented as a JavaScript library. Such data cannot be
included in a packaged publication and the interactive module is of no use without such data. Risha
therefore marks the Web Publication component relative to the interactive module as not essential when
offline.
-
Chiara is listening to an audiobook on Roman history. The description when she bought it mentioned that the
maps and images referred to the book were available as supplemental content in the file. At the end of the
audiobook, she is able to view the content, which helps her understand the audiobook better, but it is
marked as non-essential if the audiobook is taken offline.
-
Stephanie, a student, is answering questions in an assessment embedded in a Web Publication. Her internet
connection is unreliable at home. She should be able to save
those responses in the Web Publication. The assessment allows her to submit responses to a database. A notification within the Web Publication should inform her whether the data has been successfully submitted.
Default Reading Order
There should be a means to indicate the author’s preferred navigation structure among the resources of a Web
Publication, and User Agents should provide an accessible way of navigating the same.
-
Moby Dick contains 136 chapters. Each chapter is a separate HTML document, with a logical order for reading
them. It should be possible for the publication to inform the user agent that the proper order for
consumption of the HTML documents is sequentially, starting by the first chapter.
-
The Encyclopedia of Stuff includes 1348 articles, each one in a unique HTML document. The publication must
be able to indicate to the user agent that the standard way to consume the articles is alphabetical order,
by title.
-
The third edition of the Writing Handbook has 10 chapters with 5-10 subsections apiece and 4-8 readings.
Each subsection and reading starts its own HTML page. The Web Publication should be able to inform the user
agent that the proper order for consumption of the HTML documents is sequentially, starting with the first
chapter. Moreover, in the navigation structure, the user agent should be able to indicate to the user which
subheadings and readings belong to which chapter, and that those subheadings and readings are navigated
sequentially, starting with the first subheading in the parent chapter.
Navigation
A user agent should be able to reveal the navigable structure of a Web Publication as a table of contents that
is accessible to users, including those with disabilities.
The table of contents must include a link to at least one resource, and all links should refer to resources
within the publication bounds. The user agent presents an accessible table of contents, which allows the user to
access the links without navigating away from the current resource.
-
Chandrasekhar has been assigned a set of exercises in his math Web Publication textbook. To double check his work, he
wants to easily navigate to the answers in the back of the Web Publication. Using the table of contents exposable by his
user agent, he is able to navigate to the answers and return to the exercise with ease.
-
Penelope has opened her Web Publication to its preface. She would like to skip the preface and go to chapter
one. She is able to reveal the table of contents built into her Web Publication via a function in her user
agent and view a link to chapter one. She changes her mind and decides to read the preface. Since she is
still on the preface and has not navigated away, she only needs to close out of the table of contents to
continue reading the preface.
-
Matilda is reading an anthology but would like to read her favorite author's short stories first. She is
able to reveal the navigation built into her Web Publication via a function in her user agent and access the
relevant stories.
For content that requires a player interface for time-based media,
the Web Publication should provide the User Agent a way to navigate
to a specific position in the content.
-
Belinda would like to skip directly to a specific short story in a collected edition. The Web Publication provides the user agent
with metadata to access the short story.
-
Mr. Sterling asks his students to skip directly to the 23rd minute of an audiobook.
-
Ken was assigned part of a non-fiction book for a class. He wants to open the audiobook but only needs to
listen to chapter 19. The Web Publication provides the
user agent with metadata to access the chapter.
Random Access to Content
Authors of a Web Publication should be able to provide the user agent with information to access random
parts of the publication.
It should be possible for the author to convey several potential reading orders
that may go beyond the “default” for the content of the publication. This alternative reading order may only
include specific parts of the publication rather than the full content of the publication.
A user agent should be able to access the resources of the publication in whatever order it chooses—beyond
the order provided by the publication itself.
-
EsteemedJournalPublisher would like to offer the users of the EsteemedJournal of Chemistry App the
opportunity to read only the abstracts of the journals in the app. The publication would therefore provide
the user a list (table of contents) of abstracts (disjointed objects in the package with semantic
information or metadata informing the package of the nature of the object).
-
A publisher wants to provide “teasers” for a book by providing a series of extracts that are meant to give
an overview of the book without the need to read the whole publication. This can be typically used by
a reseller allowing for a prospective client to access part of the publication free of charge.
-
EducationalPublisher publishes a complex textbook. The textbook is created is such a way that it could be
used both for beginner and advanced levels. The default reading order corresponds to beginners, but the
goal is that advanced students can follow a different path through the material, corresponding to their
level of knowledge. EducationalPublisher therefore adds alternative reading orders to the publication that
advanced users can follow.
-
Acme Publishing has published a book on wines that can be read from A-Z, or personalized to only read about
red wines or wines from a specific region.
-
A specialized user agent wishes to find all images in a publication that do not already have alternative
text and automatically provide it using an image identification service such as
LabelMe.
-
EducationalPublisher publishes a history textbook in English. The textbook contains two glossaries, one in
English and one in Spanish for Spanish-speaking students learning English as their target language to
facilitate their ability to understand key terms in the text. A specialized user agent should be able to
reveal and suppress the Spanish glossary based on what the student requires.
-
A foreign language textbook Web Publication allows for a specialized user agent to filter and display content identified as vocabulary and conjugation charts. This is done in order to
facilitate the teacher creating study guides.
-
CookbookPublisher wants to create a cookbook that either a function within the publication itself or a
feature in a user agent can filter to only show recipes with certain labels.
-
The publisher of an experimental novel would like to publish the book as a Web
Publication. The author wishes the reader to have the option to read chapters in a variety of different sequences, all of which are offered as
alternative reading orders in the Web Publication.
If there is a physical book version of the Web Publication, the user must have the ability to quickly browse
to a corresponding pointer as identified in the physical book.
-
Beatrix is visually impaired and uses accessible Web Publications in her class, while her sighted classmates
use physical books. When the teacher asks the class to open page 71 and read the second paragraph, Beatrix
should be able to navigate to exactly the same position in her version as her sighted classmates.
-
Zoya borrowed a book from her library but must return it. She has decided to buy a WP version of the same
text and would like to continue reading where she left off.
Alternative Modalities
A Web Publication should encompass publications such as audiobooks, graphic books, mixed media, and
interactive media.
All concepts and structures related to a Web Publication should enable the creation and/or production of
alternative renderings for visual and auditory content.
-
Sree wants to access audiobooks while commuting, jogging, doing dishes, or otherwise not able to use his
eyes or hands.
-
Daniel wants to complete his assigned class readings during his commute to work. His textbook Web Publication allows him
to access an audio version of the each of the readings.
-
Khoudia, a librarian focusing on the children's section of her local library, is looking exclusively for
material rich in audio and video components so as to reach a wider age bracket.
-
James, a musician, requires that the musical score within a publication come preformatted in braille music
notation in order to read it, as he uses freely available assistive technology which does not have braille
music translations built in.
Data
Web Publications should be able to include data as resources, just as
it does with text, images, etc.
-
Rosa has submitted an article to EsteemedJournal and provided her research data in CSV format. She and
EsteemedJournal provide users access to the CSVs when accessing her article in any situation by including
the CSV data, as well as the Javascript library to display the content in human friendly form, as part of
the Web Publication.
-
A news organization wishes to run a series of articles containing graphs based on a set of raw data. Those
graphs are generated dynamically in each article depending on what the topic is. The data is also available
to the user for transparency.
-
An exercise in an educational Web Publication requires the student to push data to a CSV and analyze the
subsequent graphs that are dynamically created with each datapoint that the student enters.
Protection
A Web Publication should allow for application of access control and write protections of the publication.
-
A library may loan the publication for two weeks or a university may make a textbook available for its
students for the course of the year. A Web Publication should provide a means to inform user agents about
the availability period to enable the UA to control access accordingly.
-
Alice is working on potentially Nobel prize winning research and has drafted her paper describing her
discoveries. She asks her print disabled friend Bob to review the paper, but needs to make sure that the Web
Publication retains specific protections on what Bob is able to do with the publication
without restricting Bob's assistive technology from accessing the content.
Packaging
It should be possible to create and distribute a Web Publication as a single unit over different protocols or
physical media.
This can be done through the usage of Packaged Web Publications.
-
HA, Ltd, a publisher of legal briefs, needs to distribute content in a consumable format to its clients via
secure email.
-
Dalia, a patent lawyer, wants to consume content on a multitude of devices, some of which may not always have
connectivity. In order to meet her expectations, it is necessary to have all required content grouped in a
logical structure that can be easily transferred between devices.
-
Andreas is working on his first collaborative research paper with a fellow student. He wants to share a
relevant publication that includes content, diagrams, and datasets with his writing partner. He does not have
time to learn how to share each component so that his partner can access it all without much effort; he
expects to be able to share this material as a single unit via the chatting system that they use to
collaborate.
-
Dave is reading Moby Dick on his tablet (at home with network connectivity). He then jumps on a plane with his
good friend Tzviya. After having finished reading the book, he wants to lend it to Tzviya, so that she can
start reading on her own tablet. They are both offline, but can exchange data with SD cards or Bluetooth.
-
Giselle is an independent author who has produced the audio version of her latest novel. She wants to
distribute it directly to a few beta readers before launch without worrying about sending a list of files
with no structure.
A Packaged Web Publication (PWP) should include means to map the identification of a constituent resource
between the Web and its equivalent in a package.
In order to allow a Web Publication to be packaged without any changes to the content, it may be necessary to
provide a mapping from the (absolute) URLs present in the publication to URLs that point to the constituent
resources inside the package.
-
An archival service wants to harvest (spider) a Web Publication and not have to modify the OWP content
during the process. In order to achieve that goal, its manifest would incorporate a mapping from the URIs
present in the OWP content to their new location inside the archive.
The publisher should be able to provide information in a Packaged Web Publication that can be used to check
the origin of the publication and its authenticity.
-
Michael, who is a lawyer, and uses the publications of LegalPublisher Ltd., must be 100% sure that the
publication he uses for his case has indeed been published by LegalPublisher Ltd. and not by a possible
third party. This can be done because LegalPublisher Ltd. adds the necessary cryptographic information to
the Web Publication proving its own identity.
-
Brendan wants to make sure that the version of a textbook Web Publication that he wants to purchase from an online retailer
is the correct one required for his class.
The publisher should be able to provide information in a Packaged Web Publication proving that the publication
has not been tampered with during delivery.
-
Luke has written another book, this time using all of the capabilities of the Open Web Platform that he can
think of, including using the readers location to adapt the content. He submits the book for review to a Web
Publication retail platform where the book is signed by the publisher. When purchased, the user agent detects that
the book came from a trusted source and has not been modified, therefore allowing it to use the full
capabilities of the web platform.
-
LegalPublisher Ltd. regularly publishes the official legal texts and regulation as decided by the local
government. Michael, who is a lawyer, has access to these documents via his law firm and uses them for his
cases; to do so, he must be 100% sure that the publication he accesses faithfully reproduces the latest
governmental decisions. This can be done because LegalPublisher Ltd. adds the necessary cryptographic
information to the Web Publication that becomes invalid if any resource of the Web Publication changes.
User Agent Operation
Time-based Media
If a Web Publication contains time-based media, a user agent should provide a player interface that is
accessible.
The player interface should allow for the following use cases:
-
Experiencing an audiobook, video, or other time-based media to completion without user interaction
-
Experiencing an audiobook, video, or other time-based media at the user's desired pace without audio
distortion
-
Helga wants to cook, drive, or run while listening to an audiobook and can’t be interrupted by any request for
additional inputs.
-
Yitian is listening to an audiobook on a device where the only input available is play and pause.
-
Suraj wants to listen to an audiobook on his smart speaker by using a voice input to start the playback.
In time-based media in a Web Publication, It should be possible to navigate not only by chapter/section but by
short segments of time.
-
Mateus is writing a book review and wants to find a specific segment in the audiobook to quote and review. He
remembers it’s early in chapter 3, so he wants to open the audiobook to chapter 3 and listen, skipping
forward in ten second increments using the player provided by the user agent, until he finds it.
-
Delta is listening to an audiobook in her car but realizes she missed the last 30 seconds because she was
merging onto a highway and wasn’t focused on the content. She wants to pan back 30 seconds to listen to the
content she missed.
-
Stanford just opened his audiobook and wants to skip through the copyright and title info as well as the
dedication without opening up the table of contents.
-
Sasha is rewatching a video in her educational Web Publication for a refresher and wants to skip over
introductory content and content that she already has a firm understanding of.
If a Web Publication contains time-based media, a user should be able to understand the duration of the media,
both in its entirety and of its constituent parts.
-
Bruce is watching a series of videos in a Web Publication and wants to know how long it will be until the
next video plays. He is able to view the duration of the current video in the player interface of the user
agent.
Progression
User agents should provide the option for the user to save their progression in the publication and return the
user to the last location they saved the next time they open the publication.
-
Ann is visually impaired and is reading a sample test paper with objective-type questions. The answers of
the test are given at the end of the publication. Ann needs to read the question one at a time and then
check if her answer is correct. Therefore she must bookmark both the questions and answers provided at the
end of the book so that she is able to switch between the both efficiently.
-
Mateusz runs a Dungeons and Dragons campaign. He would like to be able to mark off specific sections of
different rulebooks and share those bookmarks so that players in his game can quickly reference the material.
-
Aika is reading a novel on her 9-inch tablet, bookmarks her location, and switches to her 5-inch phone. She
would like to be able to resume reading from the same point where she left off, given that she may not be
using the same user agent on each device that contains some mechanism that would sync the content position.
-
Julia prepares her lessons for the next day. In her textbook Web Publication, she bookmarks several locations while
doing her prep.
Reading State
The user must be able to leave the Web Publication and return to it at the last position they left from. The
User Agent must retain the reading position, based on the last known position of the reader in the Web
Publication. The position should be based on the reader's position in the file within the reading order.
The user agent may retain reading state if the web publication is
revised. If the user agent consists of a player interface, that
interface should allow the ability to leave and return to the
content in the same position where the reader left off.
-
Filbert is reading a comic as a Web Publication. He exits the user agent without bookmarking or saving his
location in the Web Publication. When he returns the next day, he opens the Web Publication and continues
reading from where he had left off.
-
Sainath fell asleep the last time he was listening to his audiobook. When he opened it for his next
listening session it was on chapter 6 but he needs to navigate back to chapter 3 because that’s the last one
he remembers listening to.
-
Nelleke is listening to The Iliad on her walk to work. The audiobook is 14 hours long, but her walk
only takes 30 minutes. She would like to stop the playback when she arrives and continue where she left off
on her trip home.
Movement
It should be possible to see the Web Publication in a “paginated” view. When a user agent renders a Web
Publication in a paginated layout, it must lay out each document in the default reading order sequentially,
with the last page of a resource being followed by the first page of the subsequent one.
Whereas a “scrolling” view is the dominating approach on the Web in browsers, a user or author may wish to
view their publications in a paginated view. As such, it should be possible for an individual publication or
user agent to provide the ability to switch to pagination view. This pagination may automatically adapt page
sizes to the device’s or the browser’s viewport and may contain separate headers, footers, and/or page
numbers.
This is distinct from the need to retain original page numbering (often from the print edition) which must be
available on demand and must be usable to discover specific locations in the publication.
For more detailed requirements on pagination, see here.
Time-based media, especially a Web Publication consisting solely of time-based media, such as an audiobook,
may be presented as a single page with a player module presenting the content metadata. This player may
automatically adapt size and features according to the device or browser's viewport. This view may not have
page numbering, but reading position would correspond to a time value.
For navigation within time-based media such as audio and video, refer to Time-based Media.
-
Ann reads War and Peace which, when printed, is over 1200 pages. In order to have a better sense of
her progress in the book and to make navigation within the book easier (i.e., to support usability), she
decides to switch her reading environment to paged view.
-
Susan uses a flexible CSS layout that includes images to create a rich, interactive publication on the
history of a city. Each major historical milestone is defined as a standalone unit that would be a single
page when printed, with a timeline with the main events in the footer area of the page.
-
IndyPublisher wants to provide transition effects between pages, both within and across content documents.
-
Mr. Oayia, a classroom teacher, says, “Turn to page 137 of your textbook.” Regardless of layout and font
size, students reading digital editions need to find the same location in the textbook as one another and
as students reading the print edition.
-
Alphonse has been listening to an audiobook he enjoys and would like to share a section with a friend. He
tells his friend that the position of the quote is "Chapter 5, 2:33". His friend is able to find the section
with the audiobook player module provided by his user agent.
Offline
A Web Publication should also be available offline.
The same content of the Web Publication should be accessible offline, if circumstances so dictate, without the
necessity for the user to take any particular, technical actions.
-
Omo, a student in a remote Nigerian village, is taking classes online. Connectivity in the village is
unreliable and intermittent. Omo needs to have his textbooks available regardless of actual connectivity.
-
Heather, a frequent international traveler, enjoys reading books and tour guides on her portable device,
regardless of her physical location on any given day. Due to the high mobile network access roaming charges
on her mobile network, she tends to download as much of her reading material as possible where she can avoid
those additional charges.
-
Gemma is building a private collection of publications that
she expects to be available to her whether online or offline, over the public Internet, or within a private
local area network (LAN).
-
In-house documents may have to be accessed both online and offline, depending on the access point. While
online access might be beneficial when done from the work floor (e.g., at an airplane production line), the
same documents may need reliable offline access (e.g., in the cockpit).
-
Gyöngyi, selected as a peer reviewer for the Journal of Scholarly Publications, only has time to review her
assigned publication while commuting on the train to her university where she does not have connectivity.
Since her review process includes the creation of annotations, notes, highlights, and possibly changes on
the content itself, it is important that these changes be smoothly transferred back to the server of
the journal when she is back online.
A user agent needs to know the information required to allow the user to access content offline or actively
streaming, based on the size and nature of the content, and conditions imposed by the user.
-
João lives in Recife where he has very spotty internet and slow download speeds. He enjoys listening to
audiobooks during his morning bike ride and would like to listen to his books before they complete
downloading, as fully downloading a book could take several days.
-
Rich is watching a video embedded in his textbook WP for a class. He is streaming the video rather than
waiting for it download because he does not have enough space on his device to download all the videos
required to watch for the class.
-
Wendy is running late for work but wants to listen to her audiobook on her commute. As she leaves she
realizes she has not yet downloaded it, but there is no Wi-Fi available. She has plenty of data and decides
to stream the content as she makes her way to the office.
-
Sally is preparing for a flight from London to LA tomorrow. She would like to listen to an entire Web
Publication during the flight, which does not have Wi-Fi. Via her user agent, she is able to listen to the
Web Publication offline.
Personalization
The user must have the possibility of personalizing his or her reading experience. This may include, for example,
controlling such features as font size, choice of fonts, background and foreground color, tone of audio, etc.
-
Olga, a dyslexic student, downloads a textbook and proceeds to personalize the material with larger and/or
a specialized dyslexic font, as well as different contrast that, for her particular case, makes the text
easier to consume.
-
When reading a book in the sun, Mia adjusts the background color to allow for a stronger contrast so that
she can see the text.
-
While reading a book on computer programming, Ransheed wants to change the font into a local font. However,
the code samples within the text should remain in a fixed-width font.
-
Buffy is deafblind. Every morning she downloads her daily newspaper. Like most news sites, it provides many
rich multimedia presentations. As a high-quality, accessible news site, its multimedia presentations come
with captions and transcripts. Buffy does not want to waste her data plan on the useless-to-her audio and
video content, so she instructs her user agents to ignore them.
Non-WP User Agents
A non-WP user agent should be able to access the content of a Web Publication.
Since Web Publications are based on the Open Web Platform, a Web Publication's
constituent HTML pages, video, audio, images, interactive components, and other media, should be accessible
to a non-WP user agent. Creators of Web Publications should allow for the user to be able to access this
content.
Special consideration should be given to Web Publications where time-based media is the main or only component,
such as an audiobook. To allow the user to access this content, the Web Publication should provide the user
with the ability to:
-
Display title and duration of the book to the user
-
Play chapters defined in the book one after the other. It may not include end to end continuous reading of
the book
-
Navigate to a chapter defined in the book and start playback from there
-
Pause and resume
-
Start playing from a page number or a time point defined in the book
-
Forward and rewind by some time interval
-
John found an audiobook Web Publication in his university, which he wants to start listening to immediately,
but he does not have an audiobook user agent installed on his university computer. He at least should be able to use
the basic functionality of the book in the vanilla browser available on the university computer.
Packaging
The distribution of a Packaged Web Publication should not affect its iterations.
Simply distributing or sharing a Packaged Web Publication to multiple destinations and devices should not
result in (technically) different iterations of the Web Publication unless they contain modifications that
make them different Web Publications.
-
Publisher Corp. Inc. publishes a new Packaged Web Publication and sends it to its distributors and
customers. This Packaged Web Publication is downloaded to devices or made available to a customer-specific
cloud. Customers can access this file from different retailers, through different applications, either
directly or downloaded from a private cloud. Thus, the Web Publication is duplicated many times, resulting
in a huge number of copies. There remains a single source manifestation, and therefore one
canonical identifier, for all of the items spread across devices and buyers.
-
EducationalPublisher wants to create a custom version of their textbook WP for an instructor. Content from a
few additional sources is added, but the original WPs are not affected.
-
Mary creates a Packaged Web Publication and sends it to Dave and Kristin. Kristin simply sends it along to
two other friends, but Dave adds some comments first to his copy before sending to two friends. By doing
so, Dave has created a new Web Publication with its own canonical identifier,
while the version used by Mary, Kristin, and her friends remains the same as the original.
-
Slicendice Publishing publishes many Packaged Web Publications, some of which are different iterations or
subsets or combinations of others. Slicendice needs not only to be able to uniquely identify each
Web Publication but also to identify each “copy” or “delivery” (“item”) of each of those Web Publications
so that it can track what has been sold and how many of each one have been sold.
-
BigRetailer receives a Web Publication from EsteemedPublisher that it intends to add to its catalogue.
BigRetailer wants to add its own “teaser” via an alternative reading order. To achieve that, BigRetailer
provides its own version of the publication’s manifest that the user agent will use instead of the
publisher’s manifest.
The distribution of Packaged Web Publications should respect the existing processes and expectations of
professional publishing channels as well as ad-hoc methods of distribution (e.g., email).
-
Ahmed acquires a Packaged Web Publication on an e-commerce platform. He expects to be able to receive the
Web Publication as a file (rather than only having access to it online) and to be able to load it onto his
different reading devices.
-
Alice acquires a Packaged Web Publication through a subscription service and downloads it. When, later on,
she decides to unsubscribe from the service, this Web Publication becomes unavailable to her.
-
Leila has just written a report for school as a Web Publication, but she is required to email it to her
teacher. She takes advantage of the fact that it is possible to package up a publication and then sends it
off.
Archiving
We take for granted the relative durability of print artifacts, many of which have survived with little more
than benign neglect. In contrast, digital documents are unlikely to persist without more active interventions,
such as making copies, monitoring software dependencies, and validating integrity. Since future consumers of
publications represent the most open-ended user group, it is desirable that digital documents be instilled
with more of the inherent durability that characterizes print artifacts. Packaged Web Publications offer this
potential by making it easier for archiving services to locate, harvest, update, and describe digital
publications. Long-term preservation of digital publications ensures that they may continue to be accessible
beyond the tenure of individual authors, file formats, publishers, or publishing platforms.
Fundamental use cases and requirements already help aid our archiving requirements (e.g.,
). However, archiving raises additional requirements:
There should be a way to indicate whether one or more Packaged Web Publication components contain (embedded)
descriptive metadata.
An archiving service needs a reliable way to determine which, if any, Web Publication components contain
descriptive metadata, such as those described in metadata and resources.
Without such a mechanism, the archiving service will have to develop and maintain publisher- and/or
platform-specific heuristics for locating or parsing out descriptive metadata, making archiving more
expensive and decreasing the reliability of reporting.
-
An archiving service sets out to conduct an initial harvest of an article. Along with the images, markup,
scripts, and style and layout instructions that constitute the object, it is able to locate a file
containing descriptive metadata. The archiving service retrieves these resources and packages them into a
logical archival unit for ingestion into a preservation repository. A related process identifies and parses
the descriptive metadata and saves its contents into an associated management database.
There should be a way to discover that one or more new components have been added to or deleted from a Web
Publication.
An archiving service needs a reliable way to learn that one or more Packaged Web Publication components have
been added to or removed from a Packaged Web Publication in order to be able to update the associated
archive of the publication.
-
An archiving service regularly polls for changes to an article that it has already archived. One such poll
indicates that several resources have been added to the object. The archiving service retrieves these
resources and stores them as incremental updates to the appropriate archival unit in a preservation
repository.
-
A publisher issues a retraction for a published article, resulting in the addition of new resources to the
object (i.e., the retraction notice) and the removal of others (i.e., the article content). An archiving
service regularly polls for changes to this article, which it has already archived, and discovers the
retraction. The archiving service retrieves the new resources and records those that are no longer
accessible, carrying over the cumulative updates to a preservation repository.
-
A copyright dispute results in the takedown of a published book. An archiving service regularly polls for
changes to this book, which it has already archived, and discovers that it has been taken down. It records
that the resources that constitute the object are no longer accessible and propagates this update to a
preservation repository.
-
A student writing a research report wishes to refer to changes in a certain Web Publication. The student
would like to compare changes across different iterations of that Web Publication and cite an archival
service.
Use Cases by Category
Accessibility
People with disabilities should be able to access the content of a publication. They should be able to
perceive, understand, navigate, and interact with it, as well as contribute to it. Accessibility encompasses
all disabilities that affect access to the content, including visual, auditory, physical, speech, cognitive,
and neurological disabilities.
Internationalization
Publications should be well-adapted to any language, writing system, region, or culture. This includes the
usage, when appropriate, of left-to-right, right-to-left, horizontal, or vertical writing; item numbering;
interactive forms specific to local cultures; usage of the right character sets; and local typographic
conventions.
Device Independence
The content in a Web Publication should be usable on a large number of devices with very different device
characteristics: different screen types and sizes, various input modalities, varying levels of processing power,
etc. These different affordances should be automatic with no, or very little, user intervention.
Security
Publications should be presented by a User Agent using a security model that is at least (if not more) secure
than the standard Web security model. Doing this will prevent
publications that contain malicious attacks, data theft, and other security incidents from impacting users by
jeopardizing the integrity of the underlying data or machine operations.
Privacy
The content in a publication should maintain and support user privacy, in spite of the fact that the evolution
of online technologies has increased the possibility for the collection and processing of personal, and
possibly sensitive, data. However, since a publication may use any part of the OWP, it
may choose to use functionality such as the ability to track a user's activity within the publication.