(Packaged) Web Publications (PWP)

Ivan Herman, W3C

Semantic Web in Libraries (SWIB16), Bonn, Germany, 2016-11-29

(Packaged) Web Publications

Ivan Herman, W3C

Semantic Web in Libraries (SWIB16), Bonn, Germany

2016-11-29

These Slides are Available on the Web

See: https://w3c.github.io/dpub/2016/SWIB/

(Slides are in HTML)

A format to represent digital publications: EPUB 3

Cloned Milkmen, Flickr

EPUB 3 is a very mature specification

A wide variety of books have been created

Books with lots of illustrations…

An extract of the 'Petit Prince' with a typical drawing
Antoine de Sain-Exupéry: “Le Petit Prince”, Ebooks libres et gratuits
An extract of Winnie the Pooh with a typical drawing
A.A. Milne: “Winnie-the-Pooh”, Egmont UK Ltd.

Scientific presentations

Slide-like page with lots of mathematical equations
David Mao: “Calculus”

Art books

Page with an annotated high quality reproduction of a Dutch painting
Ingrid Koenen: “Dutch Golden Age”

Mangas

Page of a typical Japanese manga
“ハルコさんの彼氏”, IDPF EPUB3 Sample

Technical books

Extract of a CSS book, with codes and figures
Lea Verou: “CSS Secrets”, O'Reilly
Extract of a Javascript book with figure and code
Michael Fogus: “Functional Programming”, O'Reilly

Books with different character sets and writing directions

An extract of a text in hindi
“The Mahabharata in Devanāgarī (देवनागरी)”, IDPF EPUB3 Sample
A book with hebrew characters
“Israel sailing”, IDPF EPUB3 Sample

It is not only for books! It can be…

…conference proceedings

Cover of a Springer Proceedings of an LOD conference
Article from a Springer Proceedings of an LOD conference

…journals or magazines (articles)

Cover of the JEP journal
Table of content of the JEP journal

…official reports of all kinds

EU brochure in English
European Commission — General Report 2015
EU brochure in Bulgarian
European Commission — General Report 2015

In fact, just about anything!

Screendump of gdrive saving a document as EPUB
Screendump of Apple Pages saving a document as EPUB

What is the secret?

James Arboghast, flickr

Well, at least one of the secrets…

Rough structure of an EPUB file

EPUB Packaging structure diagram

A good example

Figure with a complex image processing done, in fact, in CSS
Lea Verou: “CSS Secrets”, O'Reilly

Bottom line: relationships of EPUB with OWP is fundamental

Are we all done?
I.e., are OWP and EPUB a perfect match?

There are two major areas that need work:

  1. bring OWP and Digital Publishing closer
  2. bring the Web and Digital Publishing closer

Bring OWP and Digital Publishing closer

Missing OWP features

Bring other OWP features to publishing

Bringing these to publishing should (and will…) happen

Bring the Web and Digital Publishing closer

What is, in fact, a (digital) book?

What we get today…

EU report page with signs for downloads
Dump from EU publications' page

What we get today…

EU report in EPUB
Dump from EU publications' page

What we get today…

EU report page with nice outlook, TOC, etc
Dump from EU publications' page

“This should not be the case!”
what does this mean?

Portable Web Publication at a glance

Separation between publishing as Web sites and in an offline package should be diminished to zero

ibta arabia

For example: book in a browser

Joseph Reagle's book as a web page
Joseph Reagle: “Good Faith Collaboration”, PhD Thesis, MIT Press

For example: book in a browser

Joseph Reagle's book as an ebook in reader
Joseph Reagle: “Good Faith Collaboration”, PhD Thesis, MIT Press

For example: I may not be online…

Person sitting in a station with a mobile in hand
Bryan Ong, Flickr

For example: scholarly publishing

Screen dump of an article on F1000
Jullien Colomb et. al: “Sub-strains of Drosophila Canton-S…”, F1000Research

But… why not simply rely only on the current Web?
(with some facilities for offline)

The web already provides all we need!

Not quite…
(even when considering the Web only, i.e., no packaging)

Need for the concept of a “publication” of many resources

Why do we need the WP concept?

Why do we need the WP concept?

How does that translate to the Web?

a collection of resources with different URL pointer

How does that translate to the Web?

a collection of resources in a 'blob' with one URL pointer

An additional concept:
a “WP Processor”

An internal representation may also be needed

Architectural challenge: handling online/offline

Envisioned architecture:
online

Document consumed through the Web in a traditional way

Envisioned architecture:
offline

Document consumed through a Service Worker, possibly cached

Is this approach at all feasible?

Advances in modern browsers: Web and Service Workers

Advances in modern browsers: Web and Service Workers

Work in progress

A WP Processor can be implemented using Service Workers

Service Workers are coming…

Screen dump of the service workers' draft spec

An example of online/offline book with Service Workers

Screen dump the book “High Performance Browser Networking”

Manifests

Packaged Web Publications

Packaged Web Publications (PWP)

A layer “on top” of WP-s

a collection of resources in a 'blob' in a rectangle with one URL pointer

Structure of an EPUB3 file

EPUB Packaging structure diagram

A Packaging of a Web Publication

PWP packaging structure

PWP Packaging structure diagram with admin file in JSON

A PWP Processor

A PWP Processor

Document consumed through a Service Worker, possibly unpacked

Technical challenge: addressing, identification

Is it “addressing” or “identification”?

Is it “addressing” or “identification”?

What does a Web request return for a locator?

Ergonomy differences

Book reading needs a different approach to ergonomy

Front page of the War and Peace ebook
Lev Tolstoy: “War and Peace”, feedbooks

Personalization

But what about
EPUB???

PWP vs. EPUB3.1

Most things are the same!

Development process

Other synergy effects of convergence

Advantage for the publishers‘ community

Photo of a bookshelf with lots of technical books
Jeffrey Zeldman, Flickr

Advantage for the publishers‘ community

Photo of a bookshelf with lots of technical books
Jeffrey Zeldman, Flickr

Advantage for the Web community

image of a medieval manuscript
Oliver Byrne's edition of Euclid, University of British Columbia

To conclude:
Let us create real publications on the Web!

Some references

Latest PWP Use Cases and Requirements draft:
https://w3c.github.io/dpub-pwp-ucr/
Latest PWP Editors’ draft:
https://w3c.github.io/dpub-pwp/
PWP Issue list:
https://github.com/w3c/dpub-pwp/issues

Some references

Latest PWP Use Cases and Requirements draft:
https://w3c.github.io/dpub-pwp-ucr/
Latest PWP Editors’ draft:
https://w3c.github.io/dpub-pwp/
PWP Issue list:
https://github.com/w3c/dpub-pwp/issues

Constantly evolving…

One more thing…

This is not how Web development works at W3C…

Ed Ritger, Flickr

…it is more like that!

Paul Downey, Flickr

I.e., join W3C to help things moving forward!

Paul Downey, Flickr

Thank you for your attention!

This presentation:
https://w3c.github.io/dpub/2016/SWIB/
(PDF is also available for download)
My contact:
ivan@w3.org