Publishing Maintenance Working Group Telco

Meeting minutes

Annotations

Wendy Reid: and if everyone is good with that, then we can move on to talking about the first public working draft

Ivan Herman: I'm picking up this PR for Laurent Le Meur who is on vacation. This PR makes the old one obsolete
… I recorded the list of selectors we agreed on two weeks ago
… there are two issues for discussion
… we discussed one last time, how to handle the text fragment
… we didn't agree on how to do that
… so I added two approaches, it would be best to remove one of them
… I removed some of the selectors from our table, and added text fragment selector
… it has two downsides, it is not yet a standard, and we don't know when it will be
… we don't know if the syntax will stay on or if it will change
… causing issues
… if it is a URL it has to be encoded, which would make the text fragment pretty much unreadable
… we have a quote selector in the original text
… I added that with a note that it is up to the implementers to decide if they want to use it and how
… there are two advantages to leaving this to the implementers
… the person quoting can set the quote
… and we don't have to worry about the syntax change
… but we are pushing this into the reading system's court
… the second problem: we discussed the text position selector
… which counts characters and selects the text between two integers.
… how do we do this in HTML
… the text position selector was intended for pure text files
… there is normative text that would work and is applied to a different selector and doesn't apply here
… all other things are editorial and don't need discussion

Laurent Le Meur Le Meur: The text must be selected before using the text position selector
… when I read the two sections, quote selector, and text position selector
… there is a conditional approach, if you have to work with copyrighted protected contents, use the text position selector, otherwise use the quote selector

Ivan Herman: I understand now, my comments are withdrawn

Brady Duga: the normalization is ugly, but its OK, because that's what the original text says
… but its not clear where the text stream starts
… where do I get the position for the overall text

Laurent Le Meur Le Meur: I agree that we should specify the text origin

Brady Duga: perhaps it is the inner text of the closest element that we are talikng about, if it there isn't one then it is body.

Hadrien Gardeur: I don't want to do the second option, I don't want to have to process all of the text before I start
… the risk of the syntax changing for text fragment isn't very big
… it is widely used, and the people working on this know that

wendreid: where does this leave us?

Ivan Herman: we have a three way answer for this
… 1. we keep it as it is in the PR

<Ivan Herman> 1. We keep text fragments in both places

<Ivan Herman> 2. remove it from the fragment selector

Laurent Le Meur: In both cases you have the string you want to use

<Ivan Herman> 3. we remove it from the text quote selector

Laurent Le Meur Le Meur: with the text fragment selector you can keep just the begining and the end
… the choice we've got is not between different models, it is
… between a syntax that is within the specification but not widely known
… and with one that is widely known but not part of the specifcation

Brady Duga: I think we should choose one, and keep in mind we might need to switch

Ivan Herman: I almost agree with Brady, it is safer to keep the text quote selector
… we push the specification out of the problem

Brady Duga: as long as we pick one I'm happy

Ivan Herman: the annotation standard we develop is for exchanging annotations between systems
… it is not about the styling, and the text quote selector will be much more readable

Brady Duga: the text fragment selector is probably better defined
… having a well defined algorithm is important, but that is a guess

Laurent Le Meur Le Meur: I agree with Brady, the text fragment selector seems well written
… but the quote selector is not
… in the text fragment selector you can leave the middle unsaid

Brady Duga: that is a huge difference when we talk about copyright.
… some systems use precentage of the book for copyright
… with the fragment selector, we wouldn't be using up so much of the text

Ivan Herman: readability is not an option in the HTML spec

Ivan Herman: I change my vote, and propose we remove the quote selector from our spec

Laurent Le Meur Le Meur: at the time we want to close the recommendation if there is not standard, should we plan now for what we will do?

GeorgeK: publishers in education use both EPUB reading systems and HTML reading systems. Using quote selector would bridge those systems

Hadrien Gardeur: I think we shouldn't publish this if the syntax isn't final. I think it is fine to wait a bit more, there is no reason to rush

Ivan Herman: I'm not sure I understand what Hardrien said

Hadrien Gardeur: I am responding to Laurent Le Meur Le Meur's question

Ivan Herman: we can make a note in the spec now about the possible fallback
… today we should try to publish a spec that is as close as we can to what we want for its final form

Laurent Le Meur Le Meur: if the text fragment syntax isn't finalized, I think we should still publish at the end of the year

Hadrien Gardeur: but it doesn't need to be normative

Wendy Reid: we could leave the annotation spec in the same state as the text fragment spec, as a working group note

Brady Duga: when should we worry about this? we should wait until it was an issue
… rather than solving all possible outcomes
… I think it is too early to worry about it

Ivan Herman: we should move on publishing to the first working draft
… we get the horizontal reviews
… at some point we can say that we suspend the work because we are waiting on a dependent spec
… for the time being we should move ahead believing that the syntax will be resolved

Wendy Reid: our next step is to decide we can announce this as a first draft
… we can still make changes
… this is a good time to publish the first draft and invite feedback
… is there any opposition to publishing this?

Ivan Herman: I don't oppose this. There are two more documents that we may want to publish at some point
… 1. the use case document
… we might want to publish this at the same time

<Ivan Herman> vocab

Ivan Herman: the other one is
… I have been developing the vocabulary in parallel
… it might be good to publish this together because the annotation has references to it
… and to make the links live properly, we would need to have the vocabulary document live

Wendy Reid: is the vocabulary a note?
… we have to publish this as a working draft, and develop a short name

Laurent Le Meur Le Meur: is the a problem with "ann"

Brady Duga: it looks alot like "announcements"

Laurent Le Meur Le Meur: "annot"? is less confusing?

Ivan Herman: i would go with epub-anno

Wendy Reid: that would be consistent with our pattern

Ivan Herman: there is one more question, do we want to add a version number?
… if so, 1.0 or 3.4?

Laurent Le Meur Le Meur: I would go with 3.4 since it is related to EPUB 3.4

Ivan Herman: the version number would go in the title

Shinya Takami: since the accessibility is 1.2, 1.0 is good

magarrish: if this is revision bound with 3.4 then I have no problem with it

<Avneesh Singh> EPUB accessibility is independent of EPUB 3 version.

Laurent Le Meur Le Meur: if we don't modify epub, why would we push annotations changes

<Shinya Takami> +1 to Brady Duga

Brady Duga: it seems like 1.0 makes more sense, then we can update annotations without updating EPUB specs

<Wendy Reid> Proposed: Publish EPUB Annotations 1.0 as a FPWD, with the shortname epub-anno

<Shinya Takami> +1

<Charles LaPierre> +1

<Ivan Herman> +1

<Wendy Reid> +1

<Dale Rogers> +1

<Brady Duga> +1

<Matt Garrish> +1

<Hadrien Gardeur> +1

<Laurent Le Meur Le Meur> +1

<Toshiaki Koike> +1

<Susan Neuhaus> +1

<Grigorily Manucharian> +1

<Romain Deltour> +1

<Avneesh Singh> +1

<Masakazu Kitahara> +1

RESOLUTION: Publish EPUB Annotations 1.0 as a FPWD, with the shortname epub-anno

<GeorgeK> +1

<Wendy Reid> Proposed: Publish the EPUB Annotations Vocabulary as a draft note, with the shortname epub-anno-vocab

<GeorgeK> +1

<Wendy Reid> +1

<Susan Neuhaus> +1

<Charles LaPierre> +1

<Matt Garrish> +1

<Shinya Takami> +1

<Ivan Herman> +1

<Brady Duga> +1

<Dale Rogers> +1

<Toshiaki Koike> +1

<Grigorily Manucharian> +1

<Hadrien Gardeur> +1

<Avneesh Singh> +1

<Laurent Le Meur Le Meur> +1

<Masakazu Kitahara> +1

<Romain Deltour> +1

RESOLUTION: Publish the EPUB Annotations Vocabulary as a draft note, with the shortname epub-anno-vocab

Ivan Herman: is the use UCR ready to publish?

Laurent Le Meur Le Meur: yes it is stable

<Wendy Reid> Proposed: Publish the EPUB Annotations Use Cases document as a working group note, with the shortname epub-anno-ucr

<Shinya Takami> +1

<Matt Garrish> +1

<Wendy Reid> +1

<Ivan Herman> +1

<Laurent Le Meur Le Meur> +1

<Charles LaPierre> +1

<Toshiaki Koike> +1

<Grigorily Manucharian> +1

<Brady Duga> +1

<Romain Deltour> +1

<Susan Neuhaus> +1

<Dale Rogers> +1

<Masakazu Kitahara> +1

<GeorgeK> +1

<Hadrien Gardeur> +1

<Avneesh Singh> +1

RESOLUTION: Publish the EPUB Annotations Use Cases document as a working group note, with the shortname epub-anno-ucr

Multi-granularity highlighting in media overlays

<Wendy Reid> w3c/epub-specs#2917

Hadrien Gardeur: a number of specialized libraries are moving away from Daisy
… they are using media overlays with human or computer audio
… they are using tools that can generate a media overlay with open source voices
… audiobook publishers are interested in this technology too
… usually we just talk about media overlays in kids books

but there is a wider use
… the technologies are using different levels of highlighting and aligning
… you may want to choose the level of alignment
… it wouldn't take very much to do this
… we would need additional roles in epub type that identify structures
… I use "utterance"
… thanks to the use of seq plus [?] plus epub type it would be easy to do
… it would be a non normative change in that it only adds vocabulary
… and would be backward compaitible
… for reading systems that are aware of this, it could allow users to make a choice between word by word or other anchoring
… we would probably see tools and reading systems capable of this in the short term
… this will add value, is compatible with existing implementations, and we have commitments for reading system support and production

Wendy Reid: I've seen this. I am concern about the nesting required
… is that feasible in a media overlay file?
… the other concern is does this need to be a production concern?
… can reading systems detect this? Can they identify a part of a document as a paragraph, say?
… a third concern is internationalization, since not all languages have the same word boundaries as, say, English

Hadrien Gardeur: about production: many files already wrap words and containers for utterance
… what is missing is the role in SMIL
… it is completely automatable
… the second question: it would be challenging to do this on the reading system side
… for instance tts is not always acurate

Hadrien Gardeur: we don't need to worry about internationalization if we use something like utterance which is pretty open

Ivan Herman: a formal question: you are asking for new values for the role

<Shinya Takami> shiestyle: let's continue this topic next week

Shinya Takami: we are out of time, we can continue this discussion

Publishing Maintenance Working Group Telco

12 February 2026

Attendees

Meeting minutes

Annotations

Multi-granularity highlighting in media overlays

Summary of resolutions