Meeting minutes
Annotations
<Wendy Reid> w3c/
Wendy Reid: and if everyone is good with that, then we can move on to talking about the first public working draft
Ivan Herman: I'm picking up this PR for Laurent Le Meur who is on vacation. This PR makes the old one obsolete
… I recorded the list of selectors we agreed on two weeks ago
… there are two issues for discussion
… we discussed one last time, how to handle the text fragment
… we didn't agree on how to do that
… so I added two approaches, it would be best to remove one of them
… I removed some of the selectors from our table, and added text fragment selector
… it has two downsides, it is not yet a standard, and we don't know when it will be
… we don't know if the syntax will stay on or if it will change
… causing issues
… if it is a URL it has to be encoded, which would make the text fragment pretty much unreadable
… we have a quote selector in the original text
… I added that with a note that it is up to the implementers to decide if they want to use it and how
… there are two advantages to leaving this to the implementers
… the person quoting can set the quote
… and we don't have to worry about the syntax change
… but we are pushing this into the reading system's court
… the second problem: we discussed the text position selector
… which counts characters and selects the text between two integers.
… how do we do this in HTML
… the text position selector was intended for pure text files
… there is normative text that would work and is applied to a different selector and doesn't apply here
… all other things are editorial and don't need discussion
Laurent Le Meur Le Meur: The text must be selected before using the text position selector
… when I read the two sections, quote selector, and text position selector
… there is a conditional approach, if you have to work with copyrighted protected contents, use the text position selector, otherwise use the quote selector
Ivan Herman: I understand now, my comments are withdrawn
Brady Duga: the normalization is ugly, but its OK, because that's what the original text says
… but its not clear where the text stream starts
… where do I get the position for the overall text
Laurent Le Meur Le Meur: I agree that we should specify the text origin
Brady Duga: perhaps it is the inner text of the closest element that we are talikng about, if it there isn't one then it is body.
Hadrien Gardeur: I don't want to do the second option, I don't want to have to process all of the text before I start
… the risk of the syntax changing for text fragment isn't very big
… it is widely used, and the people working on this know that
wendreid: where does this leave us?
Ivan Herman: we have a three way answer for this
… 1. we keep it as it is in the PR
<Ivan Herman> 1. We keep text fragments in both places
<Ivan Herman> 2. remove it from the fragment selector
Laurent Le Meur: In both cases you have the string you want to use
<Ivan Herman> 3. we remove it from the text quote selector
Laurent Le Meur Le Meur: with the text fragment selector you can keep just the begining and the end
… the choice we've got is not between different models, it is
… between a syntax that is within the specification but not widely known
… and with one that is widely known but not part of the specifcation
Brady Duga: I think we should choose one, and keep in mind we might need to switch
Ivan Herman: I almost agree with Brady, it is safer to keep the text quote selector
… we push the specification out of the problem
Brady Duga: as long as we pick one I'm happy
Ivan Herman: the annotation standard we develop is for exchanging annotations between systems
… it is not about the styling, and the text quote selector will be much more readable
Brady Duga: the text fragment selector is probably better defined
… having a well defined algorithm is important, but that is a guess
Laurent Le Meur Le Meur: I agree with Brady, the text fragment selector seems well written
… but the quote selector is not
… in the text fragment selector you can leave the middle unsaid
Brady Duga: that is a huge difference when we talk about copyright.
… some systems use precentage of the book for copyright
… with the fragment selector, we wouldn't be using up so much of the text
Ivan Herman: readability is not an option in the HTML spec
Ivan Herman: I change my vote, and propose we remove the quote selector from our spec
Laurent Le Meur Le Meur: at the time we want to close the recommendation if there is not standard, should we plan now for what we will do?
GeorgeK: publishers in education use both EPUB reading systems and HTML reading systems. Using quote selector would bridge those systems
Hadrien Gardeur: I think we shouldn't publish this if the syntax isn't final. I think it is fine to wait a bit more, there is no reason to rush
Ivan Herman: I'm not sure I understand what Hardrien said
Hadrien Gardeur: I am responding to Laurent Le Meur Le Meur's question
Ivan Herman: we can make a note in the spec now about the possible fallback
… today we should try to publish a spec that is as close as we can to what we want for its final form
Laurent Le Meur Le Meur: if the text fragment syntax isn't finalized, I think we should still publish at the end of the year
Hadrien Gardeur: but it doesn't need to be normative
Wendy Reid: we could leave the annotation spec in the same state as the text fragment spec, as a working group note
Brady Duga: when should we worry about this? we should wait until it was an issue
… rather than solving all possible outcomes
… I think it is too early to worry about it
Ivan Herman: we should move on publishing to the first working draft
… we get the horizontal reviews
… at some point we can say that we suspend the work because we are waiting on a dependent spec
… for the time being we should move ahead believing that the syntax will be resolved
Wendy Reid: our next step is to decide we can announce this as a first draft
… we can still make changes
… this is a good time to publish the first draft and invite feedback
… is there any opposition to publishing this?
Ivan Herman: I don't oppose this. There are two more documents that we may want to publish at some point
… 1. the use case document
… we might want to publish this at the same time
<Ivan Herman> vocab
Ivan Herman: the other one is
… I have been developing the vocabulary in parallel
… it might be good to publish this together because the annotation has references to it
… and to make the links live properly, we would need to have the vocabulary document live
Wendy Reid: is the vocabulary a note?
… we have to publish this as a working draft, and develop a short name
Laurent Le Meur Le Meur: is the a problem with "ann"
Brady Duga: it looks alot like "announcements"
Laurent Le Meur Le Meur: "annot"? is less confusing?
Ivan Herman: i would go with epub-anno
Wendy Reid: that would be consistent with our pattern
Ivan Herman: there is one more question, do we want to add a version number?
… if so, 1.0 or 3.4?
Laurent Le Meur Le Meur: I would go with 3.4 since it is related to EPUB 3.4
Ivan Herman: the version number would go in the title
Shinya Takami: since the accessibility is 1.2, 1.0 is good
magarrish: if this is revision bound with 3.4 then I have no problem with it
<Avneesh Singh> EPUB accessibility is independent of EPUB 3 version.
Laurent Le Meur Le Meur: if we don't modify epub, why would we push annotations changes
<Shinya Takami> +1 to Brady Duga
Brady Duga: it seems like 1.0 makes more sense, then we can update annotations without updating EPUB specs
<Wendy Reid> Proposed: Publish EPUB Annotations 1.0 as a FPWD, with the shortname epub-anno
<Shinya Takami> +1
<Charles LaPierre> +1
<Ivan Herman> +1
<Wendy Reid> +1
<Dale Rogers> +1
<Brady Duga> +1
<Matt Garrish> +1
<Hadrien Gardeur> +1
<Laurent Le Meur Le Meur> +1
<Toshiaki Koike> +1
<Susan Neuhaus> +1
<Grigorily Manucharian> +1
<Romain Deltour> +1
<Avneesh Singh> +1
<Masakazu Kitahara> +1
RESOLUTION: Publish EPUB Annotations 1.0 as a FPWD, with the shortname epub-anno
<GeorgeK> +1
<Wendy Reid> Proposed: Publish the EPUB Annotations Vocabulary as a draft note, with the shortname epub-anno-vocab
<GeorgeK> +1
<Wendy Reid> +1
<Susan Neuhaus> +1
<Charles LaPierre> +1
<Matt Garrish> +1
<Shinya Takami> +1
<Ivan Herman> +1
<Brady Duga> +1
<Dale Rogers> +1
<Toshiaki Koike> +1
<Grigorily Manucharian> +1
<Hadrien Gardeur> +1
<Avneesh Singh> +1
<Laurent Le Meur Le Meur> +1
<Masakazu Kitahara> +1
<Romain Deltour> +1
RESOLUTION: Publish the EPUB Annotations Vocabulary as a draft note, with the shortname epub-anno-vocab
Ivan Herman: is the use UCR ready to publish?
Laurent Le Meur Le Meur: yes it is stable
<Wendy Reid> Proposed: Publish the EPUB Annotations Use Cases document as a working group note, with the shortname epub-anno-ucr
<Shinya Takami> +1
<Matt Garrish> +1
<Wendy Reid> +1
<Ivan Herman> +1
<Laurent Le Meur Le Meur> +1
<Charles LaPierre> +1
<Toshiaki Koike> +1
<Grigorily Manucharian> +1
<Brady Duga> +1
<Romain Deltour> +1
<Susan Neuhaus> +1
<Dale Rogers> +1
<Masakazu Kitahara> +1
<GeorgeK> +1
<Hadrien Gardeur> +1
<Avneesh Singh> +1
RESOLUTION: Publish the EPUB Annotations Use Cases document as a working group note, with the shortname epub-anno-ucr
Multi-granularity highlighting in media overlays
<Wendy Reid> w3c/
Hadrien Gardeur: a number of specialized libraries are moving away from Daisy
… they are using media overlays with human or computer audio
… they are using tools that can generate a media overlay with open source voices
… audiobook publishers are interested in this technology too
… usually we just talk about media overlays in kids books
but there is a wider use
… the technologies are using different levels of highlighting and aligning
… you may want to choose the level of alignment
… it wouldn't take very much to do this
… we would need additional roles in epub type that identify structures
… I use "utterance"
… thanks to the use of seq plus [?] plus epub type it would be easy to do
… it would be a non normative change in that it only adds vocabulary
… and would be backward compaitible
… for reading systems that are aware of this, it could allow users to make a choice between word by word or other anchoring
… we would probably see tools and reading systems capable of this in the short term
… this will add value, is compatible with existing implementations, and we have commitments for reading system support and production
Wendy Reid: I've seen this. I am concern about the nesting required
… is that feasible in a media overlay file?
… the other concern is does this need to be a production concern?
… can reading systems detect this? Can they identify a part of a document as a paragraph, say?
… a third concern is internationalization, since not all languages have the same word boundaries as, say, English
Hadrien Gardeur: about production: many files already wrap words and containers for utterance
… what is missing is the role in SMIL
… it is completely automatable
… the second question: it would be challenging to do this on the reading system side
… for instance tts is not always acurate
Hadrien Gardeur: we don't need to worry about internationalization if we use something like utterance which is pretty open
Ivan Herman: a formal question: you are asking for new values for the role
<Shinya Takami> shiestyle: let's continue this topic next week
Shinya Takami: we are out of time, we can continue this discussion