Meeting minutes
Do we need the concept of Annotation Sets? https://github.com/w3c/epub-specs/issues/3000
Laurent Le Meur: do we replace the annotation structure with a zip that could contain any annotation type plus any information about the book
Laurent Le MeurLM: it makes sense to have information about the book outside of the structure
… what shape should this take
Hadrien Gardeur: even if we have a zip we want a way of grouping the annotations together
… the concept of the set will continue to exist. we want to have a well known location
… we've agreed that there is a usefulness but a limited one for metadata
… mostly for services that need the metadata and for matching but this might not be so useful
… only when we have to present the metadata outside of the book
… so we should have few pieces of metadata
… citation styles would be a useful influence on the kinds of metadata we present
Ivan Herman: we are talking about standardizing the interchange format
… at that level, the set creates one single file for import and export
… when we accepted images, etc. we decided that zip is the format
… then later we decided that zip is the only accepted format
… we do not need the grouping function, one file that lists all the annotations is not needed
… if the annotations sets role is to have collective metadata, there are two things there,
… information about the generator and the publication
<Laurent Le MeurLM> current ebook metadata relative to the book: https://
Ivan Herman: the generator isn't a necessary item in the interchange format
… the metadata for the publication, is mostly copies of data in the package document
… so we are repeating those values.
… which seems superfluous
… my conclusion is that the simplest way to do this is to make the metadata package include a copy of the metadata in the publication
Laurent Le MeurM: there will be import and export of files plus push and pull of sets of annotaions in the rest API
… I don't think there is an issue choosing zip as a container
… it is also possible that our format should support this
… about the generator, many files include this kind of information, though I don't suggest including the date
… this can be useful to differentiate annotations that don't conform to the specifcations
… I would be worried about totally tieing our format to epub by using the opf in annotations
… and it mixes xml and json which is never great
… we just need descriptive information about the book, and no list of annotations
… just title, author, etc or something that links the annotation to the book
Hadrien Gardeur: we recently released Thorium on IOS and we have academic users
… we have almost 40% of everything that is read is in PDF
… so when we implement the highlight feature we will be asked to do it for PDF as well.
… if we want to implement this feature, it has to work for more than epub
… we need something as format agnostic as we can
… when we look at what we have for interchange, de-paged annotations, we define that we use a zip, that we have an extension and profile, and a list in .json
… so we have one document in which we have a list of all documentations even if we have a set
… by definition we need a set if we have mulitiple files
… now I think we have too much data, do we need dc date, dc creator?
… at a minimum we need something to identify annotations when they are separate from the book
Ivan Herman: we have currently a json structure for the set. I feel an itemized list isn't necessary
… we can have a sub directory for organizing the files
… coming back to metadata, we have never formalized that this should work outside of epub
… assuming this is a requirement now, it becomes a different discussion
… we should have a new issue about which metadata is necessary
… we must make clear how the identifiers are related to each other
… we cannot be silent about how this is related to the epub version
… I propose a pr that removes the itemized data from the annotation set/metadata
… we can discuss the exact name later, and we can further discuss it in the pr
Laurent Le MeurM: you will open a PR on the metadata? and we will leave the generator for now
Hadrien Gardeur: I think having one file will all the information is more efficient
… I think there is a utility in having it all in one place
… that means if you look at annotation sets, we have metadata including items
… which is necessary because you need names in json
… i'm not sure we need type, generator, etc
… we need a minimal set of metadata
… I would go for a light weight approach but keep the set
Ivan Herman: we have a clear difference in opinion; then I will hold off on the PR for now
Laurent Le MeurM: it would be useful to have the advice of other reading systems. I will try to get some feedback
Proposing a text on merging [annotations] w3c/epub-specs#3001
Ivan Herman: I have a question about when annotaions are merged. What can we describe normatively
… it turns out there is not much consistency in book metadata so we have to rely on the heuristics in the rs
… so we need a paragraph to make clear to readers what they can expect on import
… if I am a user and not an implementor, I may never see it
… I don't want to hide the text from reading systems either
… if we are able to separate the information into implementer and user sections, then this text should go into the user part
Laurent Le MeurM: most of section six is about what a user can expect
… there are sentences that are aimed at developers
… we can change the language to make it clear that implementers should pay attention to
… rather than duplicating information
Dale Rogers: I notice section 6 is called "best practices for reading systems" we could in each section say what audience a section is aimed at
Laurent Le MeurM: the risk is duplication and being unclear
… developers know how to deal with a use case, if we reorient this section more like use cases we can avoid splitting and duplicating
Ivan Herman: perhaps we could rewrite the section as use cases, and then have other sections or notes for implementors
… most of the sections would be use case oriented
… editorially, the simplest thing is to merge the PR and then we will move it during rewriting
Laurent Le MeurM: I can take a run on rewriting it, I'll show you the branch before the PR
Susan Neuhaus: Do users look at the reading system spec? I expect not, so rewriting it with a use case framing makes sense
Laurent Le MeurM: I think users won't read the spec but there is an interim party, people who make articles and usage notes who would need this information
… and they will pick up on this vocabulary
Add some context to the use of the term Segment in the Target section w3c/epub-specs#2990
Laurent Le MeurM: we know we can create bookmarks and annotations. A bookmark is a placeholder in a text or image but not so precise for a user
… an annotation is a highlight of text or other media, and has some range in the content
… the selectors we define can isolate a range even for a bookmark, since there is no specific marker for a book mark
… will we accept that the selector of a bookmark can be a range or must be a single point?
Hadrien Gardeur: reading systems usually have a specific affordance for bookmarks
… like an icon on the corner of the page, we look at what is currently displayed to the users
… and check if there is a bookmark there
… if there is a superlong range, the text could be displayed across many screens and cause problematic behavior
… i suspect rs will want to define a bookmark that won't cross the boundary of multiple screens
Ivan Herman: this is related to the previous discussion because th choice of selectors is done behind the scenes by the rs
… so this is a matter of the affordance that Laurent Le MeurM was talking about. So the situation you describe won't happen
… so reading systems should do this properly. We should say if a readers intent is to set a bookmark, we should use a single selector
Laurent Le MeurM: a text position selector has a beginning and end, and is a range
… do you agree that a bookmark should be a single location we should add that to the spec
Laurent Le MeurLM: if reading system A generates a bookmark, it will still work on reading system A, but export that annotation to reading system B and B handles bookmarks differently, we could have problems
… even if we include a specification, reading system B will have a sanitisation process
… that will then decide if the bookmark would be at the beginning or end of a range
… we can try to force whatever and people will do what they want
… I don't think we will be able to control what people will do
… it might be good to have some recommendations but reading systems shouldn't expect it
Dale Rogers: a reading system can do what it wants with our declarations, and we can make those declarations