Status of This Document
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This is a work in progress. No section should be considered final, and the absence of any content does not imply that such content is out of scope, or may not appear in the future. If you feel something should be covered here, tell us! The initial version of this document will focus on books, and at this time will not include requirements specific to magazines or newspapers. The scope will depend heavily on the willingness of people to contribute to this document. Please contact the Digital Publishing Interest Group if you would like to help. Once the document is stable, the group will publish it as an Interest Group Note.
This document was published by the Digital Publishing Interest Group as an Editor’s Draft. If you wish to make comments regarding this document, please send them to public-digipub@w3.org (subscribe, archives). All comments are welcome.
Publication as an Editor’s Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 1 August 2014 W3C Process Document.
1. Introduction
Not all stories worth telling can fit in a tweet, on a computer screen, or on a single piece of paper. Ever since the codex replaced the scroll, humans have divided our stories into pages. Pagination is the art and the craft of turning that scroll of content into discrete pieces, whether destined for book pages or screens. Pagination requires us to think about the document at all levels, from the total number of pages to the tiny spaces between letters. Along with graphic design and typography, it determines the look of the page.Typography is the craft of endowing human language with a durable visual form, and thus with an independent existence.
—Robert Bringhurst, The Elements of Typographic Style
Good pagination, like good typography, aims to be invisible. As the reader turns the page, the stream of words and images in her mind should not be interrupted. Two thousand years of experience have taught us how best to do this. The goal of this document is to describe those rules, as clearly as possible, so they can be implemented in the Open Web Platform. We hope for a day where the pagination of digital books will be as beautiful and transparent as the best printed books.
Note: Our goal is for this document to describe layout and pagination for all languages that use the Latin script.
2. Fundamentals
Makeup is a highly skilled procedure. If the text is merely divided mechanically into portions of equal length, without regard to where the divisions fall, some of the pages that result are bound to be unacceptable logically or aesthetically: they will incorporate bad breaks.
—Chicago Manual of Style, 14th Edition, 19.40.
What therefore God hath joined together, let not man put asunder.
—The Bible, Matthew 19:6
Every rule of pagination boils down to a single principle: break pages with as little disruption to the reading experience as possible. A widow leaves the last line of a paragraph isolated from the rest of the thought. A recto hyphen means a word is interrupted by a page turn. A heading at the bottom of a page removes the title from the section, and the section from the title.
2.1. Tradeoffs
Pagination involves tradeoffs. Fixing a widow may result in a misaligned spread. Fixing that may result in a loose line or paragraph. What is acceptable in one book, or for one publisher, may be unacceptable to another. What is acceptable in one country, or language, may be unacceptable elsewhere.
2.2. Untangling the Vertical and the Horizontal
Page breaks are often line breaks. The tiniest change in kerning can make a paragraph longer or shorter, and thus create a widow or an orphan. The work of pagination, as done by typesetters, human or machine, inevitably involves the consideration of the lines of text. And so we will not try too hard to avoid talking about line breaks, when they potentially influence pagination.
3. The Spatial Geometry of Pages: Spreads and Bleeds
Open a printed book and what you see isn’t a single page, but two pages side by side—a spread.
Books set in Latin scripts typically share some common features.
- Books are bound on the left-hand side.
- Text is written left-to-right and top-to-bottom.
- Recto and verso pages may not be symmetric.
3.1. Crossing the Gutter
Large images, tables, or sidebars may extend across both pages of a spread.
3.2. Page size, orientation, and arrangement
The size, orientation, and arrangement of pages might vary even in a single book:
- Turned pages TK
- Foldouts are pages larger than the trim size of the book, folded in order to not extend beyond the other pages.
3.3. Bleeds
Books are printed on large sheets of paper, which are then folded and cut. Since the cutting is not infinitely precise, any object that should extend to the very edge of the page in fact needs to extend a bit outside the page boundary. A bleed is the part of an object that extends outside the page, generally by a small amount such as 9 points.
Note: Imposition and related topics are out-of-scope for this document.
4. Hyphenation and Justification
Good hyphenation and justification is critically important to the appearance and readability of text. Print typesetting systems can often achieve very good results, but most online reading systems do this very poorly.
4.1. Hyphenation
Text is often easier to read when words are allowed to break at the end of lines, thus avoiding massive variations in word-spacing or margins. But determining acceptable places to break words is a difficult problem:
All of the following are the results of automated hyphenation algorithms:
pre-ached wee-knights read-just leg-ends ex-acting co-inage
Words hyphenate differently based on pronunciation or meaning:
photo-graph pho-togra-pher re-cord (verb) rec-ord (noun) pres-ent (verb) pre-sent (noun) cre-ator crea-ture
4.1.1. Parameters for Hyphenation
The following choices need to be made when considering hyphenation of text.
- Should this text be hyphenated at all? Hyphenation is generally suppressed in headings. [CSS3TEXT] includes a hyphens property, to enable or disable hyphenation.
- What’s the shortest word that can be hyphenated? Five or six is typical.
- What’s the minimum number of characters allowed before a hyphen? Two is typical, and is sometimes stated as “two-up.” PrinceXML has the
prince-hyphenate-before
property, but this is not in any current CSS draft. - What’s the minimum number of characters allowed after a hyphen? Three is typical, and can be stated as “three-down.” PrinceXML has the
prince-hyphenate-after
property, but this is not in any current CSS draft. - How many consecutive lines can end with a hyphen (known as a “ladder”)? Two or three is typical. PrinceXML has the
prince-hyphenate-lines
property, but this is not in any current CSS draft. - Should capitalized words be hyphenated?
- Can the last word of a paragraph be hyphenated?
- Can the last word in a column, page, or spread be hyphenated?
4.1.2. Choosing hyphenation points
A key question is, “who decides what is acceptable?” The answer depends on the language, the culture, the subject matter, and the material being typeset.
4.1.2.1. Language
Each language has its own conventions about hyphenation. U.S. English hyphenates differently than U.K. English. In some European languages, words may be spelled differently when hyphenated.
Of course, the same text may include words from many different languages.
4.1.2.2. Culture
Even within the same language, authorities differ on the proper hyphenation of words.
Copyeditors will often specify a canonical reference for hyphenation, which is usually a particular edition of a particular dictionary.
4.1.2.3. Subject Matter
Specialized subject matter may require additional hyphenation dictionaries. This is common in medicine, law, and science.
4.1.2.4. Exceptions
Authors should be able to provide a list of exceptions, which add to or override what the system would normally do. The format for doing so should be easily understood.
TeX uses the following format. Possible hyphenation positions are indicated with (surprise!) hyphens. Hyphenation should be prevented where hyphens are absent.
\hyphenation { sur-pris-ingly tan-ta-liz-ing-ly these }
PrinceXML uses a prefixed property prince-hyphenate-patterns: url('en_US.dic');
to load a hyphenation dictionary. No current CSS specification includes support for this idea.
4.2. Justification
4.2.1. Algorithms
4.2.1.1. Greedy
4.2.1.2. Knuth-Plass (TeX)
4.2.1.3. Adobe (InDesign)
5. Paginating Single-Column Text
The simplest situation, which is very common, is when the content is only text, in a single column. Aside from chapter and book optimizations (to be discussed later) and line-breaking, the biggest issue is likely to be widows (see this figure for an example).
5.1. Widows
A widow is when the last line(s) of a paragraph falls at the top of a page. Publishers have different standards. Most frown on a single line at the top of the page, although some are OK if that line spans at least three-quarters of the page. Others require at least two lines of a paragraph at the top of a page.
Note: [css3-break] includes the widows and orphans properties, with integer values.
[css3-break] does not consider a fractional value for the widow or orphan properties.
Many typesetting systems have settings to prevent widows. CSS discusses these issues in [css3-break]. Unfortunately, these systems usually create another problem when they fix the widow. In this figure, there’s no longer a widow at the top of the page, but since the system merely moved a line from the left page to the right, it left behind an empty line, and the pages no longer align at the bottom.
More needs to be done. Removing one line of text from each page of the spread, shown in this figure, solves the problem.
5.2. Orphans
An orphan has two possible meanings in typesetting. It can refer to the minimum number of lines required before a page break (as in [css3-break]). It can also refer to the last line of a paragraph in any context. In the former context, many publishers now accept a single line of a paragraph before a page break. For the latter, standards vary widely. Some publishers want the last line to be longer than the paragraph indent. Some require one or two full words, or a certain number of characters. Most avoid having only a fragment of a word as the last line.
CSS does not currently address the second meaning of orphan.
5.3. Constraints on page depth
In traditional typesetting, the first defense against bad breaks is to change the depth of the page. “Running long” or “running short” means including one more (or one less) line of text on each page of the spread, thus sidestepping the previously-identified issue.
A typical book design includes instructions on whether it’s acceptable to run short, long, or (more rarely) both. Often there are also constraints on how many consecutive spreads (or pages) may be altered in this way. If running both long and short, it’s usually forbidden to go from one to another without an intervening normal spread.
Running long or short may affect the space between the last line of text and a page footer or folio. Most publishers prefer footers to be in a fixed position. If, instead, the space between the last line of text and the footer is fixed, the footer is said to "bounce."
5.4. Facing Pages
If a document has facing pages, the publisher usually requires that they align top and bottom. Exceptions include:
- It’s the last page of a chapter.
- The page contains no text—only images or tables
- When aligning facing pages will make some other issue worse
5.5. Recto and Verso Hyphens
Publishers sometimes constrain what characters may appear before a page break. Most commonly, the right-hand page of a spread may not end with a word fragment, as the reader must turn the page before reading the rest of the word. Less common is a prohibition on the verso page ending with a hyphen.
5.6. Space Breaks and Ornaments
Many novels, and some narrative non-fiction books, include small breaks in the text. These are usually represented by one to three blank lines, or by a small ornament or dingbat. Problems arise when these breaks fall at the top or bottom of a page.
If, however, the space break falls at the bottom of the page, confusion can result. In the figure below, it’s hard to tell there’s a space break, as it just looks like the page is a few lines short.
In that case, asterisks or some other ornament is added to the top or bottom of a page, as a visual reminder of the break. To get everything to work out, the spread was run short, and the space break (now with ornament) pushed to the top of the second page:
This is an example where the page position of an element determines its content as well as design. A ::page-top
or ::page-bottom
pseudo-element might prove useful.
6. Paragraphs and indentation
7. Fonts
Texts are built from letters. Modern typesetting systems must be able to choose from hundreds or thousands of glyphs depending on the circumstances.
7.1. Ligatures
Two or more letters may be better displayed as a single glyph:
7.2. Numbers and math
7.2.1. Lining, oldstyle, and tabular figures
Traditionally, text set in mixed-case type should use old-style figures. Text set in all caps should use lining figures. Columns of numbers (such as in tables) are clearer using tabular figures, which are of uniform width.
7.3. Alternate forms
7.3.1. Caps
Text may be set in a mix of uppercase and lowercase, exclusively in upper- or lowercase, in small caps, in caps/small caps, etc. In many cases this is purely a design choice, and the displayed text may use a different case than the source document.
7.3.2. Swashes and Stylistic Alternates
8. Initial Capitals
Large, decorative letters have been used to start new sections of text since long before printing. In fact, their use predates lowercase letters entirely.
8.1. Drop caps
A drop cap is a larger-than-usual letter at the start of a paragraph, with a baseline at least one line lower than the first baseline of the paragraph. The size of drop caps is usually indicated by how many lines they occupy—two-line and three-line drop caps are the most common.
Aligning the letter vertically is a challenge. The cap height of the letter should align with the cap height of the first line of text. The baseline of the letter should fall on the baseline of one of the following lines (the second for a 2-line drop cap, etc.).
The horizontal position of the drop cap and the surrounding text is also an issue, as variations in glyph shapes may require increasing or decreasing space to the right of the drop cap, and in some cases separate adjustments may be required for each line adjacent to the drop cap.
The position of a drop cap in relation to the left margin may also need to be adjusted. Letters like "C" may need to move left slightly to visually align with the left margin.
A drop cap may be desired on a paragraph which starts with a punctuation mark, most often a quotation mark. In this case, one option is to delete the quotation mark entirely.
Input on techniques for coping with initial punctuation on drop caps would be appreciated.
8.2. Raised caps and sunken caps
A raised cap is a large letter used to start a paragraph, which uses the same baseline as the rest of the first line. A sunken cap both sinks below the text baseline, and extends above.
Note: The CSS Working Group has proposed an initial-letter property to allow for properly-aligned drop caps. See dev.w3.org/csswg/css-inline/#DropInitial.
9. Running headers and footers
Books often have material printed at the top and/or bottom of each page, outside the normal content area. These headers or footers may serve as guideposts for reader, fodder for designers, low-tech DRM, or merely a way to know what book your fellow train passenger is reading. There’s more to running headers than is dreamt of in the open web platform…
9.1. Content
Running heads and footers may contain:
- Content from the document: book title, chapter or part titles, author name(s). Indexes and notes sections may have running heads to identify which entries are on a particular page.
- Content intended only for running heads: shortened versions of chapter titles…
- Counters of all sorts: page numbers, section numbers, chapter numbers.
- Ornaments, decorative type, or images
- Copyright statements or other boilerplate
- Date and/or time stamps
- File names
- Version numbers
- combinations of the above
In some cases the content of running heads may have an internal structure—a chapter title might have an italic word—or may require different text styles or fonts.
In this example, the running header contains the author name, the page number, and an ornament. This seemingly simple case was quite complex, using [css3-gcpm]-like features implemented by PrinceXML.
@page body:left { @top-center { content: flow(verso); } } p.verso-cus { flow: static(verso); content: prince-glyph-index(80); font-family: 'Type Embellishments One'; font-size: 10pt; text-align: center; } p.verso-cus:before { content: counter(page); display: inline; padding-right: 15pt; font-family: 'Garamond 3 LT Std'; font-style: italic; font-size: 10pt; } p.verso-cus:after { content: string(flow-header-left-rw); display: inline; padding-left: 15pt; font-family: 'Garamond 3 LT Std'; font-style: italic; font-size: 10pt; }
9.2. Which content?
An element whose content is used in running heads may appear many times on a page. Authors must be able to specify which instance is used. [css3-gcpm] provides the start, first, last, and first-except keywords to accomplish this:
- first
- The value of the first assignment on the page is used. If there is no assignment on the page, the "entry value" is used.
- start
- If the element is the first element on the page, the value of the first assignment is used. Otherwise the "entry value" is used. The "entry value" may be empty if the element hasn’t yet appeared.
- last
- The "exit value" of the named string is used.
- first-except
- This is identical to
first
, except that the empty string is used on the page where the value is assigned.
Are these values enough to handle indexes, dictionaries, and other use cases?
9.3. Placement
Running headers and footers may appear in almost any position on a page.
- The position of the running head may be different on first pages vs subsequent pages, or the running head may be omitted on first pages
- Running heads may align to the inside or outside, and thus be different on left and right pages.
- Authors may need to control the layering of running head text (i.e. “z-index”).
- The running head may overflow the page boundary (i.e. “bleed”)
- The running head may be rotated
Running headers are addressed by [css3-gcpm].
EPUB 3.0 has now deprecated support for headers and footers using oeb-page-head
and oeb-page-foot
.
10. Heads
10.1. General Considerations
TK
10.2. Heads at the top of a page
When a head falls at the top of a page, a spacing adjustment is often necessary. Here’s a typical arrangement, with a line and a half of space above the head, and a half-line-space below, so that the text stays on the proper baselines.
If that head appears at the top of the page, the subsequent text will be off by a half-line.
Everything works out if we add a half-line-space back.
10.3. Heads at the bottom of a page
A head should never be the last thing on a page; it must be followed by two or three lines of text.
10.4. Bridge heads, side heads, and run-in heads
TK
11. Images
11.1. TK
Some things to note about this image:
- the caption and image are treated as a unit
- Text runs around the image+caption
- image runs right up to the gutter of the page (i.e. extends beyond usual content area)
11.2. Inline images
TK
11.3. Bleeds
TK
Images that cross spread
image before callout?
placing multiple images on page… inside/outside, top/bottom, stagger
broadside
placement of caption/title
12. Tables
12.1. Alignment
Many tables have specialized requirements for the alignment of cells in a given column.
12.1.1. Align on character
All entries in a given column may need to align to a predetermined character, most commonly a decimal point. Typically, the longest entry in the column should be centered, and then the other entries should align to that entry.
In some cases, a composite “longest entry” needs to be constructed:
| 445.85 | | 12345.6 | | 1.234 | | .1 |
In this case, the user agent should act as if 12345.234 was the longest line, so the margin to the left of 12345.6 will be equal to the margin to the right of 1.234.
When a collection of whole numbers with no decimal points are in a column and are asked to align, the longest whole number should center in the column and the rest of the whole numbers should right align on the right indent of the longest whole number.
If the content of a table cell is being aligned to a character, that content should not have wrapping applied by the rendering system.
12.1.2. Flush left center alignment
Also known as centering on the longest line, the longest line in a column is found and centered, and other entries in the column are aligned to the left edge of the longest line.
As before, header and footer cells are ignored, and the author should be able to exclude specified cells from the alignment process.
This type of alignment is often used in text, for poetry or prose extracts.
User agents should not break single-word cells.
12.2. Table widths
In print, tables are not randomly sized but typically set to one of a few fixed widths. This requirement necessitates that a composition engine know how to “snap to” one of the desired widths. This may help show relationships between separate tables.
broadside
placement of caption/title
spread
multi-page
continued lines
13. Lists, Indexes, and Tables of Contents
13.1. Indexes
13.1.1. Collapsing Page Ranges
When generating indexes or referring to page ranges, one often ends up with duplicated or sequential numbers.
1, 3, 3, 7, 8, 9, 10, 16
This should be formatted as:
1, 3, 7–10, 16
with duplicates removed and consecutive numbers replaced by ranges.
14. Footnotes
Having to read footnotes resembles having to go downstairs to answer the door while in the midst of making love.
—Noël Coward
In print publishing, a footnote consists of two parts: a reference (often rendered as an asterisk or superscripted number) and the footnote body.
Footnotes themselves can be quite complicated. Footnotes can contain multiple paragraphs, block quotes, poems, lists, and tables. Footnotes can contain other footnotes (an edge case, admittedly, but David Foster Wallace was notorious for this). Footnotes can extend across multiple pages. In short, a footnote is a container that can hold almost anything.
In order to describe footnotes in HTML, one must separate the footnote reference (which is an inline element) from the footnote itself, as HTML frowns on placing complex block structures inside paragraphs. This is quite different from something like DocBook, where the content model allows a footnote element inside a paragraph, and that footnote can itself contain multiple paragraphs, etc.
<p>It was the best of times<span class="ref-footnote-rw">*</span>, it was the blurst of times.</p> <div class="block-rw footnotes-rw"> <p><span class="num-footnote-rw">*</span>Oh yes, but the telephone is so impersonal.</p> <p>I prefer the hands-on touch you only get with hired goons.</p> </div>
There may also be more than one reference to the same footnote.
Footnote handling as described in [css3-gcpm] assumes the footnote is coded inline at the point of reference. This situation is under discussion on the www-style list.
14.1. Inline footnotes and multiple footnote regions
Some types of footnotes may be displayed inline, as in the top figure. Other books (see below) may have two separate streams of footnotes, requiring two footnote regions.
14.2. At the foot of what?
Footnotes usually fall at the bottom of the page, but may need to be at placed at the end of a column, table, sidebar, or other document structure.
14.3. Breaking footnotes across pages
Some footnotes can extend across more than one page. Limits on the size of the footnote area(s) may be required, so that a page containing only footnotes is avoided.
Note: Sometimes, footnotes may require so much space that they cannot all be placed before the end of a document section. In this case, it’s acceptable to have pages that consist only of footnotes.
14.4. Numbering
Three questions must be answered when numbering footnotes. First, which numbering scheme should be used? Second, what are we actually numbering? Third, is the numbering system reset at some point in the document?
14.4.1. Numbering schemes
Footnotes are most commonly numbered with arabic numerals, lower-case letters, or a sequence of symbols: *, †, ‡, and §, ||, and #. With symbols, they may be doubled or tripled after exhausting the sequence, but long before |||||| is used, the choice of numbering should be re-evaluated.
14.4.2. What are we counting?
Usually, footnote numbers count footnotes. But in some cases, the reference may be a line number, paragraph number, or section number.
14.4.3. Resetting numbers
Footnote numbering may restart with each new chapter, or each new page. The former is common with numeric footnotes, the latter with footnotes using symbols.
Note: Footnotes are addressed by [css3-gcpm].
Note: Digital publications often render footnotes differently from print. They may become pop-ups, move to the end of the section, or to the end of the document. We are not currently attempting to document digital best practices around footnotes.
15. Cross-references
Books often contain text that refers to other components of the same book. Such text commonly consists of the name or title of that component, along with a number used to identify that component.
as described in Chapter 14
From the Aiguille du Midi, follow A16 to 2950m
Make Anchovy Mayonnaise (page 762), with 6 or 8 anchovies.
For another example, see Figure 1.4.
and equation (31.3) shows it does not!
Many typesetting systems allow authors to generate numbers for such components. A cross-reference needs to be able to access such generated content from another location in the document.
Note: CSS provides counters to number things; creating cross-references would require a mechanism to access the value of a counter at a particular location. The target-counter and target-counters functions in [css3-content] are designed to do this.
[css3-content] does not have a mechanism to customize the content of a cross-reference based on the type of element being referred to. See https://lists.w3.org/Archives/Public/public-digipub/2015Aug/0079.html and subsequent discussion.
16. Sidebars
Some things to notice:
- The image floats to the top of the column inside the sidebar
- The columns themselves base-align
- The sidebar title and “supertitle” are on the same line.
17. Marginalia
alignment with reference
18. Equations
Mathematics is a critical part of many books, from learning materials for kindergartners to monographs on physics.
18.1. Breaking equations
TK
18.2. Numbering equations
Equations are often numbered. In the figure below, note that the equations are centered horizontally, the equation number is flush right, and the equation number is centered vertically relative to the equation.
Note: The alignment in this figure was implemented with [css-flexbox-1].
18.3. Aligning equations
Some publishers require that all equations on a page align on the equals sign.
x + 3z = 7 + 2y 2x + y + z = 4 Intervening text which may extend for several lines 10 + 2y = 3x + 2z
Note: the alignment is generally scoped to a page or (more likely) a defined set of equations.
Note: this is similar to how numbers in a table column may align on a decimal point or other character.
18.4. Annotating equations
19. Columns
Often the first page of a chapter or article will be set in a single column, and subsequent pages set in multiple columns.
20. Punctuation
Spacing around punctuation marks is a known obsession of typographers.
20.1. Language-specific spacing rules
Punctuation | English | French | Spanish |
Exclamation Point | ! | [thin space]! | ¡text! |
Colon | : | [thin space]: | : |
Question Mark | ? | [thin space]? | ¿text? |
Open Quote | “ | «[thin space] | “ |
Close Quote | ” | [thin space]» | ” |
20.2. Em-dashes and en-dashes
To space or not to space? That is the question. Even within publishing houses, arguments continue over the proper display of em-dashes. Some imprints at Hachette use closed em-dashes, others insist on thin spaces around em-dashes. If the same book is to be published in the United Kingdom, em-dashes would be replaced with en-dashes, with larger spaces around them.
Given the subtlety of many of these rules, it’s helpful to use CSS to generate typographically-sophisticated output from material written by lay authors, or to adapt content to varying publisher or language requirements.
Older drafts of [css3-gcpm] contained a text-replace property, which has been implemented by PrinceXML.
body { prince-text-replace: "—" "\200A—\200A"; }
In this example, we’re adding hair spaces around em-dashes.
20.3. Number formatting
Different languages have different conventions for formatting numbers. Punctuation marks are inserted at specified points in numbers to aid readability. For example, in English commas are used to separate groups of digits, and a period to denote the decimal point.
However, in Spanish and Norwegian the roles are reversed, with the period used to group digits, and the comma for the decimal point:
Language | Grouping separator | Decimal separator |
---|---|---|
Austria, Belgium, Brazil, Canada (fr), Czech Republic, Denmark, Estonia, Finland, France, Germany, Hungary, Italy, Netherlands, Norway, Peru, Poland, Portugal, Romania, South Africa, Spain, Sweden, Switzerland | space | comma |
Argentina, Austria, Brazil, Denmark, Germany, Italy, Portugal, Romania, Slovenia, Spain (older) | period | comma |
Great Britain, United States | comma | period |
Note: The CSSWG has informally proposed a method for formatting numbers.
21. Special Considerations for Genres
21.1. Education
- College textbooks
- Elhi
- Language
- Study guides
21.2. Trade
- Fiction
- Narrative nonfiction
- Children’s
- YA
-
Religious
- Bibles
- Travel
- How-to
- Manga/Comics/Graphic Novels
21.3. STEM
21.4. Reference
- Legal
- Dictionaries
22. Digital Issues
23. Large-Scale Issues in Pagination
23.1. Book optimization
In trade publishing, we often know how many pages will be in a book before it is written. The nature of printing and binding also mandate that the number of pages in a book be some multiple of eight, sixteen, or thirty-two pages. Publishers often limit how many blank pages are allowed at the end of a book.
23.2. Chapter optimization
A chapter that ends with only a few lines of text looks like a mistake, and wastes paper (or electrons!) Generally a page should contain at least five lines of text.
Appendix A: Baseline Grids
A baseline grid is a series of evenly-spaced horizontal alignment lines. This is used to provide a vertical rhythm for a design, to align adjacent content (text or graphics), and to align baselines on facing pages in printed material.
The grid lines can be spaced at line-height intervals or a factor of line-height.
Content can be aligned to the grid in various ways. Roman body text typically sets the baseline on a grid line. Graphics might have their top, bottom or both set on grid lines, or be centered between grid lines. Text blocks (consider a multi-line heading with line-height at 1.4x grid height) might have their last baseline or first baseline on a grid line, or have the block’s combined height centered between grid lines. Centering is much more important in ideographic type systems.
If normal layout would result in a misalignment, content shifts down to the next available grid line.
Sometimes it’s necessary to have particular content opt out of aligning to a grid.
There can be one or more grids per document. Multiple grids can overlap (body grid and side content grid) or run in series (a vertical stack of pages). Grids can be nested (think of a document being represented as a graphic inside another document). A particular piece of content only aligns to a single grid.
Appendix B: Of Leading and Sinkage: The Language of Print
Translating print designs to the open web platform can be tricky.
- vertical distances are usually measured baseline to baseline.
- print designers sometimes talk about a "text page" which includes the running head.
- The basic text area is often specified with a gutter margin and a text "measure". In [css3-page] this area is described by left/right or inside/outside margins.
- Leading
- Line-height
- Recto
- Right-hand page of a spread
- Verso
- Left-hand page of a spread
The Classical Rules of Hyphenation and Pagination
- At hyphenated line-ends, leave at least two characters behind, and take at least three forward.
- Avoid leaving the stub-end of a hyphenated word, or any word shorter than four letters, as the last line of a paragraph.
- Avoid more than three consecutive hyphenated lines.
- Hyphenate proper names only as a last resort unless they occur with the frequency of common nouns.
- Hyphenate according to the conventions of the language.
- Link short numerical and mathematical expressions with hard spaces.
- Avoid beginning more than two consecutive lines with the same word.
- Never begin a page with the last line of a multi-line paragraph.
- Balance facing pages by moving single lines.
- Avoid hyphenated breaks where the text is interrupted.
- Abandon any and all rules of hyphenation and pagination that fail to serve the needs of the text.
Further Reading
Bringhurst, Robert. The Elements of Typographic Style
Felici, Jim. The Complete Manual of Typography
Haralambous, Yannis. Fonts & Encodings: From Advanced Typography to Unicode and Everything in Between
Haslam, Andrew. Book Design
Highsmith, Cyrus. Inside Paragraphs
Kane, John. A Type Primer
Knuth, Donald. Digital Typography
Lawson, Alexander. Anatomy of a Typeface
Mitchell; Wightman. Book Typography
Nickel, Kristina. Ready to Print
Steer, Vincent. Printing Design and Layout (1948)
Tracy, Walter. Letters of Credit: A View of Type Design
Tschichold, Jan. The Form of the Book: Essays on the Morality of Good Design
Acknowledgments
Eric Aubourg, Luc Audrain, Bert Bos, Tom Byrer, James Clark, Brady Duga, Ivan Herman, Tony Graham, Bill Kasdorf, Jean Kaplansky, Sanders Kleinfeld, Liam Quin, Alan Stearns, Tzviya Siegman