This document describes requirements for the layout and presentation of text in languages that use the Hebrew script when they are used by Web standards and technologies, such as HTML, CSS, Mobile Web, Digital Publications, and Unicode.
This document describes the basic requirements for Hebrew script layout and text support on the Web and in eBooks. These requirements provide information for Web technologies such as CSS, HTML and digital publications about how to support users of the Hebrew script. Currently the document focuses on Hebrew as used for the Hebrew language. The information here is developed in conjunction with a document that summarises gaps in support on the Web for Hebrew.
The editor's draft of this document is being developed by the Hebrew Layout Task Force, part of the W3C Internationalization Interest Group. It is published by the Internationalization Working Group. The end target for this document is a Working Group Note.
Amir Aharoni provided most of the original technical content. Richard Ishida reorganised the content along standardised lines.
See also the GitHub contributors list for the Hebrew Language Enablement project, and the discussions related to Hebrew script.
The W3C has to ensure that the needs of scripts and languages around the world are built into technologies such as HTML, CSS, SVG, etc. so that Web pages and eBooks can look and behave as people expect around the world.
The aim of this document is to describe basic requirements for Hebrew script layout and text support on the Web and in eBooks. These requirements provide information for Web technologies such as CSS, HTML and digital publications, and for application developers, about how to support users of Hebrew scripts.
This document should contain no reference to a particular technology. For example, it should not say "CSS does/doesn't do such and such", and it should not describe how a technology, such as CSS, should implement the requirements. It is technology agnostic, so that it will be evergreen, and it simply describes how the script works. The gap analysis document is the appropriate place for all kinds of technology-specific information.
This document is pointed to by a separate document, Hebrew Gap Analysis, which describes gaps in support for Hebrew on the Web, and prioritises and describes the impact of those gaps on the user.
Wherever an unsupported feature is indentified through the gap analysis process, the requirements for that feature need to be documented. This document is where those requirements are described.
The Language enablement index points to this document and others, and provides a central location for developers and implementers to find information related to various scripts.
The W3C also maintains a tracking system that has links to github issues in W3C repositories. There are separate links for (a) requests from developers to the user community for information about how scripts/languages work, (b) issues raised against a spec, and (c) browser bugs. For example, you can find out what information developers are currently seeking, and the resulting list can also be filtered by script.
Hebrew is an abjad. This means that in normal use the script represents only consonant and long vowel sounds. This approach is helped by the strong emphasis on consonant patterns in Semitic languages.
The Hebrew script is written right-to-left across the page, and Hebrew characters are not joined cursively as they are in, for example, Arabic text.
The script is unicameral.
Words are separated by spaces, and these provide the predominant break points for line-breaking, and adjustment points for justification.
See also this overview of the Hebrew script.
The use of italic fonts in Hebrew is controversial. Some designers say that slanted fonts shouldn't be used at all. Some say that only meticulously designed slanted fonts can be used (although this is true also for Latin text).
Slanted fonts are not used in newspapers. Bold type is more likely to be used where English newspapers would use italics, although a different typeface is often used for that.
Bold type can be used for emphasis, or for highlighting specific items in text.
Most of the time Hebrew is written without diacritical marks, however there are two notable uses for diacritics:
Some known issues:
Hebrew text is generally similar to Latin text in this regard. The only issue to note is selecting and moving through text that includes the character maqaf, the Hebrew hyphen. It should behave like a hyphen.
According to the section on quotation marks (מירכאות) Academy of the Hebrew Language's punctuation rules, both single and double quotes are acceptable when there's no nesting. When there is nesting, different quotes are supposed to be used for the nested quote. In most of the examples in the document itself, the examples are written with double quotation marks in non-nested quotes and in the outer quote, and with single quotation marks in the nested quote.
Most of the examples in the document use the characters " [U+0022 QUOTATION MARK] and ' [U+0027 APOSTROPHE], however it also notes the following: "In handwriting and in traditional printing the opening quotation marks are low: „–”; ‚–’ On devices that don't support typing the low quotation marks, the high quotation marks are used." Typing the low and high quotation marks is defined in the SI 1452 standard (2012 version) on the keys Alt-;, Alt-L, Alt-., and Alt-,. This standard is implemented in Windows 8, as well as in desktop Linux distributions, and in the Gboard keyboard for Android and iOS.
Hebrew uses letter-spacing (tracking) to emphasise or highlight text (eg. people names, concepts, terms etc.).
Drop caps are used in Hebrew occasionally. If the text uses vowel or cantilation diacritics, all the diacritics on the letter must be included in the drop cap and styled accordingly.