Default styling for multilingual quotes & quotation marks in HTML

This page sumarises some findings related to the use of quotation marks around the q element in HTML5 when dealing with multilingual text. It is concerned with the default behaviour, when no CSS styling is applied.

Note: the actual quotation marks used on this page are immaterial to the discussion. They were simply chosen to make it possible to see the actual issue, which is that the choice of when to use particular language-specific characters (whatever they are), especially in multilingual quotations, doesn't produce what people expect.

The problem

Different characters are used for quotation marks in different languages.

Opinions from people on the public-digipub and www-international mailing lists point to a desire for quotation marks to remain in the format appropriate to the language of the text which lies outside the principal quotation, rather than to change according to the language of the text inside the quotation. This also means that there is no linguistically-sensitive change to quotation marks used for quotations within quotations.

This is not currently what browsers produce if they follow the HTML5 specification. The HTML5 specification chooses quotation marks on the basis of the language of the quote that they surround.

Use case examples

For each of the use cases below (in total 8), the top line shows the markup of the text. The next line shows what respondents on the list expected to see. The third line shows what browsers produce by default when they follow the HTML5 rendering section. The final, fourth line shows what is produced by the browser you are currently using, by default (the paragraph has lang="fr-CA" added).

All examples assume that the html tag has a lang attribute with the value fr-CA.

  1. <p>Mais Lucy répond: <q>Embrassez George de ma part</q></p>

    Expected: Mais Lucy répond: « Embrassez George de ma part  »

    HTML5: Mais Lucy répond: « Embrassez George de ma part  »

    Your browser: Mais Lucy répond: Embrassez George de ma part

  2. <p>Mais Lucy répond: <q lang="en">Give George my love</q></p>

    Expected: Mais Lucy répond: « Give George my love  »

    HTML5: Mais Lucy répond: “Give George my love ”

    Your browser: Mais Lucy répond: Give George my love

  3. <p>Mais Lucy répond: <q>Embrassez George de ma part – seulement une fois. Dites-lui, <q>Embrouille</q></q></p>

    Expected: Mais Lucy répond: « Embrassez George de ma part et dites-lui, ‹ Embrouille › »

    HTML5: Mais Lucy répond: « Embrassez George de ma part et dites-lui, ‹ Embrouille › »

    Your browser: Mais Lucy répond: Embrassez George de ma part et dites-lui, Embrouille

  4. <p>Mais Lucy répond: <q lang="en">Give George my love and tell him, <q>Muddle</q></q></p>

    Expected: Mais Lucy répond: « Give George my love and tell him, ‹ Muddle › »

    HTML5: Mais Lucy répond: “Give George my love and tell him, ‘Muddle’”

    Your browser: Mais Lucy répond: Give George my love and tell him, Muddle

  5. <p>Mais Lucy répond: <q>Embrassez George de ma part – seulement une fois. Dites-lui, <q lang="en">Muddle</q></q></p>

    Expected: Mais Lucy répond: « Embrassez George de ma partet dites-lui, ‹ Muddle › »

    HTML5: Mais Lucy répond: « Embrassez George de ma part et dites-lui, ‘Muddle’ »

    Your browser: Mais Lucy répond: Embrassez George de ma part et dites-lui, Muddle

  6. <p>Mais Lucy répond: <q lang="en">Give George my love – once only. Tell him, <q lang="fr-CA">Embrouille</q></q></p>

    Expected: Mais Lucy répond: « Give George my love and tell him, ‹ Embrouille › »

    HTML5: Mais Lucy répond: “Give George my love and tell him, ‹ Embrouille ›”

    Your browser: Mais Lucy répond: Give George my love and tell him, Embrouille

  7. <p>Mais Lucy répond: <span lang="en">Give George my love – once only. Tell him, <q lang="fr-CA">Embrouille</q></span></p>

    Expected: Mais Lucy répond: Give George my love and tell him, « Embrouille »

    HTML5: Mais Lucy répond: Give George my love and tell him, « Embrouille »

    Your browser: Mais Lucy répond: Give George my love ant tell him, Embrouille

  8. <p>Mais Lucy répond: <span lang="en">Give George my love – once only. Tell him, <q>Muddle</q></span></p>

    Expected: Mais Lucy répond: Give George my love and tell him, « Muddle »

    HTML5: Mais Lucy répond: Give George my love and tell him, “Muddle”

    Your browser: Mais Lucy répond: Give George my love and tell him, Muddle

There is a page that shows results of a series of tests for default handling of the q element in browsers. That page has links to the tests themselves (left column in the table).

Alternative solutions

The current default styling in the HTML5 spec is:

:root:lang(en), :not(:lang(en)) > :lang(en) { quotes: '\201c' '\201d' '\2018' '\2019' } /* “ ” ‘ ’ */

:root:lang(fr-CA), :not(:lang(fr-CA)) > :lang(fr-CA) { quotes: '\00ab' '\00bb' '\2039' '\203a' } /* « » ‹ › */

Solution A

One solution, proposed by Florian Rivoal, is to replace that with the following

:lang(en) > *:not(q *) { quotes: "“" "”" "‘" "’" }

:lang(fr-CA) > *:not(q *) { quotes: "« " " »" "‹ " " ›" }

That CSS produces the expected effect, but Chrome, Firefox, and Edge don't support the selectors. Only Safari produces a result.

There is also a difference in item 8 above. If the language change occurs on a element that is not q higher up the hierarchy, the quotes used will be appropriate to that language. This is quite useful for multilingual documents, where the html tag may not reflect the reader's language for a particular part of the document. It does, however, also mean that examples like #8 (which are likely to be rare), or quotations within blockquotes, etc. will by default use the marks appropriate to the newly set language. This may not be a significant issue.

[there also seems to be a problem for #5]

Here is what it looks like in your browser.

  1. Your browser: Mais Lucy répond: Embrassez George de ma part

  2. Your browser: Mais Lucy répond: Give George my love

  3. Your browser: Mais Lucy répond: Embrassez George de ma part et dites-lui, Embrouille

  4. Your browser: Mais Lucy répond: Give George my love and tell him, Muddle

  5. Your browser: Mais Lucy répond: Embrassez George de ma part et dites-lui, Muddle

  6. Your browser: Mais Lucy répond: Give George my love and tell him, Embrouille

  7. Your browser: Mais Lucy répond: Give George my love and tell him, Embrouille

  8. Your browser: Mais Lucy répond: Give George my love and tell him, Muddle

Solution B

An alternative solution would be to simply use

:root:lang(fr-CA) { quotes: "« " " »" "‹ " " ›" }

:root:lang(en) { quotes: "“" "”" "‘" "’" }

This would require the content author to use a lang attribute on the html tag (which they should anyway), but for a multilingual document they would need to add their own style rule to set the quotes to the alternative language. Multilingual quotes of the type we see above, however, would produce the expected result (ie. only using the quotation marks for the language indicated in the html tag.