mathml-core

MathML Core Explainer

Authors:

Draft specification:

MathML Core - W3C Editor’s Draft

Abstract

MathML Core is a definition of a fundamental subset of features described in the current MathML 3 recommendation. It attempts to resolve several problems created by MathMLs origins, history and complex status, and properly define its integration in the modern Web Platform in rigorous ways. The specific subset is derived based on what is widely developed, deployed, proven and used in practice.

Table of Contents

Goals

Non-Goals

Background: MathML

Basic example…

The <math> element provides a standard for authors to express and work with text containing generalized relationships about mathematics, in a way very similar to how <table> does for expressing text containing relationships about tabular data.

<math>
  <mfrac>
    <msup>
      <mi>x</mi>
      <msqrt>
        <mn>5</mn>
      </msqrt>
     </msup>
    <mrow>
      <mi>α</mi>
      <mo>×</mo>
      <mn>7</mn>
    </mrow>
  </mfrac>  
</math>

*Figure 1: MathML/DOM for the above

Visual MathML rendering as nested boxes representing the DOM tree, with corresponding tag name annotated for each box.

What is MathML-Core?

MathML Core is an attempt to create a minimal version of MathML that is well aligned with the modern web platform. It aims to resolve long-standing issues with the split evolution of philosophies between MathML specifications and the larger web platform. As part of this aim, MathML Core creates a well-defined starting point based on what is currently widely implemented. By creating a minimal version of MathML, MathML Core has an increased focus on testability and interoperability.

The elements of MathML-Core

MathML 3 contained 195 elements. MathML-Core focuses on just 32. Several of these elements exist in deprecated form and simply exist to map the elements and their attributes to newer concepts (let them explain the actual magic) in much the same way font remains. It provides a recommended UA stylesheet for implementation, and adds a couple of new Math oriented display types.

Here is a brief rundown of what those elements are

Design Discussion

Not reinventing the wheel

Applying Extensible Web principles

The biggest design decisions centered on how to apply Extensible Web principles in our own work, as MathML sits in a very unique place in history, and how it “fits” into the platform. Not only does it have existing implementations, very wide adoption and expectations and integration through the HTML parser, but we are approaching it while standards that in the future might theoretically expose the magic for mathematical layout, such as the CSS Layout API and related Houdini standards, are still developing and significantly in flux.

In order to balance all of this we decided on the following:

Figure 4: Example of using CSS, JavaScript or the Layout API to enhance MathML Core with user-defined features.

<style>
  math {
     font-family: STIX Two Math;
     color: blue;
  }
  mfrac {
     border: 1px solid dotted;
     padding: 1em;
  }
  .myFancyMathLayout {
     display: layout(myFancyMathLayout);
  }
</style>
<math>
  <mfrac>
    <mrow class="myFancyScriptedElement">
        ...
    </mrow>
    <mrow onclick="myInteractiveAction()">
        ...
    </mrow>
  </mfrac>
</math>

Considered Alternatives

Leave math reliant on SVGs and/or JavaScript libraries

Writing systems define how we share information. Mathematical notations form a fundamental aspect of writing systems. Math is text, and it is a normal part of text: Mathematical notations are found in all civilizations. They have been instrumental throughout history for the diffusion and development of scientific and technical knowledge. The need for browsers to natively render this kind of text was evident from the earliest days of the Web at CERN. We believe that according to the W3C TAG’s Ethical Web Principles it is not good for either the Web, the directly impacted communities of authors, or ultimately society to specially disadvantage such an important aspect of communication.

Abandon MathML in favor some new thing

There are numerous criticisms of MathML. Like all aspects of the existing platform, for example, more succinct forms of expression exist that many authors are more comfortable writing (e.g. linear text syntax used in LaTeX or Computer algebra systems). Like other aspects of the platform, it is also possible to be more semantic than MathML currently provides.

A few things don’t change though and among them is the difficulty in rendering interoperable mathematical formulas with good quality. Abandoning MathML would be a rejection of an entire ecosystem and decades of work in standardization and advancement with little hope that any of the current state would change in any reasonable timeframe. This would be tragic as we don’t generally require that authors use complex libraries in order to layout text, or recommend that they be inserted as images. We believe that getting native math rendering is the right thing to do and that a tree is good.

Why a tree is good…

Trees of text relationships aren’t the most succinct or easy to type ways to express things. However, this is true of all HTML too. That’s why a lot of HTML is generated from simpler forms like markdown or tools like rich text editors or templating. A rich ecosystem of tooling has been developed over many years for generating and editing MathML too.

But expressing the content is only part of the challenge and the platform is heavily oriented toward solving these problems via just such a tree. Many benefits flow naturally from simply matching the platform here and expressing mathematics as a standard tree of relationships:

Building atop…

Given these abilities and approach, building atop additional semantics, extensions, conversions and further explorations** becomes very plausible. It is even entirely plausible to support shorthand expansion from forms like LaTeX or ASCII Math, in much the same we can for Markdown. Patterns for extending shorthand notations like these are a common class of problem that should be well explored and, still, probably rendered into a Shadow tree natively if ever supported natively.

Figure 5: LaTeX source in a custom element rendered using MathML in a shadow DOM, with the Latin Modern Math font ; From top to bottom: Blink (Igalia’s build), WebKit (r249360) and Gecko (Firefox 68)

<la-tex>
  {\Gamma(t)}
  = {\int_{0}^{+\infty} x^{t-1} e^{-x} dx}
  = {\frac{1}{t}
     \prod_{n=1}^\infty
     \frac{\left(1+\frac{1}{n}\right)^t}{1+\frac{t}{n}}}
  \sim {\sqrt{\frac{2\pi}{t}} \left(\frac{t}{e}\right)^t}
</la-tex>

Screenshot of a MathML formula in different browsers.

Focus instead solely on lacking primitives

A big part of the challenge of focusing on lacking primitives is that it leaves open the question of what is lacking. The main proposals here of things to focus on have to do with additional semantics, ‘stretchy characters’ and complex alignments. While we agree that these are all excellent goals, we believe that they are also very independently pursuable, and that both causes are boosted by doing so.

However, without also providing a detailed layout specification, pursuing native rendering in all browsers or performing interoperability tests it becomes very hard to design a full browser-compatible math rendering implementation and to introduce necessary web platform primitives. Thus we again relegate ourselves to the current state of one of the hardest problems in a way that we don’t for other forms of text.

Enhance MathML3 but keep all or most features

Another approach would be to integrate the TeX/OpenType and HTML5/CSS improvements but at the same time preserving all or most features from MathML3. We discarded this approach for several reasons:

A Note on “Legacy Compat” and “following the platform”

MathML-Core aims to always follow the platform. Where things are noted as included for “Legacy Compat” this refers to the fact that existing, legacy patterns are mapped to their platform counterparts and deprecated. Any support for legacy/coevolutionary (deprecated) attributes just maps to existing platform solutions in the same way that legacy (deprecated) attributes like border might be supported in HTML via a UA style rule. In other words, even supporting the legacy border attribute would be explained by CSS, the box model and the border properties. We’ve chosen to only recommend supporting these kinds of mappings for things which are actually in wide use and only so that existing content keeps working while we expose (functional) platform answers.

Stakeholder Feedback