MNX Draft Specification

Living Document,

Issue Tracking:
GitHub
Inline In Spec
Editor:
Joe Berkovitz (Risible LLC)
Participate:
File an issue (open issues)
Not Ready For Implementation

This spec is not yet ready for implementation. It exists in this repository to record the ideas and promote discussion.

Before attempting to implement this spec, please contact the editors.


Abstract

A draft specification for the MNX music notation document format. MNX is a proposed music notation markup standard. Its aim is to improve MusicXML in fundamental ways, while retaining many of its key concepts, terms and features. MNX seeks to provide a high degree of interoperability and exchange between different applications working with music notation.

1. Status of this document

This document is an early draft and focuses on the main structural features that distinguish MNX from MusicXML 3.1. Many elements are omitted, or their details are referred to the MusicXML specification as a placeholder.

2. Introduction

2.1. Background

This section is non-normative.

MNX is a proposed music notation markup standard, which seeks to provide a high degree of interoperability and exchange between different applications working with music notation.

Many different sources of inspiration inform the design of MNX including MusicXML, the Music Encoding Initiative, the IEEE 1599 specification, and others.

2.2. MNX score types

This section is non-normative.

MNX can support multiple score types. Each score type is a specific encoding that applies to some portion of a container document.

The present draft of this specification deals with two score types:

2.3. Comparisons with other notation standards

This section is non-normative.

The MNX-Common score type of MNX is a lineal descendant of MusicXML, and employs many of the same concepts. However it sacrifices some features and flexibility of MusicXML in favor of tighter interoperability, and simplifies the element structure considerably. MNX-Common also moves all non-semantic information into CSS properties.

MEI is a very general and expressive medium for encoding arbitrary musical documents, with particular attention to the needs of scholars. Due to its extreme plasticity, MEI is perhaps better described as a powerful framework for building customized documents and applications, than as a single encoding method. As such, interoperability has not been a main goal of MEI to date. However there are efforts underway to define a clean MEI subset as an interoperable medium for encoding CWMN (sometimes known as "MEI Go").

IEEE 1599 is a specification that has paid unique attention to the relationships between different layers of musical information. Its Logic layer is similar in content to MNX-Common, while its Notational, Performance and Audio layers answer some of the same concerns as MNX-Generic. MNX-Generic takes a different approach to connecting these layers, and does not attempt to fully unify semantic information with visual and performance data. It relies to a greater degree on SVG, and to a lesser degree on MIDI.

2.4. Compatibility with MusicXML

This section is non-normative.

MNX uses MusicXML as a point of departure in many ways, but it does not attempt to be backward-compatible with MusicXML, nor is it a superset of MusicXML. However, a large proportion of MusicXML markup is expected to be preserved. In these examples, MusicXML constructs are used freely throughout as a way to show how proposed new concepts dovetail with existing ones.

Backward compatibility aside, it is a goal to be able to machine-translate MusicXML into MNX. This is essential for migration purposes.

2.5. Use cases

This section is non-normative.

A companion document details a set of known use cases for music notation.

2.6. Audience

This section is non-normative.

This specification is intended for authors of documents and applications that use the features defined in this specification, implementors of tools that operate on documents that use the features defined in this specification, and individuals wishing to establish the correctness of documents or implementations with respect to the requirements of this specification.

This document is probably not suited to readers who do not already have at least a passing familiarity with XML technologies. In places it sacrifices clarity for precision, and brevity for completeness. More approachable tutorials and authoring guides can provide a gentler introduction to the topic.

2.7. Design notes

This section is non-normative.

Some general principles regarding the design of this specification follow.

Make schematic form follow function.
The schema of MNX tries where possible to let the constraints of an element hierarchy serve a useful purpose, by embodying analogous constraints in music notation. For example, the sequence and event elements force a musical voice in CWMN to follow conventional rules for avoiding temporal overlap.
Preserve ease of reading and writing for simple content.
While complex scores will necessarily be generated and parsed by machines, it’s valuable to allow humans to easily create and read simple content. Therefore in some cases, encodings are intentionally more compact than strictly necessary. Examples in MNX include microsyntaxes and the use of XML attributes instead of elements for the most frequent properties.
Address both literal encoding and semantic encoding.
MNX includes two separate approaches to encoding music: high-level semantic encodings described by MNX-Common (and other future modules), and low-level literal encodings described by MNX-Generic. The literal encoding attempts to eliminate cultural and semantic assumptions within its scope, while still allowing linkage between the literal and semantic layers.
Be specific about what is valid and what is not.
MNX attempts to keep things constrained in its semantic layers, while the literal encoding is wide-open. While this constrains some potential expression towards the edges of CWMN, it enhances interoperability at the core.
Address culturally specific needs in a modular fashion
While much of the MNX specification addresses common Western music notation, nothing in the specification prevents the development of additional modules targeting other notation systems at a semantic level, or from taking other semantic approaches to Western music. The MNX-Generic module avoids cultural specificity due to its literal focus.
Separate semantic concerns from presentation/interpretation concerns.
Within its semantic encoding for MNX-Common, this specification strives to keep semantic descriptions from answering to the multitude of tiny features that control presentation and performance interpretation. These are segregated in the parallel domains of style properties and interpretation content.
Allow semantic encodings to "tunnel through" to the literal encoding.
A semantic module such as MNX-Common cannot supply all known information about rendering and performance, in cases where this knowledge lives outside the semantic markup. MNX allows semantic layers to "tunnel" through to employ the literal constructs of MNX-Generic, allowing the same primitives to describe both entire scores at a literal level, and those fragments of a semantic score that require an embedded, literal description.
Leverage existing value in the world
The ecosystem of the Web is broad and valuable. MNX attempts to exploit this by making use of existing patterns and tooling. Examples include the reuse of many CSS concepts, and the ability to employ completely standard SVG documents within MNX-Generic without need of alteration.

2.7.1. Extensibility

This section is non-normative.

Content TBD

2.8. Structure of this specification

This section is non-normative.

This specification is divided into the following major sections:

§2 Introduction

Non-normative materials providing a context for the HTML specification.

§3 Infrastructure

Scaffolding material on which the remainder of the specification relies

§4 MNX-Container

The high-level structure of MNX which organizes a hierarchy of musical resources in a document

§5 MNX-Common

A schema describing a musical score in Conventional Western Music Notation.

§6 MNX-Generic

A schema describing an arbitrary graphical score in conjunction with audio and performance data.

2.8.1. How to read this specification

As described in the conformance requirements section below, this specification describes conformance criteria for a variety of conformance classes. In particular, there are conformance requirements that apply to producers, for example authors and the documents they create, and there are conformance requirements that apply to consumers, for example Web browsers. They can be distinguished by what they are requiring: a requirement on a producer states what is allowed, while a requirement on a consumer states how software is to act.

For example, "the foo attribute’s value must be a valid integer" is a requirement on producers, as it lays out the allowed values; in contrast, the requirement "the foo attribute’s value must be parsed using the rules for parsing integers" is a requirement on consumers, as it describes how to process the content.

Requirements on producers have no bearing whatsoever on consumers.

2.8.2. Typographic conventions

This is a definition, requirement, or explanation.

This is a note.

This is an example.

This is an open issue.

This is a warning.

/* this is a CSS fragment */

The defining instance of a term is marked up like this. Uses of that term are marked up like this or like this.

The defining instance of an element, attribute, or API is marked up like this. References to that element, attribute, or API are marked up like this.

Other code fragments are marked up like this.

Byte sequences with bytes in the range 0x00 to 0x7F, inclusive, are marked up like this.

Variables are marked up like this.

In some cases, requirements are given in the form of lists with conditions and corresponding requirements. In such cases, the requirements that apply to a condition are always the first set of requirements that follow the condition, even in the case of there being multiple sets of conditions for those requirements. Such cases are presented as follows:

This is a condition
This is another condition
This is the requirement that applies to the conditions above.
This is a third condition
This is the requirement that applies to the third condition.

2.9. Suggested reading

This section is non-normative.

The following documents might be of interest to readers of this specification.

3. Infrastructure

3.1. Terminology

3.1.1. Notational idioms

A notational idiom is a set of rules in the world for encoding music as some set of visual markings, which can be interpreted by musicians to produce an audible performance.

3.1.1.1. Conventional Western music notation (CWMN)

This notational idiom comprises a set of notational rules common to (but not limited to) Western European music from circa 1600 to the present day.

3.1.2. Score profiles

A score profile is a set of constraints on the rules in a notational idiom. Score profiles are designed to narrow the set of constructs that can be produced or consumed in MNX to a practical scope.

3.2. Common syntaxes

There are various places in MNX that accept particular data types, such as note values, numbers or durations. This section describes the conformance criteria for content in those formats, and how to parse them.

3.2.1. Common parser idioms

The space characters, for the purposes of this specification, are U+0020 SPACE, U+0009 CHARACTER TABULATION (tab), U+000A LINE FEED (LF), U+000C FORM FEED (FF), and U+000D CARRIAGE RETURN (CR).

The White_Space characters are those that have the Unicode property "White_Space" in the Unicode PropList.txt data file. [UNICODE]

This should not be confused with the "White_Space" value (abbreviated "WS") of the "Bidi_Class" property in the Unicode.txt data file.

The control characters are those whose Unicode "General_Category" property has the value "Cc" in the Unicode UnicodeData.txt data file. [UNICODE]

The uppercase ASCII letters are the characters in the range U+0041 LATIN CAPITAL LETTER A to U+005A LATIN CAPITAL LETTER Z.

The lowercase ASCII letters are the characters in the range U+0061 LATIN SMALL LETTER A to U+007A LATIN SMALL LETTER Z.

The ASCII letters are the characters that are either uppercase ASCII letters or lowercase ASCII letters.

The ASCII digits are the characters in the range U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9).

The alphanumeric ASCII characters are those that are either uppercase ASCII letters, lowercase ASCII letters, or ASCII digits.

The ASCII hex digits are the characters in the ranges U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9), U+0041 LATIN CAPITAL LETTER A to U+0046 LATIN CAPITAL LETTER F, and U+0061 LATIN SMALL LETTER A to U+0066 LATIN SMALL LETTER F.

The uppercase ASCII hex digits are the characters in the ranges U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9) and U+0041 LATIN CAPITAL LETTER A to U+0046 LATIN CAPITAL LETTER F only.

The lowercase ASCII hex digits are the characters in the ranges U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9) and U+0061 LATIN SMALL LETTER A to U+0066 LATIN SMALL LETTER F only.

Some of the micro-parsers described below follow the pattern of having an input variable that holds the string being parsed, and having a position variable pointing at the next character to parse in input.

For parsers based on this pattern, a step that requires the consumer to collect a sequence of characters means that the following algorithm must be run, with characters being the set of characters that can be collected:

  1. Let input and position be the same variables as those of the same name in the algorithm that invoked these steps.

  2. Let result be the empty string.

  3. While position doesn’t point past the end of input and the character at position is one of the characters, append that character to the end of result and advance position to the next character in input.

  4. Return result.

The step skip white space means that the consumer must collect a sequence of characters that are space characters. The collected characters are not used.

When a consumer is to strip line breaks from a string, the consumer must remove any U+000A LINE FEED (LF) and U+000D CARRIAGE RETURN (CR) characters from that string.

When a consumer is to strip leading and trailing white space from a string, the consumer must remove all space characters that are at the start or end of the string.

When a consumer is to strip and collapse white space in a string, it must replace any sequence of one or more consecutive space characters in that string with a single U+0020 SPACE character, and then strip leading and trailing white space from that string.

When a consumer has to strictly split a string on a particular delimiter character delimiter, it must use the following algorithm:

  1. Let input be the string being parsed.

  2. Let position be a pointer into input, initially pointing at the start of the string.

  3. Let tokens be an ordered list of tokens, initially empty.

  4. While position is not past the end of input:

    1. Collect a sequence of characters that are not the delimiter character.

    2. Append the string collected in the previous step to tokens.

    3. Advance position to the next character in input.

  5. Return tokens.

For the special cases of splitting a string on spaces and on commas, this algorithm does not apply (those algorithms also perform white space trimming).

3.2.2. Numbers

3.2.2.1. Rational numbers

A string is a rational number if it is either an integer, or a pair of integers separated by a U+002F SLASH whose second element is nonzero.

The rules for parsing rational numbers are as given in the following algorithm. When invoked, the steps must be followed in the order given, aborting at the first step that returns a value. This algorithm will return a pair of integers, one for the numerator and one for the denominator which must be nonzero, or an error.

  1. Let input be the string being parsed.

  2. Let position be a pointer into input, initially pointing at the start of the string.

  3. Let fraction be an initially empty list of integers.

  4. Collect a sequence of characters that are space characters. These are skipped.

  5. While position is not past the end of input, and fraction contains fewer than two elements:

    1. Collect a sequence of characters that are not space characters, ASCII digits, U+002D HYPHEN-MINUS or U+002F SLASH characters. This skips past leading garbage.

    2. Collect a sequence of characters that are not space characters or U+002F SLASH, and let unparsed number be the result.

    3. Let number be the result of parsing unparsed number using the rules for parsing signed integers.

    4. If number is an error, set number to zero.

    5. Append number to fraction.

    6. Collect a sequence of characters that are space characters, or U+002F SLASH.

  6. If fraction has no elements, return zero.

  7. If fraction has only one element, append 1 to fraction.

  8. Return the first element of fraction as the numerator and the second element of fraction as the denominator.

3.2.3. Element locations

An element location constitutes a reference to a specific element in the document. It consists of the character #, immediately followed by the XML ID of the referenced element.

3.2.4. Style property lists

MNX supports a simple and compact style property list syntax, allowing a map of key-value pairs to be represented in a single string where the keys are names of style properties.

To parse a style property list:

  1. Let input be the string being parsed.

  2. Let defs be the result of strictly splitting the string input using U+003B SEMICOLON as a delimiter.

  3. Let properties be an empty map.

  4. While defs is not empty,

    1. Let definition be the first element of defs, and remove it from defs.

    2. Collect a sequence of characters from definition that are not U+003A COLON, and let property name be the result after stripping leading and trailing white space.

    3. If property name is empty, return an error.

    4. If the next character of definition is not U+003A COLON, return an error.

    5. Skip the next character of definition.

    6. Let property value be the remaining characters of definition, after stripping leading and trailing white space.

    7. Add a new entry to properties with key property name and value property value.

  5. Return properties.

Examples include:

color: red

A definition of the property color as having the value red.

color: green;

A definition of the property color as having the value green. Note that a terminal ; is provided in this case, but has no effect.

smufl-font: Bravura; color: red;

A definition of two properties: smufl-font with value Bravura, and color with value red.

3.3. Content models and categories

Each element in MNX falls into zero or more categories that group elements with similar characteristics together. Examples of content categories include event content and sequence content, among many others.

3.3.1. Element definitions

Each element in this specification has a definition that includes the following information:

Contexts

A non-normative description of where the element can be used. This information is redundant with the content models of elements that allow this one as a child, and is provided only as a convenience.

Content model

A normative description of what content must be included as children and descendants of the element.

Attributes

A normative list of attributes that may be specified on the element (except where otherwise disallowed), along with non-normative descriptions of those attributes. (The content to the left of the dash is normative, the content to the right of the dash is not.)

Style properties

A normative list of style properties that may be specified on the element (except where otherwise disallowed), along with non-normative descriptions of those attributes. Where these attributes may be inherited from ancestor elements, this is indicated.

This is then followed by a description of what the element represents, along with any additional normative conformance criteria that may apply to producers and consumers and implementations. Examples are sometimes also included.

4. MNX-Container

Each MNX document acts as a container document, which contains a hierarchy of components which collectively make up the document as a whole.

4.1. Structural Elements

4.1.1. The mnx element

Contexts:
None: this is the top-level element.
Content Model:
A single, required head element.
Either a collection or a score element.
Attributes:
None.

The mnx element encloses an MNX document as a whole.

<mnx xmlns="http://www.w3.org/mnx">
    <head>
      ...head content...
    </head>
    <score>
      ...musical body content...
    </score>
</mnx>

4.1.2. The head element

Contexts:
Any.
Content Model:
Metadata content.
stylesheet
Attributes:
None.
The head element supplies overall descriptive information for an MNX document, such as document-scoped metadata or stylesheet definitions.

4.1.3. The collection element

Contexts:
mnx, collection
Content Model:
Any combination of collection and score elements.
Attributes:
type - The type of the collection

The collection element describes a collection, which is a sequence of ordered elements that make up a compound musical document. Each child element of the collection may itself be either a collection or a score.

The type attribute determines the nature of the collection. Valid collection type values include:

movements
Each element comprises a movement of a work.
sections
Each element comprises a section of a work, or of a movement.
parts
Each element comprises a description of of the same music, organized for different parts.

Metadata content or style properties may be included at any level of the resulting structure, causing them to apply them only to those parts of the document.

The following example shows a hierarchy of collections and scores.

    <collection type="sections">
        <score>
            <title>Section 1 (for Flute and Cello)</title>
            <mnx-common>...</mnx-common>
        </score>
        <collection type="movements">
            <title>Section 2</title>
            <score>
                <title>Section 2, Movement 1 (for Solo Flute)</title>
                <mnx-common>...</mnx-common>
            </score>
            <score>
                <title>Section 2, Movement 2 (for Solo Cello)</title>
                <mnx-common>...</mnx-common>
            </score>
        </collection>
        <score>
            <title>Section 3 (for Flute and Cello)</title>
            <mnx-common>...</mnx-common>
        </score>
    </collection>

4.2. Musical body content

A musical body consists of a score in some notational idiom supplying concrete musical content that can be rendered and/or performed.

4.2.1. The score element

Contexts:
mnx, collection.
Content Model:
Metadata content
Zero or one musical body elements.
Attributes:
src - optional relative path to an external source file

The score element encloses a self-contained description of the score for a portion or the entirety of a musical work.

If the src attribute is provided, this specifies a relative path where the score’s musical body lives. Otherwise, the body must be provided within the content of the score element.

4.3. Metadata content

Metadata content may be included in many elements to supply bibliographic data and other descriptive information.

Many elements TBD. Need to harmonize with existing metadata and bibliographic standards.

4.3.1. The title element

Contexts:
Any.
Content Model:
Text
Attributes:
None.

The title element assigns a title to its parent element in the context of the document as a whole.

5. MNX-Common

5.1. Scope

This section is non-normative.

This part of the specification is called MNX-Common, and describes a semantic dialect of MNX designed to encode Conventional Western Music Notation (CWMN).

5.2. Semantics

5.2.1. Notational concepts

This section describes various foundational concepts in music notation that are frequently referenced by this specification.

5.2.1.1. Parts and staves

A score consists of multiple parts. Each part is a grouping of related musical material that relates to a single performer or set of performers. It has the same temporal extent as the score overall, but presents a slice of content that is relevant to a single instrument or a group of related instruments.

A part may employ one or more staves. Each staff supplies a pair of dimensions, usually one for pitch and one for time, within which notes may be placed. Conventionally, the time dimension is horizontally oriented; for pitched instruments, the pitch dimension is vertically oriented. All staves within a part share the same time dimension.

For unpitched instruments, the vertical dimension indicates a choice of sound rather than a pitch, governed by a set of conventions that map note placement to sound.

Every segment of a staff possesses a clef that determines the mapping between its pitch dimension and some set of performable pitches, with additional information supplied by a key signature. Accidental symbols on notes further modify this mapping on an ad-hoc basis.

Staves in CWMN are identified within a part by a unique staff index. The topmost staff in a part has a staff index of 1; staves below the topmost staff are identified with successively increasing indices.

5.2.1.2. Notated events

A notated event in CWMN is a discrete action in the score with a notated duration. It has an onset that is relative to the start of its containing sequence as well as to other elements in that sequence, subject to the conventions of CWMN. Events belong to a specific staff within a part, denoted by its staff index.

A notated event may include one or more notes possessing a pitch, or a rest indicating silence. Events including more than one note are referred to as chords.

In both cases, the event possesses an associated note value that indicates its notated duration. This value is not literal, but is subject to performance interpretation.

The content of a notated event includes only the specific notes and chords within it. In particular a notated event does not account for ties, ornamental interpretation or many other kinds of performance reading. As such, the onset, duration, pitch and other properties of notated events will often differ from those in the corresponding performance events.

Events may further possess articulations, additional properties that modulate their musical performance in commonly understood ways. In this specification, we use the term articulation in an expanded sense to cover all such additional properties.

Notated events are represented by the event element.

5.2.1.3. Metrical position

Most notated events possess a well-defined metrical position, giving a time onset expressed as a rational number of whole note durations after the start of its containing measure. This position may be thought of as the event’s "address" and plays a determining role in the normative rendering and performance of events.

5.2.1.4. Chromatic pitches

A chromatic pitch describes a pitch situated in a 12-tone temperament notated as per CWMN conventions. The description incorporates three elements:

5.2.1.5. Directions

A direction is a discrete instruction in the score that applies to notated events.

Directions do not have a duration, although they have a specific location in relation to a containing measure or sequence. Like events, directions also belong to a specific staff within a part, denoted by its one-based staff index.

Directions come in the following flavors:

Single-ended directions begin at a point in time in some part and generally continue to apply until superseded by another direction, or by some notation in the score that is understood to terminate it. An example is a piano dynamic.

Span directions begin at a point in time in some part and end at a later point within the same part. One common example of a span direction is a slur.

Liaison directions begin on one note and end on an immediately succeeding note. A common example of a liaison direction is a tie.

5.2.1.6. Notated sequences

A notated sequence is a set of notated events whose notional time intervals do not overlap, which lie within the same measure and the same part, and which occur at progressively greater temporal offsets within a measure.

Notated sequences are represented by the sequence element.

5.2.1.7. Voices

Notated sequences may also belong to voices. A voice is a set of sequences in different measures, but within the same part. The sequences within this set can be thought of as constituting a single musical voice throughout the score. Thus they are an organizational construct, rather than a notational one.

A given voice need not be expressed in every single measure of a part. It may be present in some measures, and absent in others.

Even so, the assignment of sequences to voices has concrete implications for MNX implementations. Producer implementations may interpret a voice as affecting the way that musical material is organized, for example by cutting and pasting material from a given voice in one measure into the same voice in a different measure. Consumer implementations might allow users to isolate the playback of a single voice, including only the sequences that belong to it.

5.2.1.8. Performance Interpretation

A performance interpretation is the end result of deriving a set of performance events from a score. Human performers do this by reading music, while MNX consumers will generally do this algorithmically.

In MNX-Generic, a consumer is given the exact performance interpretation as part of the document. In MNX-Common, a consumer will typically employ a set of rules that mimic the actions of a human performer, overriding these with exact performance events where these are explicitly supplied within the document.

5.2.1.9. Performance events

A performance event is a description of a single timed element of a musical performance with specific attributes for its onset, duration, pitch, dynamics, articulation and instrument. Unlike a notated event, these attributes are specific and not subject to interpretation, and they are independent of any notational concepts.

Performance events in MNX are used to describe the exact performance interpretation of some or all of a score, as distinct from its notation.

5.2.1.10. Note values

There are a variety of situations in which the note value of a musical event needs to be described, in terms of some fraction or multiple of a CWMN whole-note unit.

In CWMN, fractions for undotted base note values are constrained to be exact powers of two. The most common note values of whole, half, quarter, etc. correspond to whole-note fractions expressed by the non-negative powers 20, 2-1, 2-2. The less frequently used note values of breve, longa, etc. are expressed by the positive powers 21, 22, ...

In the broader case of general note values, some number of dots act as a multiplier on the base note value. These multipliers take the form (2n+1-1) / 2n, where n is the number of dots.

5.2.1.11. Orientation

Events and sequences may possess an optional orientation that determines a the placement and rendering of content according to a complex set of CWMN conventions. For the purposes of MNX-Common there are two orientations:

5.2.1.12. Written pitches

A note’s written pitch is the pitch that would sound if the note’s notation were performed by a concert-pitch instrument. Written pitch can be thought of as the note’s pitch from the musician’s instrument-specific perspective.

A note’s written pitch might be different from its sounded pitch, which is the pitch that an instrument generates. Written pitch differs from sounded pitch in case of notation written for a transposing instrument, such as a clarinet.

MNX considers ottava markings (e.g., "8va," "8vb") to be purely presentational. Hence, ottava markings have no effect on a note’s written pitch. For example, if a note is rendered on the bottom staff line without an ottava marking, and a second note is rendered on the top staff space with an "8vb" marking, the two notes have the same written pitch.

5.2.2. Notational syntaxes

5.2.2.1. Note value syntax

MNX provides a microsyntax for encoding note values whose syntactic constraints map to the above requirements. Its syntax is designed to be distinguishable from other syntaxes for integers, floating point numbers or rational numbers. The syntax for base note values consists of either of the following forms:

  1. For values less than or equal to a whole note:

    1. The character U+002F SLASH

    2. One or more ASCII digits encoding the base note value as a power-of-two fractional denominator

  2. For values greater than a whole note:

    1. The character U+002A ASTERISK

    2. One or more ASCII digits encoding the base note value as a power-of-two multiplying factor

The syntax for general note values consists of these components:

  1. A base note value encoding.

  2. Zero or more occurrences of U+0064 LOWERCASE D characters. The number of occurrences supply the number of dots.

To parse a note value, use the following procedure:

  1. Let input be the string being parsed.

  2. Let position be a pointer into input, initially pointing at the start of the string.

  3. Let number of dots be 0.

  4. If the character indicated by position is a U+002A ASTERISK character (*), let fractional be false and advance position by 1.

  5. Else, if the character indicated by position is a U+002E SLASH character (/), let fractional be true and advance position by 1.

  6. Else, return an error.

  7. Collect a sequence of characters that are ASCII digits only and let unparsed number be the result.

  8. Let base value be the result of parsing unparsed number using the rules for parsing integers.

  9. If parsing a general note value, collect a sequence of characters that are U+0064 LOWERCASE D characters. Set number of dots to the length of this sequence.

  10. If position is not at the end of the string, return an error.

  11. If base value is not equal to a power of 2, return an error.

  12. If base value is equal to 1 and fractional is false, return an error.

  13. If fractional is true, set base value to (1 / base value).

  14. Return base value and number of dots.

Here are some instances of the note value syntax:
/1

a whole note

/4

a quarter note

/8

an eighth note

/8d

a dotted eighth note

/8dd

a double-dotted eighth note

*2

a breve (double whole note)

*2d

a dotted breve

5.2.2.2. Note value quantity syntax

MNX allows the specification of a note value quantity, defined as an integer multiple of a note value. To parse a note value quantity, use the following procedure:

  1. Let input be the string being parsed.

  2. Let position be a pointer into input, initially pointing at the start of the string.

  3. Let multiplier be 1.

  4. Collect a sequence of characters that are ASCII digits only, and let unparsed number be the result.

  5. If unparsed number is not empty, assign unparsed number to multiplier using the rules for parsing integers.

  6. Let note value be the result of parsing the remainder of string beginning at position according to the rules to parse a note value.

  7. Return multiplier and note value as the result.

Examples include:

/8

a single eighth note

6/8

six eighth notes

6/8d

six dotted eighth notes

5/1

five whole notes

5.2.2.3. Time signature syntax

MNX allows the specification of a time signature, consisting of a sum of ordered, undotted note value quantities defining the meter of a measure. The sum may optionally share a common denominator. To parse a time signature, use the following procedure:

  1. Let input be the string being parsed.

  2. Let tokens be the result of strictly splitting the string input using U+002B PLUS as a delimiter.

  3. If tokens is empty, return an error.

  4. Let shared denominator be true.

  5. Let fractions be an empty list.

  6. While tokens is not empty,

    1. Remove the first element of tokens and assign it to t after stripping leading and trailing white space.

    2. If t contains the characters U+002F SLASH or U+002A ASTERISK,

      1. Let nv be the result of parsing t as a note value quantity.

      2. If nv has a number of dots greater than zero, return an error.

      3. If shared denominator is true,

        1. Replace the denominator in each element of fractions with the denominator of nv.

        2. If more elements remain in tokens,

          1. Set shared denominator to false.

      4. Append nv to fractions.

    3. Else,

      1. If tokens is empty, return an error.

      2. If shared denominator is false, return an error.

      3. Let numerator be the result of parsing t as a valid integer.

      4. Append the fraction composed of numerator and the denominator 1 to fractions.

  7. Return fractions and shared denominator as the result.

Examples include:

3/4

Three-quarters time

2+3+2/8

A compound time signature of 2/8, 3/8 and 2/8, with 2+3+2 over the shared denominator 8.

2/8 + 3/4 + 2/8

A compound time signature of 2/8, 3/4 and 2/8 as separate fractions (note that the spaces are ignored)

Shared denominators are all-or-nothing. So there’s currently no way to share a denominator for only some of a time signature’s fractions, which would require a grouping construct like 2/4+(2+3/8) or such.

5.2.2.4. Chromatic pitch syntax

MNX allows the specification of a chromatic pitch in a single string, by employing the rules for parsing a chromatic pitch. To parse this syntax, employ the following procedure:

  1. Let input be the string being parsed.

  2. Let position be a pointer into input, initially pointing at the start of the string.

  3. Let alteration be 0.

  4. If the character at position is not an uppercase ASCII letter in the range from U+0041 UPPERCASE A - U+0047 UPPERCASE G, return an error.

  5. Let step be the character at position, and advance position by 1.

  6. If the character at position is U+0023 HASH,

    1. While the character at position is U+0023 HASH,

      1. Increase alteration by 1.

      2. Advance position by 1.

  7. Else, if the character at position is U+0062 b,

    1. While the character at position is U+0062 b,

      1. Decrease alteration by 1.

      2. Advance position by 1.

  8. Collect a sequence of characters that are ASCII digits only and let unparsed number be the result.

  9. Let octave be the result of parsing unparsed number using the rules for parsing integers.

  10. Let alteration factor be 0.

  11. If the character at position is U+002B PLUS,

    1. Set alteration factor to 1.

    2. Advance position by 1.

  12. If the character at position is U+002D HYPHEN-MINUS,

    1. Set alteration factor to -1.

    2. Advance position by 1.

  13. If alteration factor is not equal to zero,

    1. Collect a sequence of characters that are ASCII digits, U+002E FULL STOP, or U+002F SLASH and place the result in unparsed number.

    2. If the character at position is U+006F LOWERCASE o

      1. Multiply alteration factor by 12.

      2. Advance position by 1.

    3. Else, if the character at position is U+0077 LOWERCASE w

      1. Multiply alteration factor by 2.

      2. Advance position by 1.

    4. If unparsed number contains U+002F SLASH, parse it as a rational number, otherwise parse it as a valid floating-point number. Multiply the result by alteration factor and add this to alteration.

  14. If position is not at the end of the string, return an error.

  15. Return step, octave and alteration as the result.

Examples include:

C4

Middle C

C#4

The C-sharp above middle C

Db4

The D-flat above middle C

Dbb4

The D-double-flat very near middle C

C4+0.5

The pitch one quarter-tone above middle C

C4+0.25w

The pitch one quarter-tone above middle C (identical to the above, but expressed in whole tone units)

C4+1/4w

The pitch one quarter-tone above middle C (identical to the above, but expressed as whole tone fraction)

C4+1/24o

The pitch one quarter-tone above middle C (identical to the above, but expressed as octave fraction)

5.2.2.5. Measure location syntax

There are a variety of situations in which the measure location of a musical event needs to be described, in terms of the content of the measure.

The following cases exist for specifying measure locations:

  1. If the measure location is a metrical position in the context of some containing measure, then it is specified as a valid floating-point number or note value quantity that gives the number of whole notes from the start of the measure.
  2. If the measure location is a metrical position in the context of an arbitrary measure in the score, then it is specified as a pair of tokens separated by U+003A COLON. The first token is a measure index identifying the measure, and the second token is a valid floating-point number or note value quantity that gives the number of whole notes from the start of the identified measure. The identified measure must belong to the same measure content as the element in which the measure location is given. This requirement has the effect of ensuring that the measure index is unique, since measure locations cannot reference measures in other systems with different barring.
  3. If the measure location is identical to the metrical position of some known event in the score, then it is specified as a element location identifying the event. The identified event must belong to the same measure content as the element in which the measure location is given.
Here are some instances of the measure location syntax:
0.25

one quarter note after the start of a containing measure

3/8

three eighth notes after the start of a containing measure

4:0.25

one quarter note after the start of the measure with index 4

4:1/4

the same as the preceding example

#event235

the same metrical position as the event whose element ID is event235

5.2.2.6. SMuFL glyph name syntax

Some contexts, particularly the glyph property, permit the specification of a SMuFL glyph name from the catalog of glyph names defined in the SMuFL specification.

5.2.3. MNX-Common body content

These elements of an MNX-Common score supply its high-level description and structure.

5.2.3.1. The mnx-common element
Contexts:
Wherever a musical body is expected.
Content Model:
Metadata content.
stylesheet.
One or more global elements - measure content shared by sets of parts within the score.
One or more part elements - description and measure content of each part in the score.
Zero or more score-audio elements - recordings of the score and their associated synchronization date.
Attributes:
profile — profile describing constraints on the contents of this score

The mnx-common element is a musical body that describes an MNX-Common score as a whole.

The profile attribute declares a score profile that supplies constraints to which the score is expected to obey.

The following values of profile are supported, along with the constraints that they represent:

standard
The measure content in all global or part elements consists of an identical number of measures.
Only a single global element exists.
Time signatures and tempo indications only occur in measures within the global element.
Key signatures within the global element only occur in enharmonic/transposed forms that are equivalent within part elements.
All notated events in a chord share the same duration

The following example provides the basic skeleton of a mnx-common musical body:

<mnx-common>
  <global>
      ...measure content describing system-wide features...
  </global>
  <part>
      ...part description content...
      ...measure content for part 1...
  </part>
  <part>
      ...part description content...
      ...measure content for part 2...
  </part>
  ...additional parts...
</mnx-common>
5.2.3.2. The global element
Contexts:
mnx-common
Content Model:
Measure content, which must not include any sequence content
Attributes:
parts - an optional set of IDs of part elements to which this global content applies

The global element represents a set of measures, each of which provides content that is shared by a set of parts within the score. Each measure element within global supplies the shared content for all other measure elements which share the same index.

By default, the content in global applies to all parts in the score.

Typical examples of such content include key signatures, time signatures and tempo indications.

Notated events like notes or rests cannot be shared between parts in CWMN. Consequently, sequence content cannot occur in the measures within global.

<global>
  <measure index="1">
    <directions>
      <tempo bpm="120" value="4"/>
      <time signature="4/4"/>
    </directions>
  </measure>
  <measure index="2"/>
  <measure index="3"/>
  <measure index="4" barline="final"/>
</global>

The following features are not supported by the standard MNX-Common profile.

The parts attribute optionally supplies a list of part element IDs as an unordered set of space-separated tokens. This restricts the scope of the global element to the set of parts given, allowing subsets of parts in the score to share dissimilar structures (for example, in scores having different parts in different meters). The default value of this attribute is the list of all part IDs.

No two global elements may include the same part in their parts list, to prevent conflicting definitions.

<global parts="p1 p2">
  <measure index="1">
    <directions>
      <time signature="6/8"/>
    </directions>
  </measure>
  <measure index="2"/>
  <measure index="3"/>
  <measure index="4" barline="final"/>
</global>
<global parts="p3 p4">
  <measure index="1">
    <directions>
      <time signature="4/4"/>
    </directions>
  </measure>
  <measure index="2"/>
  <measure index="3" barline="final"/>
</global>
<part id="p1">...</part>
<part id="p2">...</part>
<part id="p3">...</part>
<part id="p4">...</part>
5.2.3.3. The part element
Contexts:
mnx-common
Content Model:
Part description content
Measure content

The part element represents a set of measures which describe a single part within the score. The sequence of measures must match the measure found in the global element applying to this part.

<part>
  <part-name>Violin</part-name>
  <part-abbreviation>Vln</part-abbreviation>
  <instrument-sound>strings.violin</instrument-sound>

  <measure>
    <sequence>...</sequence>
  </measure>
  <measure>
    <sequence>...</sequence>
  </measure>
  <measure>
    <sequence>...</sequence>
  </measure>
  <measure>
    <sequence>...</sequence>
  </measure>
</part>

5.2.4. Measure content

Measure content supplies a sequence of measure elements, each of which supplies musical content for a time interval within a score.

The placement of the measures in measure content constitutes their score order, which is the order in which they are logically presented to a reader. This is distinct from their performance order, which is the order in which they are played by a performer.

Each measure may bear the index attribute, which provides its unique measure index within the score order for this measure content. The first measure in a system has an index of 1.

5.2.4.1. The measure element
Contexts:
global, part
Content Model:
Metadata content
Zero or one directions elements
One or more sequence elements (for measures within part elements only)
Interpretation content
Attributes:
index - an optional integer index for the measure
number - an optional textual number to be displayed for the measure
barline - an optional ending barline type for the measure

The measure element encloses the direction and sequence content that together make up the majority of musical content in an MNX-Common score.

The optional attribute index defines the one-based index of this measure within score order. This is used for cross- referencing measure elements in corresponding runs of measure content. The default value of index is 1 for the first measure element within a run of content; for all other measures its default is the index of the previous measure element plus 1.

The optional attribute number provides an non-negative integer which is to be used as a visual label for the measure. It is not required to be unique. If omitted, its default value is the same as index.

The optional attribute barline defines a barline type for the barline drawn at the end of the measure. Allowed values include:

regular
dotted
dashed
heavy
light-light
light-heavy
heavy-light
heavy-heavy
tick
short
none

The following example shows both direction content and a single-voice sequence with two monophonic half notes:

<measure index="2">
  <directions>
    <dynamics type="f" location="0"/>
    <wedge type="diminuendo" location="0" end="1/2"/>
    <dynamics type="p" location="1/2"/>
  </directions>
  <sequence>
    <event value="/2">
      <note pitch="C4"/>
    </event>
    <event value="/2">
      <note pitch="C5"/>
    </event>
  </sequence>
</measure>
5.2.4.2. The sequence element
Contexts:
measure
Content Model:
Metadata content
Zero or one directions elements
Sequence content
Interpretation content
Attributes:
orient - default orientation of direction and sequence content
staff - default staff index of direction or sequence content
voice - optional cross-measure voice identifier

The sequence element organizes a set of musical events within a measure into a strict temporal sequence, accompanied by relevant directions. The assignment of measure positions to these events is accomplished by sequencing the content, with a starting position of 0 and a time modification ratio of 1.

The sequence content within each sequence supplies the music for a single polyphonic voice within its containing measure, including notes, chords, rests, beam groups, tuplets and grace note runs.

The optional orient attribute provides a default orientation for all content within this sequence. If not provided, the orientation is determined automatically according to the implementation’s rendering rules, and may also be overridden by descendants.

The optional staff attribute provides a default staff index for all content within this sequence. If not provided, the staff index is determined automatically according to the implementation’s rendering rules, and may also be overridden by descendants.

The optional voice attribute supplies a string that identifies the voice to which this sequence belongs. All sequence elements in a given part having the same value of voice belong to the same voice. Within a given measure element, no two sequence elements may share the same value for voice. The value of voice is an opaque identifier that does not supply information from producers to consumers.

This example shows a sequence with a single monophonic voice including a series of events that together comprise a 4/4 measure. A single staff is used, so no staff attribute is present.

<measure>
    <sequence>
        <event value="/2">...</event>
        <event value="/4">...</event>
        <event value="/4">...</event>
    </sequence>
</measure>

Here’s a more complex measure that shows two melodic voices with independent rhythms on different staves, each represented by a sequence element:

<measure>
    <sequence staff="1">
        <event value="/2">...</event>
        <event value="/4">...</event>
        <event value="/4">...</event>
    </sequence>
    <sequence staff="2">
        <event value="/2d">...</event>
        <tuplet inner="3/8" outer="1/4">
            <event value="/8">...</event>
            <event value="/8">...</event>
            <event value="/8">...</event>
        </tuplet>
    </sequence>
</measure>

If the voices in the previous example shared a single polyphonic staff, it might look like this instead:

<measure>
    <sequence orient="up">
        <event value="/2">...</event>
        <event value="/4">...</event>
        <event value="/4">...</event>
    </sequence>
    <sequence orient="down">
        <event value="/2d">...</event>
        <tuplet inner="3/8" outer="1/4">
            <event value="/8">...</event>
            <event value="/8">...</event>
            <event value="/8">...</event>
        </tuplet>
    </sequence>
</measure>

The following example shows a typical organization of sequences within a measure for a SATB-style grand staff with four voices, two on each staff:

<measure>
    <sequence orient="up" staff="1">...</sequence>
    <sequence orient="down" staff="1">...</sequence>
    <sequence orient="up" staff="2">...</sequence>
    <sequence orient="down" staff="2">...</sequence>
</measure>

When a direction content element is included in a sequence, it acquires a measure location identical to that of the following event, or to the end of the measure if there are no more events. The following example specifies a dynamics element which applies to the start of the following note:

<sequence>
  <directions>
    <dynamics location="0" type="mp"/>
  </directions>
  ...preceding sequence content...
  <event value="/4">
    <note pitch="C4"/>
  </event>
  ...following sequence content
</sequence>

Note that directions may also occur within a directions element at the start of a sequence, in which case they are assigned explicit locations. In this example, dynamic changes are specified over the course of a single whole note which occupies the entire measure:

<sequence>
  <directions>
    <dynamics type="f" location="0"/>
    <wedge type="diminuendo" location="0" end="1/2"/>
    <dynamics type="p" location="1/2"/>
  </directions>
  <event value="/1">
    <note pitch="C4"/>
  </event>
</sequence>
5.2.4.3. The directions element
Contexts:
measure, sequence
Content Model:
Direction content
Attributes:
None.

The directions element organizes direction content within a containing measure or sequence element. Within directions, each child direction element must be assigned a explicit measure location, via its location attribute.

The order of occurrence of child elements is not significant, since each child has an explicit location independent of this order.

The context of the directions element defines a scope for contained directions as follows:

  • Directions in a measure within the global element apply to all sequences of the measure in all applicable parts. Directions with an orientation of up appear above the first displayed part; those with an orientation of down below the last displayed part.

  • Directions in a measure within a part element apply to all sequences of the measure in a given part.

  • Directions in a sequence apply to the sequence in which it occurs.

5.2.5. Sequence content

Sequence content supplies a series of musical events that are both presented and performed in a given order, each at a distinct time. Such events express the concepts of chords, notes and rests.

Sequence content possesses a starting position. This is the metrical position within a containing measure of the content’s first element.

Sequence content also possesses a time modification ratio. This is a rational number scale factor which implicitly applies to all positions and durations within the content.

Sequence content also may define some number of beamed groups. If defined, these are lists which accumulate events into groups to which beaming applies.

Sequence content also permits interspersed direction content whose directions are injected into the sequence either adjacent to events, or at explicitly given measure locations.

Within sequence content, nested event content is assigned metrical positions and placed in beamed groups according to the following procedure, called sequencing the content:

  1. Let sequence cursor be the starting position of the sequence content.
  2. Let content to the list of elements comprising the sequence content.
  3. While content is not empty:
    1. Let next be the initial element of content, and remove it from the head of content.
    2. If next is a beamed element:
      1. If beamed group has a value of list type, beamed groups have been illegally nested. Throw an error.
      2. Set beamed group to an empty list.
      3. Sequence the content of next, retaining the value of beamed group.
      4. Record beamed group as a group of beamed events within the sequence.
      5. Set beamed group to an undefined value.
    3. If next is an event element:
      1. If next has a measure value of yes,
        1. If sequence cursor is greater than zero, throw a processing error.
        2. Set sequence cursor to the end of the measure as defined by its time signature.
      2. Else,
        1. Set the metrical position of next to sequence cursor.
        2. If next has a duration attribute, assign it to event duration.
        3. Else, set event duration to next’s value attribute.
        4. Multiply event duration by the time modification ratio, and add the result to sequence cursor.
        5. If beamed group is a list, append next to beamed group.
    4. If next is a forward element:
      1. Set the metrical position of next to sequence cursor.
      2. Add the duration of next, multiplied by the time modification ratio, to sequence cursor.
    5. Else, if next is a tuplet element:
      1. Sequence the content of next, using sequence cursor as the starting position, retaining the current value of beamed group, and multiplying the time modification ratio by the tuplet's outer / inner ratio for the processing of the tuplet.
      2. Add the total duration of next as given by outer, multiplied by the time modification ratio, to sequence cursor.
    6. Else, if next is a grace element:
      1. Process the contents of next, assigning them a non-metrical ordering relative to preceding or following elements as appropriate.
    7. If sequence cursor exceeds the specified duration for the enclosing element (time signature for a measure, inner attribute for a tuplet), throw a processing error.
5.2.5.1. The event element
Contexts:
sequence, tuplet, beamed
Content Model:
Metadata content
Either zero or more note elements, or one rest element.
Event content
Interpretation content
Attributes:
value - the notated metrical duration of this event
measure - optional flag indicating that the event occupies the entire measure.
orient - optional orientation of this event
staff - optional staff index of this event
duration - optional performed metrical duration, if different from value
Style properties:
stem-direction - the stem direction of this event

The event element represents a notated event: a discrete period of time within a sequence during which one or more notes are performed, or in which a rest occurs.

All events other than whole-measure events require a value attribute to provide their duration as a note value. This duration is implicitly multiplied by the current time modification ratio, as specified by the process of sequencing the content of the event’s containing element.

Here’s an example of a simple event representing a half note:

<event value="/2">
  <note pitch="C4"/>
</event>

With more than one note, the event becomes a chord:

<event value="/2">
  <note pitch="C4"/>
  <note pitch="E4"/>
  <note pitch="G4"/>
</event>

With a rest element alone, the event is a rest:

<event value="/2">
  <rest/>
</event>

NOTE: It is legal for an event to have neither notes nor a rest. The result is functionally identical to forward, but is more constrained since an event must have a valid note value, while a space can have a multiple thereof.

If the optional measure attribute is given as yes, then the event is a whole-measure event which occupies the entire measure. Whole-measure events may not specify a value attribute, and must either be empty or contain exactly one rest. Here’s an example:

<event measure="yes">
  <rest/>
</event>

The optional orient attribute provides a specific orientation for this event. If not provided, the orientation is inherited from any sequence or tuplet ancestor which specified it. If no ancestor did so, it is determined automatically according to the implementation’s rendering rules.

The optional staff attribute provides a specific staff index for this event. If not provided, the orientation is inherited from any sequence or tuplet ancestor which specified it. If no ancestor did so, it is determined automatically according to the implementation’s rendering rules.

While staff could be used on a per-event basis, its primary purpose is for overriding a default staff assignment at the sequence level, as in cross-staff keyboard notation. The following example illustrates a lower-staff keyboard voice that temporarily crosses into the upper staff:

<sequence staff="2">
  <event value="/4">
    <note pitch="C3"/>
  </event>
  <event value="/4">
    <note pitch="G3"/>
  </event>
  <event value="/4" staff="1">
    <note pitch="E4"/>
  </event>
  <event value="/4">
    <note pitch="C3"/>
  </event>
</sequence>

The optional duration attribute specifies the actual, performed duration of the event, if different from the notated value given by value. The duration is specified as a note value quantity, which is less constrained than note value: it may be any desired multiple of a note value.

The duration attribute not only alters the performance of the event, but determines the amount of time the event takes up in the measure, affecting the location of subsequent events in the containing sequence. This can change the layout of the measure, as subsequent events will occur earlier or later as a result and be positioned accordingly (see sequencing the content). It also affects the validation of the measure for metrical correctness.

This attribute is unlikely to be frequently encountered, but is needed to handle cases in which composers employ notated values that are not interpreted literally. Some examples follow.

One case occurs in Brahms, in which a dotted half is used to specify a note that is traditionally understood as occupying the space of 11 sixteenth notes:

<sequence>
  <event value="/16">
    <rest/>
  </event>
  <event value="/2d" duration="11/16">
    <note pitch="E4"/>
  </event>
</sequence>

Because duration must be specified as a legal note value multiple, an enclosing tuplet element may be required in some cases to impose an appropriate time modification ratio.

5.2.5.2. The tuplet element
Contexts:
sequence, tuplet, beamed
Content Model:
Metadata content
Sequence content
Interpretation content
Attributes:
outer - duration with respect to containing element
inner - duration of the enclosed sequence content
orient - optional orientation of this tuplet
staff - optional staff index of this tuplet
show-number - optional control over the display of the tuplet ratio numbers
show-value - optional control over the display of the tuplet ratio note values
bracket - optional control over the display of brackets

The tuplet element organizes a set of musical events that form a distinct and contiguous run within a <{sequence}, and which are subject to a common time modification ratio expressed as the quotient of two rational numbers. A tuplet behaves much like a sequence element with respect to its contents.

The required attribute outer supplies a note value quantity describing both the duration and the units of the tuplet with respect to its enclosing context. This is how much time the tuplet occupies in the measure or tuplet in which it is placed.

The required attribute inner supplies a note value quantity describing the total duration and the units of the events within the tuplet, as notated.

The contents of the tuplet are placed into a temporal sequence by performing the procedure sequencing the content with a starting position determined by the parent context, and a time modification ratio equal to the tuplet’s outer value divided by its inner value. Following this procedure, the value of the sequence cursor MUST equal the value of inner, or the contents are considered to be in error.

The optional orient attribute provides a specific orientation for all content within this tuplet. If not provided, the orientation is determined automatically according to the implementation’s rendering rules.

The staff attribute provides a specific staff index for all content within this tuplet. If not provided, the staff index is determined automatically according to the implementation’s rendering rules.

The optional show-number attribute controls the display of the quantity of inner and outer note value units for the tuplet. Permissible values include:

none
Do not show any tuplet number.
inner (default)
Display only the numerator of the tuplet’s inner attribute.
both
Display the numerators of the tuplet’s inner and outer attributes.

The optional show-value attribute controls the display of the note value units used inside and outside the tuplet. Permissible values include:

none (default)
Do not show any tuplet units.
inner
Display only the note value unit of the tuplet’s inner attribute.
both
Display both the note value units of the tuplet’s inner and outer attributes.

The optional bracket attribute controls the display of a bracket in conjunction with the tuplet. It is disregarded if show-number has a value of none. Values include:

auto (default)
A bracket is shown for the tuplet if and only if the notes are unbeamed.
no
Do not display a bracket.
yes
Always display a bracket.
<sequence>
    ...preceding events in sequence...
    <tuplet inner="3/8" outer="1/4">
        <event value="/8">...</event>
        <event value="/8">...</event>
        <event value="/8">...</event>
    </tuplet>
    ...remaining events in sequence...
</sequence>
5.2.5.3. The beamed element
Contexts:
sequence, tuplet, grace, beamed
Content Model:
Metadata content
Sequence content
Interpretation content
Attributes:
value - optional value of a beamed group within a parent
continue - optional beam group in following measure which continues this one

The beamed element defines a beamed group of events that are joined by beams to conceptually mark them as belonging to a distinct run. In CWMN, beams cannot have one end inside a grouping such as tuplets or grace notes and the other end outside the same grouping. This element acts to enforce that constraint.

NOTE: In general, the number of beams and the use of forward or backward hooks is determined automatically by implementations. There is thus no MNX-Common element that represents an individual beam. More detailed beam specifications may be overridden using style properties, but do not amount to semantic markup.

Here is a simple beaming example in a 6/8 time signature. The grouping suffices; there is no need to describe individual beams within the beamed group, such as the distinct beam that joins the two 16th notes in the second group:

<sequence>
    <beamed>
      <event value="/8">...</event>
      <event value="/8">...</event>
      <event value="/8">...</event>
    </beamed>
    <beamed>
      <event value="/8">...</event>
      <event value="/16">...</event>
      <event value="/16">...</event>
      <event value="/8">...</event>
    </beamed>
</sequence>

The beamed element may also be nested in order to place child beamed groups in a parent beamed group. A common scenario occurs when groups of short beamed notes are themselves organized at a higher level. The value attribute optionally gives the base note value to be assigned to the child group for beaming purposes.

The following example illustrates two child beamed groups of four 32nd notes each, which are in turn placed within a higher-parent beamed group. Each of the two subgroups has a note value of one eighth note for beaming purposes, and so the subgroups are themselves connected by a single beam to constitute the larger group:

<beamed>
  <beamed value="/8">
    <event value="/32">...</event>
    <event value="/32">...</event>
    <event value="/32">...</event>
    <event value="/32">...</event>
  </beamed>
  <beamed value="/8">
    <event value="/32">...</event>
    <event value="/32">...</event>
    <event value="/32">...</event>
    <event value="/32">...</event>
  </beamed>
</beamed>

Note: The value attribute does not affect beaming within a beamed group, and so has no consequences unless beamed groups are nested.

Indirect nesting of beam groups also occurs. One example is a set of grace notes sharing their own beam, preceding a regular non-grace note within a larger beamed group:

<beamed>
  <event value="/8">...</event>
  <event value="/8">...</event>
  <grace>
    <beamed>
      <event value="/8">...</event>
      <event value="/8">...</event>
      <event value="/8">...</event>
      <event value="/8">...</event>
    </beamed>
  </grace>
  <event value="/8">...</event>
  <event value="/8">...</event>
</beamed>

Beams may continue from one measure to the next. In this case, the continue attribute gives the element location of a beamed element in the succeeding measure of the same part, which serves to identify both the continuing group and the sequence or voice in which it resides. This attribute may only be specified if no other events follow this beamed group in its measure.

A continuation of a prior beamed group, may itself have a continuation. Thus, beamed groups can continue throughout a part without interruption if desired.

The following example illustrates a beam that crosses a measure boundary:

<measure index="9">
  <sequence>
      ...preceding sequence content...
      <beamed continue="#beamcont1">
        <event value="/8">...</event>
        <event value="/8">...</event>
      </beamed>
  </sequence>
</measure>
<measure index="10">
  <sequence>
      <beamed id="beamcont1">
        <event value="/8">...</event>
        <event value="/8">...</event>
      </beamed>
      ...following sequence content...
  </sequence>
</measure>
5.2.5.4. The forward element
Contexts:
sequence, tuplet
Content Model:
None.
Attributes:
duration - the metrical duration of this space

The forward element represents a discrete period of time within a sequence in which no sequence content occurs. It embodies the idea of blank space within a measure.

The duration attribute specifies a note value quantity that provides the length of this forward.

NOTE: In contrast to event's value attribute, the duration attribute is not constrained to a single note value, but may be a multiple of one.

5.2.5.5. The grace element
Contexts:
sequence, tuplet
Content Model:
Metadata content
Sequence content
Interpretation content
Attributes:
type - type of included grace notes
slash - flag indicating rhythmic character of grace notes

The grace element represents a run of events that are performed in a subordinate relationship to the surrounding non-grace events.

The type attribute describes the kind of grace notes included in this element. Values include:

steal-previous (default)

The run of grace notes occupies a time interval that ends before the expected onset of the next non-grace event, shortening the duration of the preceding non-grace event.

steal-following

The run of grace notes occupies a time interval starting at the expected onset of the next non-grace event, both delaying its onset and shortening its duration.

make-time

The run of grace notes delays the onset of the next non-grace event.

The slash attribute specifies whether grace notes are notated with a slash or not. The default value of yes specifies a slash, indicating that the grace notes are displayed with a diagonal stroke and are to be performed quickly and not in their notated rhythm. Otherwise, they are performed with the notated note values according to the performance characteristics given by type.

Direction content lacking a <{direction/location} attribute within a grace element, is considered to be in alignment with the following event, even though grace notes in the same run technically share the same measure location.

The following example illustrates a run of two grace notes up to a quarter note.

<sequence>
    ...preceding event content...
    <grace>
      <event value="/8">...</event>
      <event value="/8">...</event>
    </grace>
    <event value="/4">...</event>
    ...following event content...
</sequence>

5.2.6. Event content

Event content comprises elements that describe the musical content of a single event, that is performed at a distinct time.

5.2.6.1. The note element
Contexts:
event
Content Model:
Metadata content
Note content
Liaison content
Attributes:
pitch - the musical pitch of this note
staff - an optional staff index for this note
accidental - an optional accidental for this note
value - an optional note value for this note

The note element defines a single note within an event, along with other information pertaining to the note itself rather than to its containing event.

The pitch attribute supplies the written pitch of the note as a chromatic pitch, using the rules for parsing a chromatic pitch.

Optionally, the staff attribute supplies a staff index for this note where this differs from the staff index which applies to the containing event as a whole.

The accidental attribute supplies an accidental value for this note. In the standard MNX-Common profile, this attribute must match the alteration of the pitch attribute. Omission of the attribute indicates that no accidental is to be displayed. The special value auto indicates that a consumer application should determine the proper accidental based on musical context.

(Import values here from MusicXML specification. Add style properties for editorial indications. Handle the case of "explicit" accidentals that should be preserved even if the chromatic context is changed by edits.)

The following features are not supported by the standard MNX-Common profile.

The value attribute optionally supplies a note value for this note, where this differs from the containing event's note value. The value of the note must be less than or equal to the value of the containing event.

5.2.6.2. The rest element
Contexts:
event
Content Model:
Metadata content
Attributes:
pitch - the musical pitch to which this rest should be visually registered

The rest element defines a rest within an event, along with other information pertaining to the rest rather than to its containing event.

If the pitch attribute is provided, it mandates that the rest be placed on the staff line corresponding to the provided chromatic pitch. The accidental component of the pitch is ignored for this purpose.

5.2.6.3. The articulations element
Needs migration from MusicXML.
5.2.6.4. The lyric element
Needs migration from MusicXML.
5.2.6.5. The ornaments element
Needs migration from MusicXML.
5.2.6.6. The technical element
Needs migration from MusicXML.

5.2.7. Note content

Note content comprises elements that describe the musical nature of a single note within an event.

5.2.7.1. The notehead element
Needs migration from MusicXML.
5.2.7.2. The fret element
Needs migration from MusicXML.
5.2.7.3. The string element
Needs migration from MusicXML.

5.2.8. Liaison content

Liaison content comprises elements that describe the liaison or connection between a single note within an event, and some other note.

5.2.8.1. Liaison attributes

Liaison elements share a set of common liaison attributes in an attribute group.

Attributes:
target - the optional element ID of the note at which the liaison ends

Liaisons in general must be provided with the ID of a note on which they end, given via the target attribute. The constraints on this attribute vary from one liaison to another.

5.2.8.2. The tied element
Contexts:
note
Content Model:
Metadata content
Attributes:
location - an optional measure location to end at

The tied element is used to indicate that a note is tied to a successor note element.

If the target attribute is provided, it must specify the element ID of a note which lies in the same part, and whose containing event begins directly after the end of the of this note’s event. The step, alteration and octave values for the target note must be identical to this one.

If target is not provided, the tie does not connect to a particular destination note. In this case location attribute may be used to specify the measure location of the other end of the tie.

The value of location may either lie before or after the current note’s event, or may assume the special values incoming or outgoing. The handling is as follows:

  • If the location precedes the start of the current event, this signifies a tie starting at the given location and ending on the current note.

  • If the location occurs after the start of the current event, this signifies a tie starting at the current note and ending at the given location.

  • A value of incoming places the start location at a conventionally short distance before the current note, and ends on the current note.

  • A value of outgoing starts on the current note, and places the end location at a conventionally short distance after the current note.

If neither target nor location is given, the effect is as if outgoing was specified.

Note: If a producer implementation does not give an explicit value for target, this always signifies an unmatched tie. In this case consumer implementations must not search for a matching end note of the same pitch.

Examples:

<sequence>
  <event value="/4">
    <note pitch="C4">
      <tied target="note2"/>
    </note>
  </event>
  <event value="/4">
    <note id="note2" pitch="C4"/>
  </event>
5.2.8.3. The arpeggiate element
Needs migration from MusicXML. In this case the liaison target will be constrained to lie within the same event.
5.2.8.4. The glissando element
Needs migration from MusicXML.
5.2.8.5. The slide element
Needs migration from MusicXML.
5.2.8.6. The bend element
Needs migration from MusicXML, and probably some simplification and better handling of pre-bends.
5.2.8.7. The hammer-on element
Needs migration from MusicXML.
5.2.8.8. The pull-off element
Needs migration from MusicXML.

5.2.9. Direction content

Direction content consists of some number of musical directions that modify or accompany the performance of events in one or more measures. Directions may be included in an MNX-Common score in two ways:

  • Within a directions element below a measure. The direction is considered to apply to all sequences.

  • Within a directions element below a sequence. The direction applies only to the content within this sequence.

5.2.9.1. Direction attributes

Directions share a set of common direction attributes in an attribute group.

Attributes:
location - the measure location of the direction
staff - an optional staff index
orient - an optional orientation

All directions within a directions parent element must be given an explicit measure location by supplying a location attribute. The default measure location is zero.

Conversely, directions occurring within sequence content must omit this attribute as their location is determined during the procedure of sequencing the content.

The optional staff attribute designates the staff index to which this direction applies, if such a designation makes sense. If not provided, the orientation is inherited from any sequence ancestor which specified it. If no ancestor did so, it is determined automatically according to the implementation’s rendering rules.

The optional orient attribute provides a specific orientation for this direction. If not provided, the orientation is inherited from any sequence ancestor which specified it. If no ancestor did so, it is determined automatically according to the implementation’s rendering rules.

5.2.9.2. The dynamics element
Contexts:
Direction content
Content Model:
None
Attributes:
type - the semantic nature of this dynamics direction
Attribute Groups:
Direction attributes

The dynamics element describes a textual dynamic direction to the performer.

The type attribute describes the nature of the dynamic. The following dynamics are supported:

p, pp, ppp, pppp, ppppp, pppppp,
f, ff, fff, ffff, fffff, ffffff
mp mf, sf, sfp, sfpp, fp, rf, rfz, sfz, sffz, fz n pf sfzp
5.2.9.3. The instruction element
Contexts:
Direction content
Content Model:
The text of this direction.
Attribute Groups:
Direction attributes
The instruction element describes a run of text that is presented in a visual style consistent with general instructions to the performer.
<measure>
  <directions>
    <instruction>Molto adagio</instruction>
  </directions>
  ...following measure content...
</measure>
5.2.9.4. The expression element
Contexts:
Direction content
Content Model:
The text of this direction.
Attribute Groups:
Direction attributes

The expression element describes a run of text that is presented in a visual style consistent with describing musical expression.

<sequence>
  <directions>
    <dirgroup location="0">
      <expression>subito</expression>
      <dynamics>p</dynamics>
    </dirgroup>
  </directions>
  ...following sequence content...
</sequence>
5.2.9.5. The ending element
Needs migration from MusicXML. Applies from the start of the containing measure (i.e. does not belong to a <barline>)
5.2.9.6. The repeat element
Needs migration from MusicXML. Applies to the containing measure (i.e. does not belong to a <barline>). Thus, "start" means a repeat starting from the beginning of this measure; "end" means a repeat ending at the end of this measure.
5.2.9.7. The coda element
Needs migration from MusicXML. May need to distinguish musical form use from textual reference to the symbol.
5.2.9.8. The segno element
Needs migration from MusicXML. May need to distinguish musical form use from textual reference to the symbol.
5.2.9.9. The harmony element
Needs migration from MusicXML.
5.2.9.10. The symbol element
Needs migration from MusicXML -- arbitrary SMuFL glyph. (How will visual registration work?)
5.2.9.11. The dirgroup element
Contexts:
Direction content
Content Model:
Direction content, which must not include measure locations
Attribute Groups:
Direction attributes
The dirgroup element describes a set of directions which are presented sequentially, like a sentence. The typical presentation of directions in a group is to arrange them horizontally from left to right.

TBD: spacing properties

5.2.10. Spanning directions

Spanning directions are directions whose temporal extent within a part is characterized by a span.

5.2.10.1. Span attributes

Spanning directions share a set of common span attributes in an attribute group.

Attributes:
end - the measure location of the direction

Spanning directions MUST be given a measure location as their endpoint, by supplying a end attribute. This measure location must lie within the same run of measure content as the location given for the start of the spanning direction.

5.2.10.2. The slur element
5.2.10.3. The wedge element
Contexts:
Direction content
Content Model:
None
Attributes:
type - the type of wedge
Attribute Groups:
Direction attributes
Span attributes

The type attribute describes the nature of the wedge:

diminuendo
The wedge represents a decrease in dynamic level.
crescendo
The wedge represents an increase in dynamic level.
5.2.10.4. The cresc element
Contexts:
Direction content
Content Model:
Attributes:
text - optional text to be displayed
Attribute Groups:
Direction attributes
Span attributes

The cresc element represents a crescendo or increasing dynamic level over the course of a span, notated as text followed by a dashed line.

The optional text attribute overrides the text to be displayed in the score for this element.

5.2.10.5. The dim element
Contexts:
Direction content
Content Model:
None
Attributes:
text - optional text to be displayed
Attribute Groups:
Direction attributes
Span attributes

The dim element represents a diminuendo or decreasing dynamic level over the course of a span, notated as text followed by a dashed line.

The optional text attribute overrides the text to be displayed in the score for this element.

5.2.10.6. The pedal element
Details TBD.
5.2.10.7. The bracket element
TBD: This should be separated into distinct elements with their own semantics. Brackets in MusicXML are too much of a catch-all.

Other spans exist in MusicXML and need to be migrated.

5.2.11. Staff directions

Staff directions are directions that apply as a whole to the one or more musical staves in a part, and which determine the interpretation of other notations within some applicable range of those staves.

Because it delineates disjoint ranges of staves, any staff direction has the effect of partitioning the events in a measure, such that all events lie either before or after the given directions. For example, consider a clef element describing a clef change. No matter how many polyphonic voices exist in a measure, all notes in all voices either lie to the left or to the right of this clef.

Staff directions that modify the interpretation or layout of the staff, apply from the start of that measure to all subsequent measures within the same global or part element until changed.

Most staff directions have a fixed location, typically the beginning or end of the measure in which they occur.

5.2.11.1. The key element
Contexts:
Direction content
Content Model:
None
Attributes:
fifths - the transposition from concert pitch in fifths

The key element defines a key signature applicable to this and all following measure content, until changed.

The measure location of a key direction is ignored and is assumed to be zero.

The staff index of a key direction is ignored, as the direction always applies to all the staves in a given part (if not the whole score).

It is invalid to place more than one key element within a measure.

In the standard MNX-Common profile, key elements in global measure content are required for every key change. key elements below part are optional; if present they must indicate a value of fifths that is either identical to the corresponding value in the global key or differs from it by a multiple of 12.

The required fifths attribute is a valid integer which supplies a number of fifths distance from a signature with no accidentals.

5.2.11.2. The time element
Contexts:
Direction content
Content Model:
None
Attributes:
signature - the displayed time signature
measure - an time signature which describes the content of the current measure only, and which is not displayed

The time element defines a time signature applicable to this and following measures, until changed.

The measure location of a time direction is ignored and is assumed to be zero.

It is invalid to place more than one time element within a measure.

In the standard MNX-Common profile, a time element may only appear inside global measure content, not within a part.

The required signature attribute supplies a time signature that gives the time signature for this measure and for subsequent measures. By default this signature is displayed in normative rendering.

Here is an example of a 4/4 time signature:

<global>
  <measure>
    <directions>
      <time signature="4/4"/>
    </directions>
    ...
  </measure>
  ...
</global>

Optionally, the measure attribute may be used to override the notated time signature for the current measure only, where this value differs from signature. This is of particular use for anacruses and for shortened measures prior to a repeat or jump back to an anacrusis.

For example, here is an example of a 3/4 time signature beginning with an anacrusis or pickup measure containing a single beat:

<global>
  <measure>
    <directions>
      <time signature="3/4" measure="1/4"/>
    </directions>
    ...this pickup measure contains only 1 beat...
  </measure>
  <measure>
    ...the following measure contains 3 beats...
  </measure>
  ...
</global>

Note: The measure attribute must be provided in all cases where the actual content of a measure is of a different length from that indicated by signature. MNX does not require consumer implementations to examine the contents of measures to determine their intended length.

5.2.11.3. The tempo element
Contexts:
Direction content
Content Model:
None.
Attributes:
value - the note value of the tempo
bpm - the number of beats per minute

The tempo element defines a tempo, from the point of its occurrence forward until changed. The tempo is rendered as a conventional pairing of a small beat unit notation equated to a metronome marking.

In the standard MNX-Common profile, a tempo element may only appear inside global measure content, not within a part.

The required value attribute supplies a note value which is asserted to occur with a frequency of bpm times per minute, which is a valid floating-point number.

To notate tempi using arbitrary text, various approaches may be taken, singly or in combination:

5.2.11.4. The staves element
Contexts:
Direction content
Content Model:
None
Attributes:
number - the number of staves in this part

The staves element defines the number of staves in a part’s measures, from the point of its occurrence forward until changed.

The measure location of a staves direction is ignored and is assumed to be zero.

It is invalid to place more than one staves element within a measure.

A staves element may only appear inside part measure content, not within global.

The required number attribute supplies the number of staves in the part, as a valid non-negative integer other than zero.

If no staves element is encountered at the start of a part, the number of staves is taken as 1.

5.2.11.5. The clef element
Contexts:
Direction content
Content Model:
None.
Attributes:
line - the staff line associated with the clef sign
sign - the clef sign
octave - an optional number of octaves by which the clef’s normal pitches should be transposed
Attribute Groups:
Direction attributes

The clef element defines a clef associated with this staff.

The required line attribute gives the staff line for the clef symbol, where the counting upwards from 1 starting at the bottom line on the staff.

The required sign attribute gives the clef symbol and may assume the following values:

G
G (treble) clef
F
F (bass) clef
C
C clef
percussion
Percussion clef
jianpu
Jianpu clef

Is the none value from MusicXML needed? Why?

The optional octave attribute is a signed integer giving the number of octaves by which the pitches normally indicated by the given clef sign should be transposed. It defaults to zero.

5.2.11.6. The staff-details element

Details TBD.

Describes the nature of a particular staff; applies to this staff in all succeeeding measure content in the part until changed. Expected to resemble MusicXML’s corresponding element.

5.2.11.7. The barline element
Only describes a barline explicitly placed elsewhere than the end of a measure. Barlines at the end of a measure are more simply described by the barline attribute.
5.2.11.8. The use-instrument element

Details TBD - applies a specific instrument sound to the staff as a whole.

5.2.11.9. The transpose element
Details TBD - determines the transposition applicable to all staves in the part.

Other staff direction elements exist in MusicXML and need to be migrated.

5.2.12. Part description content

Part description content consists of elements that supply information describing a part, and occur at the beginning of a part element.

5.2.12.1. The part-name element
5.2.12.2. The part-abbreviation element
5.2.12.3. The instrument-sound element

5.2.13. Interpretation content

Interpretation content may be included in some elements to control the specific way in which the element is interpreted as a musical performance.

5.2.13.1. The interpret element
Content Model:
Any number of performance-event or performance-tempo elements

The interpret element substitutes explicit MNX-Generic performance data for the performance of its MNX-Common parent element that a producer would normally generate.

Within interpret, a set of child performance-event and performance-tempo elements supply this performance information. The notated time units for all such events are equal to the same units used by the containing element to represent note values, and are thus modified in the case of elements for which the time modification ratio is not unity, such as tuplet.

The notated time coordinate of zero refers to the measure location of the containing element. All performance event times within interpret are thus relative to this origin. Negative event times are permitted, and specify performance events which precede the location of the containing element.

When interpret occurs within an event or note element, all performance-event attributes are defaulted to the values that would be generated by an implementation for the first note in the containing element.

As one example, consider this use of interpret to play a beamed pair of eighth notes in a swung triplet rhythm instead:

<beamed>
  <event value="/8">
    <note pitch="C4"/>
  </event>
  <event value="/8">
    <note pitch="D4"/>
  </event>
  <interpret>
    <performance-event start="0" duration="1/6" pitch="C4"/>
    <performance-event start="1/6" duration="1/12" pitch="D4"/>
  </interpret>
</beamed>

This scenario does not require the use of a beamed group; without the group, the interpretation could be expressed at a per-event level as in the following example. Note that the pitch attributes no longer require specification, and that the time origin for the second note now relates to its default onset time.

<sequence>
  ...preceding sequence content...
  <event value="/8">
    <note pitch="C4"/>
    <interpret>
      <performance-event start="0" duration="1/6"/>
    </interpret>
  </event>
  <event value="/8">
    <note pitch="D4"/>
    <interpret>
      <performance-event start="1/24" duration="1/12"/>
    </interpret>
  </event>
  ...following sequence content...
</sequence>

As a final example, this case uses an instruction element to specify a tempo via an explicit interpretation:

<sequence>
  <directions>
    <instruction class="tempo">
      <interpret>
        <performance-tempo beat="/4" bpm="60"/>
      </interpret>
      Langsamer
    </instruction>
  </directions>
  ...following sequence content...
</sequence>

Note: The measure location of a containing element is not always the same as the time at which its first event is typically performed. Consider the handling of grace notes, for example.

5.2.14. Synchronization content

The following elements describe the synchronization of a semantic CWMN score with one or more audio media representing recordings of the score.

5.2.14.1. The score-audio element
Contexts:
mnx-common
Content Model:
One or more score-audio-media elements.
Zero or one score-audio-mapping elements.

The score-audio element defines one or more audio media files that constitute a single recording of a performance of the score, and whose contents are presumed to be temporally synchronized with each other.

A set of optional score-audio-mapping elements, if given, may establish a mapping between the audio file and the semantic content of the score.

5.2.14.2. The score-audio-media element
Contexts:
score-audio
Content Model:
Metadata content.
Attributes:
src - URL of an audio file of the score

The score-audio-media element includes an audio media file, via the URL provided in the src attribute.

5.2.14.3. The score-audio-mapping element
Contexts:
score-audio
Content Model:
Zero or more score-audio-region elements.
Attributes:
system - an optional reference to a global element whose measures support the mapping

The score-audio-mapping element defines a sequence of disjoint, monotonically increasing time ranges within a specific recording. Each such range corresponds to a range of notated time within the semantic score.

Note: Semantic score ranges are not required to be disjoint nor monotonically increasing, due to the possibility of repeats and form jumps.

Each score-audio-region element describes the mapping of a single absolute audio media time range to a single notated time range. The elements must occur in forward time order in audio media time: the end value of each region must be less than or equal to the start value of the next region.

system, if present, this attribute gives the ID of the global element whose measures will be used as references for the synchronization mappings. If absent, the first global element is used.

5.2.14.4. The score-audio-region element
Contexts:
score-audio-mapping
Content Model:
None.
Attributes:
start - audio start time of the time region being mapped (inclusive)
end - audio end time of the time region (exclusive)
score-start - a reference to a measure element
score-end - an ending line segment for a cursor

The score-audio-region element describes the relationship between a media time region expressed as a pair of offsets in seconds, and a contiguous region of the semantic score expressed as a pair of measure locations. Each range is considered half-open: the range includes the start point, and all intermediate times up to and excluding the end point.

The range within the semantic score is required to a contiguous sequence of measures as specified within the score, with no gaps: if the range spans more than one measure, then it includes all intermediate measures, in forward measure index order (i.e. in their notated sequence, rather than in any presumed performance order involving repeats or form jumps). Consequently, each form jump in a performance must initiate a new score-audio-region within the parent score-audio-mapping.

start gives the start of the region within the audio media as an offset in seconds.

end gives the end of the region within the audio media as an offset in seconds. Its value must be greater than the value start.

score-start gives the start of the region within the semantic score as a measure location, in terms of the measure indices defined by system.

score-end gives the end of the region within the semantic score as a measure location, in terms of the measure indices defined by system. Its value must logically follow the location within the score given by score-start.

Note: Gaps in mapping coverage for either the audio or the semantic score are allowed, and indeed are meaningful. Audio gaps represent portions of the recording where no music is being played; semantic score gaps represent portions of the score that were not performed in the recording.

The following example illustrates a possible recording and its synchronization to a score. The score contains 7 notated measures, while the audio contains 10 performed measures reflecting a 4-bar repeat with two endings, beginning at measure 3. The performance order in the recording is thus 1, 2, 3, 4, 5, 6, 3, 4, 5, 7, for a total of 10 performed measures. The tempo slows during the performance of the latter half of measure 2, and remains slower for the remainder of the recording.

<mnx-common>
  <global id="global1">...</global>
  <part>...</part>
  <part>...</part>
  <score-audio system="global1">
     <score-audio-media src="recording.mp4"/>
     <score-audio-mapping start="1.43" end="2.43" start="1:0" end="1:1"/>
     <score-audio-mapping start="2.43" end="2.98" start="2:0" end="2:0.5"/>
     <score-audio-mapping start="2.98" end="3.53" start="2:0.5" end="2:1"/>  ...slowing down...
     <score-audio-mapping start="3.53" end="7.53" start="3:0" end="6:1"/>    ...1st repeat + ending...
     <score-audio-mapping start="7.53" end="10.53" start="3:0" end="5:1"/>   ...2nd repeat...
     <score-audio-mapping start="10.53" end="11.53" start="7:0" end="7:1"/>  ...2nd and final ending.
  </score-audio>
</mnx-common>

5.3. Style properties

Style properties may be included in many elements to control the specific way in which the element is rendered. Each property is defined as a key-value pair.

Style properties are applied to each MNX semantic element according to the following procedure for style property computation:

  1. Apply the property values from each style selector definition whose style selector rule matches the semantic element. All definitions with a matching rule are applied in the order that they were encountered in processing of the document.
  2. For each class attribute belonging to the semantic element, in the order of occurrence, apply the property values from each style class definition whose class name matches the class attribute. All definitions with a matching name are applied in the order that they were encountered in processing of the document.
  3. For each style attribute belonging to the semantic element, apply its property name-value pairs in the order of their processing in the given style property list.

Properties are documented in the following places:

5.3.1. The style attribute

The style attribute supplies the value of one or more explicit style properties which apply to its parent element. The attribute value must be a valid style property list.

Here’s an example of a style property definition adding color to a note:

<note pitch="C4" style="color: #000099;"/>

5.3.2. The class attribute

The class attribute may be used on any MNX element, and supplies the value of a style class definition which applies to a that element as per the rules of style property computation.

The value of this attribute supplies the names of one or more style class definitions which apply to the containing element as an ordered set of space-separated tokens. All style property values supplied by each class definition are applied to the element, in the order in which they were defined.

For example, the following applies a class named emphasis to a note:

<note class="emphasis" pitch="C4"/>

In this case, two different classes emphasis and alternate are applied, along with a local overriding color:

<note class="emphasis alternate" style="color: blue;" pitch="C4"/>

5.3.3. Common style properties

The following style properties apply to many kinds of object in MNX-Common, and in general apply to all descendants of the element within which they are specified.

5.3.3.1. The color property
Applies to:
part
Sequence content
Direction content
Event content
Note content
Liaison content
Value:
Simple color
Inherited:
yes
The color property specifies a simple color to be used to render notational objects at or below the level of the element.

Simple colors as per the HTML5 spec don’t support an alpha property, so perhaps we should adopt a separate syntactical definition here.

5.3.3.2. The smufl-font property
Applies to:
part
Sequence content
Direction content
Event content
Note content
Liaison content
Value:
A SMuFL font name
Inherited:
yes
The smufl-font property specifies the name of a SMuFL font that will be used> to be used to render notational objects at or below the level of the element.
5.3.3.3. The glyph property
Applies to:
Direction content
Note content
Value:
SMuFL glyph name
Inherited:
no
The glyph property specifies a specific SMuFL glyph name to be used to render a notational object.
5.3.3.4. The display property
Applies to:
part
Sequence content
Direction content
Event content
Note content
Liaison content
Value:
normal | none
Inherited:
yes

The display property controls the way in which an element and its descendants interact with the layout of the document.

Permitted values include:

Other values are likely, making this property into more than simply a way of hiding content.

5.3.3.5. The visibility property
Applies to:
Sequence content
Direction content
Event content
Note content
Liaison content
Value:
visible | hidden
Inherited:
yes

The visibility property controls whether an element’s contents (including all of its descendants) are displayed by a consumer or not.

In contrast to display with a value of none, a visibility value of hidden does not affect the layout of any other elements in the document. For example, the contents of a hidden event will not be shown, but the place where the event would have appeared will still occupy space in the containing measure.

Permitted values include:

5.3.3.6. The perform property
Applies to:
Sequence content
Direction content
Event content
Note content
Liaison content
Value:
normal | none
Inherited:
yes
The perform property controls whether an element’s contents (including all of its descendants) are performed by a consumer or not.

Permitted values include:

5.3.3.7. The x property
Applies to:
Direction content, Sequence content
Value:
staff position
Inherited:
no

The x property places the horizontal anchor point of an event or direction at a given graphical offset relative to its specified measure location, given as a valid floating-point number of staff lines starting from the measure location’s assigned position in the layout and proceeding to the right in a positive direction.

5.3.3.8. The end-x property
Applies to:
Direction content
Value:
staff position
Inherited:
no

The end-x property places the horizontal anchor point of the end of a spanning direction at a given graphical offset relative to its specified measure location, given as a valid floating-point number of staff lines starting from the measure location’s assigned position in the layout and proceeding to the right in a positive direction.

5.3.3.9. The y property
Applies to:
Direction content
Value:
staff position | above | below
Inherited:
no
The y property places a direction at a given staff position given as a valid floating-point number of staff lines starting from the center of the top staff line and proceeding downwards in a positive direction.

The special values above and below delegate exact positioning of the direction to the implementation and request that the direction be placed respectively above or below the staff.

TBD: describe rendering model and registration

5.3.3.10. The location property
Applies to:
Direction content, Sequence content
Value:
location
Inherited:
no

For layout purposes only, the location property overrides the measure location of an event or direction determined by sequencing the content, replacing it with the value location expressed as a measure location within the containing measure.

This is useful for forcing the positioning or alignment of objects that would otherwise be placed elsewhere, without otherwise altering their semantics.

5.3.3.11. The stem-direction property
Applies to:
Event content
Value:
up | down
Inherited:
no

The stem-direction property controls the direction of any rendered stem associated with this event. If omitted, the stem direction is determined automatically by the implementation, in accordance with the orientation of the event.

The value up causes an event’s stem to be rendered pointing upwards.

The value down causes an event’s stem to be rendered pointing downwards.

5.3.3.12. The grace-slash property
Applies to:
grace
Value:
yes | no
Inherited:
no

The grace-slash property controls whether or not a slash is rendered in conjunction with the flag or beam of a grace note. If omitted, the stem direction is determined automatically by the implementation based on the nature of the grace note.

Many other style properties exist in MusicXML (although not as styles per se) and will need to be migrated.

5.4. Stylesheet definitions

The stylesheet element may be used to define style properties using rules that can be applied in a unitary fashion to other elements, respectively by name matching in a style class definition, or by algorithmic rule matching in a style selector definition. Taken together, these supply a set of stylesheet definitions that control the rendering and interpretation of the document.

These definitions may be placed in the head, the mnx-common element, or in a separate linked stylesheet.

5.4.1. The stylesheet element

Contexts:
head, mnx-common
Content Model:
Stylesheet definitions.

The stylesheet element serves to place stylesheet definition elements under a single container to support clean document organization and validation.

5.4.2. The style-class element

Contexts:
stylesheet
Content Model:
None.
Attributes:
name - the name of this style class definition
style - the style property list applied by this definition

The style-class element supplies a style class definition, which associates a list of style property values with a class that can be referenced elsewhere using its name alone, as per the rules of style property computation.

The name attribute supplies the name of this style class definition; the style attribute supplies the properties that make up the content of the definition.

Multiple occurrences of style-class are permitted to share the same value of name. These are equivalent to a single occurrence of style-class with the same constituent definitions of style property values in the same order.

Here is a style class definition that creates a class called emphasis, intended to color its target objects bright red and apply a thicker stem width in the case of events:

<style-class name="emphasis" style="color: #FF0000; stem-width: 0.05;"/>

5.4.3. The style-selector element

Contexts:
stylesheet
Content Model:
None.
Attributes:
rule - a set of element names to which this style selector definition applies
style - the style property list applied by this definition

The style-selector element supplies a style selector definition, which defines a list of style property values applying to all elements matching a style selector rule, as per the rules of style property computation.

The rule attribute supplies a style selector rule that is automatically applied to all semantic MNX elements to determine a set of implied style properties. The following rule syntax is supported:

The style attribute supplies the properties that make up the content of the rule.

Multiple occurrences of style-selector are permitted to share the same value of rule. These are equivalent to a single occurrence of style-selector with the same constituent definitions of style property values in the same order.

Note: The scope of a defined rule is not limited to descendants of the element in which the style-selector element occurs, but is global to the entire document.

For example, the following specifies that all event elements in the document will employ a given stem width:

<style-selector rule="event" style="stem-width: 0.05;"/>

5.5. Rendering

TBD: section describing normative MNX-Common rendering procedure, leaving room for implementation decisions. The intent is to set out the normative constraints that MNX-Common rendering must follow, including at least:

  • normative registration

5.6. Interpretation

TBD: section describing normative MNX-Common performance interpretation, leaving room for implementation decisions.

6. MNX-Generic

MNX-Generic is a general format for representing musical scores in terms of linked graphical media, audio media and performance data.

In contrast to MNX-Common, there is no attempt to represent semantics directly in MNX-Generic. Thus, MNX-Generic can be described as a low-level, literal format that represents instances of scores, rather than their semantic content. MNX-Generic is intended to support applications which must be able to faithfully execute a visual and/or audible rendition of a score, with an awareness of the relationship between what is seen and what is heard.

MNX-Generic can be employed as a target format for applications that render semantic notation into media. And even though MNX-Generic is not a semantic format, MNX-Generic elements may cross-reference elements in a semantic source document that was rendered into MNX-Generic. This supports a connection between the original semantic markup and an MNX-Generic rendering of same.

Given MNX-Generic’s characteristics as a target format, some features of MNX-Generic are employed within MNX-Common to provide literal descriptions of rendering where semantic information does not suffice to yield the desired musical result.

The only constraints on the nature of an MNX-Generic score are:

  1. The visual content of the score must be encoded in SVG.

  2. The audible content of the score must be encoded either as audio media or performance data.

6.1. Elements

6.1.1. Musical body content

6.1.1.1. The mnx-generic element
Contexts:
Wherever a musical body is expected.
Content Model:
Metadata content.
One or more score-view elements.
Performance content.
Attributes:
None.

The mnx-generic element is a musical body that describes an MNX-Generic score as a whole.

The following example illustrates an entire MNX-Generic document; the elements are described individually in the remainder of this section.

<mnx-generic>
  <score-view id="page1" view="score.svg#page1"/>
  <score-view id="page2" view="score.svg#page2"/>
  <score-view id="page3" view="score.svg#page3"/>

  <performance-audio>
    <performance-audio-media src="score.mp4"/>
    <performance-mapping>
      <performance-region start="0" end="0.72" view="page1" region="m1"/>
      <performance-region start="0.72" end="1.43" view="page1" region="m2"/>
      <performance-region start="1.43" end="2.99" view="page1" region="m3"/>
      <performance-region start="2.99" end="3.65" view="page1" region="m4"/>
    </performance-mapping>
  </performance-audio>

  <performance-data>
    <performance-tempo beat="/4" bpm="80"/>
    <performance-mapping>
      <performance-region start="0" end="1" view="page1" region="m1"/>
      <performance-region start="1" end="2" view="page1" region="m2"/>
      <performance-region start="2" end="3" view="page1" region="m3"/>
      <performance-region start="3" end="4" view="page1" region="m4"/>
    </performance-mapping>
    <performance-part>
      <instrument-sound>strings.violin</instrument-sound>
      <performance-event start="0" duration="1/4" pitch="C4" dynamics="100"/>
      <performance-event start="1/4" duration="1/4" pitch="D4" dynamics="100"/>
      ...following events...
    </performance-part>
  </performance-data>
</mnx-generic>

6.1.2. Graphics media

6.1.2.1. The score-view element
Contexts:
mnx-generic
Content Model:
Any number of score-mapping elements.
Attributes:
view - link to an SVG view of the score

The score-view element references a specific view within a separate SVG document, via the URL provided in the view attribute. This URL must follow the rules for linking into SVG content.

Each score-view element represents a single page of the score. A default sequence of pages is established by the order of occurrence of score-view elements within the document.

The sequence of page presentation in conjunction with performance content may differ from the default sequence, according to the mapping between performance and graphics.

6.1.2.2. The score-mapping element
Contexts:
score-view
Content Model:
None.
Attributes:
graphics - an element ID within the SVG content described by the parent score-view
semantics - one or more optional IDs of corresponding element(s) within source semantic documents

The score-mapping element supplies information on the correspondence between an SVG element in a score-view, and sets of other semantic elements in this or other documents.

The graphics attribute is required, and gives a single ID of an element in the score view’s SVG content. There is no restriction on the nature or structure of this element, nor on its relationship to other elements.

The optional semantics attribute supplies one or more IDs of elements in a semantic source document, for example an MNX-Common document. This asserts that each of the referenced semantic source elements are considered as generating the SVG content described by graphics.

Note: While this element describes only a single SVG element, it is commonly the case that multiple SVG graphics may be associated with the same semantic source.

6.1.3. Performance content

The category of performance content includes both of the following:

6.1.3.1. The performance-audio element
Contexts:
mnx-generic
Content Model:
Metadata content.
Zero or more performance-tempo elements.
Zero or one performance-mapping elements.
One or more performance-audio-media elements.

The performance-audio element defines one or more audio media files that constitute a single performance of the score, and whose contents are presumed to be temporally synchronized with each other.

Additionally, performance-tempo elements may establish an proportional mapping between an arbitrary notated time unit and a time interval. This mapping may change throughout the course of the performance. If no such elements occur, the notated time unit is defined as equal to 1 second of performance time.

A set of optional performance-mapping elements, if given, may establish a mapping between the performance data and the graphical score.

6.1.3.2. The performance-audio-media element
Contexts:
performance-audio
Content Model:
Metadata content.
Attributes:
src - URL of an audio file of the score

The performance-audio-media element includes an audio media file, via the URL provided in the src attribute.

6.1.3.3. The performance-data element
Contexts:
mnx-generic
Content Model:
Metadata content.
One or more performance-part elements.
Zero or more performance-tempo elements.
Zero or one performance-mapping elements.
Attributes:

The performance-data element provides performance data in the form of discrete sonic events suitable for synthesis or analysis.

It consists of some number of parts, plus optional mappings between performance time and regions of graphical media.

Additionally, performance-tempo elements may establish an proportional mapping between an arbitrary notated time unit and a time interval. This mapping may change throughout the course of the performance. If no such elements occur, the notated time unit is defined as equal to 1 second of performance time.

A set of optional performance-mapping elements, if given, may establish a mapping between the performance data and the graphical score.

6.1.3.4. The performance-part element
Contexts:
performance-data
Content Model:
Metadata content.
Zero or more performance-event elements.
Attributes:
instrument-sound - the sound ID of the instrument for this part.

The performance-part element organizes a list of performance-event elements, within a given performance.

The instrument-sound attribute gives the MusicXML sound ID of the instrument for this part.

Note: performance-part elements do not necessarily correspond to MNX-Common part elements, as they pertain to a single instrument.

6.1.3.5. The performance-tempo element
Contexts:
performance-data, performance-audio, interpret
Content Model:
None.
Attributes:
start - start time of this performance tempo
beat - notated time units per beat
bpm - number of beats per minute

The performance-tempo element describes a proportional relationship between time and an arbitrary notated time unit that may be used by the score.

This relationship applies to a time range beginning at the time in seconds specified by start and continuing until the next performance-tempo element. The default value is 0.

The beat element is a note value which establishes a beat as some fraction or multiple of a notated time unit (which in CWMN a whole note by convention). The default value is 1.

The bpm element establishes a tempo, expressed as a valid floating-point number giving the number of beats per minute. The default value is 60.

NOTE: The defaults for both of the above attributes establish a notated time unit as equal to 1 second. Thus, if no attribute values are provided, score time is equal to real performance time.

NOTE: The set of performance-tempo elements establish a variable-rate progression of a scoring time unit relative to performance time, similar to a MIDI tempo track.

continuous changes need to be supported

6.1.3.6. The performance-event element
Contexts:
performance-part, interpret
Content Model:
None.
Attributes:
start - start time of this event
duration - duration of this event
pitch - pitch of this event
dynamics - dynamics for this event
techniques - set of performance techniques for this event
view - optional element ID of the score-view containing graphics for this event
graphics - optional SVG elements for specific event graphics

The performance-event element describes a single musical event in terms of its performance parameters.

All times given are in notated time units, whose relationship to performance time is described by performance-tempo elements. These times may be expressed in the following forms which are syntactically distinct:

start gives the starting time of the event. This specifies the actual start time, not a notated start time to be interpreted by a performer. The default value is zero.

duration gives the duration of the event. This specifies the actual duration to be performed, not a notated duration subject to interpretation by a performer.

pitch gives the pitch of the event expressed as either a valid floating-point number providing a frequency in Hertz, or a chromatic pitch.

Note: The interpretation of pitch at the event level needs to be much more carefully nailed down. Issues include how to control unpitched instruments, the temperament (if any) applied to chromatic pitches, and no doubt more.

dynamics gives the dynamics of the event expressed in a scale from 0 to 127. This scale needs to be better defined; the existing MusicXML definition as "percentage of forte" is hard to interpret clearly.

techniques gives a set of performance techniques applying to the event as a unordered set of space-separated tokens.

Note: These presumably correspond to articulatory variations of the instrument sound. Proper definition remains TBD.

If present, the view and graphics attributes together define a set of SVG graphics in a score-view element which comprise the visual representation corresponding to this event. Other than the fact of this correspondence, no other information about the graphics is encoded.

6.1.3.7. The performance-mapping element
Contexts:
performance-audio, performance-data
Content Model:
Zero or more performance-region elements.
Attributes:

The performance-mapping element defines a sequence of piecewise, non- overlapping ranges in notated time which correspond to piecewise regions within score graphics media views. In essence, it is a timeline that correlates a performance with elements within a series of views of the score from which that performance is derived.

The performance-region elements in a mapping provide the detailed descriptions of these ranges. The elements must occur in forward time order, and the end value of each region must be less than or equal to the start value of the next region.

6.1.3.8. The performance-region element
Contexts:
performance-mapping
Content Model:
None.
Attributes:
start - start time of the time region being mapped
end - end time of the time region
view - the element ID of the score-view containing the visual region
region - the definition of the visual region itself
cursor-start - a starting line segment for a cursor
cursor-end - an ending line segment for a cursor

The performance-region element describes the relationship between a performance time region expressed in notated time units, and a visual region of a score page. This allows consumers to understand the correspondence between regions of the graphical score and regions of one or more audio performances.

start gives the start of the time region.

end gives the end of the time region.

view identifies a view of some section of the score, by providing the XML ID of its score-view element.

region identifies the visual region for the mapping using a fragment identifier in accordance with linking into SVG content. The fragment identifier refers to the same document identified by the view attribute.

If the pair of attributes cursor-start and cursor-end are defined, then a mapping is defined between points in performance time and line segments in the visual region. Each attribute supplies an ordered set of space-separated tokens giving the cursor’s endpoints as successive X/Y pairs in user coordinates applicable to the region.

The special tokens left, right, top and bottom may be used here to define both endpoints of a cursor in terms of the corresponding edge of the region’s SVG bounding box.

Under this mapping, a time t in the time region corresponds to a line segment in the visual region connecting two points given by the respective formulae of:

  • cursor-start.p1 + (cursor-end.p1 - cursor-start.p1) * (t - start) / (end - start).

  • cursor-start.p2 + (cursor-end.p2 - cursor-start.p2) * (t - start) / (end - start).

If either or both of cursor-start and cursor-end are undefined, then the entire time region corresponds to the entire visual region, with no further decomposition.

Note: To more easily support cursor motion through curved arcs, non-parallel start and end cursors could be considered as segments of two rays whose common origin lies at the point of intersection between these cursors. Interpolation would then be performed in radial coordinates, smoothly sweeping both the angle and the distances from the origin to move the cursor’s endpoints along roughly circular arcs. Straight-line motion would be merely a special case in which the intersection lies at infinity.

Index

Terms defined by this specification

Terms defined by reference

  • [DOM] defines the following terms:
    • attribute
  • [HTML] defines the following terms:
    • non-negative integer
    • ordered set of space-separated tokens
    • rules for parsing integers
    • rules for parsing signed integers
    • semantic markup
    • signed integer
    • simple color
    • split a string on spaces
    • unordered set of space-separated tokens
    • valid floating-point number
    • valid integer
    • valid non-negative integer
  • [INFRA] defines the following terms:
    • split a string on commas

Issues Index

This is an open issue.
Shared denominators are all-or-nothing. So there’s currently no way to share a denominator for only some of a time signature’s fractions, which would require a grouping construct like 2/4+(2+3/8) or such.
Simple colors as per the HTML5 spec don’t support an alpha property, so perhaps we should adopt a separate syntactical definition here.
continuous changes need to be supported

Conformance

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

References

Normative References

[DOM]
Anne van Kesteren. DOM Standard. Living Standard. URL: https://dom.spec.whatwg.org/
[HTML]
Anne van Kesteren; et al. HTML Standard. Living Standard. URL: https://html.spec.whatwg.org/multipage/
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119
[UNICODE]
The Unicode Standard. URL: https://www.unicode.org/versions/latest/

Informative References

[INFRA]
Anne van Kesteren; Domenic Denicola. Infra Standard. Living Standard. URL: https://infra.spec.whatwg.org/