Chapter 3. Modularity
Modularity is an experimental feature introduced in version 3.3.0. It’s inspired by Modular ixml presented by Steven Pemberton at MarkupUK in 2025, but some of the details are different. It’s likely, if the Community Group continues to work on modularity as a feature of iXML, that the design will change further. Places where this design differs from the design presented in Modular ixml are marked with “👀” to draw your attention.
The idea behind modularity is to enable reuse. As the number and complexity of iXML grammars grows, good software design practice encourages reusablity with a mechanism more reliable than “cut-and-paste” between grammars.
Conceptually, modularity constructs a new (monolithic) grammar that contains all of the necessary productions. It isn’t a textual inclusion of any kind. Grammars can use each other more-or-less arbitrarily.
Modularity extends the iXML grammar by adding uses
and
shares
elements. These extensions are introduced by a “+
”
as shown in this example:
numerics.ixml
+uses digit, hexdigit from "digits.ixml" .
+shares decimal, hexadecimal .
number = decimal | hexadecimal .
decimal = digit+ .
hexadecimal = hexdigit+ .
The uses
clause says that this grammar
uses the rules for digit
and
hexdigit
from the digits.ixml
grammar. It shares the rules for
decimal
and hexadecimal
.
A grammar with an explicit shares clause is publishing its interface. Only the explicitly shared symbols may be used by another grammar. If there’s no shares clause, 👀 any symbol may be used by another grammar.
The effect of requesting modularity is to 👀 parse the input grammar against
a variant iXML grammar that defines a module
. The relevant
productions are:
module: s, (prolog, RS)?, (( ((uses; shares), RS)+, ixml) | ixml) .
uses: -"+uses", RS, mfrom++(-";", s), s, -'.' .
shares: -"+shares", RS, entries, s, -'.' .
mfrom>from: entries, RS, -"from", RS, source.
-entries: share++(-",", s).
share: @name.
@source: string .
As with a normal iXML grammar, the prolog comes first. In a module, that’s followed by optional uses and shares clauses that may be repeated, followed by the iXML itself. The 👀 grammar URIs are quoted and 👀 each clause ends with a full stop.
It is an error to define a rule for a symobol that is used from another grammar.
It is an error to attempt to use a clause from another grammar if that grammar does not share the symbol (or if no symbols are explicitly shared, if the grammar does not contain a rule the symbol).
To complete the example above, digits.ixml
is shown in
Example 3.2, “A modular grammar for digits, digits.ixml
” and partno.ixml
is show in
Example 3.3, “A modular grammar for part numbers, partno.ixml
”.
digits.ixml
+shares digit, hexdigit .
-digit = ["0"-"9"] .
-hexdigit = digit | hexadecimal .
-hexadecimal = ["A"-"F" | "a"-"f" ] .
partno.ixml
ixml version "1.1-nineml" .
+uses decimal from "numerics.ixml" .
partno: -decimal .
A 👀 modular grammar uses the renaming feature and must be in “version 1.1”. (This is an arbitrary limitation but reinforces the experimental nature of the feature.)
Observe that the nonterminal hexadecimal
is
used in two grammars and has different definitions. The processor is
responsible for making sure that a correct grammar is constructed. The
renaming feature, introduced recently in Invisible XML, allows the processor to rename
nonterminals while preserving the correct serialization.
The composed grammar is included in the debugging output:
<module>
<prolog>
<version string="1.1-nineml"/>
</prolog>
<ixml>
<rule name="partno">
<alt>
<nonterminal name="decimal" mark="-"/>
</alt>
</rule>
<rule name="number">
<alt>
<nonterminal name="decimal"/>
</alt>
<alt>
<nonterminal name="hexadecimal"/>
</alt>
</rule>
<rule name="decimal">
<alt>
<repeat1>
<nonterminal name="digit"/>
</repeat1>
</alt>
</rule>
<rule name="hexadecimal">
<alt>
<repeat1>
<nonterminal name="hexdigit"/>
</repeat1>
</alt>
</rule>
<rule name="digit" mark="-">
<alt>
<inclusion>
<member from="0" to="9"/>
</inclusion>
</alt>
</rule>
<rule name="hexdigit" mark="-">
<alt>
<nonterminal name="digit"/>
</alt>
<alt>
<nonterminal name="_1_hexadecimal"/>
</alt>
</rule>
<rule name="_1_hexadecimal" alias="hexadecimal" mark="-">
<alt>
<inclusion>
<member from="A" to="F"/>
<member from="a" to="f"/>
</inclusion>
</alt>
</rule>
</ixml>
</module>
Note how one of the hexadecimal
nonterminals was
renamed. Nonterminals are only renamed if there are collisions and the nonterminals
in the “top level” grammar are never renamed.
All of the rules from the three grammars are included in the composed grammar. This is a bug in the current implementation. It’s harmless, but nonterminals that are not reachable from shared symbols could easily be ignored during construction.