Chapter 1. General use
CoffeeFilter is an API for parsing with Invisible XML. You can use it to construct a parser for an Invisible XML grammar, then parse input with that parser.
The examples in this section are taken from “running code” that you can find in
src/test/java/org/nineml/examples/CoffeeFilterExamples
in
the repository. The code lives in the unit testing framework, but doesn’t
depend on that framework.
This is just an overview, consult the JavaDoc for more details.
1.1. Constructing a parser
The first example uses a small grammar for parsing numbers. This is very similar to the grammar used for the CoffeeGrinder example except that it’s been expressed in somewhat more idiomatic iXML.
number = integer | float | scientific .
sign = '+' | '-' .
-digit = ['0'-'9'] .
digits = digit+ .
integer = sign?, digits .
point = -'.' .
float = sign?, digits, point, digits .
E = -'E' .
scientific = integer, E, sign?, digits
| float, E, sign?, digits .
Begin by making an InvisibleXml object.
ParserOptions options = new ParserOptions();
InvisibleXml invisibleXml = new InvisibleXml(options);
There are a number of options that you can specify to control various aspects of the parse, but the defaults are reasonable.
1.2. Parsing the input
From the InvisibleXml
object, we can get
a parser for
a particular grammar and parse input with it to obtain
a document:
File grammar = new File("src/test/resources/numbers.ixml");
InvisibleXmlParser parser = invisibleXml.getParser(grammar);
InvisibleXmlDocument document = parser.parse(number);
Methods on the document can be used to check if the parse succeeded. If it did, we can construct a result tree.
if (document.succeeded()) {
String tree = document.getTree();
return String.format("%s is a number: %s", number, tree);
} else {
return String.format("%s is not a number", number);
}
1.3. Processing the results
The getTree() method returns a string. There are other methods for accessing the forest that can use builders to construct trees:
if (document.succeeded()) {
StringTreeBuilder builder = new StringTreeBuilder();
ParseForest forest = document.getResult().getForest();
Arborist.getArborist(forest).getTree(builder);
String tree = builder.getTree();
return String.format("%s is a number: %s", number, tree);
} else {
return String.format("%s is not a number", number);
}
Builders can be used to construct a generic tree of “plain old Java objects” or send events to a SAX content handler to construct the tree.
1.4. Ambiguous results
In the case of ambiguous results, you can ask how many trees there are:
long parseCount = document.getNumberOfParses();
if (parseCount > 1) {
if (document.isInfinitelyAmbiguous()) {
return String.format("%s is a date (in infinite ways): %s", date, tree);
}
return String.format("%s is a date (%d ways): %s", date, parseCount, tree);
}
Beware that in an infinitely ambigous forest, the number returned is wildely inaccurate (in as much as it is infinitely less than ∞!). It tells you something about how the forest divides into ambiguous branches, but doesn’t attempt to account for loops.