Chapter 1General use

CoffeeFilter is an API for parsing with Invisible XML. You can use it to construct a parser for an Invisible XML grammar, then parse input with that parser.

The examples in this section are taken from “running code” that you can find in src/test/java/org/nineml/examples/CoffeeFilterExamples in the repository. The code lives in the unit testing framework, but doesn’t depend on that framework.

This is just an overview, consult the JavaDoc for more details.

1.1Constructing a parser

The first example uses a small grammar for parsing numbers. This is very similar to the grammar used for the CoffeeGrinder example except that it’s been expressed in somewhat more idiomatic iXML.

    number = integer | float | scientific .
      sign = '+' | '-' .
    -digit = ['0'-'9'] .
    digits = digit+ .
   integer = sign?, digits .
     point = -'.' .
     float = sign?, digits, point, digits .
         E = -'E' .
scientific = integer, E, sign?, digits
           | float, E, sign?, digits .

Begin by making an InvisibleXml object.

ParserOptions options = new ParserOptions();
InvisibleXml invisibleXml = new InvisibleXml(options);

There are a number of options that you can specify to control various aspects of the parse, but the defaults are reasonable.

1.2Parsing the input

From the InvisibleXml object, we can get a parser for a particular grammar and parse input with it to obtain a document:

File grammar = new File("src/test/resources/numbers.ixml");
InvisibleXmlParser parser = invisibleXml.getParser(grammar);
InvisibleXmlDocument document = parser.parse(number);

Methods on the document can be used to check if the parse succeeded. If it did, we can construct a result tree.

if (document.succeeded()) {
    String tree = document.getTree();
    return String.format("%s is a number: %s", number, tree);
} else {
    return String.format("%s is not a number", number);
}

1.3Processing the results

The getTree() method returns a string. There are other methods for accessing the forest that can use builders to construct trees:

if (document.succeeded()) {
    StringTreeBuilder builder = new StringTreeBuilder();
    ParseForest forest = document.getResult().getForest();
    Arborist.getArborist(forest).getTree(builder);
    String tree = builder.getTree();
    return String.format("%s is a number: %s", number, tree);
} else {
    return String.format("%s is not a number", number);
}

Builders can be used to construct a generic tree of “plain old Java objects” or send events to a SAX content handler to construct the tree.

1.4Ambiguous results

In the case of ambiguous results, you can ask how many trees there are:

long parseCount = document.getNumberOfParses();
if (parseCount > 1) {
    if (document.isInfinitelyAmbiguous()) {
        return String.format("%s is a date (in infinite ways): %s", date, tree);
    }
    return String.format("%s is a date (%d ways): %s", date, parseCount, tree);
}

Beware that in an infinitely ambigous forest, the number returned is wildely inaccurate (in as much as it is infinitely less than ∞!). It tells you something about how the forest divides into ambiguous branches, but doesn’t attempt to account for loops.