Chapter 8. Output formats
CoffeePot can produce XML or JSON output of any parse tree and CSV output of trees that are “simple enough”.
Here’s a small grammar for a “contacts file”:
contacts: (contact, NL*)+ .
contact: name, NL, (email, NL)?, (phone, NL)? .
name: letter, ~[#a; "@"]* .
email: username, "@", domainname .
phone: ["+0123456789()- "]+ .
-username: (letter; ["+-."])+ .
-domainname: (letter; ["+-."])+ .
-letter: [L] .
-NL: -#a ; -#d, -#a .
Contacts are names, email addresses, and phone numbers on separate lines. Contacts are separated by a blank line. For example:
John Doe
john@example.com
555-0100
Mary Smith
m.smith@estaff.example.com
+1-222-555-0102
Jane Doe
(512) 555-0105
Nancy Jones
nancy@example.org
8.1. XML
A conformant Invisible XML processor, parsing this data with the grammar above, produces XML:
<contacts>
<contact>
<name>John Doe</name>
<email>john@example.com</email>
<phone>555-0100</phone>
</contact>
<contact>
<name>Mary Smith</name>
<email>m.smith@estaff.example.com</email>
<phone>+1-222-555-0102</phone>
</contact>
<contact>
<name>Jane Doe</name>
<phone>(512) 555-0105</phone>
</contact>
<contact>
<name>Nancy Jones</name>
<email>nancy@example.org</email>
</contact>
</contacts>
8.2. JSON
Today, there are lots of services that prefer JSON to XML, so
coffeepot provides an option to produce JSON,
--format:json
:
{
"contacts": {
"contact": [
{
"name": "John Doe",
"email": "john@example.com",
"phone": "555-0100"
},
{
"name": "Mary Smith",
"email": "m.smith@estaff.example.com",
"phone": "+1-222-555-0102"
},
{
"name": "Jane Doe",
"phone": "(512) 555-0105"
},
{
"name": "Nancy Jones",
"email": "nancy@example.org"
}
]
}
}
There are, in fact, two different JSON output options,
json-data
and json-tree
. In “data” mode,
the names of nonterminals become the names of properties and siblings
are wrapped in an array.
The “tree” mode, JSON serialization produces a literal rendering of the tree structure:
{
"content": {
"name": "contacts",
"content": [
{
"name": "contact",
"content": [
{
"name": "name",
"content": "John Doe"
},
{
"name": "email",
"content": "john@example.com"
},
{
"name": "phone",
"content": "555-0100"
}
]
},
{
"name": "contact",
"content": [
{
"name": "name",
"content": "Mary Smith"
},
{
"name": "email",
"content": "m.smith@estaff.example.com"
},
{
"name": "phone",
"content": "+1-222-555-0102"
}
]
},
{
"name": "contact",
"content": [
{
"name": "name",
"content": "Jane Doe"
},
{
"name": "phone",
"content": "(512) 555-0105"
}
]
},
{
"name": "contact",
"content": [
{
"name": "name",
"content": "Nancy Jones"
},
{
"name": "email",
"content": "nancy@example.org"
}
]
}
]
}
}
8.3. CSV
If the tree structure that results from the parse has a very specific structure, it can be rendered into CSV. The document must contain no mixed content (ignoring whitespace-only text nodes) and exactly three levels of hierarchy: a root element, its children, and its grandchildren.
The names of the root node’s children are irrelevant, each one becomes a row in the output. The names of the grandchildren elements become the names of columns. There will be one column for each such unique name, even if not every child contains the same grandchildren elements.
The contacts output satisfies these constraints, so it can be rendered in CSV:
"name","email","phone"
"John Doe","john@example.com","555-0100"
"Mary Smith","m.smith@estaff.example.com","+1-222-555-0102"
"Jane Doe",,"(512) 555-0105"
"Nancy Jones","nancy@example.org",