Appendix BUnicode Character Classes

The following table, reproduced from the Unicode Character Database for Unicode version 15.0.0, lists all of the supported character classes.

These are the character classes supported by CoffeeFilter in Invisible XML. They are reproduced here for convenience.

Table B.1Unicode 15.0.0 Character Classes
Class Description
Lu an uppercase letter
Ll a lowercase letter
Lt a digraphic character, with first part uppercase
LC Lu | Ll | Lt
Lm a modifier letter
Lo other letters, including syllables and ideographs
L Lu | Ll | Lt | Lm | Lo
Mn a nonspacing combining mark (zero advance width)
Mc a spacing combining mark (positive advance width)
Me an enclosing combining mark
M Mn | Mc | Me
Nd a decimal digit
Nl a letterlike numeric character
No a numeric character of other type
N Nd | Nl | No
Pc a connecting punctuation mark, like a tie
Pd a dash or hyphen punctuation mark
Ps an opening punctuation mark (of a pair)
Pe a closing punctuation mark (of a pair)
Pi an initial quotation mark
Pf a final quotation mark
Po a punctuation mark of other type
P Pc | Pd | Ps | Pe | Pi | Pf | Po
Sm a symbol of mathematical use
Sc a currency sign
Sk a non-letterlike modifier symbol
So a symbol of other type
S Sm | Sc | Sk | So
Zs a space character (of various non-zero widths)
Zl U+2028 LINE SEPARATOR only
Zp U+2029 PARAGRAPH SEPARATOR only
Z Zs | Zl | Zp
Cc a C0 or C1 control code
Cf a format control character
Cs a surrogate code point
Co a private-use character
Cn a reserved unassigned code point or a noncharacter
C Cc | Cf | Cs | Co | Cn