Appendix B. Unicode Character Classes
The following table, reproduced from the Unicode Character Database for Unicode version 15.0.0, lists all of the supported character classes.
These are the character classes supported by CoffeeFilter in Invisible XML. They are reproduced here for convenience.
Table B.1. Unicode 15.0.0 Character Classes
Class | Description |
---|---|
Lu | an uppercase letter |
Ll | a lowercase letter |
Lt | a digraphic character, with first part uppercase |
LC | Lu | Ll | Lt |
Lm | a modifier letter |
Lo | other letters, including syllables and ideographs |
L | Lu | Ll | Lt | Lm | Lo |
Mn | a nonspacing combining mark (zero advance width) |
Mc | a spacing combining mark (positive advance width) |
Me | an enclosing combining mark |
M | Mn | Mc | Me |
Nd | a decimal digit |
Nl | a letterlike numeric character |
No | a numeric character of other type |
N | Nd | Nl | No |
Pc | a connecting punctuation mark, like a tie |
Pd | a dash or hyphen punctuation mark |
Ps | an opening punctuation mark (of a pair) |
Pe | a closing punctuation mark (of a pair) |
Pi | an initial quotation mark |
Pf | a final quotation mark |
Po | a punctuation mark of other type |
P | Pc | Pd | Ps | Pe | Pi | Pf | Po |
Sm | a symbol of mathematical use |
Sc | a currency sign |
Sk | a non-letterlike modifier symbol |
So | a symbol of other type |
S | Sm | Sc | Sk | So |
Zs | a space character (of various non-zero widths) |
Zl | U+2028 LINE SEPARATOR only |
Zp | U+2029 PARAGRAPH SEPARATOR only |
Z | Zs | Zl | Zp |
Cc | a C0 or C1 control code |
Cf | a format control character |
Cs | a surrogate code point |
Co | a private-use character |
Cn | a reserved unassigned code point or a noncharacter |
C | Cc | Cf | Cs | Co | Cn |