Enumeration

unibp

Binary properties.

enum unibp {
    ...
}

Constants 🔗

UNI_NONCHARACTER_CODE_POINT

The Noncharacter_Code_Point character property.

UNI_ALPHABETIC

The Alphabetic character property.

UNI_LOWERCASE

The Lowercase character property.

UNI_UPPERCASE

The Uppercase character property.

UNI_HEX_DIGIT

The Hex_Digit character property.

UNI_WHITE_SPACE

The White_Space character property.

UNI_MATH

The Math character property.

UNI_DASH

The Dash character property.

UNI_DIACRITIC

The Diacritic character property.

UNI_EXTENDER

The Extender character property.

UNI_IDEOGRAPHIC

The Ideographic character property.

UNI_QUOTATION_MARK

The Quotation_Mark character property.

UNI_UNIFIED_IDEOGRAPH

The Unified_Ideograph character property.

UNI_TERMINAL_PUNCTUATION

The Terminal_Punctuation character property.

Discussion 🔗

Unicorn supports a small subset of the binary character properties defined by the Unicode Standard. The binary properties supported are those that are useful when parsing plain text.

Most binary characters properties defined by the standard are only applicable in specific applications, i.e. text shaping or rendering. Other properties are informational, for example a character’s name, the version it was introduced into the Unicode Standard. The remaining are only relevant when implementing various Unicode algorithms and are not “general” enough to expose.