Constant
UNI_GRAPHEME
Extended grapheme cluster breaks.
enum unibreak {
UNI_GRAPHEME,
}
Discussion 🔗
A grapheme or grapheme cluster is a user-perceived character. Grapheme clusters are useful for user interfaces and user-facing text manipulation. For example, graphemes should be used in the implementation of text selection, arrow key movement, backspacing through text, and so forth. When truncating text for presentation grapheme cluster boundaries should be used to avoid malforming user-perceived characters.
A key feature of Unicode grapheme clusters is that they remain unchanged across all canonically equivalent normalization forms. That means the graphemes boundaries remain unchanged whether the text is normalized as NFC or NFD.
Support for grapheme cluster break detection can be enabled in the JSON configuration as shown below:
{
"algorithms": {
"segmentation": [
"grapheme"
]
}
}