Function
uni_normcmp
Canonical equivalence.
Parameters ๐
s1 | in | First source text. |
s1_len | in | Number of code units in |
s1_attr | in | Attributes of |
s2 | in | Second source text. |
s2_len | in | Number of code units in |
s2_attr | in | Attributes of |
result | out | Set to |
Return Value ๐
UNI_OK | If the string was normalized successfully. |
UNI_BAD_OPERATION | If |
UNI_BAD_ENCODING | If |
UNI_NO_MEMORY | If dynamic memory allocation failed. |
Discussion ๐
Check if s1
and s2
are canonically equivalent. That is, it checks if the graphemes of both strings are the same. This function is equivalent to calling uni_norm with UNI_NFD followed by a code point comparison.
The implementation is optimized to normalize the strings incrementally while simultaneously comparing them. This is a more optimal approach when itโs unknown whether the text is normalized or not. If itโs known in advance that the text is normalized, then itโs faster to simply perform the code point comparison directly with memcmp
or strcmp
.
The implementation strives to be highly performant and avoid dynamic memory allocation when possible. Typically, memory allocation will only be performed on unnaturally long combining character sequences, like Zalgo text. Itโs rare for real world text to trigger memory allocation.