Function
uni_norm
Normalize text.
Parameters π
form | in | Normalization form. |
src | in | Input text. |
src_len | in | Number of code units in |
src_attr | in | Attributes of |
dst | out | Output string; can be |
dst_len | inout | Code unit capacity of |
dst_attr | in | Attributes of |
Return Value π
UNI_OK | On success. |
UNI_BAD_OPERATION | If |
UNI_BAD_ENCODING | If |
UNI_NO_SPACE | If |
UNI_NO_MEMORY | If dynamic memory allocation failed. |
UNI_FEATURE_DISABLED | If Unicorn was built without support for normalizing to |
Discussion π
Normalizes src
into the normalization form specified by form
and writes the result to dst
. If dst
is not NULL
, then the implementation writes to dst_len
the total number of code units written to dst
. If the capacity of dst
is insufficient, then UNI_NO_SPACE is returned otherwise it returns UNI_OK.
If dst
is NULL
, then dst_len
must be zero. If dst
is NULL
, then the function writes to dst_len
the number of code units in the fully normalized text and returns UNI_OK. Call the function this way to first compute the total size needed for the destination buffer, then call it again with a sufficiently-sized buffer.
Examples π
This example normalizes the input string to Normalization Form D. This form is ideal for in-memory string comparison because it is quick to compute. For persistent storage or transmission, Normalization Form C is preferred.
#include <unicorn.h>
#include <stdio.h>
int main(void)
{
const char *in = u8"Γ
strΓΆm";
char out[32];
unisize outlen = sizeof(out);
if (uni_norm(UNI_NFD, in, -1, UNI_UTF8, out, &outlen, UNI_UTF8) != UNI_OK)
{
// something went wrong
return 1;
}
printf("%.*s", outlen, out);
return 0;
}