Function

uni_norm

Normalize text.

Since v1.0

uninormform form, const void *src, unisize src_len, uniattr src_attr, void *dst, unisize *dst_len, uniattr dst_attr)

Parameters 🔗

form	in	Normalization form.
src	in	Input text.
src_len	in	Number of code units in `src` or `-1` if `src` is null terminated.
src_attr	in	Attributes of `src`.
dst	out	Output string; can be `NULL`
dst_len	inout	Code unit capacity of `dst` on input; number of code units written to `dst` on output.
dst_attr	in	Attributes of `dst`.

Return Value 🔗

UNI_OK	On success.
UNI_BAD_OPERATION	If `src` is `NULL`, if `dst_len` is negative, or if `dst` is NULL and `dst_len` is greater than zero.
UNI_BAD_ENCODING	If `src` is malformed; this is never returned if `src_attr` has UNI_TRUST.
UNI_NO_SPACE	If `dst` lacks the capacity to store the normalization of `src`.
UNI_NO_MEMORY	If dynamic memory allocation failed.
UNI_FEATURE_DISABLED	If Unicorn was built without support for normalizing to `form`.

Discussion 🔗

Normalizes src into the normalization form specified by form and writes the result to dst. If dst is not NULL, then the implementation writes to dst_len the total number of code units written to dst. If the capacity of dst is insufficient, then UNI_NO_SPACE is returned otherwise it returns UNI_OK.

If dst is NULL, then dst_len must be zero. If dst is NULL, then the function writes to dst_len the number of code units in the fully normalized text and returns UNI_OK. Call the function this way to first compute the total size needed for the destination buffer, then call it again with a sufficiently-sized buffer.

Examples 🔗

This example normalizes the input string to Normalization Form D. This form is ideal for in-memory string comparison because it is quick to compute. For persistent storage or transmission, Normalization Form C is preferred.

#include <unicorn.h>
#include <stdio.h>

int main(void)
{
    const char *in = u8"Åström";
    char out[32];
    unisize outlen = sizeof(out);

    if (uni_norm(UNI_NFD, in, -1, UNI_UTF8,  out, &outlen, UNI_UTF8) != UNI_OK)
    {
        // something went wrong
        return 1;
    }

    printf("%.*s", outlen, out);
    return 0;
}

Manual

Function

uni_norm

Parameters 🔗

Return Value 🔗

Discussion 🔗

Examples 🔗

On This Page