Function

uni_casefold

Perform case folding.

Since v1.0
unistat uni_casefold(
unicasefold casing, const void *src, unisize src_len, uniattr src_attr, void *dst, unisize *dst_len, uniattr dst_attr)

Parameters 🔗

casing in

Case fold form to apply to src.

src in

Text to case fold.

src_len in

Number of code units in src or -1 if src is null terminated.

src_attr in

Attributes of src.

dst out

Buffer to write the case folded result to (can be NULL).

dst_len inout

Capacity of dst in code units on input; number of code units in src when case folded on output.

dst_attr in

Attributes of dst.

Return Value 🔗

UNI_OK

If text was checked successfully.

UNI_BAD_OPERATION

If text is NULL, length is negative, or result is NULL.

UNI_BAD_ENCODING

If src is not well-formed (checks are omitted if src_attr has UNI_TRUST).

UNI_NO_SPACE

If dst was not large enough.

UNI_FEATURE_DISABLED

If the library was built without support for case folding.

UNI_NO_MEMORY

If dynamic memory allocation failed.

Discussion 🔗

Case folds src to casing form casing and writes the result to dst. The capacity of dst is specified by dst_len.

If dst is not NULL, then the implementation writes to dst_len the total number of code units written to dst. If the capacity of dst is insufficient, then UNI_NO_SPACE is returned otherwise it returns UNI_OK.

If dst is NULL, then dst_len must be zero and the implementation writes to dst_len the number of code units in the fully case converted text and returns UNI_OK. Call the function this way to compute the total length of the destination buffer before calling it again with a sufficiently sized buffer.

Examples 🔗

This example casefolds a string for canonical caseless comparison.

The example will produce an output string that is longer than the input string. This is because the German 'ß' (U+00DF) case folds to 'ss' (U+0073 U+0073) and the Latin small letter 'ö' with diaeresis (U+00F6) canonically normalizes to an 'o' (U+006F) followed by a combining diaeresis character (U+0308).

#include <unicorn.h>
#include <stdio.h>

int main(void)
{
    const char *in = u8"Stößen";
    char out[32];
    unisize outlen = sizeof(out);

    if (uni_casefold(UNI_CANONICAL, in, -1, UNI_UTF8, out, &outlen, UNI_UTF8) != UNI_OK)
    {
        // something went wrong
        return 1;
    }

    printf("%.*s", outlen, out); // prints "stössen"
    return 0;
}