Function

uni_nextbrk

Compute next boundary.

Since v1.0
unistat uni_nextbrk(
unibreak boundary, const void *text, unisize text_len, uniattr text_attr, unisize *index)

Parameters πŸ”—

boundary in

Boundary to detect.

text in

The text to segment.

text_len in

Number of code units in text or -1 if it’s null terminated.

text_attr in

Attributes of text.

index inout

Code point boundary as a code unit offset in text.

Return Value πŸ”—

UNI_OK

If the break iterator was successfully repositioned.

UNI_DONE

If index is beyond the last character of text.

UNI_BAD_OPERATION

If text or index is NULL.

UNI_BAD_ENCODING

If text is malformed; this is never returned if text_attr has UNI_TRUST.

Discussion πŸ”—

Compute the next boundary for text starting from a known code point specified by code unit index index. The implementation sets index to the code unit offset of the next boundary.

Examples πŸ”—

This example iterates the extended grapheme clusters of a a UTF-8 encoded string and prints the indices where each grapheme begins.

#include <unicorn.h>
#include <stdio.h>

int main(void)
{
    const char *string = u8"Hi, δΈ–η•Œ";
    unisize index = 0;
    while (uni_nextbrk(UNI_GRAPHEME, string, -1, UNI_UTF8, &index) == UNI_OK)
    {
        printf("%d\n", index); // prints '1', '2', '3', '4', 7', '10'
    }
    return 0;
}