Function

uni_nextbrk

Compute next boundary.

Since v1.0

unistat uni_nextbrk(

unibreak boundary, const void *text, unisize text_len, uniattr text_attr, unisize *index)

Parameters 🔗

boundary	in	Boundary to detect.
text	in	The text to segment.
text_len	in	Number of code units in `text` or `-1` if it’s null terminated.
text_attr	in	Attributes of `text`.
index	inout	Code point boundary as a code unit offset in `text`.

Return Value 🔗

UNI_OK	If the break iterator was successfully repositioned.
UNI_DONE	If `index` is beyond the last character of `text`.
UNI_BAD_OPERATION	If `text` or `index` is `NULL`.
UNI_BAD_ENCODING	If `text` is malformed; this is never returned if `text_attr` has UNI_TRUST.

Discussion 🔗

Compute the next boundary for text starting from a known code point specified by code unit index index. The implementation sets index to the code unit offset of the next boundary.

Examples 🔗

This example iterates the extended grapheme clusters of a UTF-8 encoded string and prints the indices where each grapheme begins.

#include <unicorn.h>
#include <stdio.h>

int main(void)
{
    const char *string = u8"Hi, 世界";
    unisize index = 0;
    while (uni_nextbrk(UNI_GRAPHEME, string, -1, UNI_UTF8, &index) == UNI_OK)
    {
        printf("%d\n", index); // prints '1', '2', '3', '4', 7', '10'
    }
    return 0;
}

Manual