Function
uni_next
Decode a scalar value.
Parameters 🔗
text | in | Character sequence to decode. |
text_len | in | Number of code units in |
text_attr | in | Attributes of |
index | inout | Code unit offset in |
cp | out | Decoded scalar value. |
Return Value 🔗
UNI_OK | If the scalar was successfully decoded. |
UNI_DONE | If the end of |
UNI_BAD_ENCODING | If |
UNI_BAD_OPERATION | If |
Discussion 🔗
Decodes one Unicode scalar value from text
at code unit index
and writes the result to cp
. The index
parameter is updated by the implementation to refer to the code unit beginning the next scalar.
The number of code units in text
is specified by text_len
and its encoding is specified by text_attr
. If text_len
is negative then text
is assumed to be null terminated.
This function returns UNI_DONE when iteration has reached the end of text
otherwise it returns UNI_OK indicating there are more characters. If the implementation detects text
is malformed, then it returns UNI_BAD_ENCODING.
The index
parameter must refer to a code point boundary otherwise the behavior is undefined.
Examples 🔗
This example prints each Unicode scalar value from a text string encoded as UTF-8.
#include <unicorn.h>
#include <stdio.h>
int main(void)
{
const char str[] = u8"I 🕵️."; // I spy
unisize i = 0;
for (;;)
{
unichar cp;
unistat r = uni_next(str, -1, UNI_UTF8, &i, &cp);
if (r == UNI_DONE)
{
break;
}
else if (r == UNI_BAD_ENCODING)
{
// malformed character
}
else
{
printf("U+%04X\n", cp); // print scalar
}
}
return 0;
}