Function
uni_next
Decode a scalar value.
Parameters 🔗
| text | in | Character sequence to decode. |
| text_len | in | Number of code units in |
| text_attr | in | Attributes of |
| index | inout | Code unit offset in |
| cp | out | Decoded scalar value. |
Return Value 🔗
| UNI_OK | If the scalar was successfully decoded. |
| UNI_DONE | If the end of |
| UNI_BAD_ENCODING | If |
| UNI_BAD_OPERATION | If |
Discussion 🔗
Decodes one Unicode scalar value from text at code unit index and writes the result to cp. The index parameter is updated by the implementation to refer to the code unit beginning the next scalar.
The number of code units in text is specified by text_len and its encoding is specified by text_attr. If text_len is negative then text is assumed to be null terminated.
This function returns UNI_DONE when iteration has reached the end of text otherwise it returns UNI_OK indicating there are more characters. If the implementation detects text is malformed, then it returns UNI_BAD_ENCODING.
The index parameter must refer to a code point boundary otherwise the behavior is undefined.
Examples 🔗
This example prints each Unicode scalar value from a text string encoded as UTF-8.
#include <unicorn.h>
#include <stdio.h>
int main(void)
{
const char str[] = u8"I 🕵️."; // I spy
unisize i = 0;
for (;;)
{
unichar cp;
unistat r = uni_next(str, -1, UNI_UTF8, &i, &cp);
if (r == UNI_DONE)
{
break;
}
else if (r == UNI_BAD_ENCODING)
{
// malformed character
}
else
{
printf("U+%04X\n", cp); // print scalar
}
}
return 0;
}