Function
uni_encode
Encode a scalar value.
Parameters 🔗
| cp | in | Unicode scalar value. |
| dst | out | Encoded character. |
| dst_len | inout | Code unit capacity of |
| dst_attr | in | Encoding form to encode |
Return Value 🔗
| UNI_OK | If the character was encoded successfully. |
| UNI_BAD_OPERATION | If |
| UNI_NO_SPACE | If |
| UNI_FEATURE_DISABLED | If Unicorn was built without support for the encoding form specified by |
Discussion 🔗
This function encodes the Unicode scalar value cp into the encoding form dst_attr and writes the resulting code units to dst. The code unit capacity of dst is specified by dst_len. The implementation will write the number of code units needed to encode cp to dst_len on output.
If the capacity of dst is insufficient, then UNI_NO_SPACE is returned.
To ensure the dst is sufficiently sized, the following table lists the number of code units needed to encode any character within its respective encoding form.
| Encoding Form | Longest Code Unit Sequence |
|---|---|
| UTF-8 | 4 |
| UTF-16 | 2 |
| UTF-32 | 1 |
If dst is null and dst_len is zero, then the number of code units needed will be computed by the implementation and written to dst_len and UNI_OK is returned.
Examples 🔗
This example encodes the Unicode scalar value for GRINNING FACE (U+1F600) as UTF-8 encoded code units. The code units are printed to stdout as hexadecimal numbers.
#include <unicorn.h>
#include <stdio.h>
int main(void)
{
unichar cp = 0x1F600;
uint8_t u8[4];
unisize u8_len = 4;
if (uni_encode(cp, u8, &u8_len, UNI_UTF8) != UNI_OK)
{
puts("failed to encode character");
}
for (unisize i = 0; i < u8_len; i++)
{
printf("0x%02X ", u8[i]);
}
return 0;
}