Getting Started 🔗

Unicorn is a lightweight, embeddable implementation of common Unicode® algorithms written in C99. The easiest way to integrate Unicorn into your project is to consume it as an amalgamation. The amalgamation is a pair of header and source files that you compile directly with your project. Alternatively, you can consume Unicorn the “traditional” way by compiling it as a library and linking it with your application.

To consume Unicorn as an amalgamation, you can either (1) use a prebuilt amalgamation or (2) build the amalgamation yourself. To use the prebuilt amalgamation, download the archive with “amalgamation” in its name from the downloads page, extract the files unicorn.c and unicorn.h and compile them with your project. While the prebuilt amalgamation is easy to consume, its downside is it includes all features of the library, even those you might not need! To customize which features the amalgamation includes, you must build it yourself.

Building from Source 🔗

Building Unicorn, either as a traditional library or an amalgamation, requires Python 3.10 or newer.

To customize which features Unicorn is built with, edit the features.json file found in the source distribution. The schema for this file is defined in the Feature Customization guide.

Building an Amalgamation 🔗

To generate your own amalgamation, download the archive release without the word “amalgamation” in its name, extract it, and from your shell run:

$ ./generate.pyz

This will generate the files unicorn.c and unicorn.h in the same directory. You can compile these source files directly with your project.

Building a Library 🔗

To consume Unicorn as a traditional static or dynamic library, build it with

$ ./configure
$ make
$ make install

or build with CMake

$ cmake -B build
$ cmake --build build
$ cmake --install build

Versioning Notice 🔗

Unicorn follows semantic versioning with one deviation: the result of a function may change between minor version bumps. For example, the function uni_is might return a different value for a Unicode character between minor releases.

This deviation exists because Unicorn relies on the Unicode Standard, and updates to Unicode may cause functions to return different results. Changes to the Unicode Standard are outside of Railgun Labs' control, so while Unicorn will strive to maintain backward compatibility, function results may change due to updates in Unicode.

Under normal circumstances, Railgun will only bump the major version if the C API is changed (e.g. a function signature is altered or a function is removed). In the event of a major version bump due to changes in Unicode, Railgun may offer free license upgrades to all current licensees, depending on the extent of the changes. Licensees will be notified of free upgrades through private communication channels and a public announcement post.

Manual

Getting Started 🔗

Building from Source 🔗

Building an Amalgamation 🔗

Building a Library 🔗

Versioning Notice 🔗

On This Page