Text to Unicode encoding replaces each character with its code point in the \uXXXX escape format used by JSON and JavaScript source. It is useful when a downstream tool cannot handle raw non-ASCII bytes or when you want the input to look identical on every system regardless of font coverage. The tool handles characters outside the BMP by emitting surrogate pairs that recompose to the original codepoint on decode.
Force non-ASCII characters into ASCII-only \u escapes so every JSON parser sees the same bytes regardless of transport encoding.
When a legacy log pipeline mangles UTF-8, encoding the input first guarantees the text survives the trip unchanged.
Translation files often ship as \u-escaped resource bundles; build them by hand with a quick encode pass.
Emit the code point for a suspicious homograph or zero-width character so a reader can identify exactly which codepoint is involved.
By default only non-ASCII characters are escaped. ASCII letters and digits pass through unchanged for readability.
Codepoints above U+FFFF are emitted as UTF-16 surrogate pairs, which is what JavaScript and JSON expect.
Yes, switch to force-all mode to escape every character including ASCII. Output is longer but byte-identical across every ASCII-safe transport.
Yes, run the Unicode decoder on the output to get the original string back without loss.