Input

Mode:

Output

Formatted result will be displayed here

What does text to Unicode encoding produce?

Text to Unicode encoding replaces each character with its code point in the \uXXXX escape format used by JSON and JavaScript source. It is useful when a downstream tool cannot handle raw non-ASCII bytes or when you want the input to look identical on every system regardless of font coverage. The tool handles characters outside the BMP by emitting surrogate pairs that recompose to the original codepoint on decode.

Use Cases

Embed safe strings in JSON

Force non-ASCII characters into ASCII-only \u escapes so every JSON parser sees the same bytes regardless of transport encoding.

Patch around broken encodings

When a legacy log pipeline mangles UTF-8, encoding the input first guarantees the text survives the trip unchanged.

Prepare localization fixtures

Translation files often ship as \u-escaped resource bundles; build them by hand with a quick encode pass.

Document tricky characters

Emit the code point for a suspicious homograph or zero-width character so a reader can identify exactly which codepoint is involved.

FAQ

Does every character get escaped?

By default only non-ASCII characters are escaped. ASCII letters and digits pass through unchanged for readability.

What about supplementary characters?

Codepoints above U+FFFF are emitted as UTF-16 surrogate pairs, which is what JavaScript and JSON expect.

Can I encode the whole string?

Yes, switch to force-all mode to escape every character including ASCII. Output is longer but byte-identical across every ASCII-safe transport.

Is there a decoder?

Yes, run the Unicode decoder on the output to get the original string back without loss.