Input

Output

Formatted result will be displayed here

Why is character counting harder than it looks?

What you see as one character might be several codepoints and many bytes. This tool reports four numbers for any input: visible graphemes, UTF-16 code units like JavaScript .length, Unicode codepoints, and UTF-8 bytes. Each number answers a different question, and using the right one matters when you are enforcing a character limit that a storage layer or a display layer cares about.

Use Cases

Validate Twitter-style limits

Character limits on social platforms count graphemes; check your draft against the grapheme total before posting.

Fit database columns

A VARCHAR column is usually sized in bytes or codepoints. Choose the right metric to avoid truncation at insert time.

Plan SMS segmentation

SMS uses GSM-7 or UCS-2 encodings. The byte count flags when your message will split into multiple segments.

Inspect hidden characters

A mismatch between grapheme and codepoint counts hints at combining marks or zero-width joiners you may not have noticed.

FAQ

Grapheme vs codepoint vs byte?

Graphemes are what a user perceives, codepoints are Unicode scalar values, bytes are the serialized UTF-8 form. All three are shown.

What counts as a visible character?

Anything the Unicode segmentation algorithm groups into one cluster, including emoji with modifiers and flags.

Does whitespace count?

Yes. Spaces, tabs, and newlines are characters. If you want to exclude whitespace, trim it first.

Is the count case sensitive?

Casing does not affect counts. Upper and lowercase letters are one grapheme each.