Get the latest tech news

UTF-8 is a brilliant design


Exploring the brilliant design of UTF-8 encoding system that represents millions of characters while being backward compatible with ASCII

2025-09-12The first time I learned about UTF-8 encoding, I was fascinated by how well-thought and brilliantly it was designed to represent millions of characters from different languages and scripts, and still be backward compatible with ASCII. This code point represents the waving hand emoji "👋" in the Unicode character set(open in playground). ( open in playground)Now this is a valid UTF-8 file, but it doesn't have to be "backward compatible" with ASCII because it contains a non-ASCII character (the emoji).

Get the Android app

Or read this on Hacker News

Read more on:

Photo of UTF-8

UTF-8

Photo of brilliant design

brilliant design

Related news:

News photo

Decoding UTF-8. Part III: Determining Sequence Length – A Lookup Table

News photo

Unicode shenanigans: Martine écrit en UTF-8

News photo

How to chop off bytes of an UTF-8 string to fit into a small slot and look nice