Get the latest tech news

How to chop off bytes of an UTF-8 string to fit into a small slot and look nice

Chopping UTF-8 While researching a very weird bug0 in Koha I had to figure out a way chop a string to a specific maximum length. In bytes and not in characters, because in that case the horrible format USMARC is used, whose spec starts with two red flags: It's from January 2000, and it's an "implementation of the American national standard", so you can bet that it only works (well) with ASCII and will be ...

While researching a very weird bug in Koha I had to figure out a way chop a string to a specific maximum length. In bytes and not in characters, because in that case the horrible format USMARC is used, whose spec starts with two red flags: It's from January 2000, and it's an "implementation of the American national standard", so you can bet that it only works (well) with ASCII and will be ... interesting when handling Unicode. Older formats (like ASCII) used a fixed length (eg one byte = 8 bit), but could therefore only represent a limited amount of letters.

Get the Android app

Or read this on Hacker News