Get the latest tech news
Wc2: Investigates optimizing 'wc', the Unix word count program
Investigates optimizing 'wc', the Unix word count program - robertdavidgraham/wc2
The real programs spend most of their time in functions like mbrtowc() to parse multi-byte characters and iswspace() to test if they are spaces -- which re-implementations of wc skip. The original wc program largley ignores errors, but it's still an important factor in making sure you are doing things correctly. The traditional wc program has wildly different performance depending upon input, such as whether the file is full of illegal characters, or whether UTF-8 is being handled.
Or read this on Hacker News