Get the latest tech news

Index 1.6B Keys with Automata and Rust (2015)


I blog mostly about my own programming projects.

Along the way, we will talk about memory maps, automaton intersection with regular expressions, fuzzy searching with Levenshtein distance and streaming set operations. 50,000,000 keys is big, and compressing them in merely 27 seconds is a nice result, but I don’t see it as a representative example of real work loads, so it was unfortunate that the biggest data set I had to try with FSTs also happened to be a near best case scenario. My hope is that this article taught you a little something about using finite state machines as a data structure, which enable storing a large number of keys in a small amount of space while remaining easily searchable.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Automata

Automata

Photo of Rust

Rust

Photo of index 1.6b

index 1.6b

Related news:

News photo

Debian GNU/Hurd 2025 Released With Completed 64-bit Support, Rust Ported

News photo

Rust's Annual Tech Report: Trusted Publishing for Packages and a C++/Rust Interop Strategy

News photo

How we replaced Elasticsearch and MongoDB with Rust and RocksDB