Get the latest tech news
The Art of Managing Skunks
Since moving from academic research to industry in 2017, I’ve worked on two software projects. Each one started as a small, clean-slate1 skunkworks effort involving 2-3 people and gradually expanded to a large, conventional software engineering effort with dozens of engineers. The first of these (from 2017 to 2021) was Delos at Meta, a Chubby/ZooKeeper/etcd-like control plane storage system. The second was a new Kafka engine (from 2022 to 2024) that can run on any disaggregated storage layer (and powers the Confluent Freight product, where S3 is used as that storage layer). Nearly every system at Meta depends in some way on Delos as of 2025 (e.g., this article describes an example dependency chain); Confluent Freight just became generally available and time will tell if it succeeds commercially, though early results are promising. this prior post might explain why I think clean-slate innovation is critical in systems. ↩
While these systems were technically difficult to build and operate (particularly given their critical roles in the stacks of the respective companies), I found that much of the challenge lay in the management of these projects. D. Formal communication (exposed outside the team) has to be extremely precise, high-quality, and reviewed: To paraphrase Jeff Bezos, we want “crisp documents and messy meetings”. I hope these rules help managers and engineers find common ground – good luck starting your own clean-slate skunkworks projects!
Or read this on Hacker News