Get the latest tech news
Tiered storage won't fix Kafka
hot topic in the world of data streaming systems, and for good reason. Cloud disks are (really) expensive, object storage is cheap, and in most cases, live consumers are just reading the most recently written data.
In theory, tiered storage should also ease operational burden by reducing the amount of data stored on each broker’s local disk, which (among other benefits) should make scaling the cluster in and out faster. Furthermore, the issues that I have described so far tend to lie dormant at first, and then rear their head at the worst possible time: when something has gone wrong and historical data needs to be replayed without sacrificing the reliability of the live workload. So maybe tiered storage doesn’t save you as much money as you hoped, and hey the performance is unpredictable, but you can provision EBS volumes with dedicated IOPS, do some very careful gamedays, and test that everything will work as you expect.
Or read this on Hacker News