Get the latest tech news
Das Problem mit German Strings
And why I don't want my database to choose the best encoding for me (yet)
There is a batch size that defines how big a chunk of data is that passes through the database operators at a time so in theory this also limits the amount of memory used by queries that do not accumulate values. At the storage layer for example, Vortex already represents data schemas using only logical types and chooses the best encoding for each column chunk dynamically. On the execution side, it looks like the datafusion community is starting to discuss this idea, opening the door for potentially one day choosing the best physical encoding dynamically at plan time based on the query and storage format.
Or read this on Hacker News