Get the latest tech news

Memory Efficient Data Streaming to Parquet Files


How to efficiently stream data into Parquet files.

While 1 GB of ram isn’t a big deal for the average engineer’s laptop, it’s actually quite a bit of overhead for a connector that would otherwise only really care about a single row at a time. The core of this methodology involves “transposing” incoming streaming data from a row-oriented to a column-oriented structure using an intermediate scratch file that is stored on disk rather than in memory. One thing that’s obvious from this strategy is that we need to be able to interact with Parquet files at a relatively low level to have direct access to reading and writing column chunks and row groups.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Memory

Memory

Photo of parquet files

parquet files

Related news:

News photo

Memory for music doesn't diminish with age

News photo

One-dose nasal spray clears toxic Alzheimer's proteins to improve memory

News photo

Leaked Pixel 9 Pro XL hands-on images tease satellite support and more memory