Get the latest tech news

Syncing Postgres Partitions to Parquet in S3 in Crunchy Bridge for Analytics


Marco shows how you can combine pg_partman and pg_cron on Bridge for Analytics to set up automated time-partitioning with long-term retention and fast analytics in your data lake.

We keep up to 7 days of data to power a dashboard with recent user activity, and to help in debugging issues. For instance, if at some point we get inserts from a few days ago or need to perform an update, then we can simply modify the heap partition and re-copy the data. In practice, you might also want to normalize, filter, or scrub your data when copying into the historical table, which can give you some additional speed ups.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of analytics

analytics

Photo of Parquet

Parquet

Photo of crunchy bridge

crunchy bridge

Related news:

News photo

The Birth of Parquet

News photo

Crunchy Bridge for Analytics: Your Data Lake in PostgreSQL

News photo

Parquet-WASM: Rust-based WebAssembly bindings to read and write Parquet data