Get the latest tech news

Building and scaling Notion's data lake


How Notion build and grew our data lake to keep up with rapid growth

The overhead of monitoring and managing 480 Fivetran connectors, along with re-syncing them during Postgres re-sharding, upgrade, and maintenance periods, became extremely high, creating a significant on-call burden for team members. Thanks to the scalability of Spark and Hudi, these three steps usually complete within 24 hours, allowing us to perform re-bootstrap with manageable time to accommodate new table asks and Postgres upgrade and re-sharding operations. Most importantly, the changeover unlocked massive data storage, compute, and freshness savings from a variety of analytics and product asks, enabling the successful rollout of Notion AI features in 2023 and 2024.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Notion

Notion

Photo of data lake

data lake

Related news:

News photo

Show HN: I am building an open-source Confluence and Notion alternative

News photo

Show HN: Eidos – Offline alternative to Notion

News photo

Nocobase, on prem Notion for creating production-ready apps