Get the latest tech news

Stream Kafka Topic to the Iceberg Tables with Zero-ETL


A solution from AutoMQ: open-sourced + no need for ETL pipeline maintenance

LinkedIn generated vast amounts of log data, from user activity events (like logins, page views, and clicks) to operational metrics (service call latency, errors, or system resource utilization). This can be achieved because table formats like Iceberg native support schema evolution over time, such as adding new columns, dropping existing ones, or changing data types, without requiring the complete rewriting of the entire dataset or disrupting downstream applications. The Workers are responsible for the writing process: converting Kafka records into Parquet data files, uploading them to object storage S3, and committing the metadata to the Iceberg catalog.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of ETL

ETL

Photo of iceberg tables

iceberg tables

Photo of stream kafka topic

stream kafka topic

Related news:

News photo

Databricks open-sources declarative ETL framework powering 90% faster pipeline builds

News photo

Trellis (YC W24) is hiring eng to turn documents into database