Get the latest tech news
Data Version Control
Open-source version control system for Data Science and Machine Learning projects. Git-like experience to organize your data, models, and experiments.
What's new Extract and parse text from documents and create vector embeddings in a scalable and distributed way (and less than 70 lines of code). Explore and enrich annotated datasets with custom embeddings, auto-labeling, and bias removal at billion-file scale — without modifying your data. Connect to versioned data sources and code with pipelines, track experiments, register models — all based on GitOps principles.
Or read this on Hacker News