Get the latest tech news

YaFSDP: a sharded data parallelism framework, faster for pre-training LLMs


YaFSDP: Yet another Fully Sharded Data Parallel. Contribute to yandex/YaFSDP development by creating an account on GitHub.

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Get the Android app

Or read this on r/technology

Read more on:

Photo of LLM

LLM

Photo of GPU

GPU

Photo of sources YaFSDP

sources YaFSDP

Related news:

News photo

Arm warns of actively exploited flaw in Mali GPU kernel drivers

News photo

Nvidia execs cash out shares as GPU giant skyrockets

News photo

Show HN: We've open-sourced our LLM attention visualization library