Get the latest tech news
YaFSDP: a sharded data parallelism framework, faster for pre-training LLMs
YaFSDP: Yet another Fully Sharded Data Parallel. Contribute to yandex/YaFSDP development by creating an account on GitHub.
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Or read this on r/technology