Get the latest tech news
Scalable and Performant Data Loading
SPDL is a framework-agnostic data loading solution that uses multi-threading, which achieves high-throughput in a regular Python interpreter (built...
*An alternative is to fork the main process when creating a subprocess, but due to some subtlety in the way libraries are initialized, this often causes a segmentation fault, so spawning is the only safe option. The reason why conventional data loading solutions use subprocesses, despite many side effects that hinder scaling their throughput, is Python's Global Interpreter Lock (GIL). SPDL is a framework-agnostic data loading solution that utilizes multi-threading, which achieves high-throughput in a regular Python interpreter (built without free-threading option enabled).
Or read this on Hacker News