Get the latest tech news
Intel's New LLM-Scaler Beta Update Brings Whisper Model & GLM-4.5-Air Support
Earlier this month Intel released LLM-Scaler 1.0 as part of their Project Battlematrix initiative
Earlier this month Intel released LLM-Scaler 1.0 as part of their Project Battlematrix initiative. This is a Docker container effort to deliver speedy AI inference performance with multi-GPU scaling and PCIe P2P support and more. On top of supporting the additional models, yesterday's beta also optimized vLLM memory usage and enables the pipeline parallelism Ray back-end.
Or read this on Phoronix