Get the latest tech news
Pipeshift cuts GPU usage for AI inferences 75% with modular interface engine
Pipeshift has a Lego-like system that allows teams to configure the right inference stack for their AI workloads, without extensive engineering.
In most cases, the in-house teams can take years to develop pipelines that can allow for the flexibility and modularization of infrastructure, pushing enterprises behind in the market alongside accumulating massive tech debts.” “This unlocks a massive reduction of scaling costs as the GPUs can now handle workloads that are an order of magnitude 20-30 times what they originally were able to achieve using the native platforms offered by the cloud providers.” Plus, there was no auto-scaling support with tools like AWS Sagemaker, which made it hard to ensure optimal use of infra, pushing the company to pre-approve quotas and reserve capacity beforehand for theoretical scale that only hit 5% of the time,” Chattopadhyay noted.
Or read this on Venture Beat