Get the latest tech news
AMD Instinct Accelerators With So Much vRAM Have Exposed Linux Hibernation Issues
Too much vRAM and too many Instinct accelerators per server is causing system hibernation to fail on some high-end AMD AI Linux-powered servers
But a new patch series was posted today in working to address this problem with the Linux kernel for high-end systems failing to hibernate. AMD engineer Samuel Zhang explained the current behavior of Linux servers potentially running into hibernation issues if there is too much vRAM due to the hibernation process trying to evict that memory to GTT or shared memory. Granted, most high-end accelerator-powered/AI servers are in use constantly, but for those wanting to hibernate them during downtime for reducing power consumption, this is apparently a real problem in play.
Or read this on Phoronix