Get the latest tech news
Show HN: PILF, The ultimate solution to catastrophic oblivion on AI models
PILF: A IPWT-inspired continual learning framework designed to mitigate catastrophic forgetting and improve efficiency using a Surprise-gated Mixture of Experts (MoE) model. - dmf-archive/PILF
Dynamic Capacity: In a Mixture-of-Experts (MoE) architecture, Surprise not only adjusts the learning rate but also determines the number of "experts" k to activate. Our test suite is now centered around a lightweight (~1M parameter) Vision Transformer architecture to facilitate rapid experimentation on cognitive learning principles. Advantage: No need to modify the model architecture; can be a drop-in replacement for existing training workflows to quickly validate the effectiveness of the dynamic policy.
Or read this on Hacker News