Get the latest tech news
What kind of bug would make machine learning suddenly 40% worse at NetHack?
One day, a roguelike-playing system just kept biffing it, for celestial reasons.
Using Tuyls' model of expert NetHack behavior, Bartłomiej Cupiał and Maciej Wołczyk trained a neural network to play and improve itself using reinforcement learning. Cupiał and Wołczyk tried quite a few things: reverting their code, restoring their entire software stack from a Singularity backup, and rolling back their CUDA libraries. I submit to you that, although NetHack responded to the full moon in its intended way, this quirky, very hard-to-fathom stop on a machine-learning journey was indeed a bug and a worthy one in the pantheon.
Or read this on r/technology