Get the latest tech news

Reproducing the deep double descent paper


This summer, I've been at the intensively trying to catch up to the current state of the machine learning world. I don't have any prior background in ML, so...

The paper discusses that introducing noise (purposefully adding incorrect labels) can be a proxy for this effect to highlight double descent. I figured I could monkeypatch the model ( model.fc = t.nn.Linear(...)), but at this point I also realized that the paper uses an old variation of resnets where the order of the convolution, activation and batch norm is different from the way pytorch implements it. My guess here is that adding too much noise makes the model unable to learn well enough, but maybe we'd see it converge lower if we gave it more epochs to run with?

Get the Android app

Or read this on Hacker News