Get the latest tech news

AI models fed AI-generated data quickly spew nonsense


Researchers gave successive versions of a large language model information produced by previous generations of the AI — and observed rapid collapse.

“The message is we have to be very careful about what ends up in our training data,” says co-author Zakhar Shumaylov, an AI researcher at the University of Cambridge, UK. The ninth iteration of the model completed a Wikipedia-style article about English church towers with a treatise on the many colours of jackrabbit tails (see ‘AI gibberish’). More subtly, the study, published in Nature 1 on 24 July, showed that even before complete collapse, learning from AI-derived texts caused models to forget the information mentioned least frequently in their data sets as their outputs became more homogeneous.

Get the Android app

Or read this on r/technology

Read more on:

Photo of AI models

AI models

Photo of nonsense

nonsense

Photo of generated data

generated data

Related news:

News photo

AI models rank their own safety in OpenAI’s new alignment research

News photo

AI models collapse when trained on recursively generated data

News photo

MIT researchers advance automated interpretability in AI models