Get the latest tech news

Less supervision, better results: Study shows AI models generalize more effectively on their own


Training LLMs and VLMs through reinforcement learning delivers better results than using hand-crafted examples.

Once a model is pre-trained on raw text and image data, companies and AI labs usually post-train it on a large dataset of hand-crafted examples in question/answer or request/response format. One of the key problems of machine learning (ML) systems is overfitting, where the model performs well on its training data but fails to generalize to unseen examples. The second task is V-IRL, which tests the model’s spatial reasoning capabilities in an open-world navigation domain that uses realistic visual input.

Get the Android app

Or read this on Venture Beat

Read more on:

Photo of Study

Study

Photo of AI models

AI models

Photo of better results

better results

Related news:

News photo

Center for the Study of National Reconnaissance: Corona Satellite (2013) [pdf]

News photo

Meta and researchers unveil AI models that convert brain activity into text with unmatched accuracy

News photo

Far-right populists much more likely than the left to spread fake news – study