Get the latest tech news
Less supervision, better results: Study shows AI models generalize more effectively on their own
Training LLMs and VLMs through reinforcement learning delivers better results than using hand-crafted examples.
Once a model is pre-trained on raw text and image data, companies and AI labs usually post-train it on a large dataset of hand-crafted examples in question/answer or request/response format. One of the key problems of machine learning (ML) systems is overfitting, where the model performs well on its training data but fails to generalize to unseen examples. The second task is V-IRL, which tests the model’s spatial reasoning capabilities in an open-world navigation domain that uses realistic visual input.
Or read this on Venture Beat