Get the latest tech news

Pixtral 12B


Pixtral 12B - the first-ever multimodal Mistral model. Apache 2.0.

Pixtral is trained to understand both natural images and documents, achieving 52.5% on the MMMU reasoning benchmark, surpassing a number of larger models. The model shows strong abilities in tasks such as chart and figure understanding, document question answering, multimodal reasoning and instruction following. In this way, Pixtral can be used to accurately understand complex diagrams, charts and documents in high resolution, while providing fast inference speeds on small images like icons, clipart, and equations.

Get the Android app

Or read this on Hacker News