Get the latest tech news
Trying out QvQ – Qwen's new visual reasoning model
I thought we were done for major model releases in 2024, but apparently not: Alibaba’s Qwen team just dropped the Apache 2.0 licensed QvQ-72B-Preview, “an experimental research model focusing on …
You can try it out on Hugging Face Spaces —it accepts an image and a single prompt and then streams out a very long response where it thinks through the problem you have posed it. Finally, I asked it to “Estimate the height of the dinosaur” against this image (which, as it correctly noted, is actually an inflatable dragon): As a happy user of Ollama’s qwq port I’m hoping they add a QvQ release at some point soon as well.
Or read this on Hacker News