Get the latest tech news

FastVLM: Efficient vision encoding for vision language models


This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025 - apple/ml-fastvlm

Our larger variants using Qwen2-7B LLM outperform recent works like Cambrian-1-8B while using a single image encoder with a 7.9x faster TTFT. To download all the pretrained checkpoints run the command below (note that this might take some time depending on your connection so might be good to grab ☕️ while you wait). To run inference on Apple devices like iPhone, iPad or Mac, see app subfolder for more details.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Apple

Apple

Related news:

News photo

Apple to Block Mac Apps From Secretly Accessing Your Clipboard

News photo

PSA: Apple Ending Support for Old HomeKit Architecture in Fall 2025, Upgrade Before Then

News photo

Apple confirms iOS 19 will end support for legacy Home app system