Get the latest tech news

How Fast Are On-Device LLMs on iPhone 17 Pro and iPad Pro?

Blog How Fast Are On-Device LLMs on iPhone 17 Pro and iPad Pro? Ricky Takkar Published February 8, 2026 TL;DR: I ran 6 quantized LLMs on Russet which uses Apple's first-party MLX framework on an iPhone 17 Pro and iPad Pro M5, both with 12GB RAM. LFM2.5 1.2B at 4-bit hits \(\text{124 tokens/sec}\) on iPad and \(\text{70 tokens/sec}\) on iPhone.

None

Get the Android app

Or read this on r/apple