Get the latest tech news

Qwen2.5-7B-Instruct-1M and Qwen2.5-14B-Instruct-1M


Very significant new release from Alibaba's Qwen team. Their openly licensed (sometimes Apache 2, sometimes Qwen license, I've had trouble keeping up) Qwen 2.5 LLM previously had an input token …

This new model increases that to 1 million, using a new technique called Dual Chunk Attention, first described in this paper from February 2024. You can also use the previous framework that supports Qwen2.5 for inference, but accuracy degradation may occur for sequences exceeding 262,144 tokens. I'll update this post when I figure out how to run longer prompts through the new Qwen model using GGUF weights on a Mac.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of 14B

14B

Photo of qwen2.5

qwen2.5

Photo of Instruct-1

Instruct-1

Related news:

News photo

Qwen2.5: A Party of Foundation Models

News photo

UK probes HPE’s planned $14B Juniper Networks acquisition

News photo

Scale AI Valued at Nearly $14B With Amazon Funding