Get the latest tech news
Qwen1.5-110B
GITHUB HUGGING FACE MODELSCOPE DEMO DISCORD Introduction Recently we have witnessed a burst of large-scale models with over 100 billion parameters in the opensource community. These models have demonstrated remarkable performance in both benchmark evaluation and chatbot arena. Today, we release the first 100B+ model of the Qwen1.5 series, Qwen1.5-110B, which achieves comparable performance with Meta-Llama3-70B in the base model evaluation, and outstanding performance in the chat evaluation, including MT-Bench and AlpacaEval 2.
Recently we have witnessed a burst of large-scale models with over 100 billion parameters in the opensource community. It consists of grouped query attention (GQA) and it can be efficient in model serving. The model supports the context length 32K tokens, and the model is still multilingual, supporting a large number of languages including English, Chinese, French, Spanish, German, Russian, Korean, Japanese, Vietnamese, Arabic, etc.
Or read this on Hacker News