Get the latest tech news

Meta Open-Sources Megalodon LLM for Efficient Long Sequence Modeling


Researchers from Meta, University of Southern California, Carnegie Mellon University, and University of California San Diego recently open-sourced MEGALODON, a large language model (LLM) with an unlimited context length. MEGALODON has linear computational complexity and outperforms a similarly-sized Llama 2 model on a range of benchmarks.

Another scheme that InfoQ recently covered is the RWKV Project's attention-free Transformer model, which has no maximum input context length. MEGALODON builds on the research team's previous model, MEGA(exponential moving average with gated attention), with several new features. MEGALODON outperformed all baseline models on the NarrativeQA subtask, and on all tasks achieved results "competitive" with Llama 2.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of sources

sources

Related news:

News photo

Google Open-Sources GWPSan As A New Sanitizer Framework

News photo

Cognizant to acquire Belcan for $1.3 billion, sources say

News photo

Sources: Wasoko-MaxAB e-commerce merger faces delays amid headwinds in Africa