Get the latest tech news

32k context length text embedding models


TL;DR – We are excited to announce voyage-3 and voyage-3-lite embedding models, advancing the frontier of retrieval quality, latency, and cost. voyage-3 outperforms OpenAI v3 large by 7.55% on aver…

Outperforms OpenAI v3 large across all eight evaluated domains (tech, code, web, law, finance, multilingual, conservation, and long-context) by 7.55% on average. If you are particularly interested in code, law, finance, and multilingual retrieval, Voyage 2 series domain-specific models ( voyage-code-2, voyage-law-2, voyage-finance-2, and voyage-multilingual-2) are still best for their respective domains, even though voyage-3 has highly competitive performance as well (see Section below). CategoryDescriptionsDatasets TECHTechnical documentationCohere, 5G, OneSignal, LangChain, PyTorchCODECode snippets, docstringsLeetCodeCpp, LeetCodeJava, LeetCodePython, HumanEval, MBPP, DS1000-referenceonly, DS1000, apps_5docLAWCases, court opinions, statutes, patents LeCaRDv2, LegalQuAD, LegalSummarization, AILA casedocs, AILA statutes FINANCESEC filings, finance QARAG benchmark (Apple-10K-2022), FinanceBench, TAT-QA, Finance Alpaca, FiQA Personal Finance, Stock News Sentiment, ConvFinQA, FinQA, HC3 FinanceWEBReviews, forum posts, policy pagesHuffpostsports, Huffpostscience, Doordash, Health4CALONG-CONTEXTLong documents on assorted topics: government reports, academic papers, and dialogues NarrativeQA, Needle, Passkey, QMSum, SummScreenFD, WikimQA CONVERSATIONMeeting transcripts, dialoguesDialog Sum, QA Conv, HQA Models.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of embedding models

embedding models

Photo of length text

length text

Photo of 32k

32k

Related news:

News photo

OpenAI launches new generation of embedding models and other API updates