Get the latest tech news

All of human cooking compressed into 2 megabytes


We present Epicure, a family of three sibling skip-gram ingredient embeddings retrained from scratch on a multilingual recipe corpus. We aggregate 4.14M recipes from 11 sources spanning seven languages, English, Chinese, Russian, Vietnamese, Spanish, Turkish, Indonesian, German, and Indian-English, and normalise the raw ingredient strings to 1,790 canonical entries via an LLM-augmented pipeline. A 203,508-edge ingredient-ingredient NPMI graph and an 80,019-edge typed FlavorDB ingredient-compound graph, 2,247 typed compound nodes across 15 categories, seed three Metapath2Vec variants that share architecture and hyperparameters and differ only in the random-walk schema: Cooc walks the co-occurrence graph only, Chem walks the typed compound metapaths only, and Core blends both via injected ingredient-ingredient walks at controlled mixing, placing each model at a distinct point on the chemistry-vs-recipe-context spectrum.

None

Get the Android app

Or read this on Hacker News

Read more on:

Photo of megabytes

megabytes

Photo of human cooking

human cooking

Related news:

News photo

What 5 Megabytes of Data Looked Like in 1966 (62,500 punched cards)

News photo

Americans Used Record 100 Trillion Megabytes of Wireless Data In 2023

News photo

Americans used record 100 trillion megabytes of wireless data in 2023