Get the latest tech news

Taming randomness in ML models with hypothesis testing and marimo


The behavior of ML models is often affected by randomness at different levels, from the initialization of model parameters to the dataset split into training and evaluation. Thus, predictions made by a model (including the answers an LLM gives to your questions) are potentially different every time you run it.

While academic papers almost always take into consideration the variability of model predictions in their evaluations, this is way less common in industry publications like blog posts or company announcements. You will learn this by running a marimo app in your browser, which will introduce you to the basics of statistical testing through different examples based on the simple concept of dice throwing. Many thanks to Akshay Agrawal and Myles Scolnick for building marimo and providing help and super early feedback, Vicki Boykis and Oleg Lavrovsky for testing the notebook and offering me great suggestions on how to improve it.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of ML models

ML models

Photo of Taming randomness

Taming randomness

Photo of hypothesis testing

hypothesis testing

Related news:

News photo

Protecting ML models will secure supply chain, JFrog releases ML security features