Get the latest tech news

Using reinforcement learning and $4.80 of GPU time to find the best HN post

Convert expensive LLM prompts into fast, cheap fine-tuned models

OpenPipe is a managed fine-tuning service that makes it easy to build your own LLMs that achieve very high accuracy on a specific task. For certain applications like chatbots or recommendation systems, you can proactively offer the user several potential outputs to choose from, and use which option they preferred as your feedback signal. There are some good diamonds in the rough there that probably should have sparked more discussion like dealing with Machiavellian co-founders, pushing through burn-out, and recording a signal from a single electron.

Get the Android app

Or read this on Hacker News