Get the latest tech news

Don’t let an LLM make decisions or execute business logic


Don’t let an LLM make decisions or implement business logic: they suck at that.

Even modern chess engines like Stockfish that incorporate neural networks are still purpose-built specialized systems with well-defined inputs and evaluation functions - not general-purpose language models trying to maintain game state through text. Even purpose-built neural networks like those in chess engines can be challenging for observability, and a general LLM is a nightmare, despite Anthropic’s great strides in this area And the rest…: testing LLM outputs is much harder than unit-testing known code-paths; LLMs are much worse at math than your CPU; LLMs are insufficiently good at picking random numbers; version-control and auditing becomes much harder; monitoring and observability gets painful; state management through natural language is fragile; you’re at the mercy of API rate limits and costs; and security boundaries become fuzzy when everything flows through prompts. The chess example illustrates the fundamental problem with using LLMs for core application logic, but this principle extends far beyond games.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of LLM

LLM

Photo of hell

hell

Related news:

News photo

Tied Crosscoders: Tracing How Chat LLM Behavior Emerges from Base Model

News photo

Writing an LLM from scratch, part 10 – dropout

News photo

What we've been playing - Hell and unrelated lumberjack fantasies