Get the latest tech news

Show HN: Factorio Learning Environment – Agents Build Factories


Claude Sonnet 3.5 builds factories Large Language Models (LLMs) are rapidly saturating existing benchmarks, necessitating new open-ended evaluations. We introduce the Factorio Learning Environment (FLE), based on the game of Factorio, that tests agents in long-term planning, program synthesis, and resource optimization.

This reveals a telling discrepancy: while Gemini-2 and Deepseek demonstrate early-game automation capabilities in structured lab-play, they rarely attempt to create cohesive factories during open-ended exploration, resulting in poorer overall performance. Common failures included placing entities too close together, not allocating space for connections, or incorrect inserter placement - issues that severely impacted performance in complex tasks requiring coordination of multiple production lines. Our experiments revealed several key patterns that highlight both the capabilities and limitations of current AI agents when faced with open-ended industrial challenges:

Get the Android app

Or read this on Hacker News

Read more on:

Photo of factories

factories

Photo of agents

agents

Related news:

News photo

China’s Manus Follows DeepSeek in Challenging US AI Lead

News photo

Bangladesh's factories turn to surveillance, automation; workers feel pressure

News photo

OpenAI reportedly plans to charge up to $20,000 a month for specialized AI ‘agents’