Get the latest tech news

DeepCoder delivers top coding performance in efficient 14B open model

DeepCoder-14B competes with frontier models like o3 and o1—and the weights, code, and optimization platform are open source.

The team also designed a straightforward reward function that only provides a positive signal if the generated code passes all sampled unit tests for the problem within a specific time limit. Combined with the high-quality training examples, this outcome-focused reward system prevents the model from learning tricks like printing memorized answers for public tests or optimizing for simple edge cases without solving the core problem. GRPO+ enables DeepCoder-14 to continue for longer durations without collapsing Credit: Together AI Finally, the team extended the model’s context window iteratively, first training it on shorter reasoning sequences and gradually increasing the length.

Get the Android app

Or read this on Venture Beat