Get the latest tech news

Embarrassingly simple self-distillation improves code generation


Can a large language model (LLM) improve at code generation using only its own raw outputs, without a verifier, a teacher model, or reinforcement learning? We answer in the affirmative with simple self-distillation (SSD): sample solutions from the model with certain temperature and truncation configurations, then fine-tune on those samples with standard supervised fine-tuning. SSD improves Qwen3-30B-Instruct from 42.4% to 55.3% pass@1 on LiveCodeBench v6, with gains concentrating on harder problems, and it generalizes across Qwen and Llama models at 4B, 8B, and 30B scale, including both instruct and thinking variants. To understand why such a simple method can work, we trace these gains to a precision-exploration conflict in LLM decoding and show that SSD reshapes token distributions in a context-dependent way, suppressing distractor tails where precision matters while preserving useful diversity where exploration matters. Taken together, SSD offers a complementary post-training direction for improving LLM code generation.

None

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Apple

Apple

Photo of code generation

code generation

Photo of distillation

distillation

Related news:

News photo

Who’s in Apple’s top 10? Here’s the full list of the most influential people of all time

News photo

Happy Birthday, iPad: Apple's Tablet Turns 16

News photo

Apple Now Sells Refurbished M4 iPad Pro Models Starting at $759