Get the latest tech news
Code in pre-training data improves LLM performance at non-coding tasks
A study by Cohere shows that pre-training LLMs on code improves their abilities in non-coding tasks, including world knowledge and reasoning.
“This shows that initialization from a pre-trained model with a mix of code has a strong positive effect on NL reasoning tasks,” the researchers write. The researchers suggest that “performance on world knowledge tasks appears to depend on a more balanced data mixture for initialization and a larger proportion of text in the continual pre-training stage.” It is worth noting that LLMs often exhibit emergent behavior at very large scales, and the trends observed in the study might change at tens or hundreds of billions of parameters.
Or read this on Venture Beat