Get the latest tech news

Does Field Ordering Affect Model Performance?

I mostly wrote this post as an excuse to try the freshly-minted and excellent pydantic-evals framework for LLM evaluations but one intere...

I mostly wrote this post as an excuse to try the freshly-minted and excellent pydantic-evals framework for LLM evaluations but one interesting question that arises when working with Pydantic Models to implement structured output in your AI applications is: What happens if you shuffle the order of fields in your schema? We use the painting style classification task from HuggingFace (because it doesn't seem saturated from 0-shot models). Easy Task ModelAnswer FirstAnswer Secondgpt 4.10.52270.5040gpt 4.1-mini0.45560.4515gpt 4o0.49600.5103gpt 4o-mini0.42050.4213 Hard Task ModelAnswer FirstAnswer Secondgpt 4.10.07770.0647gpt 4.1-mini0.03850.1017gpt 4o0.06960.0684gpt 4o-mini0.06070.0787 It's hard to say exactly why something works the way it does with LLMs but we've entered a new development paradigm and it's worth paying attention to the emerging patterns and ideas, especially the subtle ones :)

Get the Android app

Or read this on Hacker News