Get the latest tech news
Understanding Moravec's Paradox
HexHowells Blog
In general, ignoring humans or machines, problems are more difficult if their search space is large, and reward signals are sparse. It can get more difficult when the model needs to map a single input to various outputs, such as via autoregressive text generation (e.g. "my dog is "; there are multiple correct answers). Take RLHF, the network now has a larger search space, since each token that is predicted is actually used as context, and the model doesn't know if the text it's producing is good until it has finished and is evaluated.
Or read this on Hacker News