Get the latest tech news
Recent Performance Improvements in Function Calls in CPython
How costly it is to call functions and builtins in your python code? Does inlining help? How have the recent CPython releases improved performance in these areas?
If you look at the bytecode from the previous section again, you should notice that the interpreter needs to repeatedly execute the COMPARE_OP and BINARY_OP for doing comparison and increment operations inside the loop. It reduces the instruction throughput, and if any of those loads have a cache miss, it can cause a long stall of hundreds of cycles until the data arrives from the main memory. Finally, let’s discuss what changes are behind the performance improvements of the 3rd benchmark which implements min as a Python function and calls it from inside the loop.
Or read this on Hacker News