Skip to content Skip to sidebar Skip to footer

Python: Why Are * And ** Faster Than / And Sqrt()?

While optimising my code I realised the following: >>> from timeit import Timer as T >>> T(lambda : 1234567890 / 4.0).repeat() [0.22256922721862793, 0.20560789108

Solution 1:

The (somewhat unexpected) reason for your results is that Python seems to fold constant expressions involving floating-point multiplication and exponentiation, but not division. math.sqrt() is a different beast altogether since there's no bytecode for it and it involves a function call.

On Python 2.6.5, the following code:

x1 = 1234567890.0 / 4.0x2 = 1234567890.0 * 0.25x3 = 1234567890.0 ** 0.5x4 = math.sqrt(1234567890.0)

compiles to the following bytecodes:

# x1 = 1234567890.0 / 4.040 LOAD_CONST               1 (1234567890.0)
              3 LOAD_CONST               2 (4.0)
              6 BINARY_DIVIDE       
              7 STORE_FAST               0 (x1)

  # x2 = 1234567890.0 * 0.25510 LOAD_CONST               5 (308641972.5)
             13 STORE_FAST               1 (x2)

  # x3 = 1234567890.0 ** 0.5616 LOAD_CONST               6 (35136.418286444619)
             19 STORE_FAST               2 (x3)

  # x4 = math.sqrt(1234567890.0)722 LOAD_GLOBAL              0 (math)
             25 LOAD_ATTR                1 (sqrt)
             28 LOAD_CONST               1 (1234567890.0)
             31 CALL_FUNCTION            134 STORE_FAST               3 (x4)

As you can see, multiplication and exponentiation take no time at all since they're done when the code is compiled. Division takes longer since it happens at runtime. Square root is not only the most computationally expensive operation of the four, it also incurs various overheads that the others do not (attribute lookup, function call etc).

If you eliminate the effect of constant folding, there's little to separate multiplication and division:

In [16]: x = 1234567890.0

In [17]: %timeit x / 4.010000000 loops, best of 3: 87.8 ns per loop

In [18]: %timeit x * 0.2510000000 loops, best of 3: 91.6 ns per loop

math.sqrt(x) is actually a little bit faster than x ** 0.5, presumably because it's a special case of the latter and can therefore be done more efficiently, in spite of the overheads:

In [19]: %timeit x **0.51000000 loops, best of3: 211 ns per loop

In [20]: %timeit math.sqrt(x)
10000000 loops, best of3: 181 ns per loop

edit 2011-11-16: Constant expression folding is done by Python's peephole optimizer. The source code (peephole.c) contains the following comment that explains why constant division isn't folded:

case BINARY_DIVIDE:
        /* Cannot fold this operation statically since
           the result can depend on the run-time presence
           of the -Qnew flag */return0;

The -Qnew flag enables "true division" defined in PEP 238.

Post a Comment for "Python: Why Are * And ** Faster Than / And Sqrt()?"