timeit module hangs with bigger values of pow() - python

I am trying to calculate the time taken by pow function to calculate exponential modulo. With the values of g,x,p hardcoded the code gives error and with the values placed in the pow function, the code hangs. The same piece of code is working efficiently when i am using time() and clock() to calculate the time taken by this piece of code.
i wanted accuracy and for that now i have moved to timeit module after testing with clock() and time() functions.
The code works fine with small values such as pow(2, 3, 5) which makes sense. how can i improve the efficency to calculate time using timeit module.
Also i am a beginner to python, forgive me if there is any stupid mistake in the code.
import math
import random
import hashlib
import time
from timeit import Timer
g = 141802876407053547664378835005750805370737584038368838959151050908654130616798415530564917923311706921535439557793280725844349256960807398107370211978304
x = 1207729835787890214
p = 4870352607375058055471602136317178172283784073796673298937466544646468718314482464390112574915498953621226853454222898392076852427324057496200810018794472
t = Timer('pow(g,x,p)', 'import math')
z = t.timeit()
print ('the value of z is: '), z
Thanks

There are two issues here:
You can't directly access globals from timeit: See this question. You can use this to fix the error:
t = Timer('pow(g,x,p)', 'from __main__ import g,x,p')
Or just put the numerical values directly in the string.
By default, the timeit module runs 1000000 iterations, which will take much too long here. You can change the number of iterations, for example:
z = t.timeit(1000)
This will prevent what seems like a hang (but is actually just a very long calculation).

Related

How to use python timeit to get median runtime

I want to benchmark a bit of python code (not the language I am used to, but I have to do some comparisons in python). I have understood that timeit is a good tool for this, and I have a code like this:
n = 10
duration = timeit.Timer(my_func).timeit(number=n)
duration/n
to measure the mean runtime of the function. Now, I want to instead have the median time (the reason is that I want to make a comparison to something I get in median time, and it would be good to use the same measure in all cases). Now, timeit only seems to return the full runtime, and not the time of each individual run, so I am not sure how to find the median runtime. What is the best way to get this?
You can use the repeat method instead, which gives you the individual times as a list:
import timeit
from statistics import median
def my_func():
for _ in range(1000000):
pass
n = 10
durations = timeit.Timer(my_func).repeat(repeat=n, number=1)
print(median(durations))
Try it online!

Importing Numpy increases the execution time of the first iteration in timeit.repeat

Using timeit.repeat to measure the execution time of some expressions, I realized that the first iteration takes significantly longer if I imported numpy beforehand. Consider the following example:
from __future__ import print_function
import timeit
import numpy as np # comment this line
results = timeit.repeat(
"d['a']",
setup="d = dict(zip('abc', '123'))",
repeat=5, number=10**6
)
print(['{:.2e}'.format(x) for x in results])
I obtain the following results:
['5.38e-02', '2.72e-02', '2.70e-02', '2.68e-02', '2.70e-02']
The first iteration took significantly longer than the remaining ones (I verified this pattern by running the code multiple times).
Now when commenting out the import numpy as np line in the above code, then the timing results change as follows:
['2.73e-02', '2.71e-02', '2.65e-02', '2.68e-02', '2.66e-02']
Here the execution time of the first iteration is comparable to the others.
This behavior doesn't occur for Python 3.8 where I obtain similar timings, no matter if Numpy has been imported or not:
['2.64e-02', '2.89e-02', '2.65e-02', '2.63e-02', '2.63e-02'] # with 'import numpy'
['2.63e-02', '2.56e-02', '2.52e-02', '2.50e-02', '2.51e-02'] # without 'import numpy'
What causes this increase in execution time in conjunction with import numpy for Python 2.7?
Detailed version information:
Python 2.7.12: numpy==1.14.0
Python 3.8.1: numpy==1.18.1

Accurate timing for imports in Python

The timeit module is great for measuring the execution time of small code snippets but when the code changes global state (like timeit) it's really hard to get accurate timings.
For example if I want to time it takes to import a module then the first import will take much longer than subsequent imports, because the submodules and dependencies are already imported and the files are already cached. So using a bigger number of repeats, like in:
>>> import timeit
>>> timeit.timeit('import numpy', number=1)
0.2819331711316805
>>> # Start a new Python session:
>>> timeit.timeit('import numpy', number=1000)
0.3035142574359181
doesn't really work, because the time for one execution is almost the same as for 1000 rounds. I could execute the command to "reload" the package:
>>> timeit.timeit('imp.reload(numpy)', 'import importlib as imp; import numpy', number=1000)
3.6543283935557156
But that it's only 10 times slower than the first import seems to suggest it's not accurate either.
It also seems impossible to unload a module entirely ("Unload a module in Python").
So the question is: What would be an appropriate way to accuratly measure the import time?
Since it's nearly impossible to fully unload a module, maybe the inspiration behind this answer is this...
You could run a loop in a python script to run x times a python command importing numpy and another one doing nothing, and substract both + average:
import subprocess,time
n=100
python_load_time = 0
numpy_load_time = 0
for i in range(n):
s = time.time()
subprocess.call(["python","-c","import numpy"])
numpy_load_time += time.time()-s
s = time.time()
subprocess.call(["python","-c","pass"])
python_load_time += time.time()-s
print("average numpy load time = {}".format((numpy_load_time-python_load_time)/n))

Sympy reconfigures the randomness seed

The use of Python symbolic computation module "Sympy" in a simulation is very difficult, I need to have reliable fixed inputs, for that I use the seed() in the random module.
However every time I call a simple sympy function, it seems to overwrites the seed with a new value, thus getting new output every time. I have searched a little bit and found this. But neither of them has a solution.
Consider this code:
from sympy import *
import random
random.seed(1)
for _ in range(2):
x = symbols('x')
equ = (x** random.randint(1,5)) ** Rational(random.randint(1,5)/2)
print(equ)
This outputs
(x**2)**(5/2)
x**4
on the first run, and
(x**2)**(5/2)
(x**5)**(3/2)
On the second run, and every-time I run the script it returns new output. I need a way to fix this to enforce the use of seed().
Does this help? From the docs on random:
"You can instantiate your own instances of Random to get generators that don’t share state"
Usage:
import random
# Create a new pseudo random number generator
prng = random.Random()
prng.seed(1)
This number generator will be unaffected by sympy

Forcing timeit to automatically choose number of times statement is executed

In the command line, I can do:
python -m timeit 'a = 1'
According to the docs:
If -n is not given, a suitable number of loops is calculated by trying successive powers
of 10 until the total time is at least 0.2 seconds.
This works great, but how can I get the same behavior when using timeit in my program? If I leave out the number argument to the timeit.timeit call, it will simply default to 1000000.
The docs don't make it obvious that there exists such functionality, I'll assume there isn't. Fortunately, defining a wrapper that does it for you is isn't too hard:
import timeit
def auto_timeit(stmt='pass', setup='pass'):
n = 1
t = timeit.timeit(stmt, setup, number=n)
while t < 0.2:
n *= 10
t = timeit.timeit(stmt, setup, number=n)
return t / n # normalise to time-per-run
Here is an alternative to writing a wrapper:
import subprocess
import sys
subprocess.call([sys.executable, '-m', 'timeit', stmt])

Categories