Forcing timeit to automatically choose number of times statement is executed - python

In the command line, I can do:
python -m timeit 'a = 1'
According to the docs:
If -n is not given, a suitable number of loops is calculated by trying successive powers
of 10 until the total time is at least 0.2 seconds.
This works great, but how can I get the same behavior when using timeit in my program? If I leave out the number argument to the timeit.timeit call, it will simply default to 1000000.

The docs don't make it obvious that there exists such functionality, I'll assume there isn't. Fortunately, defining a wrapper that does it for you is isn't too hard:
import timeit
def auto_timeit(stmt='pass', setup='pass'):
n = 1
t = timeit.timeit(stmt, setup, number=n)
while t < 0.2:
n *= 10
t = timeit.timeit(stmt, setup, number=n)
return t / n # normalise to time-per-run

Here is an alternative to writing a wrapper:
import subprocess
import sys
subprocess.call([sys.executable, '-m', 'timeit', stmt])

Related

How to use python timeit to get median runtime

I want to benchmark a bit of python code (not the language I am used to, but I have to do some comparisons in python). I have understood that timeit is a good tool for this, and I have a code like this:
n = 10
duration = timeit.Timer(my_func).timeit(number=n)
duration/n
to measure the mean runtime of the function. Now, I want to instead have the median time (the reason is that I want to make a comparison to something I get in median time, and it would be good to use the same measure in all cases). Now, timeit only seems to return the full runtime, and not the time of each individual run, so I am not sure how to find the median runtime. What is the best way to get this?
You can use the repeat method instead, which gives you the individual times as a list:
import timeit
from statistics import median
def my_func():
for _ in range(1000000):
pass
n = 10
durations = timeit.Timer(my_func).repeat(repeat=n, number=1)
print(median(durations))
Try it online!

Using timeit in interactive mode

How can I implement this ipython code in python?
[1] %timeit sum(list(range(1000)))
Ps: I want to do it in a single line of code. I have tried several times but failed every time.
Thanks.
This will give you the time taken (in seconds):
from timeit import timeit
timeTaken = timeit(lambda: sum(list(range(1000))), number=100000)
You can look up the timeit documentation for more options.

Accurate timing for imports in Python

The timeit module is great for measuring the execution time of small code snippets but when the code changes global state (like timeit) it's really hard to get accurate timings.
For example if I want to time it takes to import a module then the first import will take much longer than subsequent imports, because the submodules and dependencies are already imported and the files are already cached. So using a bigger number of repeats, like in:
>>> import timeit
>>> timeit.timeit('import numpy', number=1)
0.2819331711316805
>>> # Start a new Python session:
>>> timeit.timeit('import numpy', number=1000)
0.3035142574359181
doesn't really work, because the time for one execution is almost the same as for 1000 rounds. I could execute the command to "reload" the package:
>>> timeit.timeit('imp.reload(numpy)', 'import importlib as imp; import numpy', number=1000)
3.6543283935557156
But that it's only 10 times slower than the first import seems to suggest it's not accurate either.
It also seems impossible to unload a module entirely ("Unload a module in Python").
So the question is: What would be an appropriate way to accuratly measure the import time?
Since it's nearly impossible to fully unload a module, maybe the inspiration behind this answer is this...
You could run a loop in a python script to run x times a python command importing numpy and another one doing nothing, and substract both + average:
import subprocess,time
n=100
python_load_time = 0
numpy_load_time = 0
for i in range(n):
s = time.time()
subprocess.call(["python","-c","import numpy"])
numpy_load_time += time.time()-s
s = time.time()
subprocess.call(["python","-c","pass"])
python_load_time += time.time()-s
print("average numpy load time = {}".format((numpy_load_time-python_load_time)/n))

Calling function using Timeit

I'm trying to time several things in python, including upload time to Amazon's S3 Cloud Storage, and am having a little trouble. I can time my hash, and a few other things, but not the upload. I thought this post would finally, get me there, but I can't seem to find salvation. Any help would be appreciated. Very new to python, thanks!
import timeit
accKey = r"xxxxxxxxxxx";
secKey = r"yyyyyyyyyyyyyyyyyyyyyyyyy";
bucket_name = 'sweet_data'
c = boto.connect_s3(accKey, secKey)
b = c.get_bucket(bucket_name);
k = Key(b);
p = '/my/aws.path'
f = 'C:\\my.file'
def upload_data(p, f):
k.key = p
k.set_contents_from_filename(f)
return
t = timeit.Timer(lambda: upload_data(p, f), "from aws_lib import upload_data; p=%r; f = %r" % (p,f))
# Just calling the function works fine
#upload_data(p, f)
I know this is heresy in the Python community, but I actually recommend not to use timeit, especially for something like this. For your purposes, I believe it will be good enough (and possibly even better than timeit!) if you simply use time.time() to time things. In other words, do something like
from time import time
t0 = time()
myfunc()
t1 = time()
print t1 - t0
Note that depending on your platform, you might want to try time.clock() instead (see Stack Overflow questions such as this and this), and if you're on Python 3.3, then you have better options, due to PEP 418.
You can use the command line interface to timeit.
Just save your code as a module without the timing stuff. For example:
# file: test.py
data = range(5)
def foo(l):
return sum(l)
Then you can run the timing code from the command line, like this:
$ python -mtimeit -s 'import test;' 'test.foo(test.data)'
See also:
http://docs.python.org/2/library/timeit.html#command-line-interface
http://docs.python.org/2/library/timeit.html#examples

timeit module hangs with bigger values of pow()

I am trying to calculate the time taken by pow function to calculate exponential modulo. With the values of g,x,p hardcoded the code gives error and with the values placed in the pow function, the code hangs. The same piece of code is working efficiently when i am using time() and clock() to calculate the time taken by this piece of code.
i wanted accuracy and for that now i have moved to timeit module after testing with clock() and time() functions.
The code works fine with small values such as pow(2, 3, 5) which makes sense. how can i improve the efficency to calculate time using timeit module.
Also i am a beginner to python, forgive me if there is any stupid mistake in the code.
import math
import random
import hashlib
import time
from timeit import Timer
g = 141802876407053547664378835005750805370737584038368838959151050908654130616798415530564917923311706921535439557793280725844349256960807398107370211978304
x = 1207729835787890214
p = 4870352607375058055471602136317178172283784073796673298937466544646468718314482464390112574915498953621226853454222898392076852427324057496200810018794472
t = Timer('pow(g,x,p)', 'import math')
z = t.timeit()
print ('the value of z is: '), z
Thanks
There are two issues here:
You can't directly access globals from timeit: See this question. You can use this to fix the error:
t = Timer('pow(g,x,p)', 'from __main__ import g,x,p')
Or just put the numerical values directly in the string.
By default, the timeit module runs 1000000 iterations, which will take much too long here. You can change the number of iterations, for example:
z = t.timeit(1000)
This will prevent what seems like a hang (but is actually just a very long calculation).

Categories