(Python 2.7.8, Windows)
there is so many questions already about this particular subject, but I cannot seem to get any of them working.
So, what I'm trying to accomplish is timing how long a function takes to execute.
I have functions.py and main.py in following fashion:
#functions.py
def function(list):
does something
return list
...
#main.py
import functions
...stuff...
while:
list = gets list from file
functions.function(list) <--- this needs to get timed
Now I tried time.time() the start and end points first, but it's not accurate enough (difference tends to be 0.0), and after some googling it seems that this isn't the way to go anyway. Apparently what I should use(?) is timeit module. However I cannot understand how to get the function into it.
Any help?
As you mentioned, there's a Python module made for this very task, timeit. Its syntax, while a little idiosyncratic, is quite easy to understand:
timeit.timeit(stmt='pass', setup='pass', timer=<default timer>, number=1000000)
stmt is the function call to be measured, in your case: functions.function(list)
setup is the code you need to create the context necessary for stmt to execute, in your case: import functions; list = gets list from file
number is how many time timeit would run stmt to find its average execution time. You might want to change the number, since calling your function a million times might take a while.
tl;dr:
timeit.timeit(stmt='functions.function(list)', setup='import functions; list = gets list from file', number=100)
you see this demo: time.time
>>> def check(n):
... start = time.time()
... for x in range(n):
... pass
... stop = time.time()
... return stop-start
...
>>> check(1000)
0.0001239776611328125
>>> check(10000)
0.0012159347534179688
>>> check(100)
1.71661376953125e-05
the above function returns hum much time in sec taken by for for n loops.
so the algorithm is:
start = time.time()
# your stuff
stop = time.time()
time_taken = stop - start
start = timeit.default_timer()
my_function()
elapsed = timeit.default_timer() - start
Related
I was doing some tests about finding a number in a number array with python. With the following code,
from time import time
search = 9999999
numbers = []
for i in range(100000000):
numbers.append(i)
start_time = time()
is_in = search in numbers
end_time = time()
print(is_in, end_time - start_time)
I got the output as follows:
True 0.10372281074523926
However, the amount of time that has passed seems much more than the output (nearly 4 seconds). In addition to that, when I change the search value to 0, it outputs the following,
True 0.0
But still, the amount of time that the program needs to terminate is nearly 4-5 seconds, (measured by human instincs) I wonder what is the reason behind this. Why it does not finish after 0.1 seconds as measured and why searching for 0 results in 0.0 seconds?
How long do you think it takes to build your numbers list, specially when doing so in the most inefficient way ? Well, let's check it - but let's check it the right way: using timeit:
>>> def foo():
... l = []
... for i in range(100000000): l.append(i)
... return l
...
>>> import timeit
>>> timeit.timeit("foo()", "from __main__ import foo", number=1)
6.561729616951197
So on this desktop (which is a rather decent machine), just creating this list already takes 6.5 seconds.
Now let's test the linear search:
>>> def search(i, num):
... return i in num
...
>>> numbers = foo()
>>> timeit.timeit("search(9999999, numbers)", "from __main__ import search, numbers", number=1)
0.06766342208720744
So we need 6.5 seconds to build the list, and 0.067 seconds to do a linear search. Note that in both cases we only executed the code under test one single time (the number=1 argument to timeit), which is not really accurate due to os process scheduling. For a more accurate reading you want to repeat the operation thousands times or more (the default value is actually 1000000 !) so you get a reasonably representative average value.
Now just for the fun let's rewrite foo():
>>> def foo():
... return list(range(100000000))
...
>>> timeit.timeit("foo()", "from __main__ import foo", number=1)
2.594872738001868
That's still long, but it's about 2.5 times faster. If you wonder why: this waythe runtime can allocate the required memory for the full list right from the start instead of having to grow it again and again and again.
And for a much more efficient (and constant time !) search:
>>> numset = set(numbers)
>>> timeit.timeit("search(9999999, numset)", "from __main__ import search, numset", number=1)
3.505963832139969e-06
Wait !!! 3.5 something seconds ??? But no - notice the e-06 at the end, it's actually 0.00000350596383213996 seconds, so almost 20000 times faster.
need help for this code
import timeit
mysetup=""
mycode='''
def gener():
...my code here...
return x
'''
# timeit statement
print (timeit.timeit(setup = mysetup,
stmt = mycode,
number = 1000000))
print("done")
As result I got 0.0008606994517737132
As I read this unit is in "seconds"
So my funtion executed 1 million time in 0.8 ms ?
I think this is not real, too fast.
I also tried basic option
start = time.time()
my code here
end = time.time()
print(end - start)
and got 0.23901081085205078 for one time execution it seems a little slow...
So what I'm I doing wrong ?
Thanks
The way you have defined this in mycode for the timeit method, all that is going to happen is the function gener will be defined, not run. You need to run the function in your code block in order to report time taken for execution.
As for what length of time is reasonable (too fast/too slow) it very much depends on what your code is doing. But I suspect you have executed the function in method 2 and only defined it in method 1, hence the discrepancy.
Edit: example code
To illustrate the difference, in the example below the block code1 just defines a function, it does not execute it. The block code2 defines and executes the function.
import timeit
code1 = '''
def gener():
time.sleep(0.01)
'''
code2 = '''
def gener():
time.sleep(0.01)
gener()
'''
We should expect running time.sleep(0.01) 100 times to take approximately 1 second. Running timeit for code1 returns ~ 10^-5 seconds, because the function gener is not actually being called:
timeit.timeit(stmt=code1, number=100)
Running timeit for code2 returns the expected result of ~1 second:
timeit.timeit(stmt=code2, number=100)
Further to this, the point of the setup argument is to do setup (the parts of the code which are not meant to be timed). If you want timeit to capture the execution time of gener, you should use this:
import timeit
setup = '''
def gener():
time.sleep(0.01)
'''
stmt = "gener()"
timeit.timeit(setup=setup, stmt=stmt, number=100)
This returns the time taken to run gener 100 times, not including the time taken to define it.
Here is a general way to measure time of code snippets.
import time
class timer(object):
"""
A simple timer used to time blocks of code. Usage as follows:
with timer("optional_name"):
some code ...
some more code
"""
def __init__(self, name=None):
self.name = name
def __enter__(self):
self.start = time.time()
return self
def __exit__(self, *args):
self.end = time.time()
self.interval = self.end - self.start
if self.name:
print("{} - Elapsed time: {:.4f}s".format(self.name, self.interval))
else:
print("Elapsed time: {:.4f}s".format(self.interval))
gist available here: https://gist.github.com/Jakobovski/191b9e95ac964b61e8abc7436111d1f9
If you want to time a function timeit can be used like so:
# defining some function you want to time
def test(n):
s = 0
for i in range(n):
s += i
return s
# defining a function which runs the function to be timed with desired input arguments
timed_func = lambda : test(1000)
# the above is done so that we have a function which takes no input arguments
N = 10000 # number of repeats
time_per_run = timeit.timeit(stmt=timed_func, number=N)/N
For your case you can do:
# defining some function you want to time
def gener():
...my code here...
return x
N = 1000000 # number of repeats
time_per_run = timeit.timeit(stmt=gener, number=N)/N
Any importing of libraries can be done globally before calling the timeit function and timeit will use the globally imported libraries
e.g.
import numpy as np
# defining some function you want to time
def gener():
...my code here...
x = np.sqrt(y)
return x
N = 1000000 # number of repeats
time_per_run = timeit.timeit(stmt=gener, number=N)/N
Working code
# importing the required module
import timeit
# code snippet to be executed only once
mysetup = '''
from collections import OrderedDict
def gener():
some lines of code here
return x'''
# code snippet whose execution time is to be measured
mycode="gener()"
# timeit statement
nb=10
print("The code run {} time in: ".format(nb ))
print("{} secondes".format(timeit.timeit(setup = mysetup,
stmt = mycode,
number = nb)))
print("done")
execution time was almost the same with the one got below
start = time.time()
my code here
end = time.time()
print(end - start)
0.23 sec with timeit and with the basic measurement code above 0.24 ,they both fluctuate...
So thanks, question resolved
I have to time the implementation I did of an algorithm in one of my classes, and I am using the time.time() function to do so. After implementing it, I have to run that algorithm on a number of data files which contains small and bigger data sets in order to formally analyse its complexity.
Unfortunately, on the small data sets, I get a runtime of 0 seconds even if I get a precision of 0.000000000000000001 with that function when looking at the runtimes of the bigger data sets and I cannot believe that it really takes less than that on the smaller data sets.
My question is: Is there a problem using this function (and if so, is there another function I can use that has a better precision)? Or am I doing something wrong?
Here is my code if ever you need it:
import sys, time
import random
from utility import parseSystemArguments, printResults
...
def main(ville):
start = time.time()
solution = dynamique(ville) # Algorithm implementation
end = time.time()
return (end - start, solution)
if __name__ == "__main__":
sys.argv.insert(1, "-a")
sys.argv.insert(2, "3")
(algoNumber, ville, printList) = parseSystemArguments()
(algoTime, solution) = main(ville)
printResults(algoTime, solution, printList)
The printResults function:
def printResults(time, solution, printList=True):
print ("Temps d'execution = " + str(time) + "s")
if printList:
print (solution)
The solution to my problem was to use the timeit module instead of the time module.
import timeit
...
def main(ville):
start = timeit.default_timer()
solution = dynamique(ville)
end = timeit.default_timer()
return (end - start, solution)
Don't confuse the resolution of the system time with the resolution of a floating point number. The time resolution on a computer is only as frequent as the system clock is updated. How often the system clock is updated varies from machine to machine, so to ensure that you will see a difference with time, you will need to make sure it executes for a millisecond or more. Try putting it into a loop like this:
start = time.time()
k = 100000
for i in range(k)
solution = dynamique(ville)
end = time.time()
return ((end - start)/k, solution)
In the final tally, you then need to divide by the number of loop iterations to know how long your code actually runs once through. You may need to increase k to get a good measure of the execution time, or you may need to decrease it if your computer is running in the loop for a very long time.
I made a little function using timeit just so I could be lazy and do less typing which isn't panning out as planned.
The (relevant) code:
def timing(function, retries=10000, formatSeconds=3, repeat=10):
"""Test how long a function takes to run. Defaults are set to run
10 times of 10000 tries each. Will display time as 1 of 4 types.
0 = Seconds, 1 = milliseconds, 2= microseconds and 3 = nanoseconds.
Pass in paramaters as: (function, retries=10000,formatSeconds=3, repeat=10)"""
t = timeit.Timer(lambda: function)
result = t.repeat(repeat=repeat,number=retries)
rlist = [i/retries for i in result]
It runs fine but it keeps returning:
timeprofile.timing(find_boundaries(numpy.asarray(Image.open(
r'D:\Python\image\image4.jpg')),79))
10 runs of 10000 cycles each:
Best time: 137.94764 Worst:158.16651 Avg: 143.25466 nanosecs/pass
Now, if I do from the interpreter:
import timeit
from timeit import Timer
t = timeit.Timer(lambda: (find_boundaries(numpy.asarray(Image.open(r'D:\Python\image\image4.jpg')),79)))
result = t.repeat(repeat=5,number=100)
result = [i/100 for i in result]
I end up with [0.007723014775432375, 0.007615270149786965, 0.0075242365377505395,
0.007420834966038683, 0.0074086862470653615], or about 8 milliseconds.
And if I run the profiler on the script, it also gives approximately the same result of about 8 milliseconds.
I'm not really sure what the problem is although I reckon it has something to do with the how it's calling the function. When I check the data in the debugger it shows the function as a dictionary with a len of 53, and each key contains 1 to 15 tuples with a pair of 2-3 digit numbers in each.
So, if anyone knows why it's doing that and would like to explain it to me, and how to fix it, that'd be great!
Yes, there is a difference. When you run:
timeprofile.timing(find_boundaries(numpy.asarray(Image.open(
r'D:\Python\image\image4.jpg')),79))
You are not passing in a function reference. You are calling the function and instead are passing in the result of that call. You are timing staticresult instead of somefunction(with, arguments).
Move out the lambda:
timeprofile.timing(lambda: (find_boundaries(numpy.asarray(Image.open(
r'D:\Python\image\image4.jpg')),79)))
This means you need to remove it from your timing function, and instead pass the function straight to the Timer() class:
t = timeit.Timer(function)
I'm programming in python on windows and would like to accurately measure the time it takes for a function to run. I have written a function "time_it" that takes another function, runs it, and returns the time it took to run.
def time_it(f, *args):
start = time.clock()
f(*args)
return (time.clock() - start)*1000
i call this 1000 times and average the result. (the 1000 constant at the end is to give the answer in milliseconds.)
This function seems to work but i have this nagging feeling that I'm doing something wrong, and that by doing it this way I'm using more time than the function actually uses when its running.
Is there a more standard or accepted way to do this?
When i changed my test function to call a print so that it takes longer, my time_it function returns an average of 2.5 ms while the cProfile.run('f()') returns and average of 7.0 ms. I figured my function would overestimate the time if anything, what is going on here?
One additional note, it is the relative time of functions compared to each other that i care about, not the absolute time as this will obviously vary depending on hardware and other factors.
Use the timeit module from the Python standard library.
Basic usage:
from timeit import Timer
# first argument is the code to be run, the second "setup" argument is only run once,
# and it not included in the execution time.
t = Timer("""x.index(123)""", setup="""x = range(1000)""")
print t.timeit() # prints float, for example 5.8254
# ..or..
print t.timeit(1000) # repeat 1000 times instead of the default 1million
Instead of writing your own profiling code, I suggest you check out the built-in Python profilers (profile or cProfile, depending on your needs): http://docs.python.org/library/profile.html
You can create a "timeme" decorator like so
import time
def timeme(method):
def wrapper(*args, **kw):
startTime = int(round(time.time() * 1000))
result = method(*args, **kw)
endTime = int(round(time.time() * 1000))
print(endTime - startTime,'ms')
return result
return wrapper
#timeme
def func1(a,b,c = 'c',sleep = 1):
time.sleep(sleep)
print(a,b,c)
func1('a','b','c',0)
func1('a','b','c',0.5)
func1('a','b','c',0.6)
func1('a','b','c',1)
This code is very inaccurate
total= 0
for i in range(1000):
start= time.clock()
function()
end= time.clock()
total += end-start
time= total/1000
This code is less inaccurate
start= time.clock()
for i in range(1000):
function()
end= time.clock()
time= (end-start)/1000
The very inaccurate suffers from measurement bias if the run-time of the function is close to the accuracy of the clock. Most of the measured times are merely random numbers between 0 and a few ticks of the clock.
Depending on your system workload, the "time" you observe from a single function may be entirely an artifact of OS scheduling and other uncontrollable overheads.
The second version (less inaccurate) has less measurement bias. If your function is really fast, you may need to run it 10,000 times to damp out OS scheduling and other overheads.
Both are, of course, terribly misleading. The run time for your program -- as a whole -- is not the sum of the function run-times. You can only use the numbers for relative comparisons. They are not absolute measurements that convey much meaning.
If you want to time a python method even if block you measure may throw, one good approach is to use with statement. Define some Timer class as
import time
class Timer:
def __enter__(self):
self.start = time.clock()
return self
def __exit__(self, *args):
self.end = time.clock()
self.interval = self.end - self.start
Then you may want to time a connection method that may throw. Use
import httplib
with Timer() as t:
conn = httplib.HTTPConnection('google.com')
conn.request('GET', '/')
print('Request took %.03f sec.' % t.interval)
__exit()__ method will be called even if the connection request thows. More precisely, you'd have you use try finally to see the result in case it throws, as with
try:
with Timer() as t:
conn = httplib.HTTPConnection('google.com')
conn.request('GET', '/')
finally:
print('Request took %.03f sec.' % t.interval)
More details here.
This is neater
from contextlib import contextmanager
import time
#contextmanager
def timeblock(label):
start = time.clock()
try:
yield
finally:
end = time.clock()
print ('{} : {}'.format(label, end - start))
with timeblock("just a test"):
print "yippee"
Similar to #AlexMartelli's answer
import timeit
timeit.timeit(fun, number=10000)
can do the trick.