How do I profile/benchmark an assynchronous Python script (which uses ASYNCIO)?
I you would usualy do
totalMem = tracemalloc.get_traced_memory()[0]
totalTime = time.time()
retValue = myFunction()
totalTime = time.time() - totalTime
totalMem = tracemalloc.get_traced_memory()[0] - totalMem
This way I would save the total time taken by the function.
I learned how to use decorators and I did just that - and dumped all stats into a text file for later analysis.
But, when you have ASYNCIO script, things get pretty different: the function will block while doing an "await aiohttpSession.get()", and control will go back to the event loop, which will run other functions.
This way, the elapsed time and changes in total allocated memory won't reveal anything, because I will have measured more than just that function.
The only way it would work would be something like
class MyTracer:
def __init__(self):
self.totalTime = 0
self.totalMem = 0
self.startTime = time.time()
self.startMem = tracemalloc.get_traced_memory()[0]
def stop(self):
self.totalTime += time.time() - self.startTime
self.totalMem += tracemalloc.get_traced_memory()[0] - self.startMem
def start(self):
self.startTime = time.time()
self.startMem = tracemalloc.get_traced_memory()[0]
And now, somehow, insert it in the code:
def myFunction():
tracer = MyTracer()
session = aiohttp.ClientSession()
# do something
tracer.stop()
# the time elapsed here, and the changes in the memory allocation, are not from the current function
retValue = await(await session.get('https://hoochie-mama.org/cosmo-kramer',
headers={
'User-Agent': 'YoYo Mama! v3.0',
'Cookies': 'those cookies are making me thirsty!',
})).text()
tracer.start()
# do more things
tracer.stop()
# now "tracer" has the info about total time spent in this function, and the memory allocated by it
# (the memory stats could be negative if the function releases more than allocates)
Is there a way to accomplish this, I mean, profile all my asyncio code without having to insert all this code?
Or is there a module already capable of doing just that?
Check out Yappi profiler which has support for coroutine profiling. Their page on coroutine profiling describes the problem you're facing very clearly:
The main issue with coroutines is that, under the hood when a coroutine yields or in other words context switches, Yappi receives a return event just like we exit from the function. That means the time spent while the coroutine is in yield state does not get accumulated to the output. This is a problem especially for wall time as in wall time you want to see whole time spent in that function or coroutine. Another problem is call count. You see every time a coroutine yields, call count gets incremented since it is a regular function exit.
They also describe very high level how Yappi solves this problem:
With v1.2, Yappi corrects above issues with coroutine profiling. Under the hood, it differentiates the yield from real function exit and if wall time is selected as the clock_type it will accumulate the time and corrects the call count metric.
Related
I have a Python class and want to measure the time it takes to instantiate the class and execute a method across numerous, e.g., 100, runs.
I noticed that the first run takes considerably longer than consecutive runs. I assume that is caused by branch prediction since the input does not change. However, I want to measure the time it takes "from scratch", i.e., without the benefit of branch prediction. Note that constructing a realistic input is difficult in this case, thus the runs have to be executed on the same input.
To tackle this, I tried creating a new object on each run and delete the old object:
import time
class Myobject:
def mymethod(self):
"""
Does something complex.
"""
pass
def benchmark(runs=100):
"""
The argument runs corresponds to the number of times the benchmark is to be executed.
"""
times_per_run = []
r = range(runs)
for _ in r:
t2_start = time.perf_counter()
# instantiation
obj = Myobject()
# method execution
obj.mymethod()
del obj
t2_stop = time.perf_counter()
times_per_run.append(t2_stop-t2_start)
print(times_per_run)
benchmark(runs=10)
Executing this code shows that the average time per run varies significantly. The first run takes consistently longer. How do I eliminate the benefit of branch prediction when benchmarking across multiple runs?
To avoid the benefits of warmup (s. comments on post), I used the subprocess module to trigger the runs individually while measuring the time for each run and aggregate the results afterwards:
def benchmark(runs=100):
times_per_run = []
command = "python3 ./myclass.py"
for _ in range(runs):
t1_start = time.perf_counter()
subprocess.run([command], capture_output=True, shell=True, check=False)
t1_stop = time.perf_counter()
times_per_run.append(t1_stop - t1_start)
logging.info(f"Average time per run: {sum(times_per_run) / runs}")
benchmark()
This yields stable results.
I'm an amateur coder. I'm working on a small little game for a project in biology, but I have come across an issue in my code. I have a loop that adds +1 to the variable sunlight every two seconds. However, all code below the loop is non-functional now that I have made the loop. I'm guessing it's because it's waiting for the loop to finish. Any way to have the loop always run but allow the code to run through it's sequence at the same time?
print("Game started!")
sunlight = 0
while True:
time.sleep(2)
sunlight += 1
commands = input("Type stats to see which molecules you have, type carbon to get carbon\ndioxide, and type water to get water: ")
if commands == ("stats"):
print("Sunlight: ",sunlight,"")
As you are beginner, i would not recommend to use multithreading or asyncio. Instead just start the time and when user enter "stats", elapsed time//2 will be equal to sunlight.
import time
start_time = time.time()
while True:
commands = input("Type stats to see which molecules you have, type carbon to get carbon\ndioxide, and type water to get water: ")
if commands == ("stats"):
sunlight = (time.time()-start_time)//2 # elapsed time // 2
print("Sunlight: ", sunlight, "")
Your sunlight variable basically functions as a clock; it counts half of the number of seconds since the program begins. Rather than implement your own clock using time.sleep(), it's better to just use an existing clock from the time library.
The function time.monotonic returns a number of seconds, so you can use this to get the current sunlight by saving the start time, then each time you want to know the value of sunlight, take the difference between the current time and the start time, divided by 2.
start_time = time.monotonic()
def get_sunlight():
current_time = time.monotonic()
return int(current_time - start_time) // 2
It is better to use the monotonic() function than the clock() function for this purpose, since the clock() function is deprecated as of Python 3.3:
The time.clock() function is deprecated because it is not portable: it behaves differently depending on the operating system.
It's also better than the time() function for this purpose, because changes to the system clock (such as going forwards or back due to daylight savings time) will affect the result of time():
While this function normally returns non-decreasing values, it can return a lower value than a previous call if the system clock has been set back between the two calls.
You should look into the multithreading library. That's probably a good resource. You can fire off a thread running your sunlight incrementer that updates a global variable (not a good idea but you seem to have just 1 writer, so you can get by till you have time to pick up more advanced parallel processing concepts).
Reference: https://www.geeksforgeeks.org/multithreading-python-set-1/
I'm trying to run a method every minute.
The method does some operations on the internet so it might take anywhere from 1 second to 30 seconds.
What I want to do is calculate the time spent by this method and then sleep for the remaining time, to make sure that the method itself runs every minute.
Currently my code looks like this:
def do_operation():
access_db()
sleep(60)
As you can see this does not take into account the delay whatsoever, and although it works, it will at some point fail and skip a minute completely, which should never happen.
import time
def do_operation():
start = time.time()
access_db()
time.sleep(60-time.time()+start)
This code will allow you to run a callable in defined intervals:
import time
import random
def recurring(interval, callable):
i = 0
start = time.time()
while True:
i += 1
callable()
remaining_delay = max(start + (i * interval) - time.time(), 0)
time.sleep(remaining_delay)
def tick_delay():
print('tick start')
time.sleep(random.randrange(1, 4))
print('tick end')
recurring(5, tick_delay)
Notes
The function tick_delay sleeps for some seconds to simulate a function which can take an undefined amount of time.
If the callable takes longer than the defined loop interval, the next iteration will be scheduled immediately after the last ended. To have the callable run in parallel you need to use threading or asyncio
I created a simple application, and I realised that my code is running extremely slow. This application included calling the same method over and over again. I tried investigating the problem, and it turned out that calling the same function / method several times resulted in Python sometimes taking 15 milliseconds to execute an empty function (pass).
I'm running windows 10 Home 64 bit on a Lenovo ThinkPad, i7 CPU
The less code the function / method has, the smaller the chance of having a 15ms runtime, however, it never goes away.
Here's the code:
import time
class Clock:
def __init__(self):
self.t = time.time()
def restart(self):
dt = time.time() - self.t
t = time.time()
return dt * 1000
def method():
pass
for i in range(100000):
c = Clock()
dt = c.restart()
if dt > 1.:
print(str(i) + ' ' + str(dt))
I'd expect that I never get anything printed out, however an average result looks like this:
6497 15.619516372680664
44412 15.622615814208984
63348 15.621185302734375
On average 1-4 out of 100000 times the time elapsed between starting the clock and getting the result (which is an empty function call and a simple subtraction and variable assignment) the elapsed time is 15.62.. milliseconds, which makes the run time really slow.
Occasionally the elapsed time is 1 millisecond.
Thank you for your help!
In your code you are making the call to time.time() twice which would require the system to retrieve the time from the OS. You can read here
How does python's time.time() method work?
As you mentioned you used Windows, it is probably better for you to use time.clock() instead and will defer you to read this link instead since they do a much better job explaining. https://www.pythoncentral.io/measure-time-in-python-time-time-vs-time-clock/
Also the link takes garbage collection into account of performance and gives the ability to remove it during testing.
Hope it answers your questions!
The script that I'm writing sometimes makes requests to an API and the API requires that requests are limited to a maximum of 1 per second.
What is the most straight forward way of limiting my requests to the API to 1 every second?
Would it involve storing the current time in a file each time a request is made?
You could use a separate thread for the CGI calls and a queuing mechanism that loops with a call to sleep on each iteration.
From 15.3. time
time.sleep(secs)
Suspend execution for the given number of seconds. The argument may be a floating point number to indicate a more precise sleep time. The actual suspension time may be less than that requested because any caught signal will terminate the sleep() following execution of that signal’s catching routine. Also, the suspension time may be longer than requested by an arbitrary amount because of the scheduling of other activity in the system.
One can use a rate-limiting python decorator on the function one wishes to rate-limit, like this one from Greg Burek:
import time
def RateLimited(maxPerSecond):
minInterval = 1.0 / float(maxPerSecond)
def decorate(func):
lastTimeCalled = [0.0]
def rateLimitedFunction(*args,**kargs):
elapsed = time.clock() - lastTimeCalled[0]
leftToWait = minInterval - elapsed
if leftToWait>0:
time.sleep(leftToWait)
ret = func(*args,**kargs)
lastTimeCalled[0] = time.clock()
return ret
return rateLimitedFunction
return decorate
#RateLimited(2) # 2 per second at most
def PrintNumber(num):
print num
if __name__ == "__main__":
print "This should print 1,2,3... at about 2 per second."
for i in range(1,100):
PrintNumber(i)