I created a simple application, and I realised that my code is running extremely slow. This application included calling the same method over and over again. I tried investigating the problem, and it turned out that calling the same function / method several times resulted in Python sometimes taking 15 milliseconds to execute an empty function (pass).
I'm running windows 10 Home 64 bit on a Lenovo ThinkPad, i7 CPU
The less code the function / method has, the smaller the chance of having a 15ms runtime, however, it never goes away.
Here's the code:
import time
class Clock:
def __init__(self):
self.t = time.time()
def restart(self):
dt = time.time() - self.t
t = time.time()
return dt * 1000
def method():
pass
for i in range(100000):
c = Clock()
dt = c.restart()
if dt > 1.:
print(str(i) + ' ' + str(dt))
I'd expect that I never get anything printed out, however an average result looks like this:
6497 15.619516372680664
44412 15.622615814208984
63348 15.621185302734375
On average 1-4 out of 100000 times the time elapsed between starting the clock and getting the result (which is an empty function call and a simple subtraction and variable assignment) the elapsed time is 15.62.. milliseconds, which makes the run time really slow.
Occasionally the elapsed time is 1 millisecond.
Thank you for your help!
In your code you are making the call to time.time() twice which would require the system to retrieve the time from the OS. You can read here
How does python's time.time() method work?
As you mentioned you used Windows, it is probably better for you to use time.clock() instead and will defer you to read this link instead since they do a much better job explaining. https://www.pythoncentral.io/measure-time-in-python-time-time-vs-time-clock/
Also the link takes garbage collection into account of performance and gives the ability to remove it during testing.
Hope it answers your questions!
Related
I'm an amateur coder. I'm working on a small little game for a project in biology, but I have come across an issue in my code. I have a loop that adds +1 to the variable sunlight every two seconds. However, all code below the loop is non-functional now that I have made the loop. I'm guessing it's because it's waiting for the loop to finish. Any way to have the loop always run but allow the code to run through it's sequence at the same time?
print("Game started!")
sunlight = 0
while True:
time.sleep(2)
sunlight += 1
commands = input("Type stats to see which molecules you have, type carbon to get carbon\ndioxide, and type water to get water: ")
if commands == ("stats"):
print("Sunlight: ",sunlight,"")
As you are beginner, i would not recommend to use multithreading or asyncio. Instead just start the time and when user enter "stats", elapsed time//2 will be equal to sunlight.
import time
start_time = time.time()
while True:
commands = input("Type stats to see which molecules you have, type carbon to get carbon\ndioxide, and type water to get water: ")
if commands == ("stats"):
sunlight = (time.time()-start_time)//2 # elapsed time // 2
print("Sunlight: ", sunlight, "")
Your sunlight variable basically functions as a clock; it counts half of the number of seconds since the program begins. Rather than implement your own clock using time.sleep(), it's better to just use an existing clock from the time library.
The function time.monotonic returns a number of seconds, so you can use this to get the current sunlight by saving the start time, then each time you want to know the value of sunlight, take the difference between the current time and the start time, divided by 2.
start_time = time.monotonic()
def get_sunlight():
current_time = time.monotonic()
return int(current_time - start_time) // 2
It is better to use the monotonic() function than the clock() function for this purpose, since the clock() function is deprecated as of Python 3.3:
The time.clock() function is deprecated because it is not portable: it behaves differently depending on the operating system.
It's also better than the time() function for this purpose, because changes to the system clock (such as going forwards or back due to daylight savings time) will affect the result of time():
While this function normally returns non-decreasing values, it can return a lower value than a previous call if the system clock has been set back between the two calls.
You should look into the multithreading library. That's probably a good resource. You can fire off a thread running your sunlight incrementer that updates a global variable (not a good idea but you seem to have just 1 writer, so you can get by till you have time to pick up more advanced parallel processing concepts).
Reference: https://www.geeksforgeeks.org/multithreading-python-set-1/
How do I profile/benchmark an assynchronous Python script (which uses ASYNCIO)?
I you would usualy do
totalMem = tracemalloc.get_traced_memory()[0]
totalTime = time.time()
retValue = myFunction()
totalTime = time.time() - totalTime
totalMem = tracemalloc.get_traced_memory()[0] - totalMem
This way I would save the total time taken by the function.
I learned how to use decorators and I did just that - and dumped all stats into a text file for later analysis.
But, when you have ASYNCIO script, things get pretty different: the function will block while doing an "await aiohttpSession.get()", and control will go back to the event loop, which will run other functions.
This way, the elapsed time and changes in total allocated memory won't reveal anything, because I will have measured more than just that function.
The only way it would work would be something like
class MyTracer:
def __init__(self):
self.totalTime = 0
self.totalMem = 0
self.startTime = time.time()
self.startMem = tracemalloc.get_traced_memory()[0]
def stop(self):
self.totalTime += time.time() - self.startTime
self.totalMem += tracemalloc.get_traced_memory()[0] - self.startMem
def start(self):
self.startTime = time.time()
self.startMem = tracemalloc.get_traced_memory()[0]
And now, somehow, insert it in the code:
def myFunction():
tracer = MyTracer()
session = aiohttp.ClientSession()
# do something
tracer.stop()
# the time elapsed here, and the changes in the memory allocation, are not from the current function
retValue = await(await session.get('https://hoochie-mama.org/cosmo-kramer',
headers={
'User-Agent': 'YoYo Mama! v3.0',
'Cookies': 'those cookies are making me thirsty!',
})).text()
tracer.start()
# do more things
tracer.stop()
# now "tracer" has the info about total time spent in this function, and the memory allocated by it
# (the memory stats could be negative if the function releases more than allocates)
Is there a way to accomplish this, I mean, profile all my asyncio code without having to insert all this code?
Or is there a module already capable of doing just that?
Check out Yappi profiler which has support for coroutine profiling. Their page on coroutine profiling describes the problem you're facing very clearly:
The main issue with coroutines is that, under the hood when a coroutine yields or in other words context switches, Yappi receives a return event just like we exit from the function. That means the time spent while the coroutine is in yield state does not get accumulated to the output. This is a problem especially for wall time as in wall time you want to see whole time spent in that function or coroutine. Another problem is call count. You see every time a coroutine yields, call count gets incremented since it is a regular function exit.
They also describe very high level how Yappi solves this problem:
With v1.2, Yappi corrects above issues with coroutine profiling. Under the hood, it differentiates the yield from real function exit and if wall time is selected as the clock_type it will accumulate the time and corrects the call count metric.
I'm using a python3 script to automatize some jobs.
I need to measure the time of such external jobs. So I decided to use python 3 built-in time() combined with the subprocess module:
with open(in_files[i],'r') as f, open(sol_files[i],'w') as f_sol:
start = time.time()
process = subprocess.run(['./'+src_files[i]], stdin = f, stdout=f_sol)
end = time.time()
The calculated elapsed time by this python snippet is 0.73 seconds
However, the equivalent bash command:
time ./file < input_file > output_file
Is significantly faster: 0.5 seconds
Which could be causing this huge discrepancy? Maybe the context switching with the python interpreter due the redirection usage? Maybe something related to buffering?
A similar code without the redirection usage does not show this behavior:
start = time.time()
process = subprocess.run(['sleep','1'])
end = time.time()
The above code time is elapsed in 1s + negligible time.
Best regards
It was a stupid mistake.
time.time() does not have a good precision in most systems.
Note that even though the time is always returned as a floating point number, not all systems provide time with a better precision than 1 second. While this function normally returns non-decreasing values, it can return a lower value than a previous call if the system clock has been set back between the two calls.
Python 3 Time Module Documentation
perf_counter() or process_time() works just fine. Nothing wrong with subprocesses.
I'm currently reading physics in the university, and im learning python as a little hobby.
To practise both at the same time, i figured I'll write a little "physics engine" that calculates the movement of an object based on x,y and z coordinates. Im only gonna return the movement in text (at least for now!) but i want the position updates to be real-time.
To do that i need to update the position of an object, lets say a hundred times a second, and print it back to the screen. So every 10 ms the program prints the current position.
So if the execution of the calculations take 2 ms, then the loop must wait 8ms before it prints and recalculate for the next position.
Whats the best way of constructing a loop like that, and is 100 times a second a fair frequency or would you go slower, like 25 times/sec?
The basic way to wait in python is to import time and use time.sleep. Then the question is, how long to sleep? This depends on how you want to handle cases where your loop misses the desired timing. The following implementation tries to catch up to the target interval if it misses.
import time
import random
def doTimeConsumingStep(N):
"""
This represents the computational part of your simulation.
For the sake of illustration, I've set it up so that it takes a random
amount of time which is occasionally longer than the interval you want.
"""
r = random.random()
computationTime = N * (r + 0.2)
print("...computing for %f seconds..."%(computationTime,))
time.sleep(computationTime)
def timerTest(N=1):
repsCompleted = 0
beginningOfTime = time.clock()
start = time.clock()
goAgainAt = start + N
while 1:
print("Loop #%d at time %f"%(repsCompleted, time.clock() - beginningOfTime))
repsCompleted += 1
doTimeConsumingStep(N)
#If we missed our interval, iterate immediately and increment the target time
if time.clock() > goAgainAt:
print("Oops, missed an iteration")
goAgainAt += N
continue
#Otherwise, wait for next interval
timeToSleep = goAgainAt - time.clock()
goAgainAt += N
time.sleep(timeToSleep)
if __name__ == "__main__":
timerTest()
Note that you will miss your desired timing on a normal OS, so things like this are necessary. Note that even with asynchronous frameworks like tulip and twisted you can't guarantee timing on a normal operating system.
Since you cannot know in advance how long each iteration will take, you need some sort of event-driven loop. A possible solution would be using the twisted module, which is based on the reactor pattern.
from twisted.internet import task
from twisted.internet import reactor
delay = 0.1
def work():
print "called"
l = task.LoopingCall(work)
l.start(delay)
reactor.run()
However, as has been noted, don't expect a true real-time responsiveness.
A piece of warning. You may not expect a real time on a non-realtime system. The sleep family of calls guarantees at least a given delay, but may well delay you for more.
Therefore, once you returned from sleep, query current time, and make the calculations into the "future" (accounting for the calculation time).
I need to wait for about 25ms in one of my functions. Sometimes this function is called when the processor is occupied with other things and other times it has the processor all to itself.
I've tried time.sleep(.25) but sometimes its actually 25ms and other times it takes much longer. Is there a way to sleep for an exact amount of time regardless of processor availability?
Because you're working with a preemptive operating system, there's no way you can guarantee that your process will be able to have control of the CPU in 25ms.
If you'd still like to try, it would be better to have a busy loop that polls until 25ms has passed. Something like this might work:
import time
target_time = time.clock() + 0.025
while time.clock() < target_time:
pass
0.25 seconds are 250 ms, not 25. Apart from this, there is no way to wait for exactly 25 ms on common operating systems – you would need some real-time operating system.
What system are you on? If you're on Windows you may want to do something like this for exact timing:
import ctypes
kernel32 = ctypes.windll.kernel32
# This sets the priority of the process to realtime--the same priority as the mouse pointer.
kernel32.SetThreadPriority(kernel32.GetCurrentThread(), 31)
# This creates a timer. This only needs to be done once.
timer = kernel32.CreateWaitableTimerA(ctypes.c_void_p(), True, ctypes.c_void_p())
# The kernel measures in 100 nanosecond intervals, so we must multiply .25 by 10000
delay = ctypes.c_longlong(.25 * 10000)
kernel32.SetWaitableTimer(timer, ctypes.byref(delay), 0, ctypes.c_void_p(), ctypes.c_void_p(), False)
kernel32.WaitForSingleObject(timer, 0xffffffff)
This code will pretty much guarentee your process will sleep .25 seconds. Watch out though- you may want to lower the priority to 2 or 3 unless it's absolutely critical that this sleeps for .25 seconds. Certainly don't change the priority too high for a user-end product.
Edit: in Windows 10 this nonsense seems unnecessary. Try it like so:
>>> from time import sleep
>>> import timeit
>>> '%.2f%% overhead' % (timeit.timeit('sleep(0.025)', number=100, globals=globals()) / 0.025 - 100)
'0.29% overhead'
.29%, or thereabout, is fairly low overhead, and usually more than accurate enough.
Previous Windows versions will by default have a sleep resolution of 55 msecs, which means your sleep call will take somewhere between 25 and 55 msecs. To get the sleep resolution down to 1 millisecond you need to set the resolution used by Windows by calling timeBeginPeriod:
import ctypes
winmm = ctypes.WinDLL('winmm')
winmm.timeBeginPeriod(1)
Another solution for accurate timings and delay is to use the perf_counter() function from module time. Especially useful in windows as time.sleep is not accurate in milliseconds. See below example where function accurate_delay creates a delay in millisecond.
import time
def accurate_delay(delay):
''' Function to provide accurate time delay in millisecond
'''
_ = time.perf_counter() + delay/1000
while time.perf_counter() < _:
pass
delay = 10
t_start = time.perf_counter()
print('Wait for {:.0f} ms. Start: {:.5f}'.format(delay, t_start))
accurate_delay(delay)
t_end = time.perf_counter()
print('End time: {:.5f}. Delay is {:.5f} ms'.
format(t_end, 1000*(t_end - t_start)))
sum = 0
ntests = 1000
for _ in range(ntests):
t_start = time.perf_counter()
accurate_delay(delay)
t_end = time.perf_counter()
print('Test completed: {:.2f}%'.format(_/ntests * 100), end='\r', flush=True)
sum = sum + 1000*(t_end - t_start) - delay
print('Average difference in time delay is {:.5f} ms.'.format(sum/ntests))
What you intend to do is a real time application. Python (and probably the OS you are using) is not intended to program this kind of applications, where time restriction is so strict.
In order for you to achieve what you are looking for you need a RTOS (Real Time Operating System) and develop your application using a suitable programming language (usually C) following RT best practises.
From the docs of the sleep method:
Suspend execution for the given number of seconds. The argument may be
a floating point number to indicate a more precise sleep time. The
actual suspension time may be less than that requested because any
caught signal will terminate the sleep() following execution of that
signal’s catching routine. Also, the suspension time may be longer
than requested by an arbitrary amount because of the scheduling of
other activity in the system.
The fact is that it depends on your underlying OS.