Could someone please give me some help with limiting a loop to N iterations per minute in Python.
Lets say I have
limit = 5
for items in recvData:
# > limit iterations in past minute? -> sleep for 60 seconds from last iterations before proceeding? #
... do work ...
How would I do the time check / sleeping to give the correct flow. I'm not worried about blocking the executing thread/process while it waits.
Thanks
It should be noted that this is not "hard real time" code. this will be off slightly because of OS scheduling and the like. That being said, unless you know that you need hard real time, this should suffice.
import time
limit = 5
starttime = time.time()
for i, item in enumerate(recvData):
if not i + 1 % limit:
sleeptime =starttime + 60 - time.time()
if sleeptime > 0:
time.sleep(sleeptime)
starttime = time.time()
#processing code
Use grouper from the itertools recipes, combined with a time check.
import itertools, datetime, time
limit = 5
def grouper(n, iterable, fillvalue=None):
"grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
args = [iter(iterable)] * n
return itertools.izip_longest(fillvalue=fillvalue, *args)
for items in grouper(limit, recvData):
prev_minute = datetime.datetime.now().minute
for item in items:
# do stuff
_, _, _, _, minute, second, _ = datetime.datetime.now()
if minute == prev_minute:
time.sleep( 60 - second )
This very much depends on the type of work you're doing inside the loop and on how accurate you want this mechanism to work.
ATM I can come up with 2 possible schemes:
If one iteration takes about constant time than you could calculate an average from the first few iterations and "sleep" for 1 - iterationTime afterwards.
Otherwise, you can poll the time and recalculate the average every step (or a few steps).
Depending on the standard deviation of your single loop execution times both scheme can work quite well, but if the execution times are very varying, neither of them will. Also, if you want evenly distributed loop cycles and not only keep the average/min you have to distribute the sleep-s and do one after each iteration.
I am not familiar enough with Python to know how expensive is to query the time and what other Python-specific issues might pop up with sleep-ing, though.
This would be the most appropriate answer :
What's a good rate limiting algorithm?
I especially like the second answer using the decorator!
Related
I am trying to log data at with a high sampling rate using a Raspberry Pi 3 B+. In order to achieve a fixed sampling rate, I am delaying the while loop, but I always get a sample rate that is a little less than I specify.
For 2500 Hz I get ~2450 Hz
For 5000 Hz I get ~4800 Hz
For 10000 Hz I get ~9300 Hz
Here is the code that I use to delay the while loop:
import time
count=0
while True:
sample_rate=5000
time_start=time.perf_counter()
count+=1
while (time.perf_counter()-time_start) < (1/sample_rate):
pass
if count == sample_rate:
print(1/(time.perf_counter()-time_start))
count=0
I have also tried updating to Python 3.7 and used time.perf_counter_ns(), but it does not make a difference.
The problem you are seeing is because your code is using the real time each time in the loop when it starts each delay for the period duration - and so time spent in untimed code and jitter due to OS multitasking accumulates, reducing the overall period below what you want to achieve.
To greatly increase the timing accuracy, use the fact that each loop "should" finish at the period (1/sample_rate) after it should have started - and maintain that start time as an absolute calculation rather than the real time, and wait until the period after that absolute start time, and then there is no drift in the timing.
I put your timing into timing_orig and my revised code using absolute times into timing_new - and results are below.
import time
def timing_orig(ratehz,timefun=time.clock):
count=0
while True:
sample_rate=ratehz
time_start=timefun()
count+=1
while (timefun()-time_start) < (1.0/sample_rate):
pass
if count == ratehz:
break
def timing_new(ratehz,timefun=time.clock):
count=0
delta = (1.0/ratehz)
# record the start of the sequence of timed periods
time_start=timefun()
while True:
count+=1
# this period ends delta from "now" (now is the time_start PLUS a number of deltas)
time_next = time_start+delta
# wait until the end time has passed
while timefun()<time_next:
pass
# calculate the idealised "now" as delta from the start of this period
time_start = time_next
if count == ratehz:
break
def timing(functotime,ratehz,ntimes,timefun=time.clock):
starttime = timefun()
for n in range(int(ntimes)):
functotime(ratehz,timefun)
endtime = timefun()
# print endtime-starttime
return ratehz*ntimes/(endtime-starttime)
if __name__=='__main__':
print "new 5000",timing(timing_new,5000.0,10.0)
print "old 5000",timing(timing_orig,5000.0,10.0)
print "new 10000",timing(timing_new,10000.0,10.0)
print "old 10000",timing(timing_orig,10000.0,10.0)
print "new 50000",timing(timing_new,50000.0,10.0)
print "old 50000",timing(timing_orig,50000.0,10.0)
print "new 100000",timing(timing_new,100000.0,10.0)
print "old 100000",timing(timing_orig,100000.0,10.0)
Results:
new 5000 4999.96331002
old 5000 4991.73952992
new 10000 9999.92662005
old 10000 9956.9314274
new 50000 49999.6477761
old 50000 49591.6104893
new 100000 99999.2172809
old 100000 94841.227219
Note I didn't use time.sleep() because it introduced too much jitter. Also, note that even though this minimal example shows very accurate timing even up to 100khz on my Windows laptop, if you put more code into the loop than there is time to execute, the timing will run correspondingly slow.
Apologies I used Python 2.7 which doesn't have the very convenient time.perf_counter() function - add an extra parameter timefun=time.perf_counter() to each of the calls to timing()
I think you can fix this pretty easily by rearranging your code as such:
import time
count=0
sample_rate=5000
while True:
time_start=time.perf_counter()
# do all the real stuff here
while (time.perf_counter()-time_start) < (1/sample_rate):
pass
This way python does the waiting after you execute the code, rather than before, so the time the interpreter takes to run it will not be added to your sample rate. As danny said, it's an interpreted language so that might introduce timing inconsistencies, but this way should at least decrease the effect you are seeing.
Edit for proof that this works:
import sys
import time
count=0
sample_rate=int(sys.argv[1])
run_start = time.time()
while True:
time_start=time.time()
a = range(10)
b = range(10)
for x in a:
for y in b:
c = a+b
count += 1
if count == sample_rate*2:
break
while (time.time()-time_start) < (1.0/sample_rate):
pass
real_rate = sample_rate*2/(time.time()-run_start)
print real_rate, real_rate/sample_rate
So the testing code does a solid amount of random junk for 2 seconds and then prints the real rate and the percentage of the actual rate that turns out to be. Here's some results:
~ ><> python t.py 1000
999.378471674 0.999378471674
~ ><> python t.py 2000
1995.98713838 0.99799356919
~ ><> python t.py 5000
4980.90553757 0.996181107514
~ ><> python t.py 10000
9939.73553783 0.993973553783
~ ><> python t.py 40000
38343.706669 0.958592666726
So, not perfect. But definitely better than a ~700Hz drop at a desired 10000. The accepted answer is definitely the right one.
import time
def find(a):
count = 0
for item in a:
count = count + 1
if item == 2:
return count
a = [7,4,5,10,3,5,88,5,5,5,5,5,5,5,5,5,5,55,
5,5,5,5,5,5,5,5,5,5,5,5,55,5,5,5,5,5,
5,5,5,5,5,2,5,5,5,55,5,55,5,5,5,6]
print (len(a))
sTime = time.time()
print (find(a))
eTime = time.time()
ave = eTime - sTime
print (ave)
I want measure the execution time of this function
My print (ave) returns 0; why?
To accurately time code execution you should use the timeit, rather than time. timeit easily allows the repetition of code blocks for timing to avoid very near zero results (the cause of your question)
import timeit
s = """
def find(a):
count = 0
for item in a:
count = count + 1
if item == 2:
return count
a = [7,4,5,10,3,5,88,5,5,5,5,5,5,5,5,5,5,55,5,5,5,5,5,5,5,5,5,5,5,5,55,5,5,5,5,5,5,5,5,5,5,2,5,5,5,55,5,55,5,5,5,6]
find(a)
"""
print(timeit.timeit(stmt=s, number=100000))
This will measure the amount of time it takes to run the code in multiline string s 100,000 times. Note that I replaced print(find(a)) with just find(a) to avoid having the result printed 100,000 times.
Running many times is advantageous for several reasons:
In general, code runs very quickly. Summing many quick runs results in a number which is actually meaningful and useful
Run time is dependent on many variable, uncontrollable factors (such as other processes using computing power). Running many times helps normalize this
If you are using timeit to compare two methodologies to see which is faster, multiple runs will make it easier to see the conclusive result
I'm not sure, either; I get a time about 1.4E-5.
Try putting the call into a loop to measure more iterations:
for i in range(10000):
result = find(a)
print(result)
I'd like to create a revenue counter for the sales team at work and would love to use Python. E.g. Joe Bloggs shifts his target from 22.1 to 23.1 (difference of 1.0.) I'd like the counter to tick evenly from 22.1 to 23.1 over an hour.
I've created this script, which works fine for counting a minute (runs 2 seconds over the minute); however, when it's supposed to run for an hour, it runs for 47 minutes.
Question: Does anyone know why it runs faster when I set it to an hour? Is sleep.time inaccurate?
import time
def rev_counter(time_length):
time_start = (time.strftime("%H:%M:%S"))
prev_pp = 22.1
new_pp = 23.1
difference = new_pp - prev_pp
iter_difference = (difference / 100000.) # Divide by 100,000 to show 10 decimal places
time_difference = ((time_length / difference) / 100000.)
i = prev_pp
while i < new_pp:
print("%.10f" % i)
i = i + iter_difference
time.sleep(time_difference)
time_end = (time.strftime("%H:%M:%S"))
print "Time started at", time_start
print "Time ended at", time_end
rev_counter(60) # 60 seconds. Returns 62 seconds
rev_counter(600) # 10 minutes. Returns 10 minutes, 20 secs
rev_counter(3600) # 1 hour. Returns 47 minutes
Please note this quote from the Python documentation for time.sleep()
The actual suspension time may be less than that requested because any
caught signal will terminate the sleep() following execution of that
signal's catching routine. Also, the suspension time may be longer
than requested by an arbitrary amount because of the scheduling of
other activity in the system.
As a suggestion, if faced with this problem, I would use a variable to track the time that the interval starts. When sleep wakes up, check to see if the expected time has elapsed. If not, restart a sleep for the difference, etc.
First of all, your loop doesn't only contain sleep statements -- the things you do between calling time.sleep take time, too, so if you do 10 repetions, you'll spent only 10% of the time doing these compared to when you have 100 iterations through your loop.
Is sleep.time inaccurate?
Yes. Or well. Quite.
I come from a real-time signal processing background. PC clocks are only somewhat accurate, and the time you spend in your OS, your standard libraries, your scripting language run time and your scripting logic between the point in time when a piece of hardware notifies you that your time has elapsed and the point in time your software notices is significant.
I just noticed time.sleep taking way too long (5-30000 times longer for input values between .0001 to 1 second), and searching for an answer, found this thread. I ran some tests and it is consistently doing this (see code and results below). The weird thing is, I restarted, then it was back to normal, working very accurately. When code started to hang it was time.sleep taking 10000 times too long?!
So a restart is a temporary solution, but not sure what the cause is/ permanent solution is.
import numpy as np
import time
def test_sleep(N,w):
data = []
for i in xrange(N):
t0 = time.time()
time.sleep(w)
t1 = time.time()
data.append(t1-t0)
print "ave = %s, min = %s, max = %s" %(np.average(data), np.min(data), np.max(data))
return data
data1 = test_sleep(20,.0001)
Out: ave = 2.95489487648, min = 1.11787080765, max = 3.23506307602
print data1
Out: [3.1929759979248047,
3.121081829071045,
3.1982388496398926,
3.1221959590911865,
3.098078966140747,
3.131525993347168,
3.12644100189209,
3.1535091400146484,
3.2167508602142334,
3.1277999877929688,
3.1103289127349854,
3.125699996948242,
3.1129801273345947,
3.1223208904266357,
3.1313750743865967,
3.1280829906463623,
1.117870807647705,
1.3357980251312256,
3.235063076019287,
3.189779043197632]
data2 = test_sleep(20, 1)
Out: ave = 9.44276217222, min = 1.00008392334, max = 10.9998381138
print data2
Out: [10.999573945999146,
10.999622106552124,
3.8115758895874023,
1.0000839233398438,
3.3502109050750732,
10.999613046646118,
10.99983811378479,
10.999617099761963,
10.999662160873413,
10.999619960784912,
10.999650955200195,
10.99962306022644,
10.999721050262451,
10.999620914459229,
10.999532222747803,
10.99965500831604,
10.999596118927002,
10.999563932418823,
10.999600887298584,
4.6992621421813965]
I have always been curious about this as the simple way is definitely not efficient. How would you efficiently go about outputting a value every x seconds?
Here is an example of what I mean:
import time
num = 50000000
startTime = time.time()
j=0
for i in range(num):
j = (((j+10)**0.5)**2)**0.5
print time.time() - startTime
#output time: 24 seconds
startTime = time.time()
newTime = time.time()
j=0
for i in range(num):
j = (((j+10)**0.5)**2)**0.5
if time.time() - newTime > 0.5:
newTime = time.time()
print i
print time.time() - startTime
#output time: 32 seconds
A whole 1/3rd faster when not outputting the progress every half a second.
I know this is because it requires an extra calculation every loop, but the same applies with other similar checks you may want to do - how would you go about implementing something like this without seriously affecting the execution time?
Well, you know that you're doing many iterations per second, so you really don't need to make the time.time() call on every iteration. You can use a modulo operator to only actually check if you need to output something every N iterations of the loop.
startTime = time.time()
newTime = time.time()
j=0
for i in range(num):
j = (((j+10)**0.5)**2)**0.5
if i % 50 == 0: # Only check every 50th iteration
if time.time() - newTime > 0.5:
newTime = time.time()
print i, newTime
print time.time() - startTime
# 45 seconds (the original version took 42 on my system)
Checking only every 50 iterations reduces my run time from 56 seconds to 43 (the original took with no printing 42, and Tom Page's solution took 50 seconds), and the iterations complete quickly enough that its still outputting exactly every 0.5 seconds according to time.time():
0 1409083225.39
605000 1409083225.89
1201450 1409083226.39
1821150 1409083226.89
2439250 1409083227.39
3054400 1409083227.89
3644100 1409083228.39
4254350 1409083228.89
4831600 1409083229.39
5433450 1409083229.89
6034850 1409083230.39
6644400 1409083230.89
7252650 1409083231.39
7840100 1409083231.89
8438300 1409083232.39
9061200 1409083232.89
9667350 1409083233.39
...
You might save a few clock cycles by keeping track of the next time that a print is due
nexttime = time.time() + 0.5
And then your condition will be a simple comparison
If time.time() >= nexttime
As opposed to a subtraction followed by a comparison
If time.time() - newTime > 0.5
You'll only have to do an addition after each message as opposed to doing a subtraction after each itteration
I tried it with a sideband thread doing the printing. It added 5 seconds to exec time on python 2.x but virtually not extra time on python 3.x. Python 2.x threads have a lot of overhead. Here's my example with timing included as comments:
import time
import threading
def showit(event):
global i # could pass in a mutable object instead
while not event.is_set():
event.wait(.5)
print 'value is', i
num = 50000000
startTime = time.time()
j=0
for i in range(num):
j = (((j+10)**0.5)**2)**0.5
print time.time() - startTime
#output time: 23 seconds
event = threading.Event()
showit_thread = threading.Thread(target=showit, args=(event,))
showit_thread.start()
startTime = time.time()
j=0
for i in range(num):
j = (((j+10)**0.5)**2)**0.5
event.set()
time.sleep(.1)
print time.time() - startTime
#output time: 28 seconds
If you want to wait a specified period of time before doing something, just use the time.sleep() method.
for i in range(100):
print(i)
time.sleep(0.5)
This will wait half a second before printing the next value of i.
If you don't care about Windows, signal.setitimer will be simpler than using a background thread, and on many *nix platforms a whole lot more efficient.
Here's an example:
import signal
import time
num = 50000000
startTime = time.time()
def ontimer(sig, frame):
global i
print(i)
signal.signal(signal.SIGVTALRM, ontimer)
signal.setitimer(signal.ITIMER_VIRTUAL, 0.5, 0.5)
j=0
for i in range(num):
j = (((j+10)**0.5)**2)**0.5
signal.setitimer(signal.ITIMER_VIRTUAL, 0)
print(time.time() - startTime)
This is about as close to free as you're going to get performance-wise.
In some use cases, a virtual timer isn't sufficiently accurate, so you need to change that to ITIMER_REAL and change the signal to SIGALRM. That's a little more expensive, but still pretty cheap, and still dead simple.
On some (older) *nix platforms, alarm may be more efficient than setitmer, but unfortunately alarm only takes integral seconds, so you can't use it to fire twice/second.
Timings from my MacBook Pro:
no output: 15.02s
SIGVTALRM: 15.03s
SIGALRM: 15.44s
thread: 19.9s
checking time.time(): 22.3s
(I didn't test with either dano's optimization or Tom Page's; obviously those will reduce the 22.3, but they're not going to get it down to 15.44…)
Part of the problem here is that you're using time.time.
On my MacBook Pro, time.time takes more than 1/3rd as long as all of the work you're doing:
In [2]: %timeit time.time()
10000000 loops, best of 3: 105 ns per loop
In [3]: %timeit (((j+10)**0.5)**2)**0.5
1000000 loops, best of 3: 268 ns per loop
And that 105ns is fast for time—e.g., an older Windows box with no better hardware timer than ACPI can take 100x longer.
On top of that, time.time is not guaranteed to have enough precision to do what you want anyway:
Note that even though the time is always returned as a floating point number, not all systems provide time with a better precision than 1 second.
Even on platforms where it has better precision than 1 second, it may have a lower accuracy; e.g., it may only be updated once per scheduler tick.
And time isn't even guaranteed to be monotonic; on some platforms, if the system time changes, time may go down.
Calling it less often will solve the first problem, but not the others.
So, what can you do?
Unfortunately, there's no built-in answer, at least not with Python 2.7. The best solution is different on different platforms—probably GetTickCount64 on Windows, clock_gettime with the appropriate clock ID on most modern *nixes, gettimeofday on most other *nixes. These are relatively easy to use via ctypes if you don't want to distribute a C extension… but someone really should wrap it all up in a module and post it on PyPI, and unfortunately I couldn't find one…
I have to time the implementation I did of an algorithm in one of my classes, and I am using the time.time() function to do so. After implementing it, I have to run that algorithm on a number of data files which contains small and bigger data sets in order to formally analyse its complexity.
Unfortunately, on the small data sets, I get a runtime of 0 seconds even if I get a precision of 0.000000000000000001 with that function when looking at the runtimes of the bigger data sets and I cannot believe that it really takes less than that on the smaller data sets.
My question is: Is there a problem using this function (and if so, is there another function I can use that has a better precision)? Or am I doing something wrong?
Here is my code if ever you need it:
import sys, time
import random
from utility import parseSystemArguments, printResults
...
def main(ville):
start = time.time()
solution = dynamique(ville) # Algorithm implementation
end = time.time()
return (end - start, solution)
if __name__ == "__main__":
sys.argv.insert(1, "-a")
sys.argv.insert(2, "3")
(algoNumber, ville, printList) = parseSystemArguments()
(algoTime, solution) = main(ville)
printResults(algoTime, solution, printList)
The printResults function:
def printResults(time, solution, printList=True):
print ("Temps d'execution = " + str(time) + "s")
if printList:
print (solution)
The solution to my problem was to use the timeit module instead of the time module.
import timeit
...
def main(ville):
start = timeit.default_timer()
solution = dynamique(ville)
end = timeit.default_timer()
return (end - start, solution)
Don't confuse the resolution of the system time with the resolution of a floating point number. The time resolution on a computer is only as frequent as the system clock is updated. How often the system clock is updated varies from machine to machine, so to ensure that you will see a difference with time, you will need to make sure it executes for a millisecond or more. Try putting it into a loop like this:
start = time.time()
k = 100000
for i in range(k)
solution = dynamique(ville)
end = time.time()
return ((end - start)/k, solution)
In the final tally, you then need to divide by the number of loop iterations to know how long your code actually runs once through. You may need to increase k to get a good measure of the execution time, or you may need to decrease it if your computer is running in the loop for a very long time.