Implement sub millisecond processing in python without busywait - python

How would i implement processing of an array with millisecond precision using python under linux (running on a single core Raspberry Pi).
I am trying to parse information from a MIDI file, which has been preprocessed to an array where each millisecond i check if the array has entries at the current timestamp and trigger some functions if it does.
Currently i am using time.time() and employ busy waiting (as concluded here). This eats up all the CPU, therefor i opt for a better solution.
# iterate through all milliseconds
for current_ms in xrange(0, last+1):
start = time()
# check if events are to be processed
try:
events = allEvents[current_ms]
# iterate over all events for this millisecond
for event in events:
# check if event contains note information
if 'note' in event:
# check if mapping to pin exists
if event['note'] in mapping:
pin = mapping[event['note']]
# check if event contains on/off information
if 'mode' in event:
if event['mode'] == 0:
pin_off(pin)
elif event['mode'] == 1:
pin_on(pin)
else:
debug("unknown mode in event:"+event)
else:
debug("no mapping for note:" + event['note'])
except:
pass
end = time()
# fill the rest of the millisecond
while (end-start) < (1.0/(1000.0)):
end = time()
where last is the millisecond of the last event (known from preprocessing)
This is not a question about time() vs clock() more about sleep vs busy wait.
I cant really sleep in the "fill rest of millisecond" loop, because of the too low accuracy of sleep(). If i were to use ctypes, how would i go about it correctly?
Is there some Timer library which would call a callback each millisecond reliably?
My current implementation is on GitHub. With this approach i get a skew of around 4 or 5ms on the drum_sample, which is 3.7s total (with mock, so no real hardware attached). On a 30.7s sample, the skew is around 32ms (so its at least not linear!).
I have tried using time.sleep() and nanosleep() via ctypes with the following code
import time
import timeit
import ctypes
libc = ctypes.CDLL('libc.so.6')
class Timespec(ctypes.Structure):
""" timespec struct for nanosleep, see:
http://linux.die.net/man/2/nanosleep """
_fields_ = [('tv_sec', ctypes.c_long),
('tv_nsec', ctypes.c_long)]
libc.nanosleep.argtypes = [ctypes.POINTER(Timespec),
ctypes.POINTER(Timespec)]
nanosleep_req = Timespec()
nanosleep_rem = Timespec()
def nsleep(us):
#print('nsleep: {0:.9f}'.format(us))
""" Delay microseconds with libc nanosleep() using ctypes. """
if (us >= 1000000):
sec = us/1000000
us %= 1000000
else: sec = 0
nanosleep_req.tv_sec = sec
nanosleep_req.tv_nsec = int(us * 1000)
libc.nanosleep(nanosleep_req, nanosleep_rem)
LOOPS = 10000
def do_sleep(min_sleep):
#print('try: {0:.9f}'.format(min_sleep))
total = 0.0
for i in xrange(0, LOOPS):
start = timeit.default_timer()
nsleep(min_sleep*1000*1000)
#time.sleep(min_sleep)
end = timeit.default_timer()
total += end - start
return (total / LOOPS)
iterations = 5
iteration = 1
min_sleep = 0.001
result = None
while True:
result = do_sleep(min_sleep)
#print('res: {0:.9f}'.format(result))
if result > 1.5 * min_sleep:
if iteration > iterations:
break
else:
min_sleep = result
iteration += 1
else:
min_sleep /= 2.0
print('FIN: {0:.9f}'.format(result))
The result on my i5 is
FIN: 0.000165443
while on the RPi it is
FIN: 0.000578617
which suggest a sleep period of about 0.1 or 0.5 milliseconds, with the given jitter (tend to sleep longer) that at most helps me reduce the load a little bit.

One possible solution, using the sched module:
import sched
import time
def f(t0):
print 'Time elapsed since t0:', time.time() - t0
s = sched.scheduler(time.time, time.sleep)
for i in range(10):
s.enterabs(t0 + 10 + i, 0, f, (t0,))
s.run()
Result:
Time elapsed since t0: 10.0058200359
Time elapsed since t0: 11.0022959709
Time elapsed since t0: 12.0017120838
Time elapsed since t0: 13.0022599697
Time elapsed since t0: 14.0022521019
Time elapsed since t0: 15.0015859604
Time elapsed since t0: 16.0023040771
Time elapsed since t0: 17.0023028851
Time elapsed since t0: 18.0023078918
Time elapsed since t0: 19.002286911
Apart from some constant offset of about 2 millisecond (which you could calibrate), the jitter seems to be on the order of 1 or 2 millisecond (as reported by time.time itself). Not sure if that is good enough for your application.
If you need to do some useful work in the meantime, you should look into multi-threading or multi-processing.
Note: a standard Linux distribution that runs on a RPi is not a hard real-time operating system. Also Python can show non-deterministic timing, e.g. when it starts a garbage collection. So your code might run fine with low jitter most of the time, but you might have occasional 'hickups', where there is a bit of delay.

Related

How to increase sleep/pause timing accuracy in python?

I ran an experiment to compare sleep/pause timing accuracy in python and C++
Experiment summary:
In a loop of 1000000 iterations, sleep 1 microsecond in each iteration.
Expected duration: 1.000000 second (for 100% accurate program)
In python:
import pause
import datetime
start = time.time()
dt = datetime.datetime.now()
for i in range(1000000):
dt += datetime.timedelta(microseconds=1)
pause.until(dt)
end = time.time()
print(end - start)
Expected: 1.000000 sec, Actual (approximate): 2.603796
In C++:
#include <iostream>
#include <chrono>
#include <thread>
using namespace std;
using usec = std::chrono::microseconds;
using datetime = chrono::_V2::steady_clock::time_point;
using clk = chrono::_V2::steady_clock;
int main()
{
datetime dt;
usec timedelta = static_cast<usec>(1);
dt = clk::now();
const auto start = dt;
for(int i=0; i < 1000000; ++i) {
dt += timedelta;
this_thread::sleep_until(dt);
}
const auto end = clk::now();
chrono::duration<double> elapsed_seconds = end - start;
cout << elapsed_seconds.count();
return 0;
}
Expected: 1.000000 sec, Actual (approximate): 1.000040
It is obvious that C++ is much more accurate, but I am developing a project in python and need to increase the accuracy. Any ideas?
P.S It's OK if you suggest another python library/technique as long as it is more accurate :)
The problem is not only that the sleep timer of python is inaccurate, but that each part of the loop requires some time.
Your original code has a run-time of ~1.9528656005859375 on my system.
If I only run this part of your code without any sleep:
for i in range(100000):
dt += datetime.timedelta(microseconds=1)
Then the required time for that loop is already ~0.45999741554260254.
If I only run
for i in range(1000000):
pause.milliseconds(0)
Then the run-time of the code is ~0.5583224296569824.
Using always the same date:
dt = datetime.datetime.now()
for i in range(1000000):
pause.until(dt)
Results in a runtime of ~1.326077938079834
If you do the same with the timestamp:
dt = datetime.datetime.now()
ts = dt.timestamp()
for i in range(1000000):
pause.until(ts)
Then the run-time changes to ~0.36722803115844727
And if you increment the timestamp with one microsecond:
dt = datetime.datetime.now()
ts = dt.timestamp()
for i in range(1000000):
ts += 0.000001
pause.until(ts)
Then you get a runtime of ~0.9536933898925781
That it is smaller then 1 is due to floating point inaccuracies, adding print(ts-dt.timestamp()) after the loop will show ~0.95367431640625, so the pause duration itself is correct, but the ts += 0.000001 is accumulating an error.
You will get the best result if you count the iterations you had and add iterationCount/1000000 to the start time:
dt = datetime.datetime.now()
ts = dt.timestamp()
for i in range(1000000):
pause.until(ts+i/1000000)
And this would result in ~1.000023365020752
So in my case pause itself would already allow an accuracy with less then 1 microsecond. The problem is actually in the datetime part that is required for both datetime.timedelta and sleep_until.
So if you want to have microseconds accuracy then you need to look for a time library that performs better then datetime.
import pause
import datetime
import time
start = time.time()
dt = datetime.datetime.now()
for i in range(1000000):
dt += datetime.timedelta(microseconds=1)
pause.until(1)
end = time.time()
print(end - start)
OUTPUT:
1.0014092922210693
The pause library says that
The precision should be within 0.001 of a second, however, this will depend on how >precise your system sleep is and other performance factors.
If you multiply 0.001 by 1000000 you will get a large accumulated error.
A couple of questions:
Why do you need to sleep?
What is the minimum required accuracy?
How time consistent are the operations you are calling? If these function calls vary by more than 0.001 then the accumulated error will be more due to the operations you are performing than can be attributed to the pauses/sleeps.
Sleeping a thread is inherently non-deterministic - you cannot talk about 'precision' really for thread sleeep in general - perhaps only in the context of a particular system and platform - there are just too many factors that can possibly play a role for example how many cpu cores, etc..
To illustrate the point, a thought experiment:
Suppose you made many threads (at least 1000) and scheduled them to run at the same exact time. What 'precision' would you then expect ?

Python - Accurate time.sleep

I am working on a project which accurate timer is really crucial. I am working on python and am using timer.sleep() function.
I noticed that timer.sleep() function will add additional delay because of the scheduling problem (refer to timer.sleep docs). Due to that issue, the longer my program runs, the more inaccurate the timer is.
Is there any more accurate timer/ticker to sleep the program or solution for this problem?
Any help would be appreciated. Cheers.
I had a solution similar to above, but it became processor heavy very quickly. Here is a processor-heavy idea and a workaround.
def processor_heavy_sleep(ms): # fine for ms, starts to work the computer hard in second range.
start = time.clock()
end = start + ms /1000.
while time.clock() < end:
continue
return start, time.clock()
def efficient_sleep(secs, expected_inaccuracy=0.5): # for longer times
start = time.clock()
end = secs + start
time.sleep(secs - expected_inaccuracy)
while time.clock() < end:
continue
return start, time.clock()
output of efficient_sleep(5, 0.5) 3 times was:
(3.1999303695151594e-07, 5.0000003199930365)
(5.00005983869791, 10.00005983869791)
(10.000092477987678, 15.000092477987678)
This is on windows. I'm running it for 100 loops right now. Here are the results.
(485.003749358414, 490.003749358414)
(490.0037919174879, 495.0037922374809)
(495.00382903668014, 500.00382903668014)
The sleeps remain accurate, but the calls are always delayed a little. If you need a scheduler that accurately calls every xxx secs to the millisecond, that would be a different thing.
the longer my program runs, the more inaccurate the timer is.
So, for example by expecting 0.5s delay, it will be time.sleep(0.5 - (start-end)). But still didn't solve the issue
You seem to be complaining about two effects, 1) the fact that timer.sleep() may take longer than you expect, and 2) the inherent creep in using a series of timer.sleep() calls.
You can't do anything about the first, short of switching to a real-time OS. The underlying OS calls are defined to sleep for at least as long as requested. They only guarantee that you won't wake early; they make no guarantee that you won't wake up late.
As for the second, you ought to figure your sleep time according to an unchanging epoch, not from your wake-up time. For example:
import time
import random
target = time.time()
def myticker():
# Sleep for 0.5s between tasks, with no creep
target += 0.5
now = time.time()
if target > now:
time.sleep(target - now)
def main():
previous = time.time()
for _ in range(100):
now = time.time()
print(now - previous)
previous = now
# simulate some work
time.sleep(random.random() / 10) # Always < tick frequency
# time.sleep(random.random()) # Not always < tick frequency
myticker()
if __name__ == "__main__":
main()
Working on Linux with zero knowledge of Windows, I may be being naive here but is there some reason that writing your own sleep function, won't work for you?
Something like:
import time
def sleep_time():
start_time = time.time()
while (time.time() - start_time) < 0.0001:
continue
end_time = time.time() + 60 # run for a minute
cnt = 0
while time.time() < end_time:
cnt += 1
print('sleeping',cnt)
sleep_time()
print('Awake')
print("Slept ",cnt," Times")

Python, Raspberry pi, call a task every 10 milliseconds precisely

I'm currently trying to have a function called every 10ms to acquire data from a sensor.
Basically I was triggering the callback from a gpio interrupt but I changed my sensor and the one I'm currently using doesn't have a INT pin to drive the callback.
So my goal is to have the same behavior but with an internal interrupt generated by a timer.
I tried this from this topic
import threading
def work ():
threading.Timer(0.25, work).start ()
print(time.time())
print "stackoverflow"
work ()
But when I run it I can see that the timer is not really precise and it's deviating over time as you can see.
1494418413.1584847
stackoverflow
1494418413.1686869
stackoverflow
1494418413.1788757
stackoverflow
1494418413.1890721
stackoverflow
1494418413.1992736
stackoverflow
1494418413.2094712
stackoverflow
1494418413.2196639
stackoverflow
1494418413.2298684
stackoverflow
1494418413.2400634
stackoverflow
1494418413.2502584
stackoverflow
1494418413.2604961
stackoverflow
1494418413.270702
stackoverflow
1494418413.2808678
stackoverflow
1494418413.2910736
stackoverflow
1494418413.301277
stackoverflow
So the timer is deviating by 0.2 milliseconds every 10 milliseconds which is quite a big bias after few seconds.
I know that python is not really made for "real-time" but I think there should be a way to do it.
If someone already have to handle time constraints with python I would be glad to have some advices.
Thanks.
This code works on my laptop - logs the delta between target and actual time - main thing is to minimise what is done in the work() function because e.g. printing and scrolling screen can take a long time.
Key thing is to start the next timer based on difference between the time when that call is made and the target.
I slowed down the interval to 0.1s so it is easier to see the jitter which on my Win7 x64 can exceed 10ms which would cause problems with passing a negative value to thte Timer() call :-o
This logs 100 samples, then prints them - if you redirect to a .csv file you can load into Excel to display graphs.
from multiprocessing import Queue
import threading
import time
# this accumulates record of the difference between the target and actual times
actualdeltas = []
INTERVAL = 0.1
def work(queue, target):
# first thing to do is record the jitter - the difference between target and actual time
actualdeltas.append(time.clock()-target+INTERVAL)
# t0 = time.clock()
# print("Current time\t" + str(time.clock()))
# print("Target\t" + str(target))
# print("Delay\t" + str(target - time.clock()))
# print()
# t0 = time.clock()
if len(actualdeltas) > 100:
# print the accumulated deltas then exit
for d in actualdeltas:
print d
return
threading.Timer(target - time.clock(), work, [queue, target+INTERVAL]).start()
myQueue = Queue()
target = time.clock() + INTERVAL
work(myQueue, target)
Typical output (i.e. don't rely on millisecond timing on Windows in Python):
0.00947008617187
0.0029628920052
0.0121824719378
0.00582923077099
0.00131316206917
0.0105631524709
0.00437298744466
-0.000251418553351
0.00897956530515
0.0028528821332
0.0118192949105
0.00546301269675
0.0145723546788
0.00910063698529
I tried your solution but I got strange results.
Here is my code :
from multiprocessing import Queue
import threading
import time
def work(queue, target):
t0 = time.clock()
print("Target\t" + str(target))
print("Current time\t" + str(t0))
print("Delay\t" + str(target - t0))
print()
threading.Timer(target - t0, work, [queue, target+0.01]).start()
myQueue = Queue()
target = time.clock() + 0.01
work(myQueue, target)
And here is the output
Target 0.054099
Current time 0.044101
Delay 0.009998
Target 0.064099
Current time 0.045622
Delay 0.018477
Target 0.074099
Current time 0.046161
Delay 0.027937999999999998
Target 0.084099
Current time 0.0465
Delay 0.037598999999999994
Target 0.09409899999999999
Current time 0.046877
Delay 0.047221999999999986
Target 0.10409899999999998
Current time 0.047211
Delay 0.05688799999999998
Target 0.11409899999999998
Current time 0.047606
Delay 0.06649299999999997
So we can see that the target is increasing per 10ms and for the first loop, the delay for the timer seems to be good.
The point is instead of starting again at current_time + delay it start again at 0.045622 which represents a delay of 0.001521 instead of 0.01000
Did I missed something? My code seems to follow your logic isn't it?
Working example for #Chupo_cro
Here is my working example
from multiprocessing import Queue
import RPi.GPIO as GPIO
import threading
import time
import os
INTERVAL = 0.01
ledState = True
GPIO.setmode(GPIO.BCM)
GPIO.setup(2, GPIO.OUT, initial=GPIO.LOW)
def work(queue, target):
try:
threading.Timer(target-time.time(), work, [queue, target+INTERVAL]).start()
GPIO.output(2, ledState)
global ledState
ledState = not ledState
except KeyboardInterrupt:
GPIO.cleanup()
try:
myQueue = Queue()
target = time.time() + INTERVAL
work(myQueue, target)
except KeyboardInterrupt:
GPIO.cleanup()

Accurate sleep/delay within Python while loop

I have a while True loop which sends variables to an external function, and then uses the returned values. This send/receive process has a user-configurable frequency, which is saved and read from an external .ini configuration file.
I've tried time.sleep(1 / Frequency), but am not satisfied with the accuracy, given the number of threads being used elsewhere. E.g. a frequency of 60Hz (period of 0.0166667) is giving an 'actual' time.sleep() period of ~0.0311.
My preference would be to use an additional while loop, which compares the current time to the start time plus the period, as follows:
EndTime = time.time() + (1 / Frequency)
while time.time() - EndTime < 0:
sleep(0)
This would fit into the end of my while True function as follows:
while True:
A = random.randint(0, 5)
B = random.randint(0, 10)
C = random.randint(0, 20)
Values = ExternalFunction.main(Variable_A = A, Variable_B = B, Variable_C = C)
Return_A = Values['A_Out']
Return_B = Values['B_Out']
Return_C = Values['C_Out']
#Updated other functions with Return_A, Return_B and Return_C
EndTime = time.time() + (1 / Frequency)
while time.time() - EndTime < 0:
time.sleep(0)
I'm missing something, as the addition of the while loop causes the function to execute once only. How can I get the above to function correctly? Is this the best approach to 'accurate' frequency control on a non-real time operating system? Should I be using threading for this particular component? I'm testing this function on both Windows 7 (64-bit) and Ubuntu (64-bit).
If I understood your question correctly, you want to execute ExternalFunction.main at a given frequency. The problem is that the execution of ExternalFunction.main itself takes some time. If you don't need very fine precision -- it seems that you don't -- my suggestion is doing something like this.
import time
frequency = 1 # Hz
period = 1.0/frequency
while True:
time_before = time.time()
[...]
ExternalFunction.main([...])
[...]
while (time.time() - time_before) < period:
time.sleep(0.001) # precision here
You may tune the precision to your needs. Greater precision (smaller number) will make the inner while loop execute more often.
This achieves decent results when not using threads. However, when using Python threads, the GIL (Global Interpreter Lock) makes sure only one thread runs at a time. If you have a huge number of threads it may be that it is taking way too much time for the program to go back to your main thread. Increasing the frequency Python changes between threads may give you more accurate delays.
Add this to the beginning of your code to increase the thread switching frequency.
import sys
sys.setcheckinterval(1)
1 is the number of instructions executed on each thread before switching (the default is 100), a larger number improves performance but will increase the threading switching time.
You may want to try python-pause
Pause until a unix time, with millisecond precision:
import pause
pause.until(1370640569.7747359)
Pause using datetime:
import pause, datetime
dt = datetime.datetime(2013, 6, 2, 14, 36, 34, 383752)
pause.until(dt)
You may use it like:
freqHz=60.0
td=datetime.timedelta(seconds=1/freqHz)
dt=datetime.now()
while true:
#Your code here
dt+=td
pause.until(dt)
Another solution for an accurate delay is to use the perf_counter() function from module time. Especially useful in windows as time.sleep is not accurate in milliseconds. See below example where function accurate_delay creates a delay in milliseconds.
import time
def accurate_delay(delay):
''' Function to provide accurate time delay in millisecond
'''
_ = time.perf_counter() + delay/1000
while time.perf_counter() < _:
pass
delay = 10
t_start = time.perf_counter()
print('Wait for {:.0f} ms. Start: {:.5f}'.format(delay, t_start))
accurate_delay(delay)
t_end = time.perf_counter()
print('End time: {:.5f}. Delay is {:.5f} ms'.
format(t_end, 1000*(t_end - t_start)))
sum = 0
ntests = 1000
for _ in range(ntests):
t_start = time.perf_counter()
accurate_delay(delay)
t_end = time.perf_counter()
print('Test completed: {:.2f}%'.format(_/ntests * 100), end='\r', flush=True)
sum = sum + 1000*(t_end - t_start) - delay
print('Average difference in time delay is {:.5f} ms.'.format(sum/ntests))`

What is the best/most efficient way to output value every x seconds during a loop

I have always been curious about this as the simple way is definitely not efficient. How would you efficiently go about outputting a value every x seconds?
Here is an example of what I mean:
import time
num = 50000000
startTime = time.time()
j=0
for i in range(num):
j = (((j+10)**0.5)**2)**0.5
print time.time() - startTime
#output time: 24 seconds
startTime = time.time()
newTime = time.time()
j=0
for i in range(num):
j = (((j+10)**0.5)**2)**0.5
if time.time() - newTime > 0.5:
newTime = time.time()
print i
print time.time() - startTime
#output time: 32 seconds
A whole 1/3rd faster when not outputting the progress every half a second.
I know this is because it requires an extra calculation every loop, but the same applies with other similar checks you may want to do - how would you go about implementing something like this without seriously affecting the execution time?
Well, you know that you're doing many iterations per second, so you really don't need to make the time.time() call on every iteration. You can use a modulo operator to only actually check if you need to output something every N iterations of the loop.
startTime = time.time()
newTime = time.time()
j=0
for i in range(num):
j = (((j+10)**0.5)**2)**0.5
if i % 50 == 0: # Only check every 50th iteration
if time.time() - newTime > 0.5:
newTime = time.time()
print i, newTime
print time.time() - startTime
# 45 seconds (the original version took 42 on my system)
Checking only every 50 iterations reduces my run time from 56 seconds to 43 (the original took with no printing 42, and Tom Page's solution took 50 seconds), and the iterations complete quickly enough that its still outputting exactly every 0.5 seconds according to time.time():
0 1409083225.39
605000 1409083225.89
1201450 1409083226.39
1821150 1409083226.89
2439250 1409083227.39
3054400 1409083227.89
3644100 1409083228.39
4254350 1409083228.89
4831600 1409083229.39
5433450 1409083229.89
6034850 1409083230.39
6644400 1409083230.89
7252650 1409083231.39
7840100 1409083231.89
8438300 1409083232.39
9061200 1409083232.89
9667350 1409083233.39
...
You might save a few clock cycles by keeping track of the next time that a print is due
nexttime = time.time() + 0.5
And then your condition will be a simple comparison
If time.time() >= nexttime
As opposed to a subtraction followed by a comparison
If time.time() - newTime > 0.5
You'll only have to do an addition after each message as opposed to doing a subtraction after each itteration
I tried it with a sideband thread doing the printing. It added 5 seconds to exec time on python 2.x but virtually not extra time on python 3.x. Python 2.x threads have a lot of overhead. Here's my example with timing included as comments:
import time
import threading
def showit(event):
global i # could pass in a mutable object instead
while not event.is_set():
event.wait(.5)
print 'value is', i
num = 50000000
startTime = time.time()
j=0
for i in range(num):
j = (((j+10)**0.5)**2)**0.5
print time.time() - startTime
#output time: 23 seconds
event = threading.Event()
showit_thread = threading.Thread(target=showit, args=(event,))
showit_thread.start()
startTime = time.time()
j=0
for i in range(num):
j = (((j+10)**0.5)**2)**0.5
event.set()
time.sleep(.1)
print time.time() - startTime
#output time: 28 seconds
If you want to wait a specified period of time before doing something, just use the time.sleep() method.
for i in range(100):
print(i)
time.sleep(0.5)
This will wait half a second before printing the next value of i.
If you don't care about Windows, signal.setitimer will be simpler than using a background thread, and on many *nix platforms a whole lot more efficient.
Here's an example:
import signal
import time
num = 50000000
startTime = time.time()
def ontimer(sig, frame):
global i
print(i)
signal.signal(signal.SIGVTALRM, ontimer)
signal.setitimer(signal.ITIMER_VIRTUAL, 0.5, 0.5)
j=0
for i in range(num):
j = (((j+10)**0.5)**2)**0.5
signal.setitimer(signal.ITIMER_VIRTUAL, 0)
print(time.time() - startTime)
This is about as close to free as you're going to get performance-wise.
In some use cases, a virtual timer isn't sufficiently accurate, so you need to change that to ITIMER_REAL and change the signal to SIGALRM. That's a little more expensive, but still pretty cheap, and still dead simple.
On some (older) *nix platforms, alarm may be more efficient than setitmer, but unfortunately alarm only takes integral seconds, so you can't use it to fire twice/second.
Timings from my MacBook Pro:
no output: 15.02s
SIGVTALRM: 15.03s
SIGALRM: 15.44s
thread: 19.9s
checking time.time(): 22.3s
(I didn't test with either dano's optimization or Tom Page's; obviously those will reduce the 22.3, but they're not going to get it down to 15.44…)
Part of the problem here is that you're using time.time.
On my MacBook Pro, time.time takes more than 1/3rd as long as all of the work you're doing:
In [2]: %timeit time.time()
10000000 loops, best of 3: 105 ns per loop
In [3]: %timeit (((j+10)**0.5)**2)**0.5
1000000 loops, best of 3: 268 ns per loop
And that 105ns is fast for time—e.g., an older Windows box with no better hardware timer than ACPI can take 100x longer.
On top of that, time.time is not guaranteed to have enough precision to do what you want anyway:
Note that even though the time is always returned as a floating point number, not all systems provide time with a better precision than 1 second.
Even on platforms where it has better precision than 1 second, it may have a lower accuracy; e.g., it may only be updated once per scheduler tick.
And time isn't even guaranteed to be monotonic; on some platforms, if the system time changes, time may go down.
Calling it less often will solve the first problem, but not the others.
So, what can you do?
Unfortunately, there's no built-in answer, at least not with Python 2.7. The best solution is different on different platforms—probably GetTickCount64 on Windows, clock_gettime with the appropriate clock ID on most modern *nixes, gettimeofday on most other *nixes. These are relatively easy to use via ctypes if you don't want to distribute a C extension… but someone really should wrap it all up in a module and post it on PyPI, and unfortunately I couldn't find one…

Categories