Does python iterate at a constant speed? - python

I writing some code to get sensor readings from GPIO against time.
To make sure the measurements corresponds to a specific time, I want to know if python iterates at a constant speed (so that the gap between iterations is constant) - and what is its minimum time gap between iterations.
If they're not, can someone let me know how to make the time gap constant.
Thank you!

No, Python does not and can not iterate at constant speed.
Python is just another process on your Raspberry PI, and your OS is responsible for allocating it time to run on the CPU (called multi-tasking). Other processes also get allotted time. This means Python is never going to be running all the time and any processing times are going to be depend on what the other processes are doing.
Iteration itself is also delegated to specific types; how the next item is produced then varies widely, and even if Python was given constant access to the CPU iteration would still vary. Whatever you do in your loop body also takes time, and unless the inputs and outputs are always exactly the same, will almost certainly take a variable amount of time to do the work.
Instead of trying to time your loops, measure time with time.time() or timeit.default_timer (depending on how precise you need to be, on your Raspberry it'll be the same function) in a loop and adjust your actions based on that.

Related

Time variable in script became negative during execution

I have a Python program running on my Raspberry Pi 3B doing a bunch of image processing and so on. I wanted to gather some data measurements from the program by writing it into a .csv file, and I wanted to write the corresponding time with each measurement. I used time.clock() (see code snippet below) to find the time before each write operation, but somewhere between 2147 seconds and 2148 seconds, the time becomes negative (see snippet table below). I expect some kind over overflow occurred, but I'm having trouble understanding in which manner it overflowed. The Raspberry Pi is a 32 bit system, and as I understand, the time.clock() method returns a float. Shouldn't the time variable have overflowed only at much larger values, or is the time variable in this case not 32 bits?
Since then, I've read various threads that indicate time.time() might have been a better method for this use-case, and I might do that in future tests, but I just want to see what I can do with the values I've gathered thus far. I believe I can do some processing on the logged time to "de-overflow", for the lack of a better word, and use it as is. Any thoughts?
import time
import csv
def somefunction(someX, someY, csvwriter):
t = time.clock()
x = somefunc(someX)
y = somefunc(someY)
csvwriter.writerow([t, x, y])
return
Time (s)
X value
Y value
2146.978524
-0.0019
0.00032
2147.30423
-0.00191
0.00023
-2147.336675
-0.00182
0.00034
-2147.000555
-0.00164
0.00037
I doubt this is an 32-bit issue. The third bullet point near the beginning of the Python 3.7 documentation of the time module says:
The functions in this module may not handle dates and times before the epoch or far in the future. The cut-off point in the future is determined by the C library; for 32-bit systems, it is typically in 2038.
That said, I don't really know what the problem is. Perhaps using the time.perf_counter() or time.process_time() functions instead would avoid the issue.

Unexpected time.sleep() behaviour

Recently, when creating a loop with a very short wait at the end, I ran into an unexpected behaviour of time.sleep() when used in quick succession.
I used this piece of code to look further into my problem
import time
import statistics
def average_wait(func):
waits=[]
loops=0
while loops<1000:
start=time.time()
func(1/1000)
waits.append(time.time()-start)
loops+=1
print(waits)
print("Average wait for 0.001: {}".format(statistics.mean(waits)))
average_wait(time.sleep)
This function usually returns something around 0.0013 which is many many times less accurate than just calling time.sleep() once, upon further inspection of this problem by looking at the waits list, I found that the amount of time time.sleep() was actually sleeping for was either almost exactly the right amount of time or almost exactly double the amount of time.
Here is a sample from waits:
[0.0010008811950683594, 0.0020041465759277344, 0.0009999275207519531, 0.0019621849060058594, 0.0010418891906738281]
Is there any reason for this behaviour and anything that can be done to avoid it?
From the time.time() documentation:
Note that even though the time is always returned as a floating point number, not all systems provide time with a better precision than 1 second.
The precision is platform dependent. Moreover, it produces wall-clock time, and your process is never the only thing running on a modern OS, other processes also are given time to process and you'll see variation in timings in your own process because of that.
The module offers different clocks, with more precision and some are per-process. See the time.get_clock_info() function to see what precision they offer. Note time.process_time() offers per-process time but excludes sleep time.
Next, time.sleep() is also not going to sleep in exact time spans; again from the relevant documentation:
[T]he suspension time may be longer than requested by an arbitrary amount because of the scheduling of other activity in the system.
It too is subject to OS scheduling.
Together, these effects can easily add up to the millisecond variation in timings you see in your experiments. So this is not a doubling of time slept; even if you used different values for time.sleep() you'd still see a similar deviation from the requested time.

Is there a python module that will give an estimate of remaining time for a long running process?

I have a long running process that is mostly IO bound. It is basically just a loop uploading items somewhere, some of these items take more time than others, some days the whole process is slower so the time can't be hardcoded.
Is there a module that given the progress through the loop in terms of (current position, final position) could evaluate the first few iterations then give an estimate of the remaining time, but also update on every iteration?
I'm thinking something like the progress output you get from tools like wget and apt-get.
I guess I could write it myself but I wondered if something like this exists already.

What does "{built-in method mainloop}" mean in cProfile?

Sorted by total time, the second longest executing function is "{built-in method mainloop}" ? I looked at the same entry with pstats_viewer.py and clicked it and it says :
Function Exclusive time Inclusive time Primitive calls Total calls Exclusive per call Inclusive per call
Tkinter.py:359:mainloop 0.00s 561.03s (26.3%) 1 1 0.00s 561.03s
What does this mean?
Edit
Here's part of the cProfile output from a longer run of my code. The more ODE's I solve, the more time is devoted to mainloop. This is crazy! I thought that my runtime was getting killed by either branch divergence in my CUDA kernel or Host-GPU memory transfers. God, I'm a horrible programmer!
How have I made Tkinter take so much of my runtime?
mainloop is the event loop in Tkinter. It waits for events and processes them as they come in.
This is a recurring thing that you will see in all GUIs as well as any other event-driven frameworks like Twisted or Tornado.
First of all, it's a lot easier to see if you change tabs to spaces, as in:
Function Exclusive time Inclusive time Primitive calls Total calls Exclusive per call Inclusive per call
Tkinter.py:359:mainloop 0.00s 561.03s (26.3%) 1 1 0.00s 561.03s
Exclusive time means time that the program counter was in that routine. For a top-level routine you would expect this to be practically zero.
Inclusive time means including time in all routines that the routine calls. For a top-level routine you would expect this to be practically 100%.
(I don't understand what that 26.3% means.)
If you are trying to get more speed, what you need to do is find activity that 1) has a high percent inclusive time, and 2) that you can do something about.
This link shows the method I use.
After you speed something up, you will still find things that take a high percent inclusive time, but the overall elapsed time will be less.
Eventually you will get to a point where some things still take a high percent, but you can no longer figure out how to improve it.

Looping through huge ranges in python

How do I (efficiently) loop through huge ranges in Python? Whenever I try to run something like-
for x in range(105, 10000000000000):
#do stuff
it takes an eternity to complete. I've tried many things like iter, set, xrange but still no success. Is there any external module which can help me do that?
Given the range, I think it's safe to say that almost regardless of what you do or how you do it, it's going to take a while.
To be more specific, you're executing approximately 1E+13 iterations of your loop. With decent code and a reasonable fast processor, you might be able to execute around 4E+9 instructions per second (e.g., 2 instructions per clock cycle at 2 GHz). That obviously isn't exact, but for the moment let's just go with it, and see where we end up.
So, even if we assume each iteration of the loop requires executing only one, single-cycle instruction, it's going to take approximately: 1E+13/4E+9 = 2.5E3 seconds = ~42 minutes. Given that you're doing some work inside the loop, and there's some overhead for the loop itself, we're clearly going to be execution more than one machine-code instruction per iteration. If we have to execute, say, 1000 instructions per iteration (may be a reasonable first approximation, given that you haven't told us anything about what you're doing), then we're looking at something like 600-700 hours for the loop to execute.
Bottom line: changing how you represent your range may help, but if you need to iterate across a range this large, there's no getting away from the fact that it's going to take a while.
Assuming you have a 3GHz processor, and that you just need one cycle for each entry in the range, you would need ~3,333 seconds (~1 hour) to process it.
This leads me to think that there's somehting fundamentally wrong with what you're doing. Maybe you should restructure your problem.

Categories