What does "{built-in method mainloop}" mean in cProfile? - python

Sorted by total time, the second longest executing function is "{built-in method mainloop}" ? I looked at the same entry with pstats_viewer.py and clicked it and it says :
Function Exclusive time Inclusive time Primitive calls Total calls Exclusive per call Inclusive per call
Tkinter.py:359:mainloop 0.00s 561.03s (26.3%) 1 1 0.00s 561.03s
What does this mean?
Edit
Here's part of the cProfile output from a longer run of my code. The more ODE's I solve, the more time is devoted to mainloop. This is crazy! I thought that my runtime was getting killed by either branch divergence in my CUDA kernel or Host-GPU memory transfers. God, I'm a horrible programmer!
How have I made Tkinter take so much of my runtime?

mainloop is the event loop in Tkinter. It waits for events and processes them as they come in.
This is a recurring thing that you will see in all GUIs as well as any other event-driven frameworks like Twisted or Tornado.

First of all, it's a lot easier to see if you change tabs to spaces, as in:
Function Exclusive time Inclusive time Primitive calls Total calls Exclusive per call Inclusive per call
Tkinter.py:359:mainloop 0.00s 561.03s (26.3%) 1 1 0.00s 561.03s
Exclusive time means time that the program counter was in that routine. For a top-level routine you would expect this to be practically zero.
Inclusive time means including time in all routines that the routine calls. For a top-level routine you would expect this to be practically 100%.
(I don't understand what that 26.3% means.)
If you are trying to get more speed, what you need to do is find activity that 1) has a high percent inclusive time, and 2) that you can do something about.
This link shows the method I use.
After you speed something up, you will still find things that take a high percent inclusive time, but the overall elapsed time will be less.
Eventually you will get to a point where some things still take a high percent, but you can no longer figure out how to improve it.

Related

Understanding Pycharm's profiler's results vs. cProfile results and how to get more detail on standard library functions

I am working on driving down the execution time on a program I've refactored, and I'm having trouble understanding the profiler output in PyCharm and how it relates to the output I would get if I run cProfile directly. (My output is shown below, with two lines of interest highlighted that I want to be sure I understand correctly before attempting to make fixes.) In particular, what do the Time and Own Time columns represent? I am guessing Own Time is the time consumed by the function, minus the time of any other calls made within that function, and time is the total time spent in each function (i.e. they just renamed tottime and cumtime, respectively), but I can't find anything that documents that clearly.
Also, what can I do to find more information about a particularly costly function using either PyCharm's profiler or vanilla cProfile? For example, _strptime seems to be costing me a lot of time, but I know it is being used in four different functions in my code. I'd like to see a breakdown of how those 2 million calls are spread across my various functions. I'm guessing there's a disproportionate number in the calc_near_geo_size_and_latency function, but I'd like more proof of that before I go rewriting code. (I realize that I could just profile the functions individually and compare, but I'm hoping for something more concise.)
I'm using Python 3.6 and PyCharm Professional 2018.3.
In particular, what do the Time and Own Time columns represent? I am guessing Own Time is the time consumed by the function, minus the time of any other calls made within that function, and time is the total time spent in each function (i.e. they just renamed tottime and cumtime, respectively), but I can't find anything that documents that clearly.
You can see definitions of own time and time here: https://www.jetbrains.com/help/profiler/Reference__Dialog_Boxes__Properties.html
Own time - Own execution time of the chosen function. The percentage of own time spent in this call related to overall time spent in this call in the parentheses.
Time - Execution time of the chosen function plus all time taken by functions called by this function. The percentage of time spent in this call related to time spent in all calls in the parentheses.
This is also confirmed by a small test:
Also, what can I do to find more information about a particularly costly function using either PyCharm's profiler or vanilla cProfile?
By default pycharm does use cProfile as a profiler. Perhaps you're asking about using cProfile on the command line? There are plenty of examples of doing so here: https://docs.python.org/3.6/library/profile.html
For example, _strptime seems to be costing me a lot of time, but I know it is being used in four different functions in my code. I'd like to see a breakdown of how those 2 million calls are spread across my various functions.
Note that the act of measuring something will have an impact on the measurement retrieved. For a function or method that is called many times, especially 2 million, the profiler itself will have a significant impact on the measured value.

Unexpected time.sleep() behaviour

Recently, when creating a loop with a very short wait at the end, I ran into an unexpected behaviour of time.sleep() when used in quick succession.
I used this piece of code to look further into my problem
import time
import statistics
def average_wait(func):
waits=[]
loops=0
while loops<1000:
start=time.time()
func(1/1000)
waits.append(time.time()-start)
loops+=1
print(waits)
print("Average wait for 0.001: {}".format(statistics.mean(waits)))
average_wait(time.sleep)
This function usually returns something around 0.0013 which is many many times less accurate than just calling time.sleep() once, upon further inspection of this problem by looking at the waits list, I found that the amount of time time.sleep() was actually sleeping for was either almost exactly the right amount of time or almost exactly double the amount of time.
Here is a sample from waits:
[0.0010008811950683594, 0.0020041465759277344, 0.0009999275207519531, 0.0019621849060058594, 0.0010418891906738281]
Is there any reason for this behaviour and anything that can be done to avoid it?
From the time.time() documentation:
Note that even though the time is always returned as a floating point number, not all systems provide time with a better precision than 1 second.
The precision is platform dependent. Moreover, it produces wall-clock time, and your process is never the only thing running on a modern OS, other processes also are given time to process and you'll see variation in timings in your own process because of that.
The module offers different clocks, with more precision and some are per-process. See the time.get_clock_info() function to see what precision they offer. Note time.process_time() offers per-process time but excludes sleep time.
Next, time.sleep() is also not going to sleep in exact time spans; again from the relevant documentation:
[T]he suspension time may be longer than requested by an arbitrary amount because of the scheduling of other activity in the system.
It too is subject to OS scheduling.
Together, these effects can easily add up to the millisecond variation in timings you see in your experiments. So this is not a doubling of time slept; even if you used different values for time.sleep() you'd still see a similar deviation from the requested time.

Is there a python module that will give an estimate of remaining time for a long running process?

I have a long running process that is mostly IO bound. It is basically just a loop uploading items somewhere, some of these items take more time than others, some days the whole process is slower so the time can't be hardcoded.
Is there a module that given the progress through the loop in terms of (current position, final position) could evaluate the first few iterations then give an estimate of the remaining time, but also update on every iteration?
I'm thinking something like the progress output you get from tools like wget and apt-get.
I guess I could write it myself but I wondered if something like this exists already.

Does python iterate at a constant speed?

I writing some code to get sensor readings from GPIO against time.
To make sure the measurements corresponds to a specific time, I want to know if python iterates at a constant speed (so that the gap between iterations is constant) - and what is its minimum time gap between iterations.
If they're not, can someone let me know how to make the time gap constant.
Thank you!
No, Python does not and can not iterate at constant speed.
Python is just another process on your Raspberry PI, and your OS is responsible for allocating it time to run on the CPU (called multi-tasking). Other processes also get allotted time. This means Python is never going to be running all the time and any processing times are going to be depend on what the other processes are doing.
Iteration itself is also delegated to specific types; how the next item is produced then varies widely, and even if Python was given constant access to the CPU iteration would still vary. Whatever you do in your loop body also takes time, and unless the inputs and outputs are always exactly the same, will almost certainly take a variable amount of time to do the work.
Instead of trying to time your loops, measure time with time.time() or timeit.default_timer (depending on how precise you need to be, on your Raspberry it'll be the same function) in a loop and adjust your actions based on that.

Looping through huge ranges in python

How do I (efficiently) loop through huge ranges in Python? Whenever I try to run something like-
for x in range(105, 10000000000000):
#do stuff
it takes an eternity to complete. I've tried many things like iter, set, xrange but still no success. Is there any external module which can help me do that?
Given the range, I think it's safe to say that almost regardless of what you do or how you do it, it's going to take a while.
To be more specific, you're executing approximately 1E+13 iterations of your loop. With decent code and a reasonable fast processor, you might be able to execute around 4E+9 instructions per second (e.g., 2 instructions per clock cycle at 2 GHz). That obviously isn't exact, but for the moment let's just go with it, and see where we end up.
So, even if we assume each iteration of the loop requires executing only one, single-cycle instruction, it's going to take approximately: 1E+13/4E+9 = 2.5E3 seconds = ~42 minutes. Given that you're doing some work inside the loop, and there's some overhead for the loop itself, we're clearly going to be execution more than one machine-code instruction per iteration. If we have to execute, say, 1000 instructions per iteration (may be a reasonable first approximation, given that you haven't told us anything about what you're doing), then we're looking at something like 600-700 hours for the loop to execute.
Bottom line: changing how you represent your range may help, but if you need to iterate across a range this large, there's no getting away from the fact that it's going to take a while.
Assuming you have a 3GHz processor, and that you just need one cycle for each entry in the range, you would need ~3,333 seconds (~1 hour) to process it.
This leads me to think that there's somehting fundamentally wrong with what you're doing. Maybe you should restructure your problem.

Categories