Why do errors take longer than the script itself? - python

Something I've noticed is that when there is an error in our script (regardless of the programming language), it often takes longer to "execute" and then output the error, compared to its execution time when there are no errors in our script.
Why does this happen? Shouldn't outputting an error take less time because the script is not being fully run? Or does the computer still attempt to fully run the script regardless of whether there is an error or not?
For example, I have a Python script that takes approximately 10 seconds to run if there are no errors. When there is an error, however, it takes an average of 15 seconds. I've noticed something similar in NodeJS, so I'm just assuming that this is the case for many programming languages? Apologies if this is a bad question - I'm relatively new to programming and still lack some fundamental understandings.

The program doesn't attempt to run the script fully in case of an error, the execution is interrupted at the point where an error happens. This is by default but you can always set up your own exception handlers in your scripts which will execute some code.
Anyway, raising and handling (logging) exceptions also requires some code execution (internal code of the programming language) thus this also takes some time.
It's hard to tell why your script execution takes longer in case of an error without looking at your script though, I personally never noticed such differences in general but maybe I just didn't pay attention...

Related

"IOStream.flush timed out" errors when multithreading

I am new to Python programming and having a problem with a multithreaded program (using the "threading" module) that runs fine at the beginning, but after a while starts repeatedly printing "IOStream.flush timed out" errors.
I am not even sure how to debug such an error, because I don't know what line is causing it. I read a bit about this error and saw that it might be related to memory consumption, so I tried profiling my program using a memory profiler on the Spyder IDE. Nothing jumped out at me, however (although I admit that I am not sure what to look for when it comes to Python memory leaks).
A few more observations:
I have an outer loop that runs my function over a large number of files. The files are just number data with the same formatting (they are quite large though and there is download latency, which is why I have made my application multithreaded so that each thread works on different files). If I have a long list of files, the problem occurs. If I shorten the list, the program concludes without problem. I am not sure why that is, although if it is some kind of memory leak then I would assume when I run the program longer, the problem grows until it reaches some kind of memory limit.
Normally, I use 128 threads in the program. If I reduce the number of threads to 48 or less, the program works fine and completes correctly. So clearly the problem is caused by multithreading (I'm using the "threading" module). This makes it a bit trickier to debug and figure out what is causing the problem. It seems something around 64 threads starts causing problems.
The program never explicitly crashes out. Once it gets to the point where it has this error, it just keeps repeatedly printing "IOStream.flush timed out". I have to close the Spyder IDE to stop it (Restart kernel doesn't work).
Right before this error happens, the program appears to stall. At least no more "prints" happen to the console (the various threads are all printing debug information to the screen). The last lines printed are standard debugging/status print statements that usually work when the number of threads is reduced or the number of files to process is decreased.
I have no idea how to debug this and get to the bottom of the problem. Any suggestions on how to get to the bottom of this would be much appreciated. Thanks in advance!
Specs:
Python 3.8.8
Spyder 4.2.5
Windows 10

What are the potential reasons for my problem with running multiprocessed Python script from elsewhere?

I have a Python script that uses multiprocessing, specifically Pool().map() from the package multiprocessing.
When I run the script locally on my machine, I remember to say if __name__ == "__main__" before running my code, and I run it by simply pressing run in my IDE.
It works, does everything I expect it to.
However, at work, we have this server (I believe it uses C#) which takes Python scripts and executes them. When I upload my script to this server, the multiprocessing part of the code fails.
Is is hard for me to tell what's going on (I do not have access to the server so I have limited info to work with, and really this problem is not my responsibility, so I am asking purely out of curiosity), however, it seems that the code does not throw an error, but rather it seems to get stuck in an infinite loop of some sort, creating and ending new sessions over and over. No work happens inside these sessions, they just begin and end.
Moreover, the code never actually enters the multiprocessed part of the code (i.e. the functions that I map to the Pool) , since if it did, it would fail (since for debugging, I threw a raise Exception as soon as the code enters the code that is meant to run parallel, just to check if it ever reaches it … but it never does).
Any clue what is going on, and how it is meant to be fixed?

Implementing a dead man's switch for running code

I have some Python code that's a bit buggy. It runs in an infinite loop and I expect it to print something about once every 50 milliseconds, but sometimes it hangs and stops printing or even outright segfaults. Unfortunately, the problem seems to be fairly rare, so I've had trouble nailing down the exact cause of the issue. I want to have the code up and running while I debug the problem, so while I try to figure it out, I'd like to create a dead man's switch that runs my code, but stops if the code doesn't print anything in a certain time frame (say, 5 seconds) or exits and finally executes a command to notify me that something went wrong (e.g. 'spd-say terminated').
I put this into the terminal for my first attempt at this:
python alert.py; spd-say terminated;
Unfortunately, it didn't seem to work - at this point I realized that the code was not only crashing but also hanging (and I'm also not sure whether this would even work if the code crashes). Unfortunately, I'm not very familiar with bash yet (I assume that's what I'm using when I run stuff in the terminal), and I'm not sure how I could set up something like what I want. I'm also open to using other things besides bash to do this if it would be particularly difficult to do for some reason.
What would be the best way to implement what I want to do?
You could run two python programs with a pipeline between them. On one side you have you buggy script writing something on the pipeline every less than 5 seconds. On the receiving end of the pipeline you have a very simple script that checks how long it has been since it last received anything. If this time is more than 5 seconds.
This way you decouple your watchdog from your buggy script.

Debug a Python program which seems paused for no reason

I am writing a Python program to analyze log files. So basically I have about 30000 medium-size log files and my Python script is designed to perform some simple (line-by-line) analysis of each log file. Roughly it takes less than 5 seconds to process one file.
So once I set up the processing, I just left it there and after about 14 hours when I came back, my Python script simply paused right after analyzing one log file; seems that it hasn't written into the file system for the analyzing output of this file, and that's it. No more proceeding.
I checked the memory usage, it seems fine (less than 1G), I also tried to write to the file system (touch test), it also works as normal. So my question is that, how should I proceed to debug the issue? Could anyone share some thoughts on that? I hope this is not too general. Thanks.
You may use Trace or track Python statement execution and/or The Python Debugger module.
Try this tool https://github.com/khamidou/lptrace with command:
sudo python lptrace -p <process_id>
It will print every python function your program invokes and may help you understand where your program stucks or in an infinity loop.
If it does not output anything, that's proberbly your program get stucks, so try
pstack <process_id>
to check the stack trace and find out where stucks. The output of pstack is c frames, but I believe somehow you can find something useful to solve your problem.

Possible to run a delayed code execution?

Will it is possible to run a small set of code automatically after a script was run?
I am asking this because for some reasons, if I added this set of code into the main script, though it works, it will displays a list of tab errors (its already there, but it is stating that it cannot find it some sort).
I realized that after running my script, Maya seems to 'load' its own setup of refreshing, along with some plugins done by my company. As such, if I am running the small set of code after my main script execution and the Maya/ plugins 'refresher', it works with no problem. I had like to make the process as automated as possible, all within a script if that is possible...
Thus is it possible to do so? Like a delayed sort of coding method?
FYI, the main script execution time depends on the number of elements in the scene. The more there are, it will takes longer...
Maya has a command Maya.cmds.evalDeferred that is meant for this purpose. It waits till no more Maya processing is pending and then evaluates itself.
You can also use Maya.cmds.scriptJob for the same purpose.
Note: While eval is considered dangerous and insecure in Maya context its really normal. Mainly because everything in Maya is inherently insecure as nearly all GUI items are just eval commands that the user may modify. So the second you let anybody use your Maya shell your security is breached.

Categories