Related question on SO (by myself earlier today): Why does error traceback show edited script instead of what actually ran? Now I know why it happens, then I want to now how I can deal with it.
I see some questions like How do I debug efficiently with spyder in Python? and How do I print debug messages in the Google Chrome JavaScript Console? well received, so I suppose asking about debugging practices is on-topic, right?
Background
I write a script that raises exception at line n, run it from terminal, add a line in the middle while the script is still running, and save the modified file. So the script file is modified while the interpreter is running it. Especially the line number of the very line that will raise exception has changed. The error traceback report by the Python interpreter shows me line n of the "modified" version of the script, not of the actual "running" version.
Minimal Example
Let's say I run a script:
import time
time.sleep(5)
raise Exception
and while the interpreter is stuck at time.sleep(5), I add a line after that one.
So now I have:
import time
time.sleep(5)
print("Hello World")
raise Exception
Then the interpreter wakes up from sleep, the next command, raise Exception is executed, and the program terminates with the following traceback.
Traceback (most recent call last):
File "test/minimal_error.py", line 4, in <module>
print("Hello World")
Exception
So it correctly reports the line number (from the original script, so actually useless if we only have the modified script) and the error message ("Exception"). But it shows a totally wrong line of code that actually raised the error; if it were to be of any help, raise Exception should be displayed, not print("Hello World"), which wasn't even executed by the interpreter.
Why this matters
In real practice, I implement one part of a program, run it to see if that part is doing fine, and while it is still running, I move on to the next thing I have to implement. And when the script throws an error, I have to find which actual line of code caused the error. I usually just read the error message and try to deduce the original code that caused it.
Sometimes it isn't easy to guess, so I copy the script to clipboard and rollback the code by undoing what I've written after running the script, check the line that caused error, and paste back from clipboard. Sometimes this is very annoying because it isn't always possible to remember the exact state of the script when I ran it. ("Do I need to undo more to rollback? Or is this the exact script I ran?")
Sometimes the script will run for more than 10 minutes, or even an hour before it raises an exception. In such case, "rollback by undo" is practically impossible. Sometimes I even don't know how long script will run before actually running it. I apparently can't just sit and keep my script unmodified before it terminates.
Question
By what practice can I correctly track down the command that caused the exception?
One hypothetical solution is to copy the script to a new file every time I want to run it, run the copied version, and keep editing the original one. But I think this is too bothersome to do every ten minutes, whenever I need to run a script to see if it works well.
Another way is to git-commit every time I want to run it, so that I can come back and see the original version when I need to, but this will make the commit history very dirty, so I think this is even worse than the other one.
I also tried python -m pdb -m script.py, but it shows the same "modified version of line n", just as the plain traceback does.
So is there any practical solution I can practice, say, every ten minutes?
Instead of committing every time you run the script, simply use git stashing, this way you will not add dirty commits to your history.
So before you run a script, git stash your local changes, inspect the error, then git stash pop.
Read more about git stash here.
This solution assumes that the script running is at the HEAD of the current branch,
Another solution if the above condition doesn't apply, is to create an arbitrary branch, call it (running-script), git stash your local changes that are not yet commited, checkout to this new branch, git apply stash and run the script. Then checkout back to your original branch, re-apply the stash and resume your work.
You could simply write a bash script file that automates this process as follows
git stash
git checkout -b running-script # potential param
git stash apply stash
RUN script # replace with the actual command to run the script in the background
git checkout original-branch # potential param
git stash apply stash
You could have the running-script and original-branch passed to the bash file as params.
#chepner's comment is valid:
I'm pretty sure the practical solution is "don't do this". Don't modify running code.
As a relatively simple workaround, you could accomplish this with a bash script (or similar scripted approach in whatever environment you are using if bash isn't available).
For bash, a script like the one below would work. It takes the filename as a parameter and uses date to create a unique temporary filename, then copies the file to it and executes it. In this manner, you always have a static copy of the running code and you can use aliasing to make it trivial to use:
filename=$1
# extract file name and extension
extension="${filename##*.}"
filename="${filename%.*}"
# create a unique temporary name (using date)
today=`date +%Y-%m-%d-%H:%M:%S` # or whatever pattern you desire
newname="$filename-$today.$extension"
# copy and run the python script
cp $1 $newname
echo "Executing from $newname..."
/path/to/python $newname
# clean it up when done, if you care to
rm $newname
You could then alias this to python if you want so you don't have to think about doing it, with something like this in your .bashrc or .bash_aliases:
alias python="source path/to/copy_execute.sh"
Although it may be better to give it a different name, like
alias mypy="source path/to/copy_execute.sh"
Then, you can run your script, modify, and run some more with mypy myscript.py and you won't ever be editing the currently executing code.
One drawback is that while this script will clean up and delete the files after it is done running, it will create a lot of temp files that will be around while it runs. To get around this, you could always copy to somewhere in /tmp or elsewhere where the temporary files won't get in the way. Another issue is that this get's more complicated for large code bases which you may not want to copy all over the place. I'll leave that one to you.
A similar approach could be crafted for Windows with powershell or cmd.
I'm probably going to give an oversimplified answer, and may not be applicable in every scenario.
Use PyCharm
I usually work with code that takes from minutes to hours to finish and I need to constantly run it to see how it is performing and I continue coding while it runs. If it fails, I receive the original line that threw the error.
I also have to run it in an GUI-less Ubuntu server, so this is how I do it to receive the right error every time:
Code in Pycharm.
Test in PyCharm and continue coding. (I get the right error if it
fails)
Once I'm comfortable with performance, I move it to the server and
run it again (I also get the right error here)
I am not saying that it will be completely avoided, but you may reduce this error.
If you are coding all your logic in a single file then stop doing it.
Here are the few recommendation..
split your code logic into multiple files. examples ..
utility,
helper,
model,
component,
train,
test,
feature
make your function as small as 10 lines (if possible)
if you are using class that should not be more that 125 lines
file size should not cross 150 line
Now if there is any Exceptions occur then it traceback might spread into more number of file and i guess not all files get modified in one shot to implement your changes.
Good news is, if your exception started from a file which you have not changed then its easy to catch that line and fix it, else it will be a minimum effort to find exact line.
if you are also using git and you have not committed then you can also compare revision to get exact code which might causing error.
Hope this minimize your problem.
Related
I am writing a Python program to analyze log files. So basically I have about 30000 medium-size log files and my Python script is designed to perform some simple (line-by-line) analysis of each log file. Roughly it takes less than 5 seconds to process one file.
So once I set up the processing, I just left it there and after about 14 hours when I came back, my Python script simply paused right after analyzing one log file; seems that it hasn't written into the file system for the analyzing output of this file, and that's it. No more proceeding.
I checked the memory usage, it seems fine (less than 1G), I also tried to write to the file system (touch test), it also works as normal. So my question is that, how should I proceed to debug the issue? Could anyone share some thoughts on that? I hope this is not too general. Thanks.
You may use Trace or track Python statement execution and/or The Python Debugger module.
Try this tool https://github.com/khamidou/lptrace with command:
sudo python lptrace -p <process_id>
It will print every python function your program invokes and may help you understand where your program stucks or in an infinity loop.
If it does not output anything, that's proberbly your program get stucks, so try
pstack <process_id>
to check the stack trace and find out where stucks. The output of pstack is c frames, but I believe somehow you can find something useful to solve your problem.
I have a created a python to get input files from windows folder and updated the excel sheet every 15 minutes. Program is always open - running in background.
Program was running properly for 2 weeks and suddenly the program closed with error message 'A problem caused the program stop working correctly and was closed". I have checked the log files and didn't see any error message.
I checked the Windows log viewer and error was present with below text, which i could not interpret properly. Can anyone please let me the possible causes for the error.
Program.exe
0.0.0.0
5a2e9e81
python36.dll
3.6.5150.1013
5abd3161
c00000fd
0000000000041476
1ba8
01d45e9fe43cba57
C:\Python code\program.exe
C:\Users\aisteam\AppData\Local\Temp\2_MEI51602\python36.dll
a9da018c-e2e3-4821-9387-cce82ff29186
Make sure that your python code robustly handles errors like when the file it wants to update is locked, which is what Excel does while the file is open in Excel. by design, you could easily make your code create a new excel file each time, or wait until the file isn’t locked then update it. Either way, you need to make your code better at telling you what it is doing, e.g. by logging what it is doing (which is important to implement now because the logging needs to be in place before your code stops unexpectedly for, err, an unexpected reason), e.g. by carefully managing exceptions (i.e. don’t simply code as try/except:pass!)
BUT don’t do this sort of code with an unconditional except and nothing but a pass in the except: statement) because it will make errors HARDER to figure out:
try:
something
except:
pass
Always be specific about the exception you expect, and even if you are going to not raise, always always always log the exception.
So this one is a doozie, and a little too specific to find an answer online.
I am writing to a file in C++ and reading that file in Python at the same time to move a robot. Or trying to.
When I try running both programs at the same time, the C++ one runs first and then the Python one runs.
Here's the command I use:
./ColorFollow & python fileToHex.py
This happens even if I switch the order of commands.
Even if I run them in different terminals (which is the same thing, just covering all bases).
Both the Python and C++ code read / write in 'infinite' loops, so these two should run until I say stop.
The code works fine; when the Python script finally runs the robot moves as intended. It's just that the code doesn't run at the same time.
Is there a way to make this happen, or is this impossible?
If you need more information, lemme know, but the code is pretty much what you'd expect it to be.
If you are using Linux, & will release bash session and in this case, CollorFlow and fileToXex.py will run in different bash sessions.
At the same time, composition ./ColorFollow | python fileToHex.py looks interesting, cause you redirect stdout of ColorFollow to fileToHex.py stdin - it can syncronize scripts by printing some code string upon exit, then reading it by fileToHex.py and exit as well.
I would create some empty file like /var/run/ColorFollow.flag and write there 1 when one of processes exit. Not a pipe - cause we do not care which process will start first. So, if next loop step of ColorFollow sees 1 in the file, it deletes it and exits (means that fileToHex already exited). The same - for fileToHex - check flag file each loop step and exit if it exists, after deleting flag file.
I have a project that requires an endless loop that I wrote in Python. This scrip will loop through an array of variables (Created once at the beginning of the script) perform a task using each one, and start over. This works fine 99.9% of the time, but every once in a while the electrons get stuck and it crashes. THIS IS NOT AN ISSUE, and is caused by hardware constraints not my script. The error my Python scrip outputs is as follows, and is caused if my Arduino on the other side of the I2C bus is busy doing something else and unable to respond:
Traceback (most recent call last):
File "i2cGet.py", line 105, in <module>
i2c.i2cWrite(0x54, 1, fanCurrent)
File "/home/pi/arduino_Interface/i2c.py", line 43, in i2cWrite
bus.write_byte(address,data)
IOError: [Errno 5] Input/output error
To take care of this, I have a bash "starter" that will restart the python script when it does crash using until python script.py do. This also works just fine, but I am working on my event logging for this process, and need a way of knowing which variable in the array my python script crashed on so I can insert it into the log. Presumably this would be done with a custom event handler in Python, but I am having a hard time figuring out what needs to be done to accomplish this task.
The question in summary: How do I create a custom event handler in Python, and then retrieve the event for use in my bash starter script?
Please excuse my terminology for event/event handler. I am simply looking for a way, in the event of an error, pass the last element in the array I was working with. In this case the 0x54 that is in the i2cWrite function, but it may not always be listed there if the script crashes somewhere else in the loop. Perhaps the more correct term is error handler, not event handler...
At this point, the only reason I am doing this in Bash is I had a starter script i used for persistence in a previous project that worked well, and it was easy to adapt it to fit this project. If it is easier to handle this with a Python starter, I can easily (I think) port this over.
Thanks for your help!
until event=$(python script.py)
do
...
done
The python script should print the event to stdout, and you can then refer to $event in the body of the loop.
I'm writing a python script that performs a series of operations in a loop, by making subprocess calls, like so:
os.system('./svm_learn -z p -t 2 trial-input model')
os.system('./svm_classify test-input model pred')
os.system('python read-svm-rank.py')
score = os.popen('python scorer.py -g gold-test -i out').readline()
When I make the calls individually one after the other in the shell they work fine. But within the script they always break. I've traced the source of the error and it seems that the output files are getting truncated towards the end (leading me to believe that calls are being made without previous ones being completed).
I tried with subprocess.Popen and then using the wait() method of the Popen object, but to no avail. The script still breaks.
Any ideas what's going on here?
I'd probably first rewrite a little to use the subprocess module instead of the os module.
Then I'd probably scrutinize what's going wrong by studying a system call trace:
http://stromberg.dnsalias.org/~strombrg/debugging-with-syscall-tracers.html
Hopefully there'll be an "E" error code near the end of the file that'll tell you what error is being encountered.
Another option would be to comment out subsets of your subprocesses (assuming the n+1th doesn't depend heavily on the output of the nth), to pin down which one of them is having problems. After that, you could sprinkle some extra error reporting in the offending script to see what it's doing.
But if you're not put off by C-ish syscall traces, that might be easier.