How does a good python debugging workflow look like? - python

My latest python debugging workflow appears extremely slow to me, and little satifying. How can I improve?
Setting: I work with some third-party python packages from github.
Workflow:
run into error after entering some command to the terminal (Ubuntu WSL, python 3.7)
read terminal error message output, most likely the first or last one is helpful
from the last message i take the code reference (ctrl+left mouse in vscode) and look at the code
i find some function call in the third party module that looks very unrelated to the problem
i add import pdb to the module, and a pdb.set_trace() before that function call
i run the program again, and it stops at the breakpoint
using n,r,u,d i try to navigate closer to the source of the error
i eventually find some error raise condition in some other module, where some property of a certain variable is checked. the variable itself is defined some levels up in the stack
re-running the program and stopping at the same breakpoint as before, i try to navigate to the point where the variable is set. I don't know on which level of the stack it is set, so i miss it sometimes. I set intermediate breakpoints to save me some work when re-running
i finally find the actual cause of the error. I can check out the workspace and eventually fix the error.
i go through all the modules and remove the import pdb and the pdb.set_trace
Thanks for any suggestions

are you using an IDE, not fully clear in your question?
they tend to have graphic ways of setting breakpoints and stepping,
and it saves the hassle of changing the source.
not going into ide opinions, but examples of ide's with debuggers are spyder, thonny and others.
you can also run the debugger via commandline to avoid changing source, but I don't think that's the way to go if you are looking to simplify the cognotive load.

Yes these things you have to do and in extra you can do include logging everywhere as applicable to get exact point where it got occurred.

Related

how to make vscode detect / auto reload modules after editing them?

I've seen a few questions asking this, but none of the solutions worked for me.
I am developing a few functions/classes in different modules and have a main.py script that calls everything.
The problem is, when I make a change to a function in another module i.e. module1.py, VSCode does not detect the changes when I call the function in main.py after updating, it's still the older version.
I can get around this by doing something like:
from importlib import reload
reload module1
but this gets old real quick especially when I'm importing specific functions or classes from a module.
Simply re-running the imports at the top of my main.py doesn't actually do anything, I can only do that if I kill the shell and reopen it from the begining, which is not ideal if I am incrementally developing something.
I've read on a few questions that I could include this:
"files.useExperimentalFileWatcher" : true
into my settings.json, but it does not seem to be a known configuration setting in my version, 1.45.1.
This is something Spyder handles by default, and makes it very easy to code incrementally when calling functions and classes from multiple modules in the pkg you are developing.
How can I achieve this in VSCode? To be clear, I don't want to use IPython autoreload magic command.
Much appreciated
FYI here are the other questions I saw, but did not get a working solution out of, amongst others with similar questions/answers :
link1
link2
There is no support for this in VS Code as Python's reload mechanism is not reliable enough to use outside of the REPL, and even then you should be careful. It isn't a perfect solution and can lead to stale code lying about which can easily trip you up (and I know this because I wrote importlib.reload() 😁).

Detect all missing imports in PyCharm/Python

Is there a simple way, preferably in PyCharm (2017.1) but via command-line python (3.5) if necessary, to detect all code places where a statement is referring to an unresolved reference, e.g. because an import statement is missing?
I am new to Python/PyCharm. More generally, any syntax errors or anything in a similar vein would be a bonus. All I am looking for is the kind of errors I would get if I were "compiling" and "linking" in another language.
I have looked at Can PyCharm list all of Python errors in a project? and PyCharm's "Inspect Code". It is way more complex than I had in mind (and takes ages to run). I see that Python Rope: How to Find all missing imports and errors in all sub modules refactoring recommends pylint, but I wasn't looking for lint-like. I just want darn-obvious errors!
I am tasked with porting a fair-size (32K lines) application, which (apparently) runs under Windows, to Linux. The first thing I want to do is get rid of some of the imports all over the place. If my application executes a line which then has an unresolved reference I get a runtime error, but I want to pick them all up at edit-time. And there will be paths of code which are Windows-only, but I still want to know of any errors like this.
To answer my own question:
From Can PyCharm list all of Python errors in a project?, you can indeed use Code|Inspect Code to get all these errors/warnings in PyCharm as a list where you can click to get to the code. It does take a long time, but at least it's built into PyCharm, and the errors reported correspond to what you see in the editor window.
Pycharm will list all errors and 'warnings' for each source file at the right-hand side of the editor window.
They are represented as short lines or small blocks, depending on the size of the error or 'warning'. Errors are shown in red.
You click on them to take you to the place of the problem in the source.
Warnings are mostly Python style-guide violations (PEP 8).
Another option is to use pylint. This is lint for Python. This will detect missing imports.
You can integrate it into PyCharm by following the instructions in https://stackoverflow.com/a/46409649/4459346

SPSS Python Custom Extension Command Debugging

I have created a Custom Extension Command in Python. I installed it, but as expected I am getting errors (quote from SPSS log output - the only way I know for debugging Python programs in SPSS):
Extension command TEST_EXTENSION could not be loaded. The module or a module that it requires may be missing, or there may be syntax errors in it.
The error is probably from the .xmlor from the Run(args) function. The CustomFunction() I am implementing was tested thoroughly.
What would be a good practice for debugging this, and the other potential errors ? The official IBM-SPSS-Statistics-Extension-Command says to
set the
SPSS_EXTENSIONS_RAISE
environment variable to "true"
but I don't know how to do that, nor of this will work regardless of the source of the error.
#horace
You set the environment variable on Windows via the Control Panel > System > Advanced system settings > Environment Variables. The exact wording varies with different Windows versions. I usually choose System variables, although either will usually work. You need to restart Statistics after that. Once you have set this variable, errors in the Python code will produce a traceback. The traceback is ordinarily suppressed as it is of no use to users, but it is very helpful for developers.
The traceback only appears for errors in the Python code. The "could not be loaded" error you reported happens before Python gets control, so no traceback would be produced. There are two common causes for this error. The first is that the xml file defining the extension command or the corresponding Python module was not found by Statistics. The extension command definitions are loaded at Statistics startup or by running the EXTENSION command. Execute SHOW EXT. from the Syntax Editor to see the places where Statistics looks for extension files.
The second cause is a syntax error in the Python code. Run
begin program.
import yourmodule
end program.
to see if any errors are reported.
More generally, there are two useful strategies for debugging. The first is to run the code in external mode, where you run the code from Python. That way you can step through the code using your IDE or the plain Python debugger. See the programmability documentation for details. There are some limitations on what can be done in external mode, but it is often a good solution.
The second is to use an IDE that supports remote debugging. I use Wing IDE, but there are other IDEs that can do this. That lets me jump into the debugger from within Statistics, step through the Python code, and do all the other things you want in a debugger.
HTh

Is there a way to see which lines of the code are currently running?

I guess this is kind of a weird question, but let's say you run a code in python that does something computationally expensive, like image processing. Oh I'm running Ubuntu 12.04 by the way. So I'm running a code, and open another terminal and type top to see what's doing what. This is ok as it tells me that python is doing its job, but what if I want to see which line is being run on the code? Is this possible? More importantly is it worth it to get this information? I can post a sample code of some of the processing if necessary
Don't blink, unless your "line of code" is unbelievably slow there is no way for such a thing to be useful. What you probably want is a Python Profiler. I suggest you start looking in http://docs.python.org/2/library/profile.html for info related to profiling your python code.
It usually is very slow but you can trace you code:
python -m trace --count -C . somefile.py ...
More manual but traditional way is logging: you can insert print statements before and after slow operations.
You can find slow places in you code using a profiler.
And you can run your code step by step with a debugger. Just insert import pdb; pdb.set_trace() (or ipdb if you like ipython) before slow operation.
This is the classic use case for a debugger. Have a look at Eclipse with the PyDev plugin, which is an IDE for Python with a useful debugger integration.
For example, a debugger allows you to set breakpoints where the execution will stop in order to let you manually step through the relevant lines of code to see how it goes. At the same time, you can inspect the variables' contents. You will thereby get a better understanding of what is happening, where and why it fails, and so on.
Go and get yourself a debugger!

Python pdb not breaking in files properly?

I wish I could provide a simple sample case that occurs using standard library code, but unfortunately it only happens when using one of our in-house libraries that in turn is built on top of sql alchemy.
Basically, the problem is that this break command:
(Pdb) print sqlalchemy.engine.base.__file__
/prod/eggs/SQLAlchemy-0.5.5-py2.5.egg/sqlalchemy/engine/base.py
(Pdb) break /prod/eggs/SQLAlchemy-0.5.5-py2.5.egg/sqlalchemy/engine/base.py:946
Is just being totally ignored, it seems, by pdb. As in, even though I am positive the code is being hit (both because I can see log messages, and because I've used sys.settrace to check which lines in which files are being hit), pdb is just not breaking there.
I suspect that somehow the use of an egg is confusing pdb as to what files are being used (I can't reproduce the error if I use a non-egg'ed library, like pickle; there everything works fine).
It's a shot in the dark, but has anyone come across this before?
I wonder if somehow there's an old .pyc that can't be deleted because of permissions being messed up. Nuke all of the .pycs in your python-path, and see if that helps.
This blog post might be related to your trouble.
I don't suppose this is yet another problem caused by setuptools? I ask because I notice the ".egg" in that path...
What version of python are running? I observe similar behavior on python 2.7.3. Curiously, I do not see the same behavior on ipython 0.12.1.
In python 2.7.3, the debugger and the stack trace get the point where an exception occurred wrong.
In ipython 0.12.1, the debugger and the stack trace get the point where the exception occurs is correct, but once the exception occurs, then the program exits, which makes post mortem debugging difficult.

Categories