I have a core dump of a running CPython program and would like to execute Python code in the dumped process's context.
I have loaded the core and the interpreter into gdb with gdb python core-dump-file.
I know about python-interactive, but it isn't able to see the context (ex: import sys; sys.modules doesn't give me any of the process's modules)
How can I do this?
I don't mind calling CPython's C functions if that is the only possible way.
1) First check if your gdb has been built with python from source.
You can do this(in the gdb prompt) by:
(gdb) python print("Hi from python")
If you want to check the version of python in your system try:
(gdb) python import sys
(gdb) python print(sys.version)
If these commands fail. It probably means that your gdb was never built with python support in the first place.
You should build gdb from source, and in the configure step add --with-python="Path to python"
eg.
./configure --with-python=/usr/bin/python36
Hope this helps!!
I am debugging decode_raw_op_test from TensorFlow. The test file is written in python however it executes code from underlying C++ files.
Using pdb, I could debug python test file however it doesn't recognize c++ file. Is there a way in which we can debug underlying c++ code?
(I tried using gdb on decode_raw_op_test but it gives "File not in executable format: File format not recognized")
Debugging a mixed Python and C++ program is tricky. You can use gdb to debug the C++ parts of TensorFlow, however. There are two main ways to do this:
Run python under gdb, rather than the test script itself. Let's say that your test script is in bazel-bin/tensorflow/python/kernel_tests/decode_raw_op_test. You would run the following command:
$ gdb python bazel-bin/tensorflow/python/kernel_tests/decode_raw_op_test
(gdb) run
Note that gdb does not have great support for debugging the Python parts of the code. I'd recommend narrowing down the test case that you run to a single, simple test, and setting a breakpoint on a TensorFlow C API method, such as TF_Run, which is the main entry point from Python into C++ in TensorFlow.
Attach gdb to a running process. You can get the process ID of the Python test using ps and then run (where $PID is the process ID):
$ gdb -p $PID
You will probably need to arrange for your Python code to block so that there's time to attach. Calling the raw_input() function is an easy way to do this.
Could debug using below steps:
gdb python
then on gdb prompt, type
run bazel-bin/tensorflow/python/kernel_tests/decode_raw_op_test
Adding on mrry's answer, in today's TF2 environment, the main entry point would be TFE_Execute, this should be where you add the breakpoint.
I have a python script test.py.
Running python test.py gives the brief message:
Segmentation fault: 11
In general, where should I start to debug such an issue?
In general (and without specifics for your code), the best thing to do is start putting print statements into your test script until you can narrow things down to a single line that is causing the segfault. Put in a bunch of print statements and move them around as you start narrowing down which print statements run and which ones don't because they are after the segfault.
If your Python script is causing a segmentation fault, that usually means that some Python module that is implemented in C is doing something wrong. You should be able to tell easily using gdb. Try running:
gdb `which python`
# This starts an interactive gdb session. Type:
set args /path/to/python/script.py
r
# The program will now run. Interact with it until the segfault occurs. Then type:
bt
This will give you the c call stack leading up to the segmentation fault. (gdb may print a message about missing debug symbols, and give you the command to run to install them. Debug symbols will give you a more detailed stack trace, including function names and line numbers of files.) Use this information to more quickly determine what call in Python is causing the segfault.
How can I examine the code of a python built-in function, for example step into sum()?
https://docs.python.org/2/library/functions.html#sum.
I expected to see what sum() does using the code below and s command in pdb:
import pdb
def adder(nums):
x = sum(nums)
return x
pdb.set_trace()
print adder([1, 2, 3,4])
Some of the Python modules are written in C (to increase performance) and cannot be stepped through in pdb. If you really want to see what's going on in these functions it's possible, but not trivial. To examine C functions I typically use the GNU Debugger (GDB) and compile Python with debugging symbols enabled.
Download the Python source code found at https://www.python.org/downloads/
Untar the Python source code | tar xzvf Python-2.7.6.tar.gz
Enter the untarred directory and run the configuration script using | ./configure
Compile with debug symbols | make -g
Start your custom compiled debug Python with the GNU Debugger | gdb ./python
Set a breakpoint in GDB for the sum() call | b bltinmodule.c:builtin_sum.
Run your script from GDB (I called mine sumtest.py) | run ~/sumtest.py
The first thing that happens is you get prompted for your PDB call. Continue using c.
The next break is in the middle of the sum function in C. You can use info locals to list all the local variables. Just like in PDB c is used to continue execution to the next breakpoint amd s is used to step through single instructions.
In Java/C# you can easily step through code to trace what might be going wrong, and IDE's make this process very user friendly.
Can you trace through python code in a similar fashion?
Yes! There's a Python debugger called pdb just for doing that!
You can launch a Python program through pdb by using pdb myscript.py or python -m pdb myscript.py.
There are a few commands you can then issue, which are documented on the pdb page.
Some useful ones to remember are:
b: set a breakpoint
c: continue debugging until you hit a breakpoint
s: step through the code
n: to go to next line of code
l: list source code for the current file (default: 11 lines including the line being executed)
u: navigate up a stack frame
d: navigate down a stack frame
p: to print the value of an expression in the current context
If you don't want to use a command line debugger, some IDEs like Pydev, Wing IDE or PyCharm have a GUI debugger. Wing and PyCharm are commercial products, but Wing has a free "Personal" edition, and PyCharm has a free community edition.
By using Python Interactive Debugger 'pdb'
First step is to make the Python interpreter to enter into the debugging mode.
A. From the Command Line
Most straight forward way, running from command line, of python interpreter
$ python -m pdb scriptName.py
> .../pdb_script.py(7)<module>()
-> """
(Pdb)
B. Within the Interpreter
While developing early versions of modules and to experiment it more iteratively.
$ python
Python 2.7 (r27:82508, Jul 3 2010, 21:12:11)
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pdb_script
>>> import pdb
>>> pdb.run('pdb_script.MyObj(5).go()')
> <string>(1)<module>()
(Pdb)
C. From Within Your Program
For a big project and long-running module, can start the debugging from inside the program using
import pdb and set_trace()
like this :
#!/usr/bin/env python
# encoding: utf-8
#
import pdb
class MyObj(object):
count = 5
def __init__(self):
self.count= 9
def go(self):
for i in range(self.count):
pdb.set_trace()
print i
return
if __name__ == '__main__':
MyObj(5).go()
Step-by-Step debugging to go into more internal
Execute the next statement… with “n” (next)
Repeating the last debugging command… with ENTER
Quitting it all… with “q” (quit)
Printing the value of variables… with “p” (print)
a) p a
Turning off the (Pdb) prompt… with “c” (continue)
Seeing where you are… with “l” (list)
Stepping into subroutines… with “s” (step into)
Continuing… but just to the end of the current subroutine… with “r” (return)
Assign a new value
a) !b = "B"
Set a breakpoint
a) break linenumber
b) break functionname
c) break filename:linenumber
Temporary breakpoint
a) tbreak linenumber
Conditional breakpoint
a) break linenumber, condition
Note:**All these commands should be execute from **pdb
For in-depth knowledge, refer:-
https://pymotw.com/2/pdb/
https://pythonconquerstheuniverse.wordpress.com/2009/09/10/debugging-in-python/
There is a module called 'pdb' in python. At the top of your python script you do
import pdb
pdb.set_trace()
and you will enter into debugging mode. You can use 's' to step, 'n' to follow next line similar to what you would do with 'gdb' debugger.
Starting in Python 3.7, you can use the breakpoint() built-in function to enter the debugger:
foo()
breakpoint() # drop into the debugger at this point
bar()
By default, breakpoint() will import pdb and call pdb.set_trace(). However, you can control debugging behavior via sys.breakpointhook() and use of the environment variable PYTHONBREAKPOINT.
See PEP 553 for more information.
ipdb (IPython debugger)
ipdb adds IPython functionality to pdb, offering the following HUGE improvements:
tab completion
show more context lines
syntax highlight
Much like pdg, ipdb is still far from perfect and completely rudimentary if compared to GDB, but it is already a huge improvement over pdb.
Usage is analogous to pdb, just install it with:
python3 -m pip install --user ipdb
and then add to the line you want to step debug from:
__import__('ipdb').set_trace(context=21)
You likely want to add a shortcut for that from your editor, e.g. for Vim snipmate I have:
snippet ipd
__import__('ipdb').set_trace(context=21)
so I can type just ipd<tab> and it expands to the breakpoint. Then removing it is easy with dd since everything is contained in a single line.
context=21 increases the number of context lines as explained at: How can I make ipdb show more lines of context while debugging?
Alternatively, you can also debug programs from the start with:
ipdb3 main.py
but you generally don't want to do that because:
you would have to go through all function and class definitions as Python reads those lines
I don't know how to set the context size there without hacking ipdb. Patch to allow it: https://github.com/gotcha/ipdb/pull/155
Or alternatively, as in raw pdb 3.2+ you can set some breakpoints from the command line:
ipdb3 -c 'b 12' -c 'b myfunc' ~/test/a.py
although -c c is broken for some reason: https://github.com/gotcha/ipdb/issues/156
python -m module debugging has been asked at: How to debug a Python module run with python -m from the command line? and since Python 3.7 can be done with:
python -m pdb -m my_module
Serious missing features of both pdb and ipdb compared to GDB:
persistent command history across sessions: Save command history in pdb
ipdb specific annoyances:
multithreading does not work well if you don't hack some settings...
ipdb, multiple threads and autoreloading programs causing ProgrammingError
https://github.com/gotcha/ipdb/issues/51
Tested in Ubuntu 16.04, ipdb==0.11, Python 3.5.2.
VSCode
If you want to use an IDE, this is a good alternative to PyCharm.
Install VSCode
Install the Python extension, if it's not already installed
Create a file mymodule.py with Python code
To set a breakpoint, hover over a line number and click the red dot, or press F9
Hit F5 to start debugging and select Python File
It will stop at the breakpoint and you can do your usual debugging stuff like inspecting the values of variables, either at the tab VARIABLES (usually on the left) or by clicking on Debug Console (usually at the bottom next to your Terminal):
This screenshot shows VSCodium.
More information
Python debugging in VS Code
Getting Started with Python in VS Code
Debugging in Visual Studio Code
There exist breakpoint() method nowadays, which replaces import pdb; pdb.set_trace().
It also has several new features, such as possible environment variables.
Python Tutor is an online single-step debugger meant for novices. You can put in code on the edit page then click "Visualize Execution" to start it running.
Among other things, it supports:
hiding variables, e.g. to hide a variable named x, put this at the end:
#pythontutor_hide: x
saving/sharing
a few other languages like Java, JS, Ruby, C, C++
However it also doesn't support a lot of things, for example:
Reading/writing files - use io.StringIO and io.BytesIO instead: demo
Code that is too large, runs too long, or defines too many variables or objects
Command-line arguments
Lots of standard library modules like argparse, csv, enum, html, os, sys, weakref...
Python 3.7+
Let's take look at what breakpoint() can do for you in 3.7+.
I have installed ipdb and pdbpp, which are both enhanced debuggers, via
pip install pdbpp
pip install ipdb
My test script, really doesn't do much, just calls breakpoint().
#test_188_breakpoint.py
myvars=dict(foo="bar")
print("before breakpoint()")
breakpoint() # 👈
print(f"after breakpoint myvars={myvars}")
breakpoint() is linked to the PYTHONBREAKPOINT environment variable.
CASE 1: disabling breakpoint()
You can set the variable via bash as usual
export PYTHONBREAKPOINT=0
This turns off breakpoint() where it does nothing (as long as you haven't modified sys.breakpointhook() which is outside of the scope of this answer).
This is what a run of the program looks like:
(venv38) myuser#explore$ export PYTHONBREAKPOINT=0
(venv38) myuser#explore$ python test_188_breakpoint.py
before breakpoint()
after breakpoint myvars={'foo': 'bar'}
(venv38) myuser#explore$
Didn't stop, because I disabled breakpoint. Something that pdb.set_trace() can't do 😀😀😀!
CASE 2: using the default pdb behavior:
Now, let's unset PYTHONBREAKPOINT which puts us back to normal, enabled-breakpoint behavior (it's only disabled when 0 not when empty).
(venv38) myuser#explore$ unset PYTHONBREAKPOINT
(venv38) myuser#explore$ python test_188_breakpoint.py
before breakpoint()
[0] > /Users/myuser/kds2/wk/explore/test_188_breakpoint.py(6)<module>()
-> print(f"after breakpoint myvars={myvars}")
(Pdb++) print("pdbpp replaces pdb because it was installed")
pdbpp replaces pdb because it was installed
(Pdb++) c
after breakpoint myvars={'foo': 'bar'}
It stopped, but I actually got pdbpp because it replaces pdb entirely while installed. If I unistalled pdbpp, I'd be back to normal pdb.
Note: a standard pdb.set_trace() would still get me pdbpp
CASE 3: calling a custom debugger
But let's call ipdb instead. This time, instead of setting the environment variable, we can use bash to set it only for this one command.
(venv38) myuser#explore$ PYTHONBREAKPOINT=ipdb.set_trace py test_188_breakpoint.py
before breakpoint()
> /Users/myuser/kds2/wk/explore/test_188_breakpoint.py(6)<module>()
5 breakpoint()
----> 6 print(f"after breakpoint myvars={myvars}")
7
ipdb> print("and now I invoked ipdb instead")
and now I invoked ipdb instead
ipdb> c
after breakpoint myvars={'foo': 'bar'}
Essentially, what it does, when looking at $PYTHONBREAKPOINT:
from ipdb import set_trace # function imported on the right-most `.`
set_trace()
Again, much cleverer than a plain old pdb.set_trace() 😀😀😀
in practice? I'd probably settle on a debugger.
Say I want ipdb always, I would:
export it via .profile or similar.
disable on a command by command basis, without modifying the normal value
Example (pytest and debuggers often make for unhappy couples):
(venv38) myuser#explore$ export PYTHONBREAKPOINT=ipdb.set_trace
(venv38) myuser#explore$ echo $PYTHONBREAKPOINT
ipdb.set_trace
(venv38) myuser#explore$ PYTHONBREAKPOINT=0 pytest test_188_breakpoint.py
=================================== test session starts ====================================
platform darwin -- Python 3.8.6, pytest-5.1.2, py-1.9.0, pluggy-0.13.1
rootdir: /Users/myuser/kds2/wk/explore
plugins: celery-4.4.7, cov-2.10.0
collected 0 items
================================== no tests ran in 0.03s ===================================
(venv38) myuser#explore$ echo $PYTHONBREAKPOINT
ipdb.set_trace
p.s.
I'm using bash under macos, any posix shell will behave substantially the same. Windows, either powershell or DOS, may have different capabilities, especially around PYTHONBREAKPOINT=<some value> <some command> to set a environment variable only for one command.
If you come from Java/C# background I guess your best bet would be to use Eclipse with Pydev. This gives you a fully functional IDE with debugger built in. I use it with django as well.
https://wiki.python.org/moin/PythonDebuggingTools
pudb is a good drop-in replacement for pdb
PyCharm is an IDE for Python that includes a debugger. Watch this YouTube video for an introduction on using it to step through code:
PyCharm Tutorial - Debug python code using PyCharm (the debugging starts at 6:34)
Note: PyCharm is a commercial product, but the company does provide a free license to students and teachers, as well as a "lightweight" Community version that is free and open-source.
If you want an IDE with integrated debugger, try PyScripter.
Programmatically stepping and tracing through python code is possible too (and its easy!). Look at the sys.settrace() documentation for more details. Also here is a tutorial to get you started.
Visual Studio with PTVS could be an option for you: http://www.hanselman.com/blog/OneOfMicrosoftsBestKeptSecretsPythonToolsForVisualStudioPTVS.aspx