I am looking for a way to create a Python function from a string containing the source code at runtime, in such a way that the source code is available by inspection.
My current method looks like:
src = 'def foo(x, y):' + '\n\t' + 'return x / y'
g = {numpy: numpy, ...} # Modules and such required for function
l = {}
exec(src, g, l)
func = l['foo']
Which works perfectly fine, but the function has no source code/file associated with it. This makes debugging difficult (note the line the error occurs on in foo() isn't shown):
>>> foo(1, 0)
ZeroDivisionError Traceback (most recent call last)
<ipython-input-85-9df128c5d862> in <module>()
----> 1 myfunc(3, 0)
<string> in foo(x, y)
ZeroDivisionError: division by zero
If I define a function in the IPython interpreter, I can get the source using inspect.getsource and it will be printed in tracebacks. inspect.getsourcefile returns something like '<ipython-input-19-8efed6025c6f>' for these types of functions, which of course isn't a real file. Is there a way to do something similar in a non-interactive environment?
So I was able to partially figure this out by digging into the IPython source code. It makes use of the builtin module linecache, which contains functions for reading source code from files and caching the results. The inspect and traceback modules both use this module to get the source of a function.
The solution is to create the function the same way as in the question, but using compile with a made-up and unique file name:
source = 'def foo(x, y):' + '\n\t' + 'return x / y'
filename = '<dynamic-123456>' # Angle brackets may be required?
code = compile(source, filename, 'exec')
g = {numpy: numpy, ...} # Modules and such required for function
l = {}
exec(src, g, l)
func = l['foo']
linecache contains a variable cache which is a dictionary mapping file names to (size, mtime, lines, fullname) tuples. You can simply add an entry for the fake file name:
lines = [line + '\n' for line in source.splitlines()]
import linecache
linecache.cache[filename] = (len(source), None, lines, filename)
The function will then work with inspect.getsource(), IPython's ?/?? syntax, and in IPython tracebacks. However, it still doesn't seem to work in built-in tracebacks. This is mostly sufficient for me because I almost always work in IPython.
EDIT: see user2357112's comment below for how to get this working with traceback printing in the builtin interpreter.
Related
I'm currently building quite a complex system in Python, and when I'm debugging I often put simple print statements in several scripts. To keep an overview I often also want to print out the file name and line number where the print statement is located. I can of course do that manually, or with something like this:
from inspect import currentframe, getframeinfo
print getframeinfo(currentframe()).filename + ':' + str(getframeinfo(currentframe()).lineno) + ' - ', 'what I actually want to print out here'
Which prints something like:
filenameX.py:273 - what I actually want to print out here
To make it more simple, I want to be able to do something like:
print debuginfo(), 'what I actually want to print out here'
So I put it into a function somewhere and tried doing:
from debugutil import debuginfo
print debuginfo(), 'what I actually want to print out here'
print debuginfo(), 'and something else here'
Unfortunately, I get:
debugutil.py:3 - what I actually want to print out here
debugutil.py:3 - and something else here
It prints out the file name and line number on which I defined the function, instead of the line on which I call debuginfo(). This is obvious, because the code is located in the debugutil.py file.
So my question is actually: How can I get the filename and line number from which this debuginfo() function is called?
The function inspect.stack() returns a list of frame records, starting with the caller and moving out, which you can use to get the information you want:
from inspect import getframeinfo, stack
def debuginfo(message):
caller = getframeinfo(stack()[1][0])
print("%s:%d - %s" % (caller.filename, caller.lineno, message)) # python3 syntax print
def grr(arg):
debuginfo(arg) # <-- stack()[1][0] for this line
grr("aargh") # <-- stack()[2][0] for this line
Output:
example.py:8 - aargh
If you put your trace code in another function, and call that from your main code, then you need to make sure you get the stack information from the grandparent, not the parent or the trace function itself
Below is a example of 3 level deep system to further clarify what I mean. My main function calls a trace function, which calls yet another function to do the work.
######################################
import sys, os, inspect, time
time_start = 0.0 # initial start time
def trace_libary_init():
global time_start
time_start = time.time() # when the program started
def trace_library_do(relative_frame, msg=""):
global time_start
time_now = time.time()
# relative_frame is 0 for current function (this one),
# 1 for direct parent, or 2 for grand parent..
total_stack = inspect.stack() # total complete stack
total_depth = len(total_stack) # length of total stack
frameinfo = total_stack[relative_frame][0] # info on rel frame
relative_depth = total_depth - relative_frame # length of stack there
# Information on function at the relative frame number
func_name = frameinfo.f_code.co_name
filename = os.path.basename(frameinfo.f_code.co_filename)
line_number = frameinfo.f_lineno # of the call
func_firstlineno = frameinfo.f_code.co_firstlineno
fileline = "%s:%d" % (filename, line_number)
time_diff = time_now - time_start
print("%13.6f %-20s %-24s %s" % (time_diff, fileline, func_name, msg))
################################
def trace_do(msg=""):
trace_library_do(1, "trace within interface function")
trace_library_do(2, msg)
# any common tracing stuff you might want to do...
################################
def main(argc, argv):
rc=0
trace_libary_init()
for i in range(3):
trace_do("this is at step %i" %i)
time.sleep((i+1) * 0.1) # in 1/10's of a second
return rc
rc=main(sys.argv.__len__(), sys.argv)
sys.exit(rc)
This will print something like:
$ python test.py
0.000005 test.py:39 trace_do trace within interface func
0.001231 test.py:49 main this is at step 0
0.101541 test.py:39 trace_do trace within interface func
0.101900 test.py:49 main this is at step 1
0.302469 test.py:39 trace_do trace within interface func
0.302828 test.py:49 main this is at step 2
The trace_library_do() function at the top is an example of something that you can drop into a library, and then call it from other tracing functions. The relative depth value controls which entry in the python stack gets printed.
I showed pulling out a few other interesting values in that function, like the line number of start of the function, the total stack depth, and the full path to the file. I didn't show it, but the global and local variables in the function are also available in inspect, as well as the full stack trace to all other functions below yours. There is more than enough information with what I am showing above to make hierarchical call/return timing traces. It's actually not that much further to creating the main parts of your own source level debugger from here -- and it's all mostly just sitting there waiting to be used.
I'm sure someone will object that I'm using internal fields with data returned by the inspect structures, as there may well be access functions that do this same thing for you. But I found them in by stepping through this type of code in a python debugger, and they work at least here. I'm running python 2.7.12, your results might very if you are running a different version.
In any case, I strongly recommend that you import the inspect code into some python code of your own, and look at what it can provide you -- Especially if you can single step through your code in a good python debugger. You will learn a lot on how python works, and get to see both the benefits of the language, and what is going on behind the curtain to make that possible.
Full source level tracing with timestamps is a great way to enhance your understanding of what your code is doing, especially in more of a dynamic real time environment. The great thing about this type of trace code is that once it's written, you don't need debugger support to see it.
An update to the accepted answer using string interpolation and displaying the caller's function name.
import inspect
def debuginfo(message):
caller = inspect.getframeinfo(inspect.stack()[1][0])
print(f"{caller.filename}:{caller.function}:{caller.lineno} - {message}")
The traceprint package can now do that for you:
import traceprint
def func():
print(f'Hello from func')
func()
# File "/traceprint/examples/example.py", line 6, in <module>
# File "/traceprint/examples/example.py", line 4, in func
# Hello from func
PyCharm will automatically make the file link clickable / followable.
Install via pip install traceprint.
Just put the code you posted into a function:
from inspect import currentframe, getframeinfo
def my_custom_debuginfo(message):
print getframeinfo(currentframe()).filename + ':' + str(getframeinfo(currentframe()).lineno) + ' - ', message
and then use it as you want:
# ... some code here ...
my_custom_debuginfo('what I actually want to print out here')
# ... more code ...
I recommend you put that function in a separate module, that way you can reuse it every time you need it.
Discovered this question for a somewhat related problem, but I wanted more details re: the execution (and I didn't want to install an entire call graph package).
If you want more detailed information, you can retrieve a full traceback with the standard library module traceback, and either stash the stack object (a list of tuples) with traceback.extract_stack() or print it out with traceback.print_stack(). This was more suitable for my needs, hope it helps someone else!
Necessarily within a python program and given an str variable that contains C code, I want to check fast if this code is syntactically correct, or not. Essentially, I only need to pass it through the compiler's front end.
My current implementation uses a temp file to dump the string and calls a clang process with subprocess (non-working code below to illustrate my solution). This is very slow for my needs.
src = "int main(){printf("This is a C program\n"); return 0;}"
with open(temp_file, 'w') as f:
f.write(src)
cmd = ["clang", abs_path(f), flags]
subprocess.Popen(cmd)
## etc..
After looking around, I found out about clang.cindex module (pip clang), which I tried out. After reading a bit the main module, lines 2763-2837 (specifically line 2828) led me to the conclusion that the following code snippet will do what I need:
import clang.cindex
......
try:
unit = clang.cindex.TranslationUnit.from_source(temp_code_file, ##args, etc.)
print("Compiled!")
except clang.cindex.TranslationUnitLoadError:
print("Did not compile!")
However, it seems that even if the source file contains obvious syntactic errors, an exception is not raised. Anyone knows what am I missing to make this work ?
On a general context, any suggestions on how to do this task as fast as possible would be more than welcome. Even with clang.cindex, I cannot get away from writing my string-represented code to a temp file, which may be an additional overhead. Writing a python parser could solve this but is an overkill at the moment, no matter how much I need speed.
The compilation itself succeeds even if the file has syntax errors. Consider the following example:
import clang.cindex
with open('broken.c', 'w') as f:
f.write('foo bar baz')
unit = clang.cindex.TranslationUnit.from_source('broken.c')
for d in unit.diagnostics:
print(d.severity, d)
Run it and you will get
3 broken.c:1:1: error: unknown type name 'foo'
3 broken.c:1:8: error: expected ';' after top level declarator
The severity member of is an int, with the value from the enum CXDiagnosticSeverity with values
CXDiagnostic_Ignored = 0
CXDiagnostic_Note = 1
CXDiagnostic_Warning = 2
CXDiagnostic_Error = 3
CXDiagnostic_Fatal = 4
I'm doing simulations for scientific computing, and I'm almost always going to want to be in the interactive interpreter to poke around at the output of my simulations. I'm trying to write classes to define simulated objects (neural populations) and I'd like to formalize my testing of these classes by calling a script %run test_class_WC.py in ipython. Since the module/file containing the class is changing as I try to debug it/add features, I'm reloading it each time.
./test_class_WC.py:
import WC_class # make sure WC_class exists
reload(WC_class) # make sure it's the most current version
import numpy as np
from WC_class import WC_unit # put the class into my global namespace?
E1 = WC_unit(Iapp=100)
E1.update() # see if it works
print E1.r
So right off the bat I'm using reload to make sure I've got the most current version of the module loaded so I've got the freshest class definition-- I'm sure this is clunky as heck (and maybe more sinister?), but it saves me some trouble from doing %run WC_class.py and having to do a separate call to %run test_WC.py
and ./WC_class:
class WC_unit:
nUnits = 0
def __init__(self,**kwargs):
self.__dict__.update(dict( # a bunch of params
gee = .6, # i need to be able to change
ke=.1,the=.2, # in test_class_WC.py
tau=100.,dt=.1,r=0.,Iapp=1.), **kwargs)
WC_unit.nUnits +=1
def update(self):
def f(x,k=self.ke,th=self.the): # a function i define inside a method
return 1/(1+np.exp(-(x-th)/k)) # using some of those params
x = self.Iapp + self.gee * self.r
self.r += self.dt/self.tau * (-self.r + f(x))
WC_unit basically defines a bunch of default parameters and defines an ODE that updates using basic Euler integration. I expect that test_class_WC sets up a global namespace containing np (and WC_unit, and WC_class)
When I run it, I get the following error:
In [14]: %run test_class_WC.py
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
/Users/steeles/Desktop/science/WC_sequence/test_class_WC.py in <module>()
8
9 E1 = WC_unit(Iapp=100)
---> 10 E1.update()
11
12 # if bPlot:
/Users/steeles/Desktop/science/WC_sequence/WC_class.py in update(self)
19 return 1/(1+np.exp(-(x-th)/k))
20 x = self.Iapp + self.gee * self.r
---> 21 self.r += self.dt/self.tau * (-self.r + f(x))
22
23 # #class_method
/Users/steeles/Desktop/science/WC_sequence/WC_class.py in f(x, k, th)
17 def update(self):
18 def f(x,k=self.ke,th=self.the):
---> 19 return 1/(1+np.exp(-(x-th)/k))
20 x = self.Iapp + self.gee * self.r
21 self.r += self.dt/self.tau * (-self.r + f(x))
NameError: global name 'np' is not defined
Now I can get around this by just importing numpy as np in top of the WC_class module, or even by doing from numpy import exp in test_class_WC and change the update() method to contain exp() instead of np.exp()... but I'm not trying to do this because it's easy, I want to learn how all this namespace/module stuff works so I stop being a python idiot. Why is np getting lost in the WC_unit namespace? Is it because I'm dealing with two different files/modules? Does the call to np.exp inside a function have to do with it?
I'm also open to suggestions regarding improving my workflow and file structure, as it seems to be not particularly pythonic. My background is in MATLAB if that helps anyone understand. I'm editing my .py files in SublimeText2. Sorry the code is not very minimal, I've been having a hard time reproducing the problem.
The correct approach is to do an import numpy as np at the top of your sub-module as well. Here's why:
The key thing to note is that in Python, global actually means "shared at a module-level", and the namespaces for each module exist distinct from each other except when a module explicitly imports from another module. An imported module definitely cannot reach out to its 'parent' module's namespace, which is probably a good thing all things considered, otherwise you'll have modules whose behavior depends entirely on the variables defined in the module that imports it.
So when the stack trace says global name 'np' is not defined, it's talking about it at a module level. Python does not let the WC_Class module access objects in its parent module by default.
(As an aside, effbot has a quick note on how to do inter-module globals)
Another key thing to note is that even if you have multiple import numpy as np in various modules of your code, the module actually only gets loaded (i.e. executed) once. Once loaded, modules (being Python objects themselves) can be found in the dictionary sys.modules, and if a module already exists in this dictionary, any import module_to_import statement simply lets the importing module access names in the namespace of module_to_import. So having import numpy as np scattered across multiple modules in your codebase isn't wasteful.
Edit: On deeper digging, effbot has an even deeper (but still pretty quick and simple) exploration of what actually happens in module imports. For deeper exploration of the topic, you may want to check the import system discussion newly added in the Python 3 documentation.
It is normal in Python to import each module that is needed with in each. Don't count on any 'global' imports. In fact there isn't such a thing. With one exception. I discovered in
Do I have to specify import when Python script is being run in Ipython?
that %run -i myscript runs the script in the Ipython interactive namespace. So for quick test scripts this can save a bunch of imports.
I don't see the need for this triple import
import WC_class # make sure WC_class exists
reload(WC_class) # make sure it's the most current version
...
from WC_class import WC_unit
If all you are using from WC_class just use the last line.
I'm currently building quite a complex system in Python, and when I'm debugging I often put simple print statements in several scripts. To keep an overview I often also want to print out the file name and line number where the print statement is located. I can of course do that manually, or with something like this:
from inspect import currentframe, getframeinfo
print getframeinfo(currentframe()).filename + ':' + str(getframeinfo(currentframe()).lineno) + ' - ', 'what I actually want to print out here'
Which prints something like:
filenameX.py:273 - what I actually want to print out here
To make it more simple, I want to be able to do something like:
print debuginfo(), 'what I actually want to print out here'
So I put it into a function somewhere and tried doing:
from debugutil import debuginfo
print debuginfo(), 'what I actually want to print out here'
print debuginfo(), 'and something else here'
Unfortunately, I get:
debugutil.py:3 - what I actually want to print out here
debugutil.py:3 - and something else here
It prints out the file name and line number on which I defined the function, instead of the line on which I call debuginfo(). This is obvious, because the code is located in the debugutil.py file.
So my question is actually: How can I get the filename and line number from which this debuginfo() function is called?
The function inspect.stack() returns a list of frame records, starting with the caller and moving out, which you can use to get the information you want:
from inspect import getframeinfo, stack
def debuginfo(message):
caller = getframeinfo(stack()[1][0])
print("%s:%d - %s" % (caller.filename, caller.lineno, message)) # python3 syntax print
def grr(arg):
debuginfo(arg) # <-- stack()[1][0] for this line
grr("aargh") # <-- stack()[2][0] for this line
Output:
example.py:8 - aargh
If you put your trace code in another function, and call that from your main code, then you need to make sure you get the stack information from the grandparent, not the parent or the trace function itself
Below is a example of 3 level deep system to further clarify what I mean. My main function calls a trace function, which calls yet another function to do the work.
######################################
import sys, os, inspect, time
time_start = 0.0 # initial start time
def trace_libary_init():
global time_start
time_start = time.time() # when the program started
def trace_library_do(relative_frame, msg=""):
global time_start
time_now = time.time()
# relative_frame is 0 for current function (this one),
# 1 for direct parent, or 2 for grand parent..
total_stack = inspect.stack() # total complete stack
total_depth = len(total_stack) # length of total stack
frameinfo = total_stack[relative_frame][0] # info on rel frame
relative_depth = total_depth - relative_frame # length of stack there
# Information on function at the relative frame number
func_name = frameinfo.f_code.co_name
filename = os.path.basename(frameinfo.f_code.co_filename)
line_number = frameinfo.f_lineno # of the call
func_firstlineno = frameinfo.f_code.co_firstlineno
fileline = "%s:%d" % (filename, line_number)
time_diff = time_now - time_start
print("%13.6f %-20s %-24s %s" % (time_diff, fileline, func_name, msg))
################################
def trace_do(msg=""):
trace_library_do(1, "trace within interface function")
trace_library_do(2, msg)
# any common tracing stuff you might want to do...
################################
def main(argc, argv):
rc=0
trace_libary_init()
for i in range(3):
trace_do("this is at step %i" %i)
time.sleep((i+1) * 0.1) # in 1/10's of a second
return rc
rc=main(sys.argv.__len__(), sys.argv)
sys.exit(rc)
This will print something like:
$ python test.py
0.000005 test.py:39 trace_do trace within interface func
0.001231 test.py:49 main this is at step 0
0.101541 test.py:39 trace_do trace within interface func
0.101900 test.py:49 main this is at step 1
0.302469 test.py:39 trace_do trace within interface func
0.302828 test.py:49 main this is at step 2
The trace_library_do() function at the top is an example of something that you can drop into a library, and then call it from other tracing functions. The relative depth value controls which entry in the python stack gets printed.
I showed pulling out a few other interesting values in that function, like the line number of start of the function, the total stack depth, and the full path to the file. I didn't show it, but the global and local variables in the function are also available in inspect, as well as the full stack trace to all other functions below yours. There is more than enough information with what I am showing above to make hierarchical call/return timing traces. It's actually not that much further to creating the main parts of your own source level debugger from here -- and it's all mostly just sitting there waiting to be used.
I'm sure someone will object that I'm using internal fields with data returned by the inspect structures, as there may well be access functions that do this same thing for you. But I found them in by stepping through this type of code in a python debugger, and they work at least here. I'm running python 2.7.12, your results might very if you are running a different version.
In any case, I strongly recommend that you import the inspect code into some python code of your own, and look at what it can provide you -- Especially if you can single step through your code in a good python debugger. You will learn a lot on how python works, and get to see both the benefits of the language, and what is going on behind the curtain to make that possible.
Full source level tracing with timestamps is a great way to enhance your understanding of what your code is doing, especially in more of a dynamic real time environment. The great thing about this type of trace code is that once it's written, you don't need debugger support to see it.
An update to the accepted answer using string interpolation and displaying the caller's function name.
import inspect
def debuginfo(message):
caller = inspect.getframeinfo(inspect.stack()[1][0])
print(f"{caller.filename}:{caller.function}:{caller.lineno} - {message}")
The traceprint package can now do that for you:
import traceprint
def func():
print(f'Hello from func')
func()
# File "/traceprint/examples/example.py", line 6, in <module>
# File "/traceprint/examples/example.py", line 4, in func
# Hello from func
PyCharm will automatically make the file link clickable / followable.
Install via pip install traceprint.
Just put the code you posted into a function:
from inspect import currentframe, getframeinfo
def my_custom_debuginfo(message):
print getframeinfo(currentframe()).filename + ':' + str(getframeinfo(currentframe()).lineno) + ' - ', message
and then use it as you want:
# ... some code here ...
my_custom_debuginfo('what I actually want to print out here')
# ... more code ...
I recommend you put that function in a separate module, that way you can reuse it every time you need it.
Discovered this question for a somewhat related problem, but I wanted more details re: the execution (and I didn't want to install an entire call graph package).
If you want more detailed information, you can retrieve a full traceback with the standard library module traceback, and either stash the stack object (a list of tuples) with traceback.extract_stack() or print it out with traceback.print_stack(). This was more suitable for my needs, hope it helps someone else!
I'm using python's trace module to trace some code. When I trace code this way, I can get one of the following two results:
Call:
tracer = trace.Trace(count=False, trace=True, ignoredirs=[sys.prefix, sys.exec_prefix])
r = tracer.run('run()')
tracer.results().write_results(show_missing=True)
Result:
<filename>(<line number>): <line of code>
Call [citation]:
tracer = trace.Trace(count=False, trace=True, ignoredirs=[sys.prefix, sys.exec_prefix], countfuncs=True)
r = tracer.run('run()')
tracer.results().write_results(show_missing=True)
Result:
filename:<filepath>, modulename:<module name>, funcname: <function name>
What I really need is a trace that gives me this:
<filepath> <line number>
It would seem that I could use the above information and interleave them to get what I need, but such an attempt would fail in the following use case:
sys.path contains directory A and directory B.
There are two files A/foo.py and B/foo.py
Both A/foo.py and B/foo.py contain the function bar, defined on lines 100 - 120
There are some minor differences between A/foo.py and B/foo.py
In this scenario, using such interleaving to correctly identifying which bar is called is impossible (please correct me if I'm wrong) without statically analyzing the code within each bar, which itself is very difficult for non-trivial functions.
So, how can I get the correct trace output that I need?
With a little monkey-patching, this is actually quite easy. Digging around in the source code of the trace module it seems that callbacks are used to report on each execution step. The basic functionality of Trace.run, greatly simplified, is:
sys.settrace(globaltrace) # Set the trace callback function
exec function # Execute the function, invoking the callback as necessary
sys.settrace(None) # Reset the trace
globaltrace is defined in Trace.__init__ depending on the arguments passed. Specifically, with the arguments in your first example, Trace.globaltrace_lt is used as the global callback, which calls Trace.localtrace_trace for each line of execution. Changing it is simply a case of modifying Trace.localtrace, so to get the result you want:
import trace
import sys
import time
import linecache
class Trace(trace.Trace):
def localtrace_trace(self, frame, why, arg):
if why == "line":
# record the file name and line number of every trace
filename = frame.f_code.co_filename
lineno = frame.f_lineno
if self.start_time:
print '%.2f' % (time.time() - self.start_time),
print "%s (%d): %s" % (filename, lineno,
linecache.getline(filename, lineno)),
return self.localtrace
tracer = Trace(count=False, trace=True, ignoredirs=[sys.prefix, sys.exec_prefix])
r = tracer.run('run()')
There is a difference between the two examples you give; if the first the output is printed during the Trace.run call, in the second it is printed during write_results. The code I've given above follows the pattern of the former, so tracer.results().write_results() is not necessary. However, if you want to manipulate this output instead it can be achieved by patching the trace.CoverageResults.write_results method in a similar manner.