How can I build a pyparsing program that allows operations being executed on a context/state object?
An example of my program looks like this:
load 'data.txt'
remove line 1
remove line 4
The first line should load a file and line 2 and 3 are commands that operate on the content of the file. As a result, I expect the content of the file after all commands have been executed.
load_cmd = Literal('load') + filename
remove_cmd = Literal('remove line') + line_no
more_cmd = ...
def load_action(s, loc, toks):
# load file, where should I store it?
load_cmd.setParseAction(load_action)
def remove_line_action(s, loc, toks):
# remove line, how to obtain data to operate on? where to write result?
remove_line_cmd.setParseAction(remove_cmd)
# Is this the right way to define a whole program, i.e. not only one line?
program = load_cmd + remove_cmd | more_cmd |...
# How do I obtain the result?
program.scanString("""
load 'data.txt'
remove line 1
remove line 4
""")
I have written a few pyparsing examples of this command-parsing style, you can find them online at:
http://pyparsing.wikispaces.com/file/view/simpleBool.py/451074414/simpleBool.py
http://pyparsing.wikispaces.com/file/view/eval_arith.py/68273277/eval_arith.py
I have also written a simple Adventure-style game processor, which accepts parsed command structures and executes them against a game "world", which functions as the command executor. I presented this at PyCon 2006, but the link from the conference page has gone stale - you can find it now at http://www.ptmcg.com/geo/python/confs/pyCon2006_pres2.html (the presentation is written using S5 - mouse over the lower right corner to see navigation buttons). The code is at http://www.ptmcg.com/geo/python/confs/adventureEngine.py.txt, and UML diagram for the code is at http://www.ptmcg.com/geo/python/confs/pyparsing_adventure.pdf.
The general pattern I have found to work best is similar to the old Model-View-Controller pattern.
The Model is your virtual machine, which maintains the context from command to command. In simple_bool the context is just the inferred local variable scope, since each parsed statement is just evaled. In eval_arith, this context is kept in the EvalConstant._vars dict, containing the names and values of pre-defined and parsed variables. In the Adventure engine, the context is kept in the Player object (containing attributes that point to the current Room and the collection of Items), which is passed to the parsed command object to execute the command.
The View is the parser itself. It extracts the pieces of each command and composes an instance of a command class. The interface to the command class's exec method depends on how you have set up the Model. But in general you can envision that the exec method you define will take the Model as one of, if not its only, parameter.
Then the Controller is a simple loop that implements the following pseudo-code:
while not finished
read a command, assign to commandstring
parse commandstring, use parsed results to create commandobj (null if bad parse)
if commandobj is not null:
commandobj.exec(context)
finished = context.is_finished()
If you implement your parser using pyparsing, then you can define your Command classes as subclasses of this abstract class:
class Command(object):
def __init__(self, s, l, t):
self.parameters = t
def exec(self, context):
self._do_exec(context)
When you define each command, the corresponding subclass can be passed directly as the command expression's parse action. For instance, a simplified GO command for moving through a maze would look like:
goExpression = Literal("GO") + oneOf("NORTH SOUTH EAST WEST")("direction")
goExpression.setParseAction(GoCommand)
For the abstract Command class above, a GoCommand class might look like:
class GoCommand(Command):
def _do_exec(self, context):
if context.is_valid_move(self.parameters.direction):
context.move(self.parameters.direction)
else:
context.report("Sorry, you can't go " +
self.parameters.direction +
" from here.")
By parsing a statement like "GO NORTH", you would get back not a ParseResults containing the tokens "GO" and "NORTH", but a GoCommand instance, whose parameters include a named token "direction", giving the direction parameter for the GO command.
So the design steps to do this are:
design your virtual machine, and its command interface
create a class to capture the state/context in the virtual machine
design your commands, and their corresponding Command subclasses
create the pyparsing parser expressions for each command
attach the Command subclass as a parse action to each command's pyparsing expression
create an overall parser by combining all the command expressions using '|'
implement the command processor loop
I would do something like this:
cmdStrs = '''
load
remove line
add line
some other command
'''
def loadParse(val): print 'Load --> ' + val
def removeParse(val): print 'remove --> ' + val
def addLineParse(val): print 'addLine --> ' + val
def someOtherCommandParse(val): print 'someOther --> ' + val
commands = [ l.strip() for l in cmdStrs.split('\n') if l.strip() !='' ]
functions = [loadParse,
removeParse,
addLineParse,
someOtherCommandParse]
funcDict = dict( zip(commands, functions) )
program = '''
# This is a comment
load 'data.txt' # This is another comment
remove line 1
remove line 4
'''
for l in program.split('\n'):
l = l.strip().split('#')[0].strip() # remove comments
if l == '': continue
commandFound = False
for c in commands:
if c in l:
funcDict[c](l.split(c)[-1])
commandFound = True
if not commandFound:
print 'Error: Unknown command : ', l
Of course, you can put the entire thing within a class and make it an object, but you see the general structure. If you have an object, then you can go ahead and create a version which can handle contextual/state information. Then, the functions above will simply be member functions.
Why do I get a sense that you are starting on Python after learning Haskell? Generally people go the other way. In Python you get state for free. You don't need Classes. You can use classes to handle more than one state within the same program :).
Related
I'm currently building quite a complex system in Python, and when I'm debugging I often put simple print statements in several scripts. To keep an overview I often also want to print out the file name and line number where the print statement is located. I can of course do that manually, or with something like this:
from inspect import currentframe, getframeinfo
print getframeinfo(currentframe()).filename + ':' + str(getframeinfo(currentframe()).lineno) + ' - ', 'what I actually want to print out here'
Which prints something like:
filenameX.py:273 - what I actually want to print out here
To make it more simple, I want to be able to do something like:
print debuginfo(), 'what I actually want to print out here'
So I put it into a function somewhere and tried doing:
from debugutil import debuginfo
print debuginfo(), 'what I actually want to print out here'
print debuginfo(), 'and something else here'
Unfortunately, I get:
debugutil.py:3 - what I actually want to print out here
debugutil.py:3 - and something else here
It prints out the file name and line number on which I defined the function, instead of the line on which I call debuginfo(). This is obvious, because the code is located in the debugutil.py file.
So my question is actually: How can I get the filename and line number from which this debuginfo() function is called?
The function inspect.stack() returns a list of frame records, starting with the caller and moving out, which you can use to get the information you want:
from inspect import getframeinfo, stack
def debuginfo(message):
caller = getframeinfo(stack()[1][0])
print("%s:%d - %s" % (caller.filename, caller.lineno, message)) # python3 syntax print
def grr(arg):
debuginfo(arg) # <-- stack()[1][0] for this line
grr("aargh") # <-- stack()[2][0] for this line
Output:
example.py:8 - aargh
If you put your trace code in another function, and call that from your main code, then you need to make sure you get the stack information from the grandparent, not the parent or the trace function itself
Below is a example of 3 level deep system to further clarify what I mean. My main function calls a trace function, which calls yet another function to do the work.
######################################
import sys, os, inspect, time
time_start = 0.0 # initial start time
def trace_libary_init():
global time_start
time_start = time.time() # when the program started
def trace_library_do(relative_frame, msg=""):
global time_start
time_now = time.time()
# relative_frame is 0 for current function (this one),
# 1 for direct parent, or 2 for grand parent..
total_stack = inspect.stack() # total complete stack
total_depth = len(total_stack) # length of total stack
frameinfo = total_stack[relative_frame][0] # info on rel frame
relative_depth = total_depth - relative_frame # length of stack there
# Information on function at the relative frame number
func_name = frameinfo.f_code.co_name
filename = os.path.basename(frameinfo.f_code.co_filename)
line_number = frameinfo.f_lineno # of the call
func_firstlineno = frameinfo.f_code.co_firstlineno
fileline = "%s:%d" % (filename, line_number)
time_diff = time_now - time_start
print("%13.6f %-20s %-24s %s" % (time_diff, fileline, func_name, msg))
################################
def trace_do(msg=""):
trace_library_do(1, "trace within interface function")
trace_library_do(2, msg)
# any common tracing stuff you might want to do...
################################
def main(argc, argv):
rc=0
trace_libary_init()
for i in range(3):
trace_do("this is at step %i" %i)
time.sleep((i+1) * 0.1) # in 1/10's of a second
return rc
rc=main(sys.argv.__len__(), sys.argv)
sys.exit(rc)
This will print something like:
$ python test.py
0.000005 test.py:39 trace_do trace within interface func
0.001231 test.py:49 main this is at step 0
0.101541 test.py:39 trace_do trace within interface func
0.101900 test.py:49 main this is at step 1
0.302469 test.py:39 trace_do trace within interface func
0.302828 test.py:49 main this is at step 2
The trace_library_do() function at the top is an example of something that you can drop into a library, and then call it from other tracing functions. The relative depth value controls which entry in the python stack gets printed.
I showed pulling out a few other interesting values in that function, like the line number of start of the function, the total stack depth, and the full path to the file. I didn't show it, but the global and local variables in the function are also available in inspect, as well as the full stack trace to all other functions below yours. There is more than enough information with what I am showing above to make hierarchical call/return timing traces. It's actually not that much further to creating the main parts of your own source level debugger from here -- and it's all mostly just sitting there waiting to be used.
I'm sure someone will object that I'm using internal fields with data returned by the inspect structures, as there may well be access functions that do this same thing for you. But I found them in by stepping through this type of code in a python debugger, and they work at least here. I'm running python 2.7.12, your results might very if you are running a different version.
In any case, I strongly recommend that you import the inspect code into some python code of your own, and look at what it can provide you -- Especially if you can single step through your code in a good python debugger. You will learn a lot on how python works, and get to see both the benefits of the language, and what is going on behind the curtain to make that possible.
Full source level tracing with timestamps is a great way to enhance your understanding of what your code is doing, especially in more of a dynamic real time environment. The great thing about this type of trace code is that once it's written, you don't need debugger support to see it.
An update to the accepted answer using string interpolation and displaying the caller's function name.
import inspect
def debuginfo(message):
caller = inspect.getframeinfo(inspect.stack()[1][0])
print(f"{caller.filename}:{caller.function}:{caller.lineno} - {message}")
The traceprint package can now do that for you:
import traceprint
def func():
print(f'Hello from func')
func()
# File "/traceprint/examples/example.py", line 6, in <module>
# File "/traceprint/examples/example.py", line 4, in func
# Hello from func
PyCharm will automatically make the file link clickable / followable.
Install via pip install traceprint.
Just put the code you posted into a function:
from inspect import currentframe, getframeinfo
def my_custom_debuginfo(message):
print getframeinfo(currentframe()).filename + ':' + str(getframeinfo(currentframe()).lineno) + ' - ', message
and then use it as you want:
# ... some code here ...
my_custom_debuginfo('what I actually want to print out here')
# ... more code ...
I recommend you put that function in a separate module, that way you can reuse it every time you need it.
Discovered this question for a somewhat related problem, but I wanted more details re: the execution (and I didn't want to install an entire call graph package).
If you want more detailed information, you can retrieve a full traceback with the standard library module traceback, and either stash the stack object (a list of tuples) with traceback.extract_stack() or print it out with traceback.print_stack(). This was more suitable for my needs, hope it helps someone else!
Maybe I am completely off track here (and above my paygrade for sure), but what I want to do is to give users of my app (That I am writing in Python since that's the language I know) a python interpreter to control some objects within my app. Something similar like many 3D and VFX softwares have (Maya, Blender, Nuke). This is the code I got so far:
#main.py
import code
import networkx as nx
class Env():
def __init__(self):
self.graph = nx.graph.DiGraph()
# load library with functions that will be availabel for user inside the app.
import mylib
functions = {f: getattr(mylib, f) for f in dir(mylib) if not f.startswith('__')}
self._interpreter = code.InteractiveInterpreter(locals=functions)
def execute_node(self, node=None):
# In IRL the main object to be pass1ed to users' interpreter will be the self.graph object
# but I made it more clear for this question.
self._interpreter.locals['var'] = 42
node = "print(var)\nprint(some_function())\nvar = 67" # Let's pretend node object contains this code.
self._interpreter.runcode(node)
if __name__ == '__main__':
e = Env()
# some code, node creation and so on...
e.execute_code()
print(e.locals['var'])
#mylib.py
var = None # I have to put this here because if there is no variable function fails at import
def some_function():
print(var)
Output:
42 # This prints as expected
None # The print within the function prints the value that was there when module was initialized
67 # The last print returns expected value
So, it is clear that python interprets the functions on first import and "bakes" the global variables that it had at the import time. Now the question is can I somehow easily make it use the globals passed from the code.InteractiveInterpreter() or I should look for a completely different solution (and which one) :)? Of course the idea is that the two python programs should communicate, the user should use a special library to operate the software and the backend code should not be exposed to them. Do I make any sense? Thanks :)
This is the one-ish instance where you do want to use the exec() function, but please remember that the user may be able to run any Python code, including stuff that could run forever, mess up your main program, write (or delete) files, etc.
def run_code(code, add_locals={}):
code_locals = {}
code_locals.update(add_locals) # Copy in the additional locals so that dict could be reused
exec(
code,
{}, # no globals (you may wish to replace this),
code_locals,
)
return code_locals # return updated locals
class Beeper: # define a toy object
def beep(self, times):
print("Beep! " * times)
beeper = Beeper() # instantiate the object to play with
# Some user code...
user_code = """
x = 5
beeper.beep(x)
x += 3
"""
new_locals = run_code(user_code, {"beeper": beeper})
print(new_locals)
This outputs
Beep! Beep! Beep! Beep! Beep!
{'beeper': <__main__.Beeper>, 'x': 8}
So you can see we can use the locals the user has modified if need be.
I am working on a django based web app that takes python file as input which contains some function, then in backend i have some lists that are passed as parameters through the user's function,which will generate a single value output.The result generated will be used for some further computation.
Here is how the function inside the user's file look like :
def somefunctionname(list):
''' some computation performed on list'''
return float value
At present the approach that i am using is taking user's file as normal file input. Then in my views.py i am executing the file as module and passing the parameters with eval function. Snippet is given below.
Here modulename is the python file name that i had taken from user and importing as module
exec("import "+modulename)
result = eval(f"{modulename}.{somefunctionname}(arguments)")
Which is working absolutely fine. But i know this is not the secured approach.
My question , Is there any other way through which i can run users file securely as the method that i am using is not secure ? I know the proposed solutions can't be full proof but what are the other ways in which i can run this (like if it can be solved with dockerization then what will be the approach or some external tools that i can use with API )?
Or if possible can somebody tell me how can i simply sandbox this or any tutorial that can help me..?
Any reference or resource will be helpful.
It is an important question. In python sandboxing is not trivial.
It is one of the few cases where the question which version of python interpreter you are using. For example, Jyton generates Java bytecode, and JVM has its own mechanism to run code securely.
For CPython, the default interpreter, originally there were some attempts to make a restricted execution mode, that were abandoned long time ago.
Currently, there is that unofficial project, RestrictedPython that might give you what you need. It is not a full sandbox, i.e. will not give you restricted filesystem access or something, but for you needs it may be just enough.
Basically the guys there just rewrote the python compilation in a more restricted way.
What it allows to do is to compile a piece of code and then execute, all in a restricted mode. For example:
from RestrictedPython import safe_builtins, compile_restricted
source_code = """
print('Hello world, but secure')
"""
byte_code = compile_restricted(
source_code,
filename='<string>',
mode='exec'
)
exec(byte_code, {__builtins__ = safe_builtins})
>>> Hello world, but secure
Running with builtins = safe_builtins disables the dangerous functions like open file, import or whatever. There are also other variations of builtins and other options, take some time to read the docs, they are pretty good.
EDIT:
Here is an example for you use case
from RestrictedPython import safe_builtins, compile_restricted
from RestrictedPython.Eval import default_guarded_getitem
def execute_user_code(user_code, user_func, *args, **kwargs):
""" Executed user code in restricted env
Args:
user_code(str) - String containing the unsafe code
user_func(str) - Function inside user_code to execute and return value
*args, **kwargs - arguments passed to the user function
Return:
Return value of the user_func
"""
def _apply(f, *a, **kw):
return f(*a, **kw)
try:
# This is the variables we allow user code to see. #result will contain return value.
restricted_locals = {
"result": None,
"args": args,
"kwargs": kwargs,
}
# If you want the user to be able to use some of your functions inside his code,
# you should add this function to this dictionary.
# By default many standard actions are disabled. Here I add _apply_ to be able to access
# args and kwargs and _getitem_ to be able to use arrays. Just think before you add
# something else. I am not saying you shouldn't do it. You should understand what you
# are doing thats all.
restricted_globals = {
"__builtins__": safe_builtins,
"_getitem_": default_guarded_getitem,
"_apply_": _apply,
}
# Add another line to user code that executes #user_func
user_code += "\nresult = {0}(*args, **kwargs)".format(user_func)
# Compile the user code
byte_code = compile_restricted(user_code, filename="<user_code>", mode="exec")
# Run it
exec(byte_code, restricted_globals, restricted_locals)
# User code has modified result inside restricted_locals. Return it.
return restricted_locals["result"]
except SyntaxError as e:
# Do whaever you want if the user has code that does not compile
raise
except Exception as e:
# The code did something that is not allowed. Add some nasty punishment to the user here.
raise
Now you have a function execute_user_code, that receives some unsafe code as a string, a name of a function from this code, arguments, and returns the return value of the function with the given arguments.
Here is a very stupid example of some user code:
example = """
def test(x, name="Johny"):
return name + " likes " + str(x*x)
"""
# Lets see how this works
print(execute_user_code(example, "test", 5))
# Result: Johny likes 25
But here is what happens when the user code tries to do something unsafe:
malicious_example = """
import sys
print("Now I have the access to your system, muhahahaha")
"""
# Lets see how this works
print(execute_user_code(malicious_example, "test", 5))
# Result - evil plan failed:
# Traceback (most recent call last):
# File "restr.py", line 69, in <module>
# print(execute_user_code(malitious_example, "test", 5))
# File "restr.py", line 45, in execute_user_code
# exec(byte_code, restricted_globals, restricted_locals)
# File "<user_code>", line 2, in <module>
#ImportError: __import__ not found
Possible extension:
Pay attention that the user code is compiled on each call to the function. However, it is possible that you would like to compile the user code once, then execute it with different parameters. So all you have to do is to save the byte_code somewhere, then to call exec with a different set of restricted_locals each time.
EDIT2:
If you want to use import, you can write your own import function that allows to use only modules that you consider safe. Example:
def _import(name, globals=None, locals=None, fromlist=(), level=0):
safe_modules = ["math"]
if name in safe_modules:
globals[name] = __import__(name, globals, locals, fromlist, level)
else:
raise Exception("Don't you even think about it {0}".format(name))
safe_builtins['__import__'] = _import # Must be a part of builtins
restricted_globals = {
"__builtins__": safe_builtins,
"_getitem_": default_guarded_getitem,
"_apply_": _apply,
}
....
i_example = """
import math
def myceil(x):
return math.ceil(x)
"""
print(execute_user_code(i_example, "myceil", 1.5))
Note that this sample import function is VERY primitive, it will not work with stuff like from x import y. You can look here for a more complex implementation.
EDIT3
Note, that lots of python built in functionality is not available out of the box in RestrictedPython, it does not mean it is not available at all. You may need to implement some function for it to become available.
Even some obvious things like sum or += operator are not obvious in the restricted environment.
For example, the for loop uses _getiter_ function that you must implement and provide yourself (in globals). Since you want to avoid infinite loops, you may want to put some limits on the number of iterations allowed. Here is a sample implementation that limits number of iterations to 100:
MAX_ITER_LEN = 100
class MaxCountIter:
def __init__(self, dataset, max_count):
self.i = iter(dataset)
self.left = max_count
def __iter__(self):
return self
def __next__(self):
if self.left > 0:
self.left -= 1
return next(self.i)
else:
raise StopIteration()
def _getiter(ob):
return MaxCountIter(ob, MAX_ITER_LEN)
....
restricted_globals = {
"_getiter_": _getiter,
....
for_ex = """
def sum(x):
y = 0
for i in range(x):
y = y + i
return y
"""
print(execute_user_code(for_ex, "sum", 6))
If you don't want to limit loop count, just use identity function as _getiter_:
restricted_globals = {
"_getiter_": labmda x: x,
Note that simply limiting the loop count does not guarantee security. First, loops can be nested. Second, you cannot limit the execution count of a while loop. To make it secure, you have to execute unsafe code under some timeout.
Please take a moment to read the docs.
Note that not everything is documented (although many things are). You have to learn to read the project's source code for more advanced things. Best way to learn is to try and run some code, and to see what kind function is missing, then to see the source code of the project to understand how to implement it.
EDIT4
There is still another problem - restricted code may have infinite loops. To avoid it, some kind of timeout is required on the code.
Unfortunately, since you are using django, that is multi threaded unless you explicitly specify otherwise, simple trick for timeouts using signeals will not work here, you have to use multiprocessing.
Easiest way in my opinion - use this library. Simply add a decorator to execute_user_code so it will look like this:
#timeout_decorator.timeout(5, use_signals=False)
def execute_user_code(user_code, user_func, *args, **kwargs):
And you are done. The code will never run more than 5 seconds.
Pay attention to use_signals=False, without this it may have some unexpected behavior in django.
Also note that this is relatively heavy on resources (and I don't really see a way to overcome this). I mean not really crazy heavy, but it is an extra process spawn. You should hold that in mind in your web server configuration - the api which allows to execute arbitrary user code is more vulnerable to ddos.
For sure with docker you can sandbox the execution if you are careful. You can restrict CPU cycles, max memory, close all network ports, run as a user with read only access to the file system and all).
Still,this would be extremely complex to get it right I think. For me you shall not allow a client to execute arbitrar code like that.
I would be to check if a production/solution isn't already done and use that. I was thinking that some sites allow you to submit some code (python, java, whatever) that is executed on the server.
I'd like to run ipython script in python, ie:
code='''a=1
b=a+1
b
c'''
from Ipython import executor
for l in code.split("\n"):
print(executor(l))
that whould print
None
None
2
NameError: name 'c' is not defined
does it exists ? I searched the doc, but it does not seems to be (well) documented.
In short, depending on what you want to do and how much IPython features you want to include, you will need to do more.
First thing you need to know is that IPython separates its code into blocks.
Each block has its own result.
If you use blocks use this advice
If you don't any magic IPython provides you with and don't want any results given by each block, then you could just try to use exec(compile(script, "exec"), {}, {}).
If you want more than that, you will need to actually spawn an InteractiveShell-instance as features like %magic and %%magic will need a working InteractiveShell.
In one of my projects I have this function to execute code in an InteractiveShell-instance:
https://github.com/Irrational-Encoding-Wizardry/yuuno/blob/master/yuuno_ipython/ipython/utils.py#L28
If you want to just get the result of each expression,
then you should parse the code using the ast-Module and add code to return each result.
You will see this in the function linked above from line 34 onwards.
Here is the relevant except:
if isinstance(expr_ast.body[-1], ast.Expr):
last_expr = expr_ast.body[-1]
assign = ast.Assign( # _yuuno_exec_last_ = <LAST_EXPR>
targets=[ast.Name(
id=RESULT_VAR,
ctx=ast.Store()
)],
value=last_expr.value
)
expr_ast.body[-1] = assign
else:
assign = ast.Assign( # _yuuno_exec_last_ = None
targets=[ast.Name(
id=RESULT_VAR,
ctx=ast.Store(),
)],
value=ast.NameConstant(
value=None
)
)
expr_ast.body.append(assign)
ast.fix_missing_locations(expr_ast)
Instead doing this for every statement in the body instead of the last one and replacing it with some "printResult"-transformation will do the same for you.
I'm currently building quite a complex system in Python, and when I'm debugging I often put simple print statements in several scripts. To keep an overview I often also want to print out the file name and line number where the print statement is located. I can of course do that manually, or with something like this:
from inspect import currentframe, getframeinfo
print getframeinfo(currentframe()).filename + ':' + str(getframeinfo(currentframe()).lineno) + ' - ', 'what I actually want to print out here'
Which prints something like:
filenameX.py:273 - what I actually want to print out here
To make it more simple, I want to be able to do something like:
print debuginfo(), 'what I actually want to print out here'
So I put it into a function somewhere and tried doing:
from debugutil import debuginfo
print debuginfo(), 'what I actually want to print out here'
print debuginfo(), 'and something else here'
Unfortunately, I get:
debugutil.py:3 - what I actually want to print out here
debugutil.py:3 - and something else here
It prints out the file name and line number on which I defined the function, instead of the line on which I call debuginfo(). This is obvious, because the code is located in the debugutil.py file.
So my question is actually: How can I get the filename and line number from which this debuginfo() function is called?
The function inspect.stack() returns a list of frame records, starting with the caller and moving out, which you can use to get the information you want:
from inspect import getframeinfo, stack
def debuginfo(message):
caller = getframeinfo(stack()[1][0])
print("%s:%d - %s" % (caller.filename, caller.lineno, message)) # python3 syntax print
def grr(arg):
debuginfo(arg) # <-- stack()[1][0] for this line
grr("aargh") # <-- stack()[2][0] for this line
Output:
example.py:8 - aargh
If you put your trace code in another function, and call that from your main code, then you need to make sure you get the stack information from the grandparent, not the parent or the trace function itself
Below is a example of 3 level deep system to further clarify what I mean. My main function calls a trace function, which calls yet another function to do the work.
######################################
import sys, os, inspect, time
time_start = 0.0 # initial start time
def trace_libary_init():
global time_start
time_start = time.time() # when the program started
def trace_library_do(relative_frame, msg=""):
global time_start
time_now = time.time()
# relative_frame is 0 for current function (this one),
# 1 for direct parent, or 2 for grand parent..
total_stack = inspect.stack() # total complete stack
total_depth = len(total_stack) # length of total stack
frameinfo = total_stack[relative_frame][0] # info on rel frame
relative_depth = total_depth - relative_frame # length of stack there
# Information on function at the relative frame number
func_name = frameinfo.f_code.co_name
filename = os.path.basename(frameinfo.f_code.co_filename)
line_number = frameinfo.f_lineno # of the call
func_firstlineno = frameinfo.f_code.co_firstlineno
fileline = "%s:%d" % (filename, line_number)
time_diff = time_now - time_start
print("%13.6f %-20s %-24s %s" % (time_diff, fileline, func_name, msg))
################################
def trace_do(msg=""):
trace_library_do(1, "trace within interface function")
trace_library_do(2, msg)
# any common tracing stuff you might want to do...
################################
def main(argc, argv):
rc=0
trace_libary_init()
for i in range(3):
trace_do("this is at step %i" %i)
time.sleep((i+1) * 0.1) # in 1/10's of a second
return rc
rc=main(sys.argv.__len__(), sys.argv)
sys.exit(rc)
This will print something like:
$ python test.py
0.000005 test.py:39 trace_do trace within interface func
0.001231 test.py:49 main this is at step 0
0.101541 test.py:39 trace_do trace within interface func
0.101900 test.py:49 main this is at step 1
0.302469 test.py:39 trace_do trace within interface func
0.302828 test.py:49 main this is at step 2
The trace_library_do() function at the top is an example of something that you can drop into a library, and then call it from other tracing functions. The relative depth value controls which entry in the python stack gets printed.
I showed pulling out a few other interesting values in that function, like the line number of start of the function, the total stack depth, and the full path to the file. I didn't show it, but the global and local variables in the function are also available in inspect, as well as the full stack trace to all other functions below yours. There is more than enough information with what I am showing above to make hierarchical call/return timing traces. It's actually not that much further to creating the main parts of your own source level debugger from here -- and it's all mostly just sitting there waiting to be used.
I'm sure someone will object that I'm using internal fields with data returned by the inspect structures, as there may well be access functions that do this same thing for you. But I found them in by stepping through this type of code in a python debugger, and they work at least here. I'm running python 2.7.12, your results might very if you are running a different version.
In any case, I strongly recommend that you import the inspect code into some python code of your own, and look at what it can provide you -- Especially if you can single step through your code in a good python debugger. You will learn a lot on how python works, and get to see both the benefits of the language, and what is going on behind the curtain to make that possible.
Full source level tracing with timestamps is a great way to enhance your understanding of what your code is doing, especially in more of a dynamic real time environment. The great thing about this type of trace code is that once it's written, you don't need debugger support to see it.
An update to the accepted answer using string interpolation and displaying the caller's function name.
import inspect
def debuginfo(message):
caller = inspect.getframeinfo(inspect.stack()[1][0])
print(f"{caller.filename}:{caller.function}:{caller.lineno} - {message}")
The traceprint package can now do that for you:
import traceprint
def func():
print(f'Hello from func')
func()
# File "/traceprint/examples/example.py", line 6, in <module>
# File "/traceprint/examples/example.py", line 4, in func
# Hello from func
PyCharm will automatically make the file link clickable / followable.
Install via pip install traceprint.
Just put the code you posted into a function:
from inspect import currentframe, getframeinfo
def my_custom_debuginfo(message):
print getframeinfo(currentframe()).filename + ':' + str(getframeinfo(currentframe()).lineno) + ' - ', message
and then use it as you want:
# ... some code here ...
my_custom_debuginfo('what I actually want to print out here')
# ... more code ...
I recommend you put that function in a separate module, that way you can reuse it every time you need it.
Discovered this question for a somewhat related problem, but I wanted more details re: the execution (and I didn't want to install an entire call graph package).
If you want more detailed information, you can retrieve a full traceback with the standard library module traceback, and either stash the stack object (a list of tuples) with traceback.extract_stack() or print it out with traceback.print_stack(). This was more suitable for my needs, hope it helps someone else!