Python Code: Information on Execution Trace of loops/conditionals - python

I want to get the execution trace of a python function in terms of the loops and conditionals executed upon completion. However, I want to do this without instrumenting the original python function with additional parameters. For example:
def foo(a: int, b: int):
while a:
a = do_something()
if b:
a = do_something()
if __name__ == "__main__":
foo(a, b)
After the execution of foo() I want a execution trace something like:
[while: true, if:false, while: true, if: true, while: false, ...] which documents the sequence of conditional evaluations in the code. Is there any way to get this information automatically for an arbitrary python function?
I understand "Coverage" python module returns the "Branch coverage" information. But I am unsure how to use it in this context?

You can use as a starting point trace_conditions.py and modify it if needed.
Example
foo function that defined in the question is used in the example below:
from trace_conditions import trace_conditions
# (1) This will just print conditions
traced_foo = trace_conditions(foo)
traced_foo(a, b)
# while c -> True
# if d -> True
# ...
# (2) This will return conditions
traced_foo = trace_conditions(foo, return_conditions=True)
result, conditions = traced_foo(a, b)
# conditions = [('while', 'c', True), ('if', 'd', True), ...)]
Note: ast.unparse is used to get string representation of condition. It was introduced in Python 3.9. If you want to use older version of Python, perhaps you will want to install 3rd party package astunparse and then use it in function _condition_to_string. Otherwise trace_conditions will not return string representation of conditions.
TL;DR
Idea
Basically, we want to programmatically add catchers to the function's code. For example, print catchers could look like this:
while x > 5:
print('while x > 5', x > 5) # <-- print condition after while
# do smth
print('if x > 5', x > 5) # <-- print condition before if
if x > 5:
# do smth
So, the main idea is to use code introspection tools in python (inspect, ast, exec).
Implementation
Here I will briefly explain the code in trace_conditions.py:
Main function trace_conditions
The main function is self-explanatory and simply reflects the whole algorithm: (1) build syntactic tree; (2) inject condition catchers; (3) compile new function.
def trace_conditions(
func: Callable, return_conditions=False):
catcher_type = 'yield' if return_conditions else 'print'
tree = _build_syntactic_tree(func)
_inject_catchers(tree, catcher_type)
func = _compile_function(tree, globals_=inspect.stack()[1][0].f_globals)
if return_conditions:
func = _gather_conditions(func)
return func
The only thing that requires explanation is globals_=inspect.stack()[1][0].f_globals. In order to compile a new function we need to give python all modules that are used by that function (for example, it may use math, numpy, django, etc...). And inspect.stack()[1][0].f_globals simply takes everything what imported in the module of the calling function.
Caveat!
# math_pi.py
import math
def get_pi():
return math.pi
# test.py
from math_pi import get_pi
from trace_conditions import trace_conditions
traced = trace_conditions(get_pi)
traced() # Error! Math is not imported in this module
To solve it you can either modify code in trace_conditions.py or just add import math in test.py
_build_syntactic_tree
Here we are first getting the source code of function using inspect.getsource and then parse it in syntactic tree using ast.parse. Unfortunately, python cannot inspect source code of function if it is called from decorator, so it seems with this approach it is not possible to use convenient decorators.
_inject_catchers
In this function we traverse given syntactic tree, find while and if statements and then inject catchers before or after them. ast module has method walk, but it returns only node itself (without parent), so I implemented slightly changed version of walk that returns parent node as well. We need to know parent if we want to insert catcher before if.
def _inject_catchers(tree, catcher_type):
for parent, node in _walk_with_parent(tree):
if isinstance(node, ast.While):
_catch_after_while(node, _create_catcher(node, catcher_type))
elif isinstance(node, ast.If):
_catch_before_if(parent, node, _create_catcher(node, catcher_type))
ast.fix_missing_locations(tree)
At the end we call ast.fix_missing_locations function that helps to fill in correctly technical fields like lineno and others that required in order to compile code. Usually, you need to use it, when you modify syntactic tree.
Catching elif statement
The funny stuff is that python doesn't have in its ast grammar elif statement, so it has just if-else statements. The ast.If node has field body that contains expressions of if body and field orelse that contains expressions of else block. And elif case is simply represented by ast.If node inside orelse field. This fact reflected in the function _catch_before_if.
Catchers (and _gather_conditions)
There are several ways how you could catch conditions, the most simple is to just print it, but this approach will not work if you want handle them later in python code. One straightforward way is to have a global empty list in which you will append condition and its value during execution of function. However, I think that this solution introduces a new name in the namespace that potentially can clutter with local names inside a function, so I decided that it should be more safe to yield conditions and its information.
The function _gather_conditions is adding a wrapper around function with injected yield statements, that simply gathers all yielded conditions and returns result of function and conditions.

Related

Running function code only when NOT assigning output to variable?

I am looking for a way in python to stop certain parts of the code inside a function but only when the output of the function is assigned to a variable. If the the function is run without any assignment then it should run all the inside of it.
Something like this:
def function():
print('a')
return ('a')
function()
A=function()
The first time that I call function() it should display a on the screen, while the second time nothing should print and only store value returned into A.
I have not tried anything since I am kind of new to Python, but I was imagining it would be something like the if __name__=='__main__': way of checking if a script is being used as a module or run directly.
I don't think such a behavior could be achieved in python, because within the scope of the function call, there is no indication what your will do with the returned value.
You will have to give an argument to the function that tells it to skip/stop with a default value to ease the call.
def call_and_skip(skip_instructions=False):
if not skip_instructions:
call_stuff_or_not()
call_everytime()
call_and_skip()
# will not skip inside instruction
a_variable = call_and_skip(skip_instructions=True)
# will skip inside instructions
As already mentionned in comments, what you're asking for is not technically possible - a function has (and cannot have) any knowledge of what the calling code will do with the return value.
For a simple case like your example snippet, the obvious solution is to just remove the print call from within the function and leave it out to the caller, ie:
def fun():
return 'a'
print(fun())
Now I assume your real code is a bit more complex than this so such a simple solution would not work. If that's the case, the solution is to split the original function into many distinct one and let the caller choose which part it wants to call. If you have a complex state (local variables) that need to be shared between the different parts, you can wrap the whole thing into a class, turning the sub functions into methods and storing those variables as instance attributes.

python imported module dot notation as function parameter input

New to python programming here. I need some help understand why this does not work:
import x.y # x is a package with __init__.py
def func1(x.y.z): # I get a syntax error here
It works when I do this:
import x.y
a = x.y.z
def func1(a):
I've search the web and can't find anything that would answer this somewhat directly.
Thanks.
With def you define new functions which accept some possibly unknown(!) arguments.
So, def sin(x): means "define a function called sin that accepts one argument". Note that this code means that x can be absolutely anything, the function definition doesn't (and cannot) apply any restrictions on its type, value, size, etc.
When you do
a = "hello"
def test(a):
pass
The a in the function definition is merely an argument that doesn't have any relation to any other a you use in your code! You could've called it x, pi, z or whatever as the name doesn't really matter (code readability aside).
When you try to write
def test(x.y.z):
pass
You get a syntax error as there exist restrictions on the variables' and arguments' names that don't allow you to call a variable any name you want. Why? Simply because otherwise you'll get a lot of uncertainty. For example, how to parse this:
# a poorly formatted number literal or a variable definition??
1234hello = "test"
# attempt to access a member of a class (or module) or a variable definition??
x.y.z = 5
# is "yay a variable's name or a poorly formatted string literal??
x = "yay - 5
# the same question as above
f' = df/dx
A function argument is a variable, so the very same restrictions are imposed on it as well.
BTW, take a look at the SO code highlighter going nuts trying to highlight the code above.

Alternative to exec

I'm currently trying to code a Python (3.4.4) GUI with tkinter which should allow to fit an arbitrary function to some datapoints. To start easy, I'd like to create some input-function and evaluate it. Later, I would like to plot and fit it using curve_fit from scipy.
In order to do so, I would like to create a dynamic (fitting) function from a user-input-string. I found and read about exec, but people say that (1) it is not safe to use and (2) there is always a better alternative (e.g. here and in many other places). So, I was wondering what would be the alternative in this case?
Here is some example code with two nested functions which works but it's not dynamic:
def buttonfit_press():
def f(x):
return x+1
return f
print(buttonfit_press()(4))
And here is some code that gives rise to NameError: name 'f' is not defined before I can even start to use xval:
def buttonfit_press2(xval):
actfitfunc = "f(x)=x+1"
execstr = "def {}:\n return {}\n".format(actfitfunc.split("=")[0], actfitfunc.split("=")[1])
exec(execstr)
return f
print(buttonfit_press2(4))
An alternative approach with types.FunctionType discussed here (10303248) wasn't successful either...
So, my question is: Is there a good alternative I could use for this scenario? Or if not, how can I make the code with exec run?
I hope it's understandable and not too vague. Thanks in advance for your ideas and input.
#Gábor Erdős:
Either I don't understand or I disagree. If I code the same segment in the mainloop, it recognizes f and I can execute the code segment from execstr:
actfitfunc = "f(x)=x+1"
execstr = "def {}:\n return {}\n".format(actfitfunc.split("=")[0], actfitfunc.split("=")[1])
exec(execstr)
print(f(4))
>>> 5
#Łukasz Rogalski:
Printing execstr seems fine to me:
def f(x):
return x+1
Indentation error is unlikely due to my editor, but I double-checked - it's fine.
Introducing my_locals, calling it in exec and printing in afterwards shows:
{'f': <function f at 0x000000000348D8C8>}
However, I still get NameError: name 'f' is not defined.
#user3691475:
Your example is very similar to my first example. But this is not "dynamic" in my understanding, i.e. one can not change the output of the function while the code is running.
#Dunes:
I think this is going in the right direction, thanks. However, I don't understand yet how I can evaluate and use this function in the next step? What I mean is: in order to be able to fit it, I have to extract fitting variables (i.e. a in f(x)=a*x+b) or evaluate the function at various x-values (i.e. print(f(3.14))).
The problem with exec/eval, is that they can execute arbitrary code. So to use exec or eval you need to either carefully parse the code fragment to ensure it doesn't contain malicious code (an incredibly hard task), or be sure that the source of the code can be trusted. If you're making a small program for personal use then that's fine. A big program that's responsible for sensitive data or money, definitely not. It would seem your use case counts as having a trusted source.
If all you want is to create an arbitrary function at runtime, then just use a combination of the lambda expression and eval. eg.
func_str = "lambda x: x + 1" # equates to f(x)=x+1
func = eval(func_str)
assert func(4) == 5
The reason why your attempt isn't working is that locals(), in the context of a function, creates a copy of the local namespace. Mutations to the resulting dictionary do not effect the current local namespace. You would need to do something like:
def g():
src = """
def f(x):
return x + 1
"""
exec_namespace = {} # exec will place the function f in this dictionary
exec(src, exec_namespace)
return exec_namespace['f'] # retrieve f
I'm not sure what exactly are you trying to do, i.e. what functions are allowed, what operations are permitted, etc.
Here is an example of a function generator with one dynamic parameter:
>>> def generator(n):
def f(x):
return x+n
return f
>>> plus_one=generator(1)
>>> print(plus_one(4))
5

function_exists in PHP for python3 [duplicate]

This question already has answers here:
How do I check if a variable exists?
(14 answers)
Closed 6 years ago.
Is there something like function_exists in PHP for Python3? I am implementing something that allows users (through some web UI) to define simple rules in JSON as follows (in some weird lisp-like structure):
["_and", ["_tautology"], ["tautology"]]
and would like to turn that into a python statement, for instance these functions
import operator
from functools import reduce
def _and(*args):
return lambda context: reduce(operator.and, [arg(context) for arg in args], True)
def _tautology(*_):
return lambda *__: True
by turning that original JSON rule into
_and(_tautology(), _tautology())
Just out of curiousity, is ast made for this kind of task? I did this once before but I am looking for something that is scalable. Because what I did before this was practically maintaining a dictionary like follows
mapping = {'_and': _and}
and the list would keep growing, and that results in more code typed to describe what the string value means, instead of implementing them. Or I should have used another rule engine? Because one of the rule would look like
["_and", ["_equals", "fieldA", "some_value"],
["_equals", "fieldB", "some_other_value"]]
Assuming _equals is
def _equals(field_name, value):
return lambda context: context[field_name] == value
so that the rule is expanded to
_and(_equals('fieldA', 'some_value'),
_equals('fieldB', 'some_other_value'))
TL;DR
Main Question: is there something like function_exists for Python3, is ast suitable for this?
Secondary Question: should I use some sort of rule engine instead?
Regarding the duplicate question report No, I am not checking if a variable exists. I want to know if there is a function that has the same name, as a string value. For example, if I have a string '_and' I want to know if there is a function named _and, not trying to figure out whether this identifier _and is actually a function.
As Morton pointed out, you could use globals() and locals() to fetch a variable using a string containing the name.
In [32]: a = 1
In [33]: def b():
c = 2
print(globals()['a'])
print(globals()['b'])
print(locals()['c'])
....:
In [34]: b()
1
<function b at 0x7f425cae3ae8>
2
But! For your task I would recommend using a decorator that registers your functions to a mapping automatically.
_mapping = {}
def register(f):
_mapping[f.__name__] = f
return f
#register
def _and(*args):
return lambda context: reduce(operator.and_,
[arg(context) for arg in args], True)
#register
def _tautology(*_):
return lambda *_: True
and so your function_exists would be just
_mapping[key]
AST is suitable for inspecting syntax trees generated from parsed python source, modifying existing syntax trees and generating new ones and transpiling to python from a different language by generating syntax trees from it (to name a few uses). So in a way yes, you could generate AST from your JSON and compile that. I believe that is actually what Hy does, though not from JSON, but full blown lisp syntax.

Can I be warned when I used a generator function by accident

I was working with generator functions and private functions of a class. I am wondering
Why when yielding (which in my one case was by accident) in __someFunc that this function just appears not to be called from within __someGenerator. Also what is the terminology I want to use when referring to these aspects of the language?
Can the python interpreter warn of such instances?
Below is an example snippet of my scenario.
class someClass():
def __init__(self):
pass
#Copy and paste mistake where yield ended up in a regular function
def __someFunc(self):
print "hello"
#yield True #if yielding in this function it isn't called
def __someGenerator (self):
for i in range(0, 10):
self.__someFunc()
yield True
yield False
def someMethod(self):
func = self.__someGenerator()
while func.next():
print "next"
sc = someClass()
sc.someMethod()
I got burned on this and spent some time trying to figure out why a function just wasn't getting called. I finally discovered I was yielding in function I didn't want to in.
A "generator" isn't so much a language feature, as a name for functions that "yield." Yielding is pretty much always legal. There's not really any way for Python to know that you didn't "mean" to yield from some function.
This PEP http://www.python.org/dev/peps/pep-0255/ talks about generators, and may help you understand the background better.
I sympathize with your experience, but compilers can't figure out what you "meant for them to do", only what you actually told them to do.
I'll try to answer the first of your questions.
A regular function, when called like this:
val = func()
executes its inside statements until it ends or a return statement is reached. Then the return value of the function is assigned to val.
If a compiler recognizes the function to actually be a generator and not a regular function (it does that by looking for yield statements inside the function -- if there's at least one, it's a generator), the scenario when calling it the same way as above has different consequences. Upon calling func(), no code inside the function is executed, and a special <generator> value is assigned to val. Then, the first time you call val.next(), the actual statements of func are being executed until a yield or return is encountered, upon which the execution of the function stops, value yielded is returned and generator waits for another call to val.next().
That's why, in your example, function __someFunc didn't print "hello" -- its statements were not executed, because you haven't called self.__someFunc().next(), but only self.__someFunc().
Unfortunately, I'm pretty sure there's no built-in warning mechanism for programming errors like yours.
Python doesn't know whether you want to create a generator object for later iteration or call a function. But python isn't your only tool for seeing what's going on with your code. If you're using an editor or IDE that allows customized syntax highlighting, you can tell it to give the yield keyword a different color, or even a bright background, which will help you find your errors more quickly, at least. In vim, for example, you might do:
:syntax keyword Yield yield
:highlight yield ctermbg=yellow guibg=yellow ctermfg=blue guifg=blue
Those are horrendous colors, by the way. I recommend picking something better. Another option, if your editor or IDE won't cooperate, is to set up a custom rule in a code checker like pylint. An example from pylint's source tarball:
from pylint.interfaces import IRawChecker
from pylint.checkers import BaseChecker
class MyRawChecker(BaseChecker):
"""check for line continuations with '\' instead of using triple
quoted string or parenthesis
"""
__implements__ = IRawChecker
name = 'custom_raw'
msgs = {'W9901': ('use \\ for line continuation',
('Used when a \\ is used for a line continuation instead'
' of using triple quoted string or parenthesis.')),
}
options = ()
def process_module(self, stream):
"""process a module
the module's content is accessible via the stream object
"""
for (lineno, line) in enumerate(stream):
if line.rstrip().endswith('\\'):
self.add_message('W9901', line=lineno)
def register(linter):
"""required method to auto register this checker"""
linter.register_checker(MyRawChecker(linter))
The pylint manual is available here: http://www.logilab.org/card/pylint_manual
And vim's syntax documentation is here: http://www.vim.org/htmldoc/syntax.html
Because the return keyword is applicable in both generator functions and regular functions, there's nothing you could possibly check (as #Christopher mentions). The return keyword in a generator indicates that a StopIteration exception should be raised.
If you try to return with a value from within a generator (which doesn't make sense, since return just means "stop iteration"), the compiler will complain at compile-time -- this may catch some copy-and-paste mistakes:
>>> def foo():
... yield 12
... return 15
...
File "<stdin>", line 3
SyntaxError: 'return' with argument inside generator
I personally just advise against copy and paste programming. :-)
From the PEP:
Note that return means "I'm done, and have nothing interesting to
return", for both generator functions and non-generator functions.
We do this.
Generators have names with "generate" or "gen" in their name. It will have a yield statement in the body. Pretty easy to check visually, since no method is much over 20 lines of code.
Other methods don't have "gen" in their name.
Also, we do not every use __ (double underscore) names under any circumstances. 32,000 lines of code. Non __ names.
The "generator vs. non-generator" method function is entirely a design question. What did the programmer "intend" to happen. The compiler can't easily validate your intent, it can only validate what you actually typed.

Categories