Related
I have an object of a custom class that I am trying to serialize and permanently store.
When I serialize it, store it, load it and use it in the same run, it works fine. It only messes up when I've ended the process and then try to load it again from the pickle file. This is the code that works fine:
first_model = NgramModel(3, name="debug")
for paragraph in text:
first_model.train(paragraph_to_sentences(text))
# paragraph to sentences just uses regex to do the equivalent of splitting by punctuation
print(first_model.context_options)
# context_options is a dict (counter)
first_model = NgramModel.load_existing_model("debug")
#load_existing_model loads the pickle file. Look in the class code
print(first_model.context_options)
However, when I run this alone, it prints an empty counter:
first_model = NgramModel.load_existing_model("debug")
print(first_model.context_options)
This is a shortened version of the class file (the only two methods that touch the pickle/dill are update_pickle_state and load_existing_model):
import os
import dill
from itertools import count
from collections import Counter
from os import path
class NgramModel:
context_options: dict[tuple, set[str]] = {}
ngram_count: Counter[tuple] = Counter()
n = 0
pickle_path: str = None
num_paragraphs = 0
num_sentences = 0
def __init__(self, n: int, **kwargs):
self.n = n
self.pickle_path = NgramModel.pathify(kwargs.get('name', NgramModel.gen_pickle_name())) #use name if exists else generate random name
def train(self, paragraph_as_list: list[str]):
'''really the central method that coordinates everything else. Takes a list of sentences, generates data(n-grams) from each, updates the fields, and saves the instance (self) to a pickle file'''
self.num_paragraphs += 1
for sentence in paragraph_as_list:
self.num_sentences += 1
generated = self.generate_Ngrams(sentence)
self.ngram_count.update(generated)
for ngram in generated:
self.add_to_set(ngram)
self.update_pickle_state()
def update_pickle_state(self):
'''saves instance to pickle file'''
file = open(self.pickle_path, "wb")
dill.dump(self, file)
file.close()
#staticmethod
def load_existing_model(name: str):
'''returns object from pickle file'''
path = NgramModel.pathify(name)
file = open(path, "rb")
obj: NgramModel = dill.load(file)
return obj
def generate_Ngrams(self, string: str):
'''ref: https://www.analyticsvidhya.com/blog/2021/09/what-are-n-grams-and-how-to-implement-them-in-python/'''
words = string.split(" ")
words = ["<start>"] * (self.n - 1) + words + ["<end>"] * (self.n - 1)
list_of_tup = []
for i in range(len(words) + 1 - self.n):
list_of_tup.append((tuple(words[i + j] for j in range(self.n - 1)), words[i + self.n - 1]))
return list_of_tup
def add_to_set(self, ngram: tuple[tuple[str, ...], str]):
if ngram[0] not in self.context_options:
self.context_options[ngram[0]] = set()
self.context_options[ngram[0]].add(ngram[1])
#staticmethod
def pathify(name):
'''converts name to path'''
return f"models/{name}.pickle"
#staticmethod
def gen_pickle_name():
for i in count():
new_name = f"unnamed-pickle-{i}"
if not path.exists(NgramModel.pathify(new_name)):
return new_name
All the other fields print properly and are complete and correct except the two dicts
The problem is that is that context_options is a mutable class-member, not an instance member. If I had to guess, dill is only pickling instance members, since the class definition holds class members. That would account for why you see a "filled-out" context_options when you're working in the same shell but not when you load fresh — you're using the dirtied class member in the former case.
It's for stuff like this that you generally don't want to use mutable class members (or similarly, mutable default values in function signatures). More typical is to use something like context_options: dict[tuple, set[str]] = None and then check if it's None in the __init__ to set it to a default value, e.g., an empty dict. Alternatively, you could use a #dataclass and provide a field initializer, i.e.
#dataclasses.dataclass
class NgramModel:
context_options: dict[tuple, set[str]] = dataclasses.field(default_factory=dict)
...
You can observe what I mean about it being a mutable class member with, for instance...
if __name__ == '__main__':
ng = NgramModel(3, name="debug")
print(ng.context_options) # {}
ng.context_options[("foo", "bar")] = {"baz", "qux"}
print(ng.context_options) # {('foo', 'bar'): {'baz', 'qux'}}
ng2 = NgramModel(3, name="debug")
print(ng2.context_options) # {('foo', 'bar'): {'baz', 'qux'}}
I would expect a brand new ng2 to have the same context that the brand new ng had - empty (or whatever an appropriate default is).
This question already has answers here:
Getting the name of a variable as a string
(32 answers)
Closed 4 months ago.
Is it possible to get the original variable name of a variable passed to a function? E.g.
foobar = "foo"
def func(var):
print var.origname
So that:
func(foobar)
Returns:
>>foobar
EDIT:
All I was trying to do was make a function like:
def log(soup):
f = open(varname+'.html', 'w')
print >>f, soup.prettify()
f.close()
.. and have the function generate the filename from the name of the variable passed to it.
I suppose if it's not possible I'll just have to pass the variable and the variable's name as a string each time.
EDIT: To make it clear, I don't recommend using this AT ALL, it will break, it's a mess, it won't help you in any way, but it's doable for entertainment/education purposes.
You can hack around with the inspect module, I don't recommend that, but you can do it...
import inspect
def foo(a, f, b):
frame = inspect.currentframe()
frame = inspect.getouterframes(frame)[1]
string = inspect.getframeinfo(frame[0]).code_context[0].strip()
args = string[string.find('(') + 1:-1].split(',')
names = []
for i in args:
if i.find('=') != -1:
names.append(i.split('=')[1].strip())
else:
names.append(i)
print names
def main():
e = 1
c = 2
foo(e, 1000, b = c)
main()
Output:
['e', '1000', 'c']
To add to Michael Mrozek's answer, you can extract the exact parameters versus the full code by:
import re
import traceback
def func(var):
stack = traceback.extract_stack()
filename, lineno, function_name, code = stack[-2]
vars_name = re.compile(r'\((.*?)\).*$').search(code).groups()[0]
print vars_name
return
foobar = "foo"
func(foobar)
# PRINTS: foobar
Looks like Ivo beat me to inspect, but here's another implementation:
import inspect
def varName(var):
lcls = inspect.stack()[2][0].f_locals
for name in lcls:
if id(var) == id(lcls[name]):
return name
return None
def foo(x=None):
lcl='not me'
return varName(x)
def bar():
lcl = 'hi'
return foo(lcl)
bar()
# 'lcl'
Of course, it can be fooled:
def baz():
lcl = 'hi'
x='hi'
return foo(lcl)
baz()
# 'x'
Moral: don't do it.
Another way you can try if you know what the calling code will look like is to use traceback:
def func(var):
stack = traceback.extract_stack()
filename, lineno, function_name, code = stack[-2]
code will contain the line of code that was used to call func (in your example, it would be the string func(foobar)). You can parse that to pull out the argument
You can't. It's evaluated before being passed to the function. All you can do is pass it as a string.
#Ivo Wetzel's answer works in the case of function call are made in one line, like
e = 1 + 7
c = 3
foo(e, 100, b=c)
In case that function call is not in one line, like:
e = 1 + 7
c = 3
foo(e,
1000,
b = c)
below code works:
import inspect, ast
def foo(a, f, b):
frame = inspect.currentframe()
frame = inspect.getouterframes(frame)[1]
string = inspect.findsource(frame[0])[0]
nodes = ast.parse(''.join(string))
i_expr = -1
for (i, node) in enumerate(nodes.body):
if hasattr(node, 'value') and isinstance(node.value, ast.Call)
and hasattr(node.value.func, 'id') and node.value.func.id == 'foo' # Here goes name of the function:
i_expr = i
break
i_expr_next = min(i_expr + 1, len(nodes.body)-1)
lineno_start = nodes.body[i_expr].lineno
lineno_end = nodes.body[i_expr_next].lineno if i_expr_next != i_expr else len(string)
str_func_call = ''.join([i.strip() for i in string[lineno_start - 1: lineno_end]])
params = str_func_call[str_func_call.find('(') + 1:-1].split(',')
print(params)
You will get:
[u'e', u'1000', u'b = c']
But still, this might break.
You can use python-varname package
from varname import nameof
s = 'Hey!'
print (nameof(s))
Output:
s
Package below:
https://github.com/pwwang/python-varname
For posterity, here's some code I wrote for this task, in general I think there is a missing module in Python to give everyone nice and robust inspection of the caller environment. Similar to what rlang eval framework provides for R.
import re, inspect, ast
#Convoluted frame stack walk and source scrape to get what the calling statement to a function looked like.
#Specifically return the name of the variable passed as parameter found at position pos in the parameter list.
def _caller_param_name(pos):
#The parameter name to return
param = None
#Get the frame object for this function call
thisframe = inspect.currentframe()
try:
#Get the parent calling frames details
frames = inspect.getouterframes(thisframe)
#Function this function was just called from that we wish to find the calling parameter name for
function = frames[1][3]
#Get all the details of where the calling statement was
frame,filename,line_number,function_name,source,source_index = frames[2]
#Read in the source file in the parent calling frame upto where the call was made
with open(filename) as source_file:
head=[source_file.next() for x in xrange(line_number)]
source_file.close()
#Build all lines of the calling statement, this deals with when a function is called with parameters listed on each line
lines = []
#Compile a regex for matching the start of the function being called
regex = re.compile(r'\.?\s*%s\s*\(' % (function))
#Work backwards from the parent calling frame line number until we see the start of the calling statement (usually the same line!!!)
for line in reversed(head):
lines.append(line.strip())
if re.search(regex, line):
break
#Put the lines we have groked back into sourcefile order rather than reverse order
lines.reverse()
#Join all the lines that were part of the calling statement
call = "".join(lines)
#Grab the parameter list from the calling statement for the function we were called from
match = re.search('\.?\s*%s\s*\((.*)\)' % (function), call)
paramlist = match.group(1)
#If the function was called with no parameters raise an exception
if paramlist == "":
raise LookupError("Function called with no parameters.")
#Use the Python abstract syntax tree parser to create a parsed form of the function parameter list 'Name' nodes are variable names
parameter = ast.parse(paramlist).body[0].value
#If there were multiple parameters get the positional requested
if type(parameter).__name__ == 'Tuple':
#If we asked for a parameter outside of what was passed complain
if pos >= len(parameter.elts):
raise LookupError("The function call did not have a parameter at postion %s" % pos)
parameter = parameter.elts[pos]
#If there was only a single parameter and another was requested raise an exception
elif pos != 0:
raise LookupError("There was only a single calling parameter found. Parameter indices start at 0.")
#If the parameter was the name of a variable we can use it otherwise pass back None
if type(parameter).__name__ == 'Name':
param = parameter.id
finally:
#Remove the frame reference to prevent cyclic references screwing the garbage collector
del thisframe
#Return the parameter name we found
return param
If you want a Key Value Pair relationship, maybe using a Dictionary would be better?
...or if you're trying to create some auto-documentation from your code, perhaps something like Doxygen (http://www.doxygen.nl/) could do the job for you?
I wondered how IceCream solves this problem. So I looked into the source code and came up with the following (slightly simplified) solution. It might not be 100% bullet-proof (e.g. I dropped get_text_with_indentation and I assume exactly one function argument), but it works well for different test cases. It does not need to parse source code itself, so it should be more robust and simpler than previous solutions.
#!/usr/bin/env python3
import inspect
from executing import Source
def func(var):
callFrame = inspect.currentframe().f_back
callNode = Source.executing(callFrame).node
source = Source.for_frame(callFrame)
expression = source.asttokens().get_text(callNode.args[0])
print(expression, '=', var)
i = 1
f = 2.0
dct = {'key': 'value'}
obj = type('', (), {'value': 42})
func(i)
func(f)
func(s)
func(dct['key'])
func(obj.value)
Output:
i = 1
f = 2.0
s = string
dct['key'] = value
obj.value = 42
Update: If you want to move the "magic" into a separate function, you simply have to go one frame further back with an additional f_back.
def get_name_of_argument():
callFrame = inspect.currentframe().f_back.f_back
callNode = Source.executing(callFrame).node
source = Source.for_frame(callFrame)
return source.asttokens().get_text(callNode.args[0])
def func(var):
print(get_name_of_argument(), '=', var)
If you want to get the caller params as in #Matt Oates answer answer without using the source file (ie from Jupyter Notebook), this code (combined from #Aeon answer) will do the trick (at least in some simple cases):
def get_caller_params():
# get the frame object for this function call
thisframe = inspect.currentframe()
# get the parent calling frames details
frames = inspect.getouterframes(thisframe)
# frame 0 is the frame of this function
# frame 1 is the frame of the caller function (the one we want to inspect)
# frame 2 is the frame of the code that calls the caller
caller_function_name = frames[1][3]
code_that_calls_caller = inspect.findsource(frames[2][0])[0]
# parse code to get nodes of abstract syntact tree of the call
nodes = ast.parse(''.join(code_that_calls_caller))
# find the node that calls the function
i_expr = -1
for (i, node) in enumerate(nodes.body):
if _node_is_our_function_call(node, caller_function_name):
i_expr = i
break
# line with the call start
idx_start = nodes.body[i_expr].lineno - 1
# line with the end of the call
if i_expr < len(nodes.body) - 1:
# next expression marks the end of the call
idx_end = nodes.body[i_expr + 1].lineno - 1
else:
# end of the source marks the end of the call
idx_end = len(code_that_calls_caller)
call_lines = code_that_calls_caller[idx_start:idx_end]
str_func_call = ''.join([line.strip() for line in call_lines])
str_call_params = str_func_call[str_func_call.find('(') + 1:-1]
params = [p.strip() for p in str_call_params.split(',')]
return params
def _node_is_our_function_call(node, our_function_name):
node_is_call = hasattr(node, 'value') and isinstance(node.value, ast.Call)
if not node_is_call:
return False
function_name_correct = hasattr(node.value.func, 'id') and node.value.func.id == our_function_name
return function_name_correct
You can then run it as this:
def test(*par_values):
par_names = get_caller_params()
for name, val in zip(par_names, par_values):
print(name, val)
a = 1
b = 2
string = 'text'
test(a, b,
string
)
to get the desired output:
a 1
b 2
string text
Since you can have multiple variables with the same content, instead of passing the variable (content), it might be safer (and will be simpler) to pass it's name in a string and get the variable content from the locals dictionary in the callers stack frame. :
def displayvar(name):
import sys
return name+" = "+repr(sys._getframe(1).f_locals[name])
If it just so happens that the variable is a callable (function), it will have a __name__ property.
E.g. a wrapper to log the execution time of a function:
def time_it(func, *args, **kwargs):
start = perf_counter()
result = func(*args, **kwargs)
duration = perf_counter() - start
print(f'{func.__name__} ran in {duration * 1000}ms')
return result
I would need to separate the signature from the body of a function in Python, i.e. :
def mult(a, b):
"Multiplication"
if a or b is None:
# border case
return None
else:
return a * b
(the function is there for demonstration only).
I know that inspect.getsourcelines(mult) will get me the full code, but I would like to have only the body, i.e. less the signature and the docstring:
if a or b is None:
# border case
return None
else:
return a * b
Is there an elegant way to obtain that, ideally using Python3 built-in parsing tools, rather than string manipulation?
As I don't know any function that does this, here is a homemade function that should work.
I don't have time for more in-depth use-cases. So i'll leave my incomplete answer here.
def get_body(func):
""" Using the magic method __doc__, we KNOW the size of the docstring.
We then, just substract this from the total length of the function
"""
try:
lines_to_skip = len(func.__doc__.split('\n'))
except AttributeError:
lines_to_skip = 0
lines = getsourcelines(func)[0]
return ''.join( lines[lines_to_skip+1:] )
def ex_0(a, b):
""" Please !
Don't
Show
This
"""
if a or b is None:
# border case
return None
else:
return a * b
def ex_1(a, b):
''' Please !Don'tShowThis'''
if a or b is None:
# border case
return None
else:
return a * b
def ex_2(a, b):
if a or b is None:
# border case
return None
else:
return a * b
def ex_3(bullshit, hello):
pass
get_body(ex_0)
# if a or b is None:
# # border case
# return None
# else:
# return a * b
get_body(ex_1)
# if a or b is None:
# # border case
# return None
# else:
# return a * b
get_body(ex_2)
# if a or b is None:
# # border case
# return None
# else:
# return a * b
get_body(ex_3)
# pass
you can use this also
i = inspect.getsource(fun_name).index('\n')
j = inspect.getsource(fun_name).rindex(':',0,i)
print(inspect.getsource(fun_name)[j+1:])
If you know your code will always have a non-empty docstring like a above, You can do this with the built in inspect library.
The below snippet should get you acquainted with how to see the source code body of a function.
def hello_world():
"""Python port of original Fortran code"""
print("hello_world")
import inspect
source = inspect.getsource(hello_world)
doc = hello_world.__doc__
code = source.split(doc)
body = code[1]
print(body)
EDIT:
I thought about this some more, you can first remove the defintion line and then strip out the docstring if it exists:
def hello_world():
"""Python port of original Fortran code"""
print("hello_world")
import inspect
source = inspect.getsource(hello_world)
doc = hello_world.__doc__
code = source.split(':',1)
body= code[1].replace(doc, "")
body = body.replace('""""""',"")
print(body)
Thanks a lot for the inspiring answers! I tried to take this further and I found a solution that uses Python's ast facilities.
import inspect, ast
from astunparse import unparse
def mult(a,
b): # comment
"""
This is a 'doctstring', "hello"
"""
if not a is None or b is None:
# border case
return None
else:
return a * b
def get_body(f, doc_string=False):
"""
Get the body text of a function, i.e. without the signature.
NOTE: Comments are stripped.
"""
COMMENT_START = ("'", '"')
code = inspect.getsource(mult)
# print("Function's code:\n", code)
module_tree = ast.parse(code) # this creates a module
# print("Dump:\n", ast.dump(module_tree))
# the first element of the module is evidently the function:
function_body = module_tree.body[0].body
# strip the code lines to remove the additional lines:
lines = [unparse(code_line).strip() for code_line in function_body]
# for i, line in enumerate(lines): print("%s: %s" % (i, line.strip()))
# in principle the notion of body contains the docstring:
if not doc_string:
lines = (line for line in lines if not line.startswith(COMMENT_START))
return '\n'.join(lines).strip()
s =get_body(mult)
print("---------")
print(s)
Here is the result:
$ python3 function_body.py
---------
if ((not (a is None)) or (b is None)):
return None
else:
return (a * b)
You can choose if you want the docstring or not. The downside of this approach (which should not be a downside in my use case) is that comments are stripped.
I also left some commented print statements, in case someone would like to explore the various steps.
I want to write a decorator that inject custom local variable into function.
interface may like this.
def enclose(name, value):
...
def decorator(func):
def wrapper(*args, **kwargs):
return func(*args, **kwargs)
return wrapper
return decorator
expectation:
#enclose('param1', 1)
def f():
param1 += 1
print param1
f() will compile and run without error
output:
2
Is it possible to do this in python? why?
I thought I'd try this out just to see how hard it would be. Pretty hard as it turns out.
First thing was how do you implement this? Is the extra parameter an injected local variable, an additional argument to the function or a nonlocal variable. An injected local variable will be a fresh object each time, but how to create more complicated objects... An additional argument will record mutations to the object, but assignments to the name will be forgotten between function invocations. Additionally, this will require either parsing of the source to find where to place the argument, or directly manipulating code objects. Finally, declaring the variables nonlocal will record mutations to the object and assignments to the name. Effectively a nonlocal is global, but only reachable by the decorated function. Again, using a nonlocal will requiring parsing the source and finding where to place the nonlocal declaration or direct manipulation of a code object.
In the end I decided with using a nonlocal variable and parsing the function source. Originally I was going to manipulate code objects, but it seemed too complicated.
Here is the code for the decorator:
import re
import types
import inspect
class DummyInject:
def __call__(self, **kwargs):
return lambda func: func
def __getattr__(self, name):
return self
class Inject:
function_end = re.compile(r"\)\s*:\s*\n")
indent = re.compile("\s+")
decorator = re.compile("#([a-zA-Z0-9_]+)[.a-zA-Z0-9_]*")
exec_source = """
def create_new_func({closure_names}):
{func_source}
{indent}return {func_name}"""
nonlocal_declaration = "{indent}nonlocal {closure_names};"
def __init__(self, **closure_vars):
self.closure_vars = closure_vars
def __call__(self, func):
lines, line_number = inspect.getsourcelines(func)
self.inject_nonlocal_declaration(lines)
new_func = self.create_new_function(lines, func)
return new_func
def inject_nonlocal_declaration(self, lines):
"""hides nonlocal declaration in first line of function."""
function_body_start = self.get_function_body_start(lines)
nonlocals = self.nonlocal_declaration.format(
indent=self.indent.match(lines[function_body_start]).group(),
closure_names=", ".join(self.closure_vars)
)
lines[function_body_start] = nonlocals + lines[function_body_start]
return lines
def get_function_body_start(self, lines):
line_iter = enumerate(lines)
found_function_header = False
for i, line in line_iter:
if self.function_end.search(line):
found_function_header = True
break
assert found_function_header
for i, line in line_iter:
if not line.strip().startswith("#"):
break
return i
def create_new_function(self, lines, func):
# prepares source -- eg. making sure indenting is correct
declaration_indent, body_indent = self.get_indent(lines)
if not declaration_indent:
lines = [body_indent + line for line in lines]
exec_code = self.exec_source.format(
closure_names=", ".join(self.closure_vars),
func_source="".join(lines),
indent=declaration_indent if declaration_indent else body_indent,
func_name=func.__name__
)
# create new func -- mainly only want code object contained by new func
lvars = {"closure_vars": self.closure_vars}
gvars = self.get_decorators(exec_code, func.__globals__)
exec(exec_code, gvars, lvars)
new_func = eval("create_new_func(**closure_vars)", gvars, lvars)
# add back bits that enable function to work well
# includes original global references and
new_func = self.readd_old_references(new_func, func)
return new_func
def readd_old_references(self, new_func, old_func):
"""Adds back globals, function name and source reference."""
func = types.FunctionType(
code=self.add_src_ref(new_func.__code__, old_func.__code__),
globals=old_func.__globals__,
name=old_func.__name__,
argdefs=old_func.__defaults__,
closure=new_func.__closure__
)
func.__doc__ = old_func.__doc__
return func
def add_src_ref(self, new_code, old_code):
return types.CodeType(
new_code.co_argcount,
new_code.co_kwonlyargcount,
new_code.co_nlocals,
new_code.co_stacksize,
new_code.co_flags,
new_code.co_code,
new_code.co_consts,
new_code.co_names,
new_code.co_varnames,
old_code.co_filename, # reuse filename
new_code.co_name,
old_code.co_firstlineno, # reuse line number
new_code.co_lnotab,
new_code.co_freevars,
new_code.co_cellvars
)
def get_decorators(self, source, global_vars):
"""Creates a namespace for exec function creation in. Must remove
any reference to Inject decorator to prevent infinite recursion."""
namespace = {}
for match in self.decorator.finditer(source):
decorator = eval(match.group()[1:], global_vars)
basename = match.group(1)
if decorator is Inject:
namespace[basename] = DummyInject()
else:
namespace[basename] = global_vars[basename]
return namespace
def get_indent(self, lines):
"""Takes a set of lines used to create a function and returns the
outer indentation that the function is declared in and the inner
indentation of the body of the function."""
body_indent = None
function_body_start = self.get_function_body_start(lines)
for line in lines[function_body_start:]:
match = self.indent.match(line)
if match:
body_indent = match.group()
break
assert body_indent
match = self.indent.match(lines[0])
if not match:
declaration_indent = ""
else:
declaration_indent = match.group()
return declaration_indent, body_indent
if __name__ == "__main__":
a = 1
#Inject(b=10)
def f(c, d=1000):
"f uses injected variables"
return a + b + c + d
#Inject(var=None)
def g():
"""Purposefully generate exception to show stacktraces are still
meaningful."""
create_name_error # line number 164
print(f(100)) # prints 1111
assert f(100) == 1111
assert f.__doc__ == "f uses injected variables" # show doc is retained
try:
g()
except NameError:
raise
else:
assert False
# stack trace shows NameError on line 164
Which outputs the following:
1111
Traceback (most recent call last):
File "inject.py", line 171, in <module>
g()
File "inject.py", line 164, in g
create_name_error # line number 164
NameError: name 'create_name_error' is not defined
The whole thing is hideously ugly, but it works. It's also worth noting that if Inject is used for method, then any injected values are shared between all instances of the class.
You can do it using globals but I don't recommend this approach.
def enclose(name, value):
globals()[name] = value
def decorator(func):
def wrapper(*args, **kwargs):
return func(*args, **kwargs)
return wrapper
return decorator
#enclose('param1', 1)
def f():
global param1
param1 += 1
print(param1)
f()
Suppose I have a function like f(a, b, c=None). The aim is to call the function like f(*args, **kwargs), and then construct a new set of args and kwargs such that:
If the function had default values, I should be able to acquire their values. For example, if I call it like f(1, 2), I should be able to get the tuple (1, 2, None) and/or the dictionary {'c': None}.
If the value of any of the arguments was modified inside the function, get the new value. For example, if I call it like f(1, 100000, 3) and the function does if b > 500: b = 5 modifying the local variable, I should be able to get the the tuple (1, 5, 3).
The aim here is to create a a decorator that finishes the job of a function. The original function acts as a preamble setting up the data for the actual execution, and the decorator finishes the job.
Edit: I'm adding an example of what I'm trying to do. It's a module for making proxies for other classes.
class Spam(object):
"""A fictional class that we'll make a proxy for"""
def eggs(self, start, stop, step):
"""A fictional method"""
return range(start, stop, step)
class ProxyForSpam(clsproxy.Proxy):
proxy_for = Spam
#clsproxy.signature_preamble
def eggs(self, start, stop, step=1):
start = max(0, start)
stop = min(100, stop)
And then, we'll have that:
ProxyForSpam().eggs(-10, 200) -> Spam().eggs(0, 100, 1)
ProxyForSpam().eggs(3, 4) -> Spam().eggs(3, 4, 1)
There are two recipes available here, one which requires an external library and another that uses only the standard library. They don't quite do what you want, in that they actually modify the function being executed to obtain its locals() rather than obtain the locals() after function execution, which is impossible, since the local stack no longer exists after the function finishes execution.
Another option is to see what debuggers, such as WinPDB or even the pdb module do. I suspect they use the inspect module (possibly along with others), to get the frame inside which a function is executing and retrieve locals() that way.
EDIT: After reading some code in the standard library, the file you want to look at is probably bdb.py, which should be wherever the rest of your Python standard library is. Specifically, look at set_trace() and related functions. This will give you an idea of how the Python debugger breaks into the class. You might even be able to use it directly. To get the frame to pass to set_trace() look at the inspect module.
I've stumbled upon this very need today and wanted to share my solution.
import sys
def call_function_get_frame(func, *args, **kwargs):
"""
Calls the function *func* with the specified arguments and keyword
arguments and snatches its local frame before it actually executes.
"""
frame = None
trace = sys.gettrace()
def snatch_locals(_frame, name, arg):
nonlocal frame
if frame is None and name == 'call':
frame = _frame
sys.settrace(trace)
return trace
sys.settrace(snatch_locals)
try:
result = func(*args, **kwargs)
finally:
sys.settrace(trace)
return frame, result
The idea is to use sys.trace() to catch the frame of the next 'call'. Tested on CPython 3.6.
Example usage
import types
def namespace_decorator(func):
frame, result = call_function_get_frame(func)
try:
module = types.ModuleType(func.__name__)
module.__dict__.update(frame.f_locals)
return module
finally:
del frame
#namespace_decorator
def mynamespace():
eggs = 'spam'
class Bar:
def hello(self):
print("Hello, World!")
assert mynamespace.eggs == 'spam'
mynamespace.Bar().hello()
I don't see how you could do this non-intrusively -- after the function is done executing, it doesn't exist any more -- there's no way you can reach inside something that doesn't exist.
If you can control the functions that are being used, you can do an intrusive approach like
def fn(x, y, z, vars):
'''
vars is an empty dict that we use to pass things back to the caller
'''
x += 1
y -= 1
z *= 2
vars.update(locals())
>>> updated = {}
>>> fn(1, 2, 3, updated)
>>> print updated
{'y': 1, 'x': 2, 'z': 6, 'vars': {...}}
>>>
...or you can just require that those functions return locals() -- as #Thomas K asks above, what are you really trying to do here?
Witchcraft below read on your OWN danger(!)
I have no clue what you want to do with this, it's possible but it's an awful hack...
Anyways, I HAVE WARNED YOU(!), be lucky if such things don't work in your favorite language...
from inspect import getargspec, ismethod
import inspect
def main():
#get_modified_values
def foo(a, f, b):
print a, f, b
a = 10
if a == 2:
return a
f = 'Hello World'
b = 1223
e = 1
c = 2
foo(e, 1000, b = c)
# intercept a function and retrieve the modifed values
def get_modified_values(target):
def wrapper(*args, **kwargs):
# get the applied args
kargs = getcallargs(target, *args, **kwargs)
# get the source code
src = inspect.getsource(target)
lines = src.split('\n')
# oh noes string patching of the function
unindent = len(lines[0]) - len(lines[0].lstrip())
indent = lines[0][:len(lines[0]) - len(lines[0].lstrip())]
lines[0] = ''
lines[1] = indent + 'def _temp(_args, ' + lines[1].split('(')[1]
setter = []
for k in kargs.keys():
setter.append('_args["%s"] = %s' % (k, k))
i = 0
while i < len(lines):
indent = lines[i][:len(lines[i]) - len(lines[i].lstrip())]
if lines[i].find('return ') != -1 or lines[i].find('return\n') != -1:
for e in setter:
lines.insert(i, indent + e)
i += len(setter)
elif i == len(lines) - 2:
for e in setter:
lines.insert(i + 1, indent + e)
break
i += 1
for i in range(0, len(lines)):
lines[i] = lines[i][unindent:]
data = '\n'.join(lines) + "\n"
# setup variables
frame = inspect.currentframe()
loc = inspect.getouterframes(frame)[1][0].f_locals
glob = inspect.getouterframes(frame)[1][0].f_globals
loc['_temp'] = None
# compile patched function and call it
func = compile(data, '<witchstuff>', 'exec')
eval(func, glob, loc)
loc['_temp'](kargs, *args, **kwargs)
# there you go....
print kargs
# >> {'a': 10, 'b': 1223, 'f': 'Hello World'}
return wrapper
# from python 2.7 inspect module
def getcallargs(func, *positional, **named):
"""Get the mapping of arguments to values.
A dict is returned, with keys the function argument names (including the
names of the * and ** arguments, if any), and values the respective bound
values from 'positional' and 'named'."""
args, varargs, varkw, defaults = getargspec(func)
f_name = func.__name__
arg2value = {}
# The following closures are basically because of tuple parameter unpacking.
assigned_tuple_params = []
def assign(arg, value):
if isinstance(arg, str):
arg2value[arg] = value
else:
assigned_tuple_params.append(arg)
value = iter(value)
for i, subarg in enumerate(arg):
try:
subvalue = next(value)
except StopIteration:
raise ValueError('need more than %d %s to unpack' %
(i, 'values' if i > 1 else 'value'))
assign(subarg,subvalue)
try:
next(value)
except StopIteration:
pass
else:
raise ValueError('too many values to unpack')
def is_assigned(arg):
if isinstance(arg,str):
return arg in arg2value
return arg in assigned_tuple_params
if ismethod(func) and func.im_self is not None:
# implicit 'self' (or 'cls' for classmethods) argument
positional = (func.im_self,) + positional
num_pos = len(positional)
num_total = num_pos + len(named)
num_args = len(args)
num_defaults = len(defaults) if defaults else 0
for arg, value in zip(args, positional):
assign(arg, value)
if varargs:
if num_pos > num_args:
assign(varargs, positional[-(num_pos-num_args):])
else:
assign(varargs, ())
elif 0 < num_args < num_pos:
raise TypeError('%s() takes %s %d %s (%d given)' % (
f_name, 'at most' if defaults else 'exactly', num_args,
'arguments' if num_args > 1 else 'argument', num_total))
elif num_args == 0 and num_total:
raise TypeError('%s() takes no arguments (%d given)' %
(f_name, num_total))
for arg in args:
if isinstance(arg, str) and arg in named:
if is_assigned(arg):
raise TypeError("%s() got multiple values for keyword "
"argument '%s'" % (f_name, arg))
else:
assign(arg, named.pop(arg))
if defaults: # fill in any missing values with the defaults
for arg, value in zip(args[-num_defaults:], defaults):
if not is_assigned(arg):
assign(arg, value)
if varkw:
assign(varkw, named)
elif named:
unexpected = next(iter(named))
if isinstance(unexpected, unicode):
unexpected = unexpected.encode(sys.getdefaultencoding(), 'replace')
raise TypeError("%s() got an unexpected keyword argument '%s'" %
(f_name, unexpected))
unassigned = num_args - len([arg for arg in args if is_assigned(arg)])
if unassigned:
num_required = num_args - num_defaults
raise TypeError('%s() takes %s %d %s (%d given)' % (
f_name, 'at least' if defaults else 'exactly', num_required,
'arguments' if num_required > 1 else 'argument', num_total))
return arg2value
main()
Output:
1 1000 2
{'a': 10, 'b': 1223, 'f': 'Hello World'}
There you go... I'm not responsible for any small children that get eaten by demons or something the like (or if it breaks on complicated functions).
PS: The inspect module is the pure EVIL.
Since you are trying to manipulate variables in one function, and do some job based on those variables on another function, the cleanest way to do it is having these variables to be an object's attributes.
It could be a dictionary - that could be defined inside the decorator - therefore access to it inside the decorated function would be as a "nonlocal" variable. That cleans up the default parameter tuple of this dictionary, that #bgporter proposed.:
def eggs(self, a, b, c=None):
# nonlocal parms ## uncomment in Python 3
parms["a"] = a
...
To be even more clean, you probably should have all these parameters as attributes of the instance (self) - so that no "magical" variable has to be used inside the decorated function.
As for doing it "magically" without having the parameters set as attributes of certain object explicitly, nor having the decorated function to return the parameters themselves (which is also an option) - that is, to have it to work transparently with any decorated function - I can't think of a way that does not involve manipulating the bytecode of the function itself.
If you can think of a way to make the wrapped function raise an exception at return time, you could trap the exception and check the execution trace.
If it is so important to do it automatically that you consider altering the function bytecode an option, feel free to ask me further.