How to pop elements from another function's kwargs? - python

I have a function that is responsible for getting data from the kwargs of several other functions.
The other functions pass their own kwargs to this function along with a keep argument that determines whether or not to keep these properties in the kwargs - i.e. whether to use get or pop.
def _handle_kwargs(keep, **kwargs):
# keep: whether to keep the kwarg when we're done with it (i.e. get or pop)
if keep: func = getattr(kwargs, 'get')
else: func = getattr(kwargs, 'pop')
# get or pop some kwargs individually
debug = func('debug', False)
assert isinstance(debug, bool)
...
# repeated for several different possible kwargs
return debug, some_other_kwarg, ...
def normal_function(**kwargs)
debug, some_other_kwarg = _handle_kwargs(False, **kwargs)
Getting the values from the kwargs works fine. However, if I try to pop the kwargs, then they are still present in the original function's kwargs. I suspect this is because _handle_kwargs is only modifying its own kwargs.
How can I ensure that the kwargs are removed if I use pop, even if that's coming from another function?

I doubt you can do that passing to **kwargs, as it appears to be passed by value, but if it's ok to modify the inner function, you could pass kwargs as a plain dictionary, i.e. without the **.
def test(x):
print(x)
x.pop('test')
print(x)
def real(**kwargs):
test(kwargs)
print(kwargs)
real(test='nothing', real='something')
Output
{'test': 'nothing', 'real': 'something'}
{'real': 'something'}
{'real': 'something'}

The problem is that you don't pass a dictionary to _handle_kwargs. The **kwargs syntax when calling a function actually "explodes" kwargs.
That is, if kwargs is {'a':1, 'b':2}, then _handle_kwargs(False, **kwargs) is equivalent to _handle_kwargs(False, kwargs['a'], kwargs['b']). You don't pass the kwargs dict at all!
_handle_kwargs collects them into a new dictionary, so it won't affect the original one.
The solution is very simple.
First, def _handle_kwargs(keep, kwargs): without asterisks. Just receive a dict.
Second, call it like so:
def normal_function(**kwargs)
debug, some_other_kwarg = _handle_kwargs(False, kwargs)
See the second line - calling _handle_kwargs without asterisks - just pass the dict.

Related

function not recognizing args and kwargs

I am trying to define a function like so:
def get_event_stats(elengths, einds, *args, **kwargs):
master_list = []
if avg:
for arg in args:
do stuff...
if tot:
for arg in args:
do stuff...
return master_list
I would like elengths and einds to be fixed positional args (these are just arrays of ints). I am trying to use the function by passing it a variable length list of arrays as *args and some **kwargs, in this example two (avg and tot), but potentially more, for example,
avg_event_time = get_event_stats(event_lengths, eventInds, *alist, avg=True, tot=False)
where
alist = [time, counts]
and my kwargs are avg and tot, which are given the value of True and False respectively. Regardless of how I've tried to implement this function, I get some kind of error. What am I missing here in the correct usage of *args and **kwargs?
**kwargs creates a dict, it doesn't inject arbitrary names into your local namespace. If you want to look for whether a particular keyword was passed, you can't test if avg: (there is no variable named avg), you need to check if avg is in the dict, e.g. if 'avg' in kwargs:, or to check both existence and "truthiness", so passing avg=False is equivalent to not passing it at all, test if kwargs.get('avg'): (using kwargs.get('avg') ensures no exception is thrown if avg wasn't passed at all, unlike if kwargs['avg']:).
Note: You should really move to Python 3 if at all possible. It makes writing this function much more obvious and clean, as you could avoid the need for kwargs completely, and verify no unrecognized keyword arguments were passed by just defining the function as:
def get_event_stats(elengths, einds, *args, avg=False, tot=False):
master_list = []
if avg:
for arg in args:
do stuff...
if tot:
for arg in args:
do stuff...
return master_list
Note how the body of the function you already wrote works without modification if you explicitly name your keyword arguments after the positional varargs, making your code far more self-documenting (as well as more efficient, and with better self-checks; the clean Py3 code will error out informing you of the unrecognized argument if you pass avrg=True to it, while the **kwargs approach would require explicit checks for unknown arguments that would slow you down and bloat the code.
The closest you could get to the Py3 error-checks with minimal overhead and similar correctness/readability would be:
def get_event_stats(elengths, einds, *args, **kwargs):
master_list = []
# Unpack the recognized arguments (with default values), so kwargs left should be empty
avg = kwargs.pop('avg', False)
tot = kwargs.pop('tot', False)
# If any keywords left over, they're unrecognized, raise an error
if kwargs:
# Arbitrarily select alphabetically first unknown keyword arg
raise TypeError('get_event_stats() got an unexpected keyword argument {!r}'.format(min(kwargs)))
if avg:
for arg in args:
do stuff...
if tot:
for arg in args:
do stuff...
return master_list
If you meant that avg and tot should be passed in as keyword args, like in your example get_event_stats(..., avg=True, tot=False) then they are populated in kwargs. You can look them up in the kwargs dict using a key lookup (like kwargs['avg'].
However if they are not present at all, then that will give a key error, so use it with the dict.get() method: kwargs.get('avg') which returns None if it is not present, which is boolean False. Or use kwargs.get('avg', False) if you explicitly want a False if it's not present.
def get_event_stats(elengths, einds, *args, **kwargs):
master_list = []
if kwargs.get('avg'):
for arg in args:
do stuff...
if kwargs.get('tot'):
for arg in args:
do stuff...
return master_list

Data structure of memoization in db

What is the best data structure to cache (save/store/memorize) so many function result in database.
Suppose function calc_regress with flowing definition in python:
def calc_regress(ind_key, dep_key, count=30):
independent_list = sql_select_recent_values(count, ind_key)
dependant_list = sql_select_recent_values(count, dep_key)
import scipy.stats as st
return st.linregress(independent_list, dependant_list)
I see answers to What kind of table structure should be used to store memoized function parameters and results in a relational database? but it seem to resolve problem of just one function while I have about 500 function.
Option A
You could use the structure in the linked answer, un-normalized with the number of columns = max number of arguments among the 500 functions. Also need to add a column for the function name.
Then you could do a SELECT * FROM expensive_func_results WHERE func_name = 'calc_regress' AND arg1 = ind_key AND arg2 = dep_key and arg3 = count, etc.
Ofcourse, that's not a very good design to use. For the same function called with fewer parameters, columns with null values/non-matches need to be ignored; otherwise you'll get multiple result rows.
Option B
Create the table/structure as func_name, arguments, result where 'arguments' is always a kwargs dictionary or positional args but not mixed per entry. Even with the kwargs dict stored as a string, order of keys->values in it is not predictable/consistent even if it's the same args. So you'll need to order it before converting to a string and storing it. When you want to query, you'll use SELECT * FROM expensive_func_results WHERE func_name = 'calc_regress' AND arguments = 'str(kwargs_dict)', where str(kwargs_dict) is something you'll set programmatically. It could also be set to the result of inspect.getargspec, (or inspect.getcallargs) though you'll have to check for consistency.
You won't be able to do queries on the argument combos unless you provide all the arguments to the query or partial match with LIKE.
Option C
Normalised all the way: One table func_calls as func_name, args_combo_id, arg_name_idx, arg_value. Each row of the table will store one arg for one combo of that function's calling args. Another table func_results as func_name, args_combo_id, result. You could also normalise further for func_name to be mapped to a func_id.
In this one, the order of keyword args doesn't matter since you'll be doing an Inner join to select each parameter. This query will have to be built programmatically or done via a stored procedure, since the number of joins required to fetch all the parameters is determined by the number of parameters. Your function above has 3 params but you may have another with 10. arg_name_idx is 'argument name or index' so it also works for mixed kwargs + args. Some duplication may occur in cases like calc_regress(ind_key=1, dep_key=2, count=30) and calc_regress(1, 2, 30) (as well as calc_regress(1, 2) with a default value for count <-- this cases should be avoided, the table entry should have all args); since the args_combo_id will be different for both but result will obviously be the same. Again, the inspect module may help in this area.
[Edit] PS: Additionally, for the func_name, you may need to use a fully qualified name to avoid conflicts across modules in your package. And decorators may interfere with that as well; without a deco.__name__ = func.__name__, etc.
PPS: If objects are being passed to functions being memoized in the db, make sure that their __str__ is something useful & repeatable/consistent to store as arg values.
This particular case doesn't require you to re-create objects from the arg values in the db, otherwise, you'd need to make __str__ or __repr__ like the way __repr__ was intended to be (but isn't generally done):
this should look like a valid Python expression that could be used to recreate an object with the same value (given an appropriate environment).
I'd use a key value storage here, where the key could be a concatenation of the id of the function object (to guarantee the key uniqness) and its arguments while the value would be the function returned value.
So calc_regress(1, 5, 30) call would produce an example key 139694472779248_1_5_30 where the first part is id(calc_regress). An example key producing function:
>>> def produce_cache_key(fun, *args, **kwargs):
... args_key = '_'.join(str(a) for a in args)
... kwargs_key = '_'.join('%s%s' % (k, v) for k, v in kwargs.items())
... return '%s_%s_%s' % (id(fun), args_key, kwargs_key)
You could keep your results in memory using a dictionary and a decorator:
>>> def cache_result(cache):
... def decorator(fun):
... def wrapper(*args, **kwargs):
... key = produce_cache_key(fun, *args, **kwargs)
... if key not in cache:
... cache[key] = fun(*args, **kwargs)
... return cache[key]
... return wrapper
... return decorator
...
>>>
>>> #cache_result(cache_dict)
... def fx(x, y, z=0):
... print 'Doing some expensive job...'
...
>>> cache = {}
>>> fx(1, 2, z=1)
Doing some expensive job...
>>> fx(1, 2, z=1)
>>>

Python closure and function attributes

I tried to reimplement something like partial (which later will have more behavior). Now in the following example lazycall1 seems to work just as fine as lazycall2, so I don't understand why the documentation of partial suggests using the longer second version. Any suggestions? Can it get me in trouble?
def lazycall1(func, *args, **kwargs):
def f():
func(*args, **kwargs)
return f
def lazycall2(func, *args, **kwargs):
def f():
func(*args, **kwargs)
f.func=func # why do I need that?
f.args=args
f.kwargs=kwargs
return f
def A(x):
print("A", x)
def B(x):
print("B", x)
a1=lazycall1(A, 1)
b1=lazycall1(B, 2)
a1()
b1()
a2=lazycall2(A, 3)
b2=lazycall2(B, 4)
a2()
b2()
EDIT: Actually the answers given so far aren't quite right. Even with double arguments it would work. Is there another reason?
def lazycall(func, *args):
def f(*args2):
return func(*(args+args2))
return f
def sum_up(a, b):
return a+b
plusone=lazycall(sum_up, 1)
plustwo=lazycall(sum_up, 2)
print(plusone(6)) #7
print(plustwo(9)) #11
The only extra thing the second form has, are some extra properties. This might be helpful if you start passing around the functions returned by lazycall2, so that the receiving function may make decisions based on these values.
functools.partial can accept additional arguments - or overridden arguments - in the inner, returned function. Your inner f() functions don't, so there's no need for what you're doing in lazycall2. However, if you wanted to do something like this:
def sum(a, b):
return a+b
plusone = lazycall3(sum, 1)
plusone(6) # 7
You'd need to do what is shown in those docs.
Look closer at the argument names in the inner function newfunc in the Python documentation page you link to, they are different than those passed to the inner function, args vs. fargs, keywords vs. fkeywords. Their implementation of partial saves the arguments that the outer function was given and adds them to the arguments given to the inner function.
Since you reuse the exact same argument names in your inner function, the original arguments to the outer function won't be accessible in there.
As for setting func, args, and kwargs attributes on the outer function, a function is an object in Python, and you can set attributes on it. These attributes allow you to get access to the original function and arguments after you have passed them into your lazycall functions. So a1.func will be A and a1.args will be [1].
If you don't need to keep track of the original function and arguments, you should be fine
with your lazycall1.

Setting parameter names and defaults in method created at runtime

I need to give an automatically constructed method positional and default parameters. If I didn't care about the parameter names or defaults I would have the method take *args and **kwargs. Here's an example of what I'm trying to do (note that
def make_method(params):
# Create method that's parameters are defined in params.
return method
params = {'param1': 'default_value', 'param2': None}
method = make_method(params)
For each non-None value I want to create a positional argument in method and for each None value I want to create a default parameter.
The problem is that I don't know how to set the parameter names and default values at runtime. I don't want to use *args and **kwars for method since it'd be better to have the signature correct when using help() etc.. Is there any way I can do this?
This is a HACK! Also I am not sure I understand how to generate kwargs based on what you asked. But this is a start. Although, I am not sure if it's a good start :)
def make_method(params):
def impl(*args, **kwargs):
# do stuff with args & kwargs
print "I was called with {args} and {kwargs}".format(
args=args,
kwargs=kwargs)
pass
name = 'some_name'
scope = dict(__impl=impl)
exec """\
def {name}({pos_arg_defs}):
'Very helpful function.'
return __impl({pos_arg_names})
""".format(name=name,
pos_arg_defs=', '.join('='.join((k, repr(v)))
for k, v in params.iteritems()
if v is not None),
pos_arg_names=', '.join(k
for k, v in params.iteritems()
if v is not None)) in scope
return scope[name]
Here's the some REPL output:
>>> help(make_method(dict(foo=42, bar=None, zomg='abc')))
Help on function some_name:
some_name(zomg='abc', foo=42)
Very helpful function.

python check if function accepts **kwargs

is there a way to check if a function accepts **kwargs before calling it e.g.
def FuncA(**kwargs):
print 'ok'
def FuncB(id = None):
print 'ok'
def FuncC():
print 'ok'
args = {'id': '1'}
FuncA(**args)
FuncB(**args)
FuncC(**args)
When I run this FuncA and FuncB would be okay but FuncC errors with got an unexpected keyword argument 'id' as it doesn't accept any arguments
try:
f(**kwargs)
except TypeError:
#do stuff
It's easier to ask forgiveness than permission.
def foo(a, b, **kwargs):
pass
import inspect
args, varargs, varkw, defaults = inspect.getargspec(foo)
assert(varkw=='kwargs')
This only works for Python functions. Functions defined in C extensions (and built-ins) may be tricky and sometimes interpret their arguments in quite creative ways. There's no way to reliably detect which arguments such functions expect. Refer to function's docstring and other human-readable documentation.
func is the function in question.
with python2, it's:
inspect.getargspec(func).keywords is not None
python3 is a bit tricker, following https://www.python.org/dev/peps/pep-0362/ the kind of parameter must be VAR_KEYWORD
Parameter.VAR_KEYWORD - a dict of keyword arguments that aren't bound to any other parameter. This corresponds to a "**kwargs" parameter in a Python function definition.
any(param for param in inspect.signature(func).parameters.values() if param.kind == param.VAR_KEYWORD)
For python > 3 you should to use inspect.getfullargspec.
import inspect
def foo(**bar):
pass
arg_spec = inspect.getfullargspec(foo)
assert arg_spec.varkw and arg_spec.varkw == 'bar'
Seeing that there are a multitude of different answers in this thread, I thought I would give my two cents, using inspect.signature().
Suppose you have this method:
def foo(**kwargs):
You can test if **kwargs are in this method's signature:
import inspect
sig = inspect.signature(foo)
params = sig.parameters.values()
has_kwargs = any([True for p in params if p.kind == p.VAR_KEYWORD])
More
Getting the parameters in which a method takes is also possible:
import inspect
sig = inspect.signature(foo)
params = sig.parameters.values()
for param in params:
print(param.kind)
You can also store them in a variable like so:
kinds = [param.kind for param in params]
# [<_ParameterKind.VAR_KEYWORD: 4>]
Other than just keyword arguments, there are 5 parameter kinds in total, which are as follows:
POSITIONAL_ONLY # parameters must be positional
POSITIONAL_OR_KEYWORD # parameters can be positional or keyworded (default)
VAR_POSITIONAL # *args
KEYWORD_ONLY # parameters must be keyworded
VAR_KEYWORD # **kwargs
Descriptions in the official documentation can be found here.
Examples
POSITIONAL_ONLY
def foo(a, /):
# the '/' enforces that all preceding parameters must be positional
foo(1) # valid
foo(a=1) #invalid
POSITIONAL_OR_KEYWORD
def foo(a):
# 'a' can be passed via position or keyword
# this is the default and most common parameter kind
VAR_POSITIONAL
def foo(*args):
KEYWORD_ONLY
def foo(*, a):
# the '*' enforces that all following parameters must by keyworded
foo(a=1) # valid
foo(1) # invalid
VAR_KEYWORD
def foo(**kwargs):
It appears that you want to check whether the function receives an 'id' keyword argument. You can't really do that by inspection because the function might not be a normal function, or you might have a situation like that:
def f(*args, **kwargs):
return h(*args, **kwargs)
g = lambda *a, **kw: h(*a, **kw)
def h(arg1=0, arg2=2):
pass
f(id=3) still fails
Catching TypeError as suggested is the best way to do that, but you can't really figure out what caused the TypeError. For example, this would still raise a TypeError:
def f(id=None):
return "%d" % id
f(**{'id': '5'})
And that might be an error that you want to debug. And if you're doing the check to avoid some side effects of the function, they might still be present if you catch it. For example:
class A(object):
def __init__(self): self._items = set([1,2,3])
def f(self, id): return self._items.pop() + id
a = A()
a.f(**{'id': '5'})
My suggestion is to try to identify the functions by another mechanism. For example, pass objects with methods instead of functions, and call only the objects that have a specific method. Or add a flag to the object or the function itself.
According to https://docs.python.org/2/reference/datamodel.html
you should be able to test for use of **kwargs using co_flags:
>>> def blah(a, b, kwargs):
... pass
>>> def blah2(a, b, **kwargs):
... pass
>>> (blah.func_code.co_flags & 0x08) != 0
False
>>> (blah2.func_code.co_flags & 0x08) != 0
True
Though, as noted in the reference this may change in the future, so I would definitely advise to be extra careful. Definitely add some unit tests to check this feature is still in place.

Categories