Implementing **<class>? - python

Edit:
This question has been marked duplicate but I don't think that it is. Implementing the suggested answer, that is to use the Mapping abc, does not have the behavior I would like:
from collections import Mapping
class data(Mapping):
def __init__(self,params):
self.params = params
def __getitem__(self,k):
print "getting",k
return self.params[k]
def __len__(self):
return len(self.params)
def __iter__(self):
return ( k for k in self.params.keys() )
def func(*args,**kwargs):
print "In func"
return None
ps = data({"p1":1.,"p2":2.})
print "\ncalling...."
func(ps)
print "\ncalling...."
func(**ps)
Output:
calling....
In func
calling....
in __getitem__ p2
in __getitem__ p1
In func
Which, as mentioned in the question, is not what I want.
The other solution, given in the comments, is to modify the routines that are causing problems. That will certainly work, however I was looking for a quick (lazy?) fix!
Question:
How can I implement the ** operator for a class, other than via __getitem__? For example I would like to be able to do this::
def func(**kwargs):
<do some clever stuff>
x = some_generic_class():
func( **x )
without an implicit call to some_generic_class.__getitem__(). In my application I have already implemented __getitem__ with some data logging which I do not want to perform when the class is referenced as above.
If it's not possible to overload the ** operator, is it possible to detect when __getitem__ is being called as a result of the class being passed to a function, rather than explicitly?
Background:
I am working on a physics model that is built out of a set of packages which are chosen according to user input at runtime. The flexible structure of the model means that I rarely know the required parameters and so i pass a dict of parameter names and values between the models. In order to make this more user friendly I am now trying to develop a class paramlist that overloads the dict functionality with a set of routines that do some consistency checking, set default values, etc. The idea is that I pass an instance of paramlist rather than a dict. One of the more important aims is to keep a log of which members of paramlist have been referenced by the physics packages and which ones have not. A stripped out version is below, which aims to maintain a second dict that logs whether a parameter has been referenced::
class paramlist(object):
def __init__( self, params ):
self.params = copy(params)
self.used = { k:False for k in self.params }
def __getitem__(self, k):
try:
v = self.params[k]
except KeyError:
raise KeyError("Parameter {} not in parameter list".format(k))
else:
self.used[k] = True
return v
def __setitem__(self,k,v):
self.params[k] = v
self.used[k] = False
Which does not have the behaviour I want:
ps = paramlist( {"p1":1.} )
def donothing( *args, **kwargs ):
return None
donothing(ps)
print paramlist.used["p1"]
donothing(**ps)
print paramlist.used["p1"]
Output:
False
True
I would like the use dict to remain False in both cases, so that I can tell the user that one of their parameters was not used (implying that they screwed up and a default value has been used instead). I presume that the ** case has the effect of calling __getitem__ on every entry in the paramlist.

Related

How to access a dictionary value from within the same dictionary in Python? [duplicate]

I'm new to Python, and am sort of surprised I cannot do this.
dictionary = {
'a' : '123',
'b' : dictionary['a'] + '456'
}
I'm wondering what the Pythonic way to correctly do this in my script, because I feel like I'm not the only one that has tried to do this.
EDIT: Enough people were wondering what I'm doing with this, so here are more details for my use cases. Lets say I want to keep dictionary objects to hold file system paths. The paths are relative to other values in the dictionary. For example, this is what one of my dictionaries may look like.
dictionary = {
'user': 'sholsapp',
'home': '/home/' + dictionary['user']
}
It is important that at any point in time I may change dictionary['user'] and have all of the dictionaries values reflect the change. Again, this is an example of what I'm using it for, so I hope that it conveys my goal.
From my own research I think I will need to implement a class to do this.
No fear of creating new classes -
You can take advantage of Python's string formating capabilities
and simply do:
class MyDict(dict):
def __getitem__(self, item):
return dict.__getitem__(self, item) % self
dictionary = MyDict({
'user' : 'gnucom',
'home' : '/home/%(user)s',
'bin' : '%(home)s/bin'
})
print dictionary["home"]
print dictionary["bin"]
Nearest I came up without doing object:
dictionary = {
'user' : 'gnucom',
'home' : lambda:'/home/'+dictionary['user']
}
print dictionary['home']()
dictionary['user']='tony'
print dictionary['home']()
>>> dictionary = {
... 'a':'123'
... }
>>> dictionary['b'] = dictionary['a'] + '456'
>>> dictionary
{'a': '123', 'b': '123456'}
It works fine but when you're trying to use dictionary it hasn't been defined yet (because it has to evaluate that literal dictionary first).
But be careful because this assigns to the key of 'b' the value referenced by the key of 'a' at the time of assignment and is not going to do the lookup every time. If that is what you are looking for, it's possible but with more work.
What you're describing in your edit is how an INI config file works. Python does have a built in library called ConfigParser which should work for what you're describing.
This is an interesting problem. It seems like Greg has a good solution. But that's no fun ;)
jsbueno as a very elegant solution but that only applies to strings (as you requested).
The trick to a 'general' self referential dictionary is to use a surrogate object. It takes a few (understatement) lines of code to pull off, but the usage is about what you want:
S = SurrogateDict(AdditionSurrogateDictEntry)
d = S.resolve({'user': 'gnucom',
'home': '/home/' + S['user'],
'config': [S['home'] + '/.emacs', S['home'] + '/.bashrc']})
The code to make that happen is not nearly so short. It lives in three classes:
import abc
class SurrogateDictEntry(object):
__metaclass__ = abc.ABCMeta
def __init__(self, key):
"""record the key on the real dictionary that this will resolve to a
value for
"""
self.key = key
def resolve(self, d):
""" return the actual value"""
if hasattr(self, 'op'):
# any operation done on self will store it's name in self.op.
# if this is set, resolve it by calling the appropriate method
# now that we can get self.value out of d
self.value = d[self.key]
return getattr(self, self.op + 'resolve__')()
else:
return d[self.key]
#staticmethod
def make_op(opname):
"""A convience class. This will be the form of all op hooks for subclasses
The actual logic for the op is in __op__resolve__ (e.g. __add__resolve__)
"""
def op(self, other):
self.stored_value = other
self.op = opname
return self
op.__name__ = opname
return op
Next, comes the concrete class. simple enough.
class AdditionSurrogateDictEntry(SurrogateDictEntry):
__add__ = SurrogateDictEntry.make_op('__add__')
__radd__ = SurrogateDictEntry.make_op('__radd__')
def __add__resolve__(self):
return self.value + self.stored_value
def __radd__resolve__(self):
return self.stored_value + self.value
Here's the final class
class SurrogateDict(object):
def __init__(self, EntryClass):
self.EntryClass = EntryClass
def __getitem__(self, key):
"""record the key and return"""
return self.EntryClass(key)
#staticmethod
def resolve(d):
"""I eat generators resolve self references"""
stack = [d]
while stack:
cur = stack.pop()
# This just tries to set it to an appropriate iterable
it = xrange(len(cur)) if not hasattr(cur, 'keys') else cur.keys()
for key in it:
# sorry for being a duche. Just register your class with
# SurrogateDictEntry and you can pass whatever.
while isinstance(cur[key], SurrogateDictEntry):
cur[key] = cur[key].resolve(d)
# I'm just going to check for iter but you can add other
# checks here for items that we should loop over.
if hasattr(cur[key], '__iter__'):
stack.append(cur[key])
return d
In response to gnucoms's question about why I named the classes the way that I did.
The word surrogate is generally associated with standing in for something else so it seemed appropriate because that's what the SurrogateDict class does: an instance replaces the 'self' references in a dictionary literal. That being said, (other than just being straight up stupid sometimes) naming is probably one of the hardest things for me about coding. If you (or anyone else) can suggest a better name, I'm all ears.
I'll provide a brief explanation. Throughout S refers to an instance of SurrogateDict and d is the real dictionary.
A reference S[key] triggers S.__getitem__ and SurrogateDictEntry(key) to be placed in the d.
When S[key] = SurrogateDictEntry(key) is constructed, it stores key. This will be the key into d for the value that this entry of SurrogateDictEntry is acting as a surrogate for.
After S[key] is returned, it is either entered into the d, or has some operation(s) performed on it. If an operation is performed on it, it triggers the relative __op__ method which simple stores the value that the operation is performed on and the name of the operation and then returns itself. We can't actually resolve the operation because d hasn't been constructed yet.
After d is constructed, it is passed to S.resolve. This method loops through d finding any instances of SurrogateDictEntry and replacing them with the result of calling the resolve method on the instance.
The SurrogateDictEntry.resolve method receives the now constructed d as an argument and can use the value of key that it stored at construction time to get the value that it is acting as a surrogate for. If an operation was performed on it after creation, the op attribute will have been set with the name of the operation that was performed. If the class has a __op__ method, then it has a __op__resolve__ method with the actual logic that would normally be in the __op__ method. So now we have the logic (self.op__resolve) and all necessary values (self.value, self.stored_value) to finally get the real value of d[key]. So we return that which step 4 places in the dictionary.
finally the SurrogateDict.resolve method returns d with all references resolved.
That'a a rough sketch. If you have any more questions, feel free to ask.
If you, just like me wandering how to make #jsbueno snippet work with {} style substitutions, below is the example code (which is probably not much efficient though):
import string
class MyDict(dict):
def __init__(self, *args, **kw):
super(MyDict,self).__init__(*args, **kw)
self.itemlist = super(MyDict,self).keys()
self.fmt = string.Formatter()
def __getitem__(self, item):
return self.fmt.vformat(dict.__getitem__(self, item), {}, self)
xs = MyDict({
'user' : 'gnucom',
'home' : '/home/{user}',
'bin' : '{home}/bin'
})
>>> xs["home"]
'/home/gnucom'
>>> xs["bin"]
'/home/gnucom/bin'
I tried to make it work with the simple replacement of % self with .format(**self) but it turns out it wouldn't work for nested expressions (like 'bin' in above listing, which references 'home', which has it's own reference to 'user') because of the evaluation order (** expansion is done before actual format call and it's not delayed like in original % version).
Write a class, maybe something with properties:
class PathInfo(object):
def __init__(self, user):
self.user = user
#property
def home(self):
return '/home/' + self.user
p = PathInfo('thc')
print p.home # /home/thc
As sort of an extended version of #Tony's answer, you could build a dictionary subclass that calls its values if they are callables:
class CallingDict(dict):
"""Returns the result rather than the value of referenced callables.
>>> cd = CallingDict({1: "One", 2: "Two", 'fsh': "Fish",
... "rhyme": lambda d: ' '.join((d[1], d['fsh'],
... d[2], d['fsh']))})
>>> cd["rhyme"]
'One Fish Two Fish'
>>> cd[1] = 'Red'
>>> cd[2] = 'Blue'
>>> cd["rhyme"]
'Red Fish Blue Fish'
"""
def __getitem__(self, item):
it = super(CallingDict, self).__getitem__(item)
if callable(it):
return it(self)
else:
return it
Of course this would only be usable if you're not actually going to store callables as values. If you need to be able to do that, you could wrap the lambda declaration in a function that adds some attribute to the resulting lambda, and check for it in CallingDict.__getitem__, but at that point it's getting complex, and long-winded, enough that it might just be easier to use a class for your data in the first place.
This is very easy in a lazily evaluated language (haskell).
Since Python is strictly evaluated, we can do a little trick to turn things lazy:
Y = lambda f: (lambda x: x(x))(lambda y: f(lambda *args: y(y)(*args)))
d1 = lambda self: lambda: {
'a': lambda: 3,
'b': lambda: self()['a']()
}
# fix the d1, and evaluate it
d2 = Y(d1)()
# to get a
d2['a']() # 3
# to get b
d2['b']() # 3
Syntax wise this is not very nice. That's because of us needing to explicitly construct lazy expressions with lambda: ... and explicitly evaluate lazy expression with ...(). It's the opposite problem in lazy languages needing strictness annotations, here in Python we end up needing lazy annotations.
I think with some more meta-programmming and some more tricks, the above could be made more easy to use.
Note that this is basically how let-rec works in some functional languages.
The jsbueno answer in Python 3 :
class MyDict(dict):
def __getitem__(self, item):
return dict.__getitem__(self, item).format(self)
dictionary = MyDict({
'user' : 'gnucom',
'home' : '/home/{0[user]}',
'bin' : '{0[home]}/bin'
})
print(dictionary["home"])
print(dictionary["bin"])
Her ewe use the python 3 string formatting with curly braces {} and the .format() method.
Documentation : https://docs.python.org/3/library/string.html

How to set up a class with all the methods of and functions like a built in such as float, but holds onto extra data?

I am working with 2 data sets on the order of ~ 100,000 values. These 2 data sets are simply lists. Each item in the list is a small class.
class Datum(object):
def __init__(self, value, dtype, source, index1=None, index2=None):
self.value = value
self.dtype = dtype
self.source = source
self.index1 = index1
self.index2 = index2
For each datum in one list, there is a matching datum in the other list that has the same dtype, source, index1, and index2, which I use to sort the two data sets such that they align. I then do various work with the matching data points' values, which are always floats.
Currently, if I want to determine the relative values of the floats in one data set, I do something like this.
minimum = min([x.value for x in data])
for datum in data:
datum.value -= minimum
However, it would be nice to have my custom class inherit from float, and be able to act like this.
minimum = min(data)
data = [x - minimum for x in data]
I tried the following.
class Datum(float):
def __new__(cls, value, dtype, source, index1=None, index2=None):
new = float.__new__(cls, value)
new.dtype = dtype
new.source = source
new.index1 = index1
new.index2 = index2
return new
However, doing
data = [x - minimum for x in data]
removes all of the extra attributes (dtype, source, index1, index2).
How should I set up a class that functions like a float, but holds onto the extra data that I instantiate it with?
UPDATE: I do many types of mathematical operations beyond subtraction, so rewriting all of the methods that work with a float would be very troublesome, and frankly I'm not sure I could rewrite them properly.
I suggest subclassing float and using a couple decorators to "capture" the float output from any method (except for __new__ of course) and returning a Datum object instead of a float object.
First we write the method decorator (which really isn't being used as a decorator below, it's just a function that modifies the output of another function, AKA a wrapper function):
def mydecorator(f,cls):
#f is the method being modified, cls is its class (in this case, Datum)
def func_wrapper(*args,**kwargs):
#*args and **kwargs are all the arguments that were passed to f
newvalue = f(*args,**kwargs)
#newvalue now contains the output float would normally produce
##Now get cls instance provided as part of args (we need one
##if we're going to reattach instance information later):
try:
self = args[0]
##Now check to make sure new value is an instance of some numerical
##type, but NOT a bool or a cls type (which might lead to recursion)
##Including ints so things like modulo and round will work right
if (isinstance(newvalue,float) or isinstance(newvalue,int)) and not isinstance(newvalue,bool) and type(newvalue) != cls:
##If newvalue is a float or int, now we make a new cls instance using the
##newvalue for value and using the previous self instance information (arg[0])
##for the other fields
return cls(newvalue,self.dtype,self.source,self.index1,self.index2)
#IndexError raised if no args provided, AttributeError raised of self isn't a cls instance
except (IndexError, AttributeError):
pass
##If newvalue isn't numerical, or we don't have a self, just return what
##float would normally return
return newvalue
#the function has now been modified and we return the modified version
#to be used instead of the original version, f
return func_wrapper
The first decorator only applies to a method to which it is attached. But we want it to decorate all (actually, almost all) the methods inherited from float (well, those that appear in the float's __dict__, anyway). This second decorator will apply our first decorator to all of the methods in the float subclass except for those listed as exceptions (see this answer):
def for_all_methods_in_float(decorator,*exceptions):
def decorate(cls):
for attr in float.__dict__:
if callable(getattr(float, attr)) and not attr in exceptions:
setattr(cls, attr, decorator(getattr(float, attr),cls))
return cls
return decorate
Now we write the subclass much the same as you had before, but decorated, and excluding __new__ from decoration (I guess we could also exclude __init__ but __init__ doesn't return anything, anyway):
#for_all_methods_in_float(mydecorator,'__new__')
class Datum(float):
def __new__(klass, value, dtype="dtype", source="source", index1="index1", index2="index2"):
return super(Datum,klass).__new__(klass,value)
def __init__(self, value, dtype="dtype", source="source", index1="index1", index2="index2"):
self.value = value
self.dtype = dtype
self.source = source
self.index1 = index1
self.index2 = index2
super(Datum,self).__init__()
Here are our testing procedures; iteration seems to work correctly:
d1 = Datum(1.5)
d2 = Datum(3.2)
d3 = d1+d2
assert d3.source == 'source'
L=[d1,d2,d3]
d4=max(L)
assert d4.source == 'source'
L = [i for i in L]
assert L[0].source == 'source'
assert type(L[0]) == Datum
minimum = min(L)
assert [x - minimum for x in L][0].source == 'source'
Notes:
I am using Python 3. Not certain if that will make a difference for you.
This approach effectively overrides EVERY method of float other than the exceptions, even the ones for which the result isn't modified. There may be side effects to this (subclassing a built-in and then overriding all of its methods), e.g. a performance hit or something; I really don't know.
This will also decorate nested classes.
This same approach could also be implemented using a metaclass.
The problem is when you do :
x - minimum
in terms of types you are doing either :
datum - float, or datum - integer
Either way python doesn't know how to do either of them, so what it does is look at parent classes of the arguments if it can. since datum is a type of float, it can easily use float - and the calculation ends up being
float - float
which will obviously result in a 'float' - python has no way of knowing how to construct your datum object unless you tell it.
To solve this you either need to implement the mathematical operators so that python knows how to do datum - float or come up with a different design.
Assuming that 'dtype', 'source', index1 & index2 need to stay the same after a calculation - then as an example your class needs :
def __sub__(self, other):
return datum(value-other, self.dtype, self.source, self.index1, self.index2)
this should work - not tested
and this will now allow you to do this
d = datum(23.0, dtype="float", source="me", index1=1)
e = d - 16
print e.value, e.dtype, e.source, e.index1, e.index2
which should result in :
7.0 float me 1 None

Should a plugin adding new instance-methods monkey-patch or subclass/mixin and replace the parent?

As a simple example take a class Polynomial
class Polynomial(object):
def __init__(self, coefficients):
self.coefficients = coefficients
for polynomials of the form p(x) = a_0 + a_1*x + a_2*x^2 + ... + a_n*x^n where the list coefficients = (a_0, a_1, ..., a_n) stores those coefficients.
One plugin-module horner could then provide a function horner.evaluate_polynomial(p, x) to evaluate a Polynomial instance p at value x, i.e. return the value of p(x). But instead of calling the function that way, a call to p.evaluate(x) (or more intuitively p(x) via __call__) would be better. But how should it be done?
a) Monkey-patching, i.e.
Polynomial.evaluate = horner.evaluate_polynomial
# or Polynomial.__call__ = horner.evaluate_polynomial
b) Subclassing and replacing the class, i.e.
orgPolynomial = Polynomial
class EvaluatablePolynomial(Polynomial):
def evaluate(self, x):
return horner.evaluate_polynomial(self, x)
Polynomial = EvaluatablePolynomial
c) Mixin + Replacing, i.e.
orgPolynomial = Polynomial
class Evaluatable(object):
def evaluate(self, x):
return horner.evaluate_polynomial(self, x)
class EvaluatablePolynomial(Polynomial, Evaluatable):
pass
Polynomial = EvaluatablePolynomial
Sure enough, monkey-patching is the shortest one (especially since I didn't include any check à la hasattr(Polynomial, 'evaluate'), but similarly a subclass should call super() then...), but is it the most Pythonic? Or are there other better alternatives?
Especially considering the possibility for multiple plugins providing the same function, e.g. zeros either using numpy or a self-made bisection, where of course only one implementing plugin should be used, which choice might be less error-prone?
The one and probably most important property of monkey-patching the function directly to the original class instead of replacing it is that references to / instances of the original class before the plugin was loaded will now also have the new attribute. In the given example, this is most likely desired behaviour and should therefore be used.
There may however be other situations where a monkey-patch modifies the behaviour of an existing method in a way that is not compatible with its original implementation and previous instances of the modified class should use the original implementation. Granted, this is not only rare but also bad design, but you should keep this possibility in mind. For some convoluted reasons code might even rely on the absence of a monkey-patch-added method, though it seems hard to come up with a non-artificial example here.
In summary, with few exceptions monkey-patching the methods into the original class (preferably with a hasattr(...) check before patching) should be the preferred way.
edit My current go: create a subclass (for simpler code completion and patching) and then use the following patch(patching_class, unpatched_class) method:
import logging
from types import FunctionType, MethodType
logger = logging.getLogger(__name__)
applied_patches = []
class PatchingError(Exception):
pass
def patch(subcls, cls):
if not subcls in applied_patches:
logger.info("Monkeypatching %s methods into %s", subcls, cls)
for methodname in subcls.__dict__:
if methodname.startswith('_'):
logger.debug('Skipping methodname %s', methodname)
continue
# TODO treat modified init
elif hasattr(cls, methodname):
raise PatchingError(
"%s alrady has methodname %s, cannot overwrite!",
cls, methodname)
else:
method = getattr(subcls, methodname)
logger.debug("Adding %s %s", type(method), methodname)
method = get_raw_method(methodname, method)
setattr(cls, methodname, method)
applied_patches.append(subcls)
def get_raw_method(methodname, method):
# The following wouldn't be necessary in Python3...
# http://stackoverflow.com/q/18701102/321973
if type(method) == FunctionType:
logger.debug("Making %s static", methodname)
method = staticmethod(method)
else:
assert type(method) == MethodType
logger.debug("Un-bounding %s", methodname)
method = method.__func__
return method
The open question is whether the respective subclass' module should directly call patch on import or that should be done manually. I'm also considering writing a decorator or metaclass for such a patching subclass...

automatic wrapper that adds an output to a function

[I am using python 2.7]
I wanted to make a little wrapper function that add one output to a function. Something like:
def add_output(fct, value):
return lambda *args, **kargs: (fct(*args,**kargs),value)
Example of use:
def f(a): return a+1
g = add_output(f,42)
print g(12) # print: (13,42)
This is the expected results, but it does not work if the function given to add_ouput return more than one output (nor if it returns no output). In this case, the wrapped function will return two outputs, one contains all the output of the initial function (or None if it returns no output), and one with the added output:
def f1(a): return a,a+1
def f2(a): pass
g1 = add_output(f1,42)
g2 = add_output(f2,42)
print g1(12) # print: ((12,13),42) instead of (12,13,42)
print g2(12) # print: (None,42) instead of 42
I can see this is related to the impossibility to distinguish between one output of type tuple and several output. But this is disappointing not to be able to do something so simple with a dynamic language like python...
Does anyone have an idea on a way to achieve this automatically and nicely enough, or am I in a dead-end ?
Note:
In case this change anything, my real purpose is doing some wrapping of class (instance) method, to looks like function (for workflow stuff). However it is require to add self in the output (in case its content is changed):
class C(object):
def f(self): return 'foo','bar'
def wrap(method):
return lambda self, *args, **kargs: (self,method(self,*args,**kargs))
f = wrap(C.f)
c = C()
f(c) # returns (c,('foo','bar')) instead of (c,'foo','bar')
I am working with python 2.7, so I a want solution with this version or else I abandon the idea. I am still interested (and maybe futur readers) by comments about this issue for python 3 though.
Your add_output() function is what is called a decorator in Python. Regardless, you can use one of the collections module's ABCs (Abstract Base Classes) to distinguish between different results from the function being wrapped. For example:
import collections
def add_output(fct, value):
def wrapped(*args, **kwargs):
result = fct(*args, **kwargs)
if isinstance(result, collections.Sequence):
return tuple(result) + (value,)
elif result is None:
return value
else: # non-None and non-sequence
return (result, value)
return wrapped
def f1(a): return a,a+1
def f2(a): pass
g1 = add_output(f1, 42)
g2 = add_output(f2, 42)
print g1(12) # -> (12,13,42)
print g2(12) # -> 42
Depending of what sort of functions you plan on decorating, you might need to use the collections.Iterable ABC instead of, or in addition to, collections.Sequence.

How to call Python functions dynamically [duplicate]

This question already has answers here:
Calling a function of a module by using its name (a string)
(18 answers)
Closed 4 months ago.
I have this code:
fields = ['name','email']
def clean_name():
pass
def clean_email():
pass
How can I call clean_name() and clean_email() dynamically?
For example:
for field in fields:
clean_{field}()
I used the curly brackets because it's how I used to do it in PHP but obviously doesn't work.
How to do this with Python?
If don't want to use globals, vars and don't want make a separate module and/or class to encapsulate functions you want to call dynamically, you can call them as the attributes of the current module:
import sys
...
getattr(sys.modules[__name__], "clean_%s" % fieldname)()
Using global is a very, very, bad way of doing this. You should be doing it this way:
fields = {'name':clean_name,'email':clean_email}
for key in fields:
fields[key]()
Map your functions to values in a dictionary.
Also using vars()[] is wrong too.
It would be better to have a dictionary of such functions than to look in globals().
The usual approach is to write a class with such functions:
class Cleaner(object):
def clean_name(self):
pass
and then use getattr to get access to them:
cleaner = Cleaner()
for f in fields:
getattr(cleaner, 'clean_%s' % f)()
You could even move further and do something like this:
class Cleaner(object):
def __init__(self, fields):
self.fields = fields
def clean(self):
for f in self.fields:
getattr(self, 'clean_%s' % f)()
Then inherit it and declare your clean_<name> methods on an inherited class:
cleaner = Cleaner(['one', 'two'])
cleaner.clean()
Actually this can be extended even further to make it more clean. The first step probably will be adding a check with hasattr() if such method exists in your class.
I have come across this problem twice now, and finally came up with a safe and not ugly solution (in my humble opinion).
RECAP of previous answers:
globals is the hacky, fast & easy method, but you have to be super consistent with your function names, and it can break at runtime if variables get overwritten. Also it's un-pythonic, unsafe, unethical, yadda yadda...
Dictionaries (i.e. string-to-function maps) are safer and easy to use... but it annoys me to no end, that i have to spread dictionary assignments across my file, that are easy to lose track of.
Decorators made the dictionary solution come together for me. Decorators are a pretty way to attach side-effects & transformations to a function definition.
Example time
fields = ['name', 'email', 'address']
# set up our function dictionary
cleaners = {}
# this is a parametered decorator
def add_cleaner(key):
# this is the actual decorator
def _add_cleaner(func):
cleaners[key] = func
return func
return _add_cleaner
Whenever you define a cleaner function, add this to the declaration:
#add_cleaner('email')
def email_cleaner(email):
#do stuff here
return result
The functions are added to the dictionary as soon as their definition is parsed and can be called like this:
cleaned_email = cleaners['email'](some_email)
Alternative proposed by PeterSchorn:
def add_cleaner(func):
cleaners[func.__name__] = func
return func
#add_cleaner
def email():
#clean email
This uses the function name of the cleaner method as its dictionary key.
It is more concise, though I think the method names become a little awkward.
Pick your favorite.
globals() will give you a dict of the global namespace. From this you can get the function you want:
f = globals()["clean_%s" % field]
Then call it:
f()
Here's another way:
myscript.py:
def f1():
print 'f1'
def f2():
print 'f2'
def f3():
print 'f3'
test.py:
import myscript
for i in range(1, 4):
getattr(myscript, 'f%d' % i)()
I had a requirement to call different methods of a class in a method of itself on the basis of list of method names passed as input (for running periodic tasks in FastAPI). For executing methods of Python classes, I have expanded the answer provided by #khachik. Here is how you can achieve it from inside or outside of the class:
>>> class Math:
... def add(self, x, y):
... return x+y
... def test_add(self):
... print(getattr(self, "add")(2,3))
...
>>> m = Math()
>>> m.test_add()
5
>>> getattr(m, "add")(2,3)
5
Closely see how you can do it from within the class using self like this:
getattr(self, "add")(2,3)
And from outside the class using an object of the class like this:
m = Math()
getattr(m, "add")(2,3)
Here's another way: define the functions then define a dict with the names as keys:
>>> z=[clean_email, clean_name]
>>> z={"email": clean_email, "name":clean_name}
>>> z['email']()
>>> z['name']()
then you loop over the names as keys.
or how about this one? Construct a string and use 'eval':
>>> field = "email"
>>> f="clean_"+field+"()"
>>> eval(f)
then just loop and construct the strings for eval.
Note that any method that requires constructing a string for evaluation is regarded as kludgy.
for field in fields:
vars()['clean_' + field]()
In case if you have a lot of functions and a different number of parameters.
class Cleaner:
#classmethod
def clean(cls, type, *args, **kwargs):
getattr(cls, f"_clean_{type}")(*args, **kwargs)
#classmethod
def _clean_email(cls, *args, **kwargs):
print("invoked _clean_email function")
#classmethod
def _clean_name(cls, *args, **kwargs):
print("invoked _clean_name function")
for type in ["email", "name"]:
Cleaner.clean(type)
Output:
invoked _clean_email function
invoked _clean_name function
I would use a dictionary which mapped field names to cleaning functions. If some fields don't have corresponding cleaning function, the for loop handling them can be kept simple by providing some sort of default function for those cases. Here's what I mean:
fields = ['name', 'email', 'subject']
def clean_name():
pass
def clean_email():
pass
# (one-time) field to cleaning-function map construction
def get_clean_func(field):
try:
return eval('clean_'+field)
except NameError:
return lambda: None # do nothing
clean = dict((field, get_clean_func(field)) for field in fields)
# sample usage
for field in fields:
clean[field]()
The code above constructs the function dictionary dynamically by determining if a corresponding function named clean_<field> exists for each one named in the fields list. You likely would only have to execute it once since it would remain the same as long as the field list or available cleaning functions aren't changed.

Categories