How to avoid excessive parameter passing? - python

I am developing a medium size program in python spread across 5 modules. The program accepts command line arguments using OptionParser in the main module e.g. main.py. These options are later used to determine how methods in other modules behave (e.g. a.py, b.py). As I extend the ability for the user to customise the behaviour or the program I find that I end up requiring this user-defined parameter in a method in a.py that is not directly called by main.py, but is instead called by another method in a.py:
main.py:
import a
p = some_command_line_argument_value
a.meth1(p)
a.py:
meth1(p):
# some code
res = meth2(p)
# some more code w/ res
meth2(p):
# do something with p
This excessive parameter passing seems wasteful and wrong, but has hard as I try I cannot think of a design pattern that solves this problem. While I had some formal CS education (minor in CS during my B.Sc.), I've only really come to appreciate good coding practices since I started using python. Please help me become a better programmer!

Create objects of types relevant to your program, and store the command line options relevant to each in them. Example:
import WidgetFrobnosticator
f = WidgetFrobnosticator()
f.allow_oncave_widgets = option_allow_concave_widgets
f.respect_weasel_pins = option_respect_weasel_pins
# Now the methods of WidgetFrobnosticator have access to your command-line parameters,
# in a way that's not dependent on the input format.
import PlatypusFactory
p = PlatypusFactory()
p.allow_parthenogenesis = option_allow_parthenogenesis
p.max_population = option_max_population
# The platypus factory knows about its own options, but not those of the WidgetFrobnosticator
# or vice versa. This makes each class easier to read and implement.

Maybe you should organize your code more into classes and objects? As I was writing this, Jimmy showed a class-instance based answer, so here is a pure class-based answer. This would be most useful if you only ever wanted a single behavior; if there is any chance at all you might want different defaults some of the time, you should use ordinary object-oriented programming in Python, i.e. pass around class instances with the property p set in the instance, not the class.
class Aclass(object):
p = None
#classmethod
def init_p(cls, value):
p = value
#classmethod
def meth1(cls):
# some code
res = cls.meth2()
# some more code w/ res
#classmethod
def meth2(cls):
# do something with p
pass
from a import Aclass as ac
ac.init_p(some_command_line_argument_value)
ac.meth1()
ac.meth2()

If "a" is a real object and not just a set of independent helper methods, you can create an "p" member variable in "a" and set it when you instantiate an "a" object. Then your main class will not need to pass "p" into meth1 and meth2 once "a" has been instantiated.

[Caution: my answer isn't specific to python.]
I remember that Code Complete called this kind of parameter a "tramp parameter". Googling for "tramp parameter" doesn't return many results, however.
Some alternatives to tramp parameters might include:
Put the data in a global variable
Put the data in a static variable of a class (similar to global data)
Put the data in an instance variable of a class
Pseudo-global variable: hidden behind a singleton, or some dependency injection mechanism
Personally, I don't mind a tramp parameter as long as there's no more than one; i.e. your example is OK for me, but I wouldn't like ...
import a
p1 = some_command_line_argument_value
p2 = another_command_line_argument_value
p3 = a_further_command_line_argument_value
a.meth1(p1, p2, p3)
... instead I'd prefer ...
import a
p = several_command_line_argument_values
a.meth1(p)
... because if meth2 decides that it wants more data than before, I'd prefer if it could extract this extra data from the original parameter which it's already being passed, so that I don't need to edit meth1.

With objects, parameter lists should normally be very small, since most appropriate information is a property of the object itself. The standard way to handle this is to configure the object properties and then call the appropriate methods of that object. In this case set p as an attribute of a. Your meth2 should also complain if p is not set.

Your example is reminiscent of the code smell Message Chains. You may find the corresponding refactoring, Hide Delegate, informative.

Related

Exposing arbitrary methods names in a python class for an IDE to read

The situation is this. Suppose you
from nastymodule import NastyObject1, NastyObject2, NastyObject3
and this NastyObject's under the hood have a weird implementation that does not cleanly expose its methods due to an inextricable maze of interfaces (COM, dll calls and the likes), so the IDE is not able to suggest them. From the documentation, you read that NastyObject1 has a method do_thing, NastyObject2 has a method do_other_thing, and in fact
NO1 = NastyObject1()
res = NO1.do_thing()
NO2 = NastyObject2()
res = NO2.do_other_thing()
perfectly works as documented. The only problem is that, as I said, the IDE does not know, due to the obscure implementation, of this method do_thing, or any other methods of that class. Now, for reasons I have to write a unique NstObjWrapper class for all these NastyObject's, capable of dynamically exposing these methods.
Keep in mind that I already wrote NstObjWrapper's __getattr__, so that
NOW1 = NstObjWrapper('NastyObject1')
res = NOW1.do_thing()
NOW2 = NstObjWrapper('NastyObject2')
res = NOW2.do_other_thing()
already works; I only need to find a way to dynamically make IDEs (and any kind of class inspectors) aware that NOW1 has a do_thing method and NOW2 has a do_other_thing method.
NstObjWrapper can be, if necessary, informed of the methods of the NastyObject's through an exhaustive, hardcoded dict:
methods_dict = {'NastyObject1': ['do_thing', ......]
'NastyObject2': ['do_other_thing', .......]
'NastyObject3': [.......]}
But since the class has to be able to wrap all the objects, that all have different methods, you cannot just define methods with the same name, and then calling the wrapped NastyObject's methods.
Is this possible? How would you do it?
A pretty simple way to solve this would be to the following :
class my_class_func():
def do_thing():
pass
def func2():
pass
def func3():
pass
def func4():
pass
With each func corresponding to one of the funcs you want, and then telling the IDE your class created is of type of that specific class, as following :
NOW1 = NstObjWrapper('NastyObject1') # type: my_class_func
res = NOW1.do_thing()
You would get auto-complete, since the way we wrote the "type" comment is according to pep convention.
Either way I am not sure this is the best way to go.

Efficiently setting attribute values for a class instantiated within another class

I am trying to set the attribute values of a certain class AuxiliaryClass than is instantiated in a method from MainClass class in the most efficient way possible.
AuxiliaryClass is instantiated within a method of MainClass - see below. However, AuxiliaryClass has many different attributes and I need to set the value of those attributes once the class has been instantiated - see the last 3 lines of my code.
Note: due to design constraints I cannot explain here, my classes only contain methods, meaning that I need to declare attributes as methods (see below).
class AuxiliaryClass(object):
def FirstMethod(self):
return None
...
def NthMethod(self):
return None
class MainClass(object):
def Auxiliary(self):
return AuxiliaryClass()
def main():
obj = MainClass()
obj.Auxiliary().FirstMethod = #some_value
...
obj.Auxiliary().NthMethod = #some_other_value
# ~~> further code
Basically I want to replace these last 3 lines of code with something neater, more elegant and more efficient. I know I could use a dictionary if I was instantiating AuxiliaryClass directly:
d = {'FirstMethod' : some_value,
...
'NthMethod' : some_other_value}
obj = AuxiliaryClass(**d)
But this does not seem to work for the structure of my problem. Finally, I need to set the values of AuxiliaryClass's attributes once MainClass has been instantiated (so I can't set the attribute's values within method Auxiliary).
Is there a better way to do this than obj.Auxiliary().IthMethod = some_value?
EDIT
A couple of people have said that the following lines:
obj.Auxiliary().FirstMethod = #some_value
...
obj.Auxiliary().NthMethod = #some_other_value
will have no effect because they will immediately get garbage collected. I do not really understand what this means, but if I execute the following lines (after the lines above):
print(obj.Auxiliary().FirstMethod())
...
print(obj.Auxiliary().NthMethod())
I am getting the values I entered previously.
To speed things up, and make the customization somewhat cleaner, you can cache the results of the AuxilliaryClass constructor/singleton/accessor, and loop over a dict calling setattr().
Try something like this:
init_values = {
'FirstMethod' : some_value,
:
'NthMethod' : some_other_value,
}
def main():
obj = MainClass()
aux = obj.Auxiliary() # cache the call, only make it once
for attr,value in init_values.items(): # python3 here, iteritems() in P2
setattr(aux, attr, value)
# other stuff below this point
I understand what is happening here: my code has a series of decorators before all methods which allow memoization. I do not know exactly how they work but when used the problem described above - namely, that lines of type obj.Auxiliary().IthMethod = some_value get immediately garbage collected - does not occur.
Unfortunately I cannot give further details regarding these decorators as 1) I do not understand them very well and 2) I cannot transmit this information outside my company. I think under this circumstances it is difficult to answer my question because I cannot fully disclose all the necessary details.

Overwriting class methods without inheritance (python)

First, if you guys think the way I'm trying to do things is not Pythonic, feel free to offer alternative suggestions.
I have an object whose functionality needs to change based on outside events. What I've been doing originally is create a new object that inherits from original (let's call it OrigObject()) and overwrites the methods that change (let's call the new object NewObject()). Then I modified both constructors such that they can take in a complete object of the other type to fill in its own values based on the passed in object. Then when I'd need to change functionality, I'd just execute myObject = NewObject(myObject).
I'm starting to see several problems with that approach now. First of all, other places that reference the object need to be updated to reference the new type as well (the above statement, for example, would only update the local myObject variable). But that's not hard to update, only annoying part is remembering to update it in other places each time I change the object in order to prevent weird program behavior.
Second, I'm noticing scenarios where I need a single method from NewObject(), but the other methods from OrigObject(), and I need to be able to switch the functionality on the fly. It doesn't seem like the best solution anymore to be using inheritance, where I'd need to make M*N different classes (where M is the number of methods the class has that can change, and N is the number of variations for each method) that inherit from OrigObject().
I was thinking of using attribute remapping instead, but I seem to be running into issues with it. For example, say I have something like this:
def hybrid_type2(someobj, a):
#do something else
...
class OrigObject(object):
...
def hybrid_fun(self, a):
#do something
...
def switch(type):
if type == 1:
self.hybrid_fun = OrigObject.hybrid_fun
else:
self.fybrid_fun = hybrid_type2
Problem is, after doing this and trying to call the new hybrid_fun after switching it, I get an error saying that hybrid_type2() takes exactly 2 arguments, but I'm passing it one. The object doesn't seem to be passing itself as an argument to the new function anymore like it does with its own methods, anything I can do to remedy that?
I tried including hybrid_type2 inside the class as well and then using self.hybrid_fun = self.hybrid_type2 works, but using self.hybrid_fun = OrigObject.hybrid_fun causes a similar error (complaining that the first argument should be of type OrigObject). I know I can instead define OrigObject.hybrid_fun() logic inside OrigObject.hybrid_type1() so I can revert it back the same way I'm setting it (relative to the instance, rather than relative to the class to avoid having object not be the first argument). But I wanted to ask here if there is a cleaner approach I'm not seeing here? Thanks
EDIT:
Thanks guys, I've given points for several of the solutions that worked well. I essentially ended up using a Strategy pattern using types.MethodType(), I've accepted the answer that explained how to do the Strategy pattern in python (the Wikipedia article was more general, and the use of interfaces is not needed in Python).
Use the types module to create an instance method for a particular instance.
eg.
import types
def strategyA(possible_self):
pass
instance = OrigObject()
instance.strategy = types.MethodType(strategyA, instance)
instance.strategy()
Note that this only effects this specific instance, no other instances will be effected.
You want the Strategy Pattern.
Read about descriptors in Python. The next code should work:
else:
self.fybrid_fun = hybrid_type2.__get__(self, OrigObject)
What about defining it like so:
def hybrid_type2(someobj, a):
#do something else
...
def hybrid_type1(someobj, a):
#do something
...
class OrigObject(object):
def __init__(self):
...
self.run_the_fun = hybrid_type1
...
def hybrid_fun(self, a):
self.run_the_fun(self, a)
def type_switch(self, type):
if type == 1:
self.run_the_fun = hybrid_type1
else:
self.run_the_fun = hybrid_type2
You can change class at runtime:
class OrigObject(object):
...
def hybrid_fun(self, a):
#do something
...
def switch(self):
self.__class__ = DerivedObject
class DerivedObject(OrigObject):
def hybrid_fun(self, a):
#do the other thing
...
def switch(self):
self.__class__ = OrigObject

How do I check if a module/class/methods has changed and log the changes?

I am trying to compare two modules/classes/method and to find out if the class/method has have changed. We allow users to change classes/methods, and after processing, we make those changes persistent, without overwriting the older classes/methods. However, before we commit the new classes, we need to establish if the code has changed and also if the functionally of the methods has changed e.g output differ and performance also defer on the same input data. I am ok with performance change, but my problem is changes in code and how to log - what has changed. i wrote something like below
class TestIfClassHasChanged(unittest.TestCase):
def setUp(self):
self.old = old_class()
self.new = new_class()
def test_if_code_has_changed(self):
# simple case for one method
old_codeobject = self.old.area.func_code.co_code
new_codeobject = self.new.area.func_code.co_code
self.assertEqual(old_codeobject, new_codeobject)
where area() is a method in both classes.. However, if I have many methods, what i see here is looping over all methods. Possible to do this at class or module level?
Secondly if I find that the code objects are not equal, I would like to log the changes. I used inspect.getsource(self.old.area) and inspect.getsource(self.new.area) compared the two to get the difference, could there be a better way of doing this?
You should be using a version control program to help manage development. This is one of the specific d=features you get from vc program is the ability to track changes. You can do diffs between current source code and previous check-ins to test if there were any changes.
if i have many methods, what i see
here is looping over all methods.
Possible to do this at class or module
level?
i will not ask why you want to do such thing ? but yes you can here is an example
import inspect
import collections
# Here i will loop over all the function in a module
module = __import__('inspect') # this is fun !!!
# Get all function in the module.
list_functions = inspect.getmembers(module, inspect.isfunction)
# Get classes and methods correspond .
list_class = inspect.getmembers(module, inspect.isclass)
class_method = collections.defaultdict(list)
for class_name, class_obj in list_class:
for method in inspect.getmembers(class_obj, inspect.ismethod):
class_method[class_name].append(method)

Structuring a program. Classes and functions in Python

I'm writing a program that uses genetic techniques to evolve equations.
I want to be able to submit the function 'mainfunc' to the Parallel Python 'submit' function.
The function 'mainfunc' calls two or three methods defined in the Utility class.
They instantiate other classes and call various methods.
I think what I want is all of it in one NAMESPACE.
So I've instantiated some (maybe it should be all) of the classes inside the function 'mainfunc'.
I call the Utility method 'generate()'. If we were to follow it's chain of execution
it would involve all of the classes and methods in the code.
Now, the equations are stored in a tree. Each time a tree is generated, mutated or cross
bred, the nodes need to be given a new key so they can be accessed from a dictionary attribute of the tree. The class 'KeySeq' generates these keys.
In Parallel Python, I'm going to send multiple instances of 'mainfunc' to the 'submit' function of PP. Each has to be able to access 'KeySeq'. It would be nice if they all accessed the same instance of KeySeq so that none of the nodes on the returned trees had the same key, but I could get around that if necessary.
So: my question is about stuffing EVERYTHING into mainfunc.
Thanks
(Edit) If I don't include everything in mainfunc, I have to try to tell PP about dependent functions, etc by passing various arguements in various places. I'm trying to avoid that.
(late Edit) if ks.next() is called inside the 'generate() function, it returns the error 'NameError: global name 'ks' is not defined'
class KeySeq:
"Iterator to produce sequential \
integers for keys in dict"
def __init__(self, data = 0):
self.data = data
def __iter__(self):
return self
def next(self):
self.data = self.data + 1
return self.data
class One:
'some code'
class Two:
'some code'
class Three:
'some code'
class Utilities:
def generate(x):
'___________'
def obfiscate(y):
'___________'
def ruminate(z):
'__________'
def mainfunc(z):
ks = KeySeq()
one = One()
two = Two()
three = Three()
utilities = Utilities()
list_of_interest = utilities.generate(5)
return list_of_interest
result = mainfunc(params)
It's fine to structure your program that way. A lot of command line utilities follow the same pattern:
#imports, utilities, other functions
def main(arg):
#...
if __name__ == '__main__':
import sys
main(sys.argv[1])
That way you can call the main function from another module by importing it, or you can run it from the command line.
If you want all of the instances of mainfunc to use the same KeySeq object, you can use the default parameter value trick:
def mainfunc(ks=KeySeq()):
key = ks.next()
As long as you don't actually pass in a value of ks, all calls to mainfunc will use the instance of KeySeq that was created when the function was defined.
Here's why, in case you don't know: A function is an object. It has attributes. One of its attributes is named func_defaults; it's a tuple containing the default values of all of the arguments in its signature that have defaults. When you call a function and don't provide a value for an argument that has a default, the function retrieves the value from func_defaults. So when you call mainfunc without providing a value for ks, it gets the KeySeq() instance out of the func_defaults tuple. Which, for that instance of mainfunc, is always the same KeySeq instance.
Now, you say that you're going to send "multiple instances of mainfunc to the submit function of PP." Do you really mean multiple instances? If so, the mechanism I'm describing won't work.
But it's tricky to create multiple instances of a function (and the code you've posted doesn't). For example, this function does return a new instance of g every time it's called:
>>> def f():
def g(x=[]):
return x
return g
>>> g1 = f()
>>> g2 = f()
>>> g1().append('a')
>>> g2().append('b')
>>> g1()
['a']
>>> g2()
['b']
If I call g() with no argument, it returns the default value (initially an empty list) from its func_defaults tuple. Since g1 and g2 are different instances of the g function, their default value for the x argument is also a different instance, which the above demonstrates.
If you'd like to make this more explicit than using a tricky side-effect of default values, here's another way to do it:
def mainfunc():
if not hasattr(mainfunc, "ks"):
setattr(mainfunc, "ks", KeySeq())
key = mainfunc.ks.next()
Finally, a super important point that the code you've posted overlooks: If you're going to be doing parallel processing on shared data, the code that touches that data needs to implement locking. Look at the callback.py example in the Parallel Python documentation and see how locking is used in the Sum class, and why.
Your concept of classes in Python is not sound I think. Perhaps, it would be a good idea to review the basics. This link will help.
Python Basics - Classes

Categories