In my script I'm having to run several functions within a class by using if __name__ == "__main__" but I am unable to adjust the functions within the class due to other people needing to use it for other purposes. The class expects a number of objects to be passed in (these objects are essentially empty arrays that can be filled with certain commands). If I pass in None instead, what will happen when the class functions try to perform operations on them? I presume it will crash because the object's functions will no longer be defined if there is no object. However, is there perhaps a way to ignore these commands by doing something outside of the class? I know try-except would probably work, but I'm trying to avoid making any edits to the class, if at all possible.
As far as I understood the question, you want to pass some mock objects to third-party scripts. Consider using mock library for python 2.x (which is available as for unittest.mock for python >= 3.3) to instantiate objects the objects that functions expect for.
NB But please, if the code in question is written by your colleagues do discuss with them some changes in their code that would simplify both your work and the clarity of the logic and will keep the code safe from mockups.
Related
I would like, given a python module, to monkey patch all functions, classes and attributes it defines. Simply put, I would like to log every interaction a script I do not directly control has with a module I do not directly control. I'm looking for an elegant solution that will not require prior knowledge of either the module or the code using it.
I found several high-level tools that help wrapping, decorating, patching etc... and i've went over the code of some of them, but I cannot find an elegant solution to create a proxy of any given module and automatically proxy it, as seamlessly as possible, except for appending logic to every interaction (record input arguments and return value, for example).
in case someone else is looking for a more complete proxy implementation
Although there are several python proxy solutions similar to those OP is looking for, I could not find a solution that will also proxy classes and arbitrary class objects, as well as automatically proxy functions return values and arguments. Which is what I needed.
I've got some code written for that purpose as part of a full proxying/logging python execution and I may make it into a separate library in the future. If anyone's interested you can find the core of the code in a pull request. Drop me a line if you'd like this as a standalone library.
My code will automatically return wrapper/proxy objects for any proxied object's attributes, functions and classes. The purpose is to log and replay some of the code, so i've got the equivalent "replay" code and some logic to store all proxied objects to a json file.
I am about to refactor the code of a python project built on top of twisted. So far I have been using a simple settings.py module to store constants and dictionaries like:
#settings.py
MY_CONSTANT='whatever'
A_SLIGHTLY_COMPLEX_CONF= {'param_a':'a', 'param_b':b}
A great deal of modules import settings.py to do their stuff.
The reason why I want to refactor the project is because I am in need to change/add configuration parameters on the fly. The approach that I am about to take is to gather all configuration in a singleton and to access its instance whenever I need to.
import settings.MyBloatedConfig
def first_insteresting_function():
cfg = MyBloatedConfig.get_instance()
a_much_needed_param = cfg["a_respectable_key"]
#do stuff
#several thousands of functions later
def gazillionth_function_in_module():
tired_cfg = MyBloatedConfig.get_instance()
a_frustrated_value = cfg["another_respectable_key"]
#do other stuff
This approach works but feels unpythonic and bloated. An alternative would be to externalize the cfg object in the module, like this:
CONFIG=MyBloatedConfig.get_instance()
def a_suspiciously_slimmer_function():
suspicious_value = CONFIG["a_shady_parameter_key"]
Unfortunately this does not work if I am changing the MyBloatedConfig instance entries in another module. Since I am using the reactor pattern, storing staff on a thread local is out of question as well as using a queue.
For completeness, following is the implementation I am using to implement a singleton pattern
instances = {}
def singleton(cls):
""" Use class as singleton. """
global instances
#wraps(cls)
def get_instance(*args, **kwargs):
if cls not in instances:
instances[cls] = cls(*args, **kwargs)
return instances[cls]
return get_instance
#singleton
class MyBloatedConfig(dict):
....
Is there some other more pythonic way to broadcast configuration changes across different modules?
The big, global (often singleton) config object is an anti-pattern.
Whether you have settings.py, a singleton in the style of MyBloatedConfig.get_instance(), or any of the other approaches you've outlined here, you're basically using the same anti-pattern. The exact spelling doesn't matter, these are all just ways to have a true global (as distinct from a Python module level global) shared by all of the code in your entire project.
This is an anti-pattern for a number of reasons:
It makes your code difficult to unit test. Any code that changes its behavior based on this global is going to require some kind of hacking - often monkey-patching - in order to let you unit test its behavior under different configurations. Compare this to code which is instead written to accept arguments (as in, function arguments) and alters its behavior based on the values passed to it.
It makes your code less re-usable. Since the configuration is global, you'll have to jump through hoops if you ever want to use any of the code that relies on that configuration object under two different configurations. Your singleton can only represent one configuration. So instead you'll have to swap global state back and forth to get the different behavior you want.
It makes your code harder to understand. If you look at a piece of code that uses the global configuration and you want to know how it works, you'll have to go look at the configuration. Much worse than this, though, is if you want to change your configuration you'll have to look through your entire codebase to find any code that this might affect. This leads to the configuration growing over time, as you add new items to it and only infrequently remove or modify old ones, for fear of breaking something (or for lack of time to properly track down all users of the old item).
The above problems should hint to you what the solution is. If you have a function that needs to know the value of some constant, make it accept that value as an argument. If you have a function that needs a lot of values, then create a class that can wrap up those values in a convenient container and pass an instance of that class to the function.
The part of this solution that often bothers people is the part where they don't want to spend the time typing out all of this argument passing. Whereas before you had functions that might have taken one or two (or even zero) arguments, now you'll have functions that might need to take three or four arguments. And if you're converting an application written in the style of settings.py, then you may find that some of your functions used half a dozen or more items from your global configuration, and these functions suddenly have a really long signature.
I won't dispute that this is a potential issue, but should be looked upon mostly as an issue with the structure and organization of the existing code. The functions that end up with grossly long signatures depended on all of that data before. The fact was just obscured from you. And as with most programming patterns which hide aspects of your program from you, this is a bad thing. Once you are passing all of these values around explicitly, you'll see where your abstractions need work. Maybe that 10 parameter function is doing too much, and would work better as three different functions. Or maybe you'll notice that half of those parameters are actually related and always belong together as part of a container object. Perhaps you can even put some logic related to manipulation of those parameters onto that container object.
Suppose I have an instance method that contains a lot of nested conditionals. What would be a good way to encapsulate that code? Put in another instance method of the same class or a function? Could you say why a certain approach is preferred?
If the function is only used by one class, and especially if the module has more classes with potentially more utility functions (used only by one class), it might clarify things a bit if you kept the functions as static methods instead to make it obvious which class they belong to. Also, automated refactorings (using the e.g. the rope library, or PyCharm or PyDev etc) then automatically move the static method along with the class to wherever the class is moved.
P.S. #staticmethods, unlike module-level functions, can be overridden in subclasses, e.g. in case of a mathematical formula that doesn't depend on the object but does depend on the type of the object.
There are two different questions here. The first one is what to do with multiple nested conditionals. There's no single right answer: it depends on your coding style, how the conditions interact, the architecture of your program and so on. Have a look at this Programmers.SE question and Jeff Atwood's blog post for some ideas; personally, I like
if not check1: return
code1
if not check2: return
code 2
...
although some people object to the multiple exit points.
The second question is what to do with individual functions if you're writing object oriented Python. The usual answer is just to put them as functions inside the module containing the class, since there's no requirement that a function be attached to a particular class. If you want, though, you can include them in the class as static methods.
When is a class more useful to use than a function? Is there any hard or fast rule that I should know about? Is it language dependent? I'm intending on writing a script for Python which will parse different types of json data, and my gut feeling is that I should use a class to do this, versus a function.
You should use a class when your routine needs to save state. Otherwise a function will suffice.
First of all, I think that isn't language-dependent (if the language permit you to define classes and function as well).
As a general rule I can tell you that a Class wrap into itself a behaviour. So, if you have a certain type of service that you have to implement (with, i.e. different functions) a class is what you're lookin' for.
Moreover classes (say object that is more correct) has state and you can instantiate more occurrences of a class (so different objects with different states).
Not less important, a class can be inearthed: so you can overwrite a specific behaviour of your function only with small changes.
the class when you have the state - something that should be persistent across the calls
the function in other cases
exception: if your class is only storing couple of values and has a single method besides __init__, you should better use the function
For anything non-trivial, you should probably be using a class. I tend to limit all of my "free-floating" functions to a utils.py file.
This is language-dependent.
Some languages, like Java, insist that you use a class for everything. There's simply no concept of a standalone function.
Python isn't like that. It's perfectly OK - in fact recommended - to define functions standalone, and related functions can be grouped together in modules. As others have stated, the only time you really want a class in Python is when you have state that you need to keep - ie, encapsulating the data within the object.
I'm getting a bit of a headache trying to figure out how to organise modules and classes together. Coming from C++, I'm used to classes encapsulating all the data and methods required to process that data. In python there are modules however and from code I have looked at, some people have a lot of loose functions stored in modules, whereas others almost always bind their functions to classes as methods.
For example say I have a data structure and would like to write it to disk.
One way would be to implement a save method for that object so that I could just type
MyObject.save(filename)
or something like that. Another method I have seen in equal proportion is to have something like
from myutils import readwrite
readwrite.save(MyObject,filename)
This is a small example, and I'm not sure how python specific this problem is at all, but my general question is what is the best pythonic practice in terms of functions vs methods organisation?
It seems like loose functions bother you. This is the python way. It makes sense because a module in python is really just an object on the same footing as any other object. It does have language level support for loading it from a file but other than that, it's just an object.
so if I have a module foo.py:
import pprint
def show(obj):
pprint(obj)
Then the when I import it from bar.py
import foo
class fubar(object):
#code
def method(self, obj):
#more stuff
foo.show(obj)
I am essentially accessing a method on the foo object. The data attributes of the foo module are just the globals that are defined in foo. A module is the language level implementation of a singleton without the need to prepend self to every methods argument list.
I try to write as many module level functions as possible. If some function will only work with an instance of a particular class, I will make it a method on the class. Otherwise, I try to make it work on instances of every class that is defined in the module for which it would make sense.
The rational behind the exact example that you gave is that if each class has a save method, then if you later change how you are saving data (from say filesystem to database or remote XML file) then you have to change every class. If each class implements an interface to yield that data that it wants saved, then you can write one function to save instances of every class and only change that function once. This is known as the Single Responsibility Principle: Each class should have only one reason to change.
If you have a regular old class you want to save to disk, I would just make it an instance method. If it were a serialization library that could handle different types of objects I would do the second way.