How do I unit test the methods in a method object?

How do I unit test the methods in a method object? - python

I've performed the "Replace Method with Method Object" refactoring described by Beck.
Now, I have a class with a "run()" method and a bunch of member functions that decompose the computation into smaller units. How do I test those member functions?
My first idea is that my unit tests be basically copies of the "run()" method (with different initializations), but with assertions between each call to the member functions to check the state of the computation.
(I'm using Python and the unittest module.)
class Train:
def __init__(self, options, points):
self._options = options
self._points = points
# other initializations
def run(self):
self._setup_mappings_dict()
self._setup_train_and_estimation_sets()
if self._options.estimate_method == 'per_class':
self._setup_priors()
self._estimate_all_mappings()
self._save_mappings()
def _estimate_all_mappings():
# implementation, calls to methods in this class
#other method definitions
I definitely have expectations about what the the states of the member attributes should be before and after calls to the the different methods as part of the implementation of the run() method. Should I be making assertions about these "private" attributes? I don't know how else to unittest these methods.
The other option is that I really shouldn't be testing these.

I'll answer my own question. After a bit of reading and thinking, I believe I shouldn't be unit testing these private methods. I should just test the public interface. If the private methods that do the internal processing are important enough to test independently and are not just coincidences of the current implementation, then perhaps this is a sign that they should be refactored out into a separate class.

I like your answer, but I disagree.
The situation where you would use this design pattern is one where there is a fairly complex operation going on. As a result, being able to verify the individual components of such an operation, I would say, is highly desirable.
You then have the issue of dependancies on other resources (which may or may not be true in this case).
You need to be able to use some form of Inversion of Control in order to inject some form of mock to isolate the class.
Besides most mocking frameworks will provide you with accessors to get at the private members.

There are two principles at play here. The first is that public methods should be the public API you want to expose. In this case, exposing run() is appropriate, whereas exposing estimate_all_mappings() is not, since you don't want anyone else calling that function.
The second is that a single function should only ever do one thing. In this case run() assembles the results of several other complex actions. estimate_all_mappings() is doing one of those complex actions. It, in turn, might be delegating to some other function estimate_map() that does a single estimation that estimate_all_mappings() aggregates.
Therefore, it is correct to have this sort of delegation of responsibilities. Then all that is required is to know how to test a private method.
The only reason to have another class is if there is some subset of the functionality that composes it's own behavioral unit. You wouldn't, for instance, create some class B that is only ever called/used by a class A, unless there was some unit of state that is easier to pass around as an object.

Related

Require guidance for a low level design problem

I have a Main class.
There are five strategies needs to be implemented from which customer can choose one strategy on run time.
Here I am using strategy patterns because the objective of these is the same but the algorithm is different.
Apart from common functions (defined in the interface of strategy classes), there could be some other functions as well in each strategy class to fulfil some requirements if one has chosen that particular strategy at runtime. Some requirements means extra methods which are required to fulfil the requirements but at same time it may or may not be required at user's end.
Also, there are some common functions that I am defining in the Main class. Common means which could be helpful for any strategies.
class Main:
def __init__(self, input):
self.other_work = Extra(input)
self.strategy = Factory(input)
Question 1:
How to use this class:
a = Main(input)
# if want to use some extra function
a.other_work.do_this()
# if related to particular strategy
a.strategy.uncommonStrategy1()
Here challenges are:
How can user know that this extra uncommonStrategy1 function is defined in Strategy1 class and not in Extra class.
I can put uncommonStrategy1 function in Extra class as well but it is irrelevant to other strategies.

The user should not need to know where functionalities are implemented. Use inheritance, python import or wrapper functions (as appropriate to the design) to call the specialized functions. Ideally, when triggering the work the user will not know, or care, which strategy is used. (But the different strategies may need to be initialized with different parameters, etc.; so it cannot be completely transparent).

How can I test a Python private method (yes, I do have reason to test them)?

I am looking for the cleanest way to write unit tests for Python private methods. I know that usually you don't want to test private methods, but we have inherited a gigantic behemoth of a Python file which we need to refactor into more maintainable modules.
We don't understand its logic, but we know it works, and so are looking to use TDD to ensure our refactoring does not break the code, and currently the 90% of the code is located in private methods and the module does too much to reliable test it all purely by black box testing.
I fully expect I'll write some tests that will get removed once the refactor is complete, but for now I'd like to be able to plug into some private methods to test them to increase my confidence that my refactor has not broken key logic as I transition to a more maintainable (and testable) layout.

In Python, "private" methods are only a sign for developer that they should be private. In fact, you can access every method. When you start a method name with two underscores, Python does some name "magic" to make it harder to access. In fact, it does not enforce anything like other languages do.
Let’s say that we have the following class:
class Foo:
def __bar(self, arg):
print(arg)
def baz(self, arg):
self.__bar(arg)
To access the "private" __bar method, try this:
f = Foo()
f._Foo__bar('a')
More about identifiers could be found in the documentation.

Tests for Basic Python Data Structure Interfaces

A fairly small question: does anyone know about a pre-made suite of Python unit tests that just check if a class conforms to one of the standard Python data structure interfaces (e.g., lists, sets, dictionaries, queues, etc). It's not overly hard to write them, but I'd hate to bother doing so if someone has already done this. It seems like very basic functionality that someone has probably done already.
The use case is that I am using a factory pattern to create data structures due to different restrictions related to platforms. As such, I need to be able to test that the resulting created objects still conform to the standard interfaces on the surface. Also, I should note that by "conform" I mean that the tests should check not just that the interface functions exist, but also check that they work (e.g., can set and retrieve a value in a map, for instance). Python 2.7 tests would be preferred.

First, "the standard Python data structure interfaces" are not lists, sets, dictionaries, queues, etc. Those are specific implementations of the interfaces. (And queue isn't even a data structure in the sense you're thinking of—its salient features are that its operations are atomic, and put and get optionally synchronize on a Condition, and so on.)
Anyway, the interfaces are defined in five different not-quite-compatible ways.
The Built-in Types section of the documentation describes what it means to be an iterator type, a sequence type, etc. However, these are not nearly as rigorous as you'd expect for reference documentation (at least if you're used to, say, C++ or Java).
I'm not aware of any tests for such a thing, so I think you'd have to build them from scratch.
The collections module contains Collections Abstract Base Classes that define the interfaces, and provide a way to register "virtual subclasses" via the abc module. So, you can declare "I am a mapping" by inheriting from collections.Mapping, or calling collections.Mapping.register. But that doesn't actually prove that you are a mapping, just that you're claiming to be. (If you inherit from Mapping, it also acts as a mixin that helps you complete the interface by implementing, e.g., __contains__ on top of __getitem__.)
If you want to test the ABC meaning, defuz's answer is very close, and with a little more work I think he or someone else can complete it.
The CPython C API defines an Abstract Objects Layer. While this is not actually authoritative for the language, it's obviously intended that the C-API protocols and the language-level interfaces are supposed to match. And, unlike the latter, the former are rigorously defined. And of course the source code from CPython 2.7, and maybe other implementations like PyPy, may help.
There are tests for this that come with CPython, but really, they're for testing that calling PyMapping_GetItem from C properly calls your mymapping.__getitem__ in Python, which is really at a tangent to what you want to test, so I don't think it will help much.
The actual concrete classes have additional interface on top of the protocols, that you may want to test, but that's harder to describe. In particular, the way the __new__ and __init__ methods work is often important. Implementing the Mapping protocol means someone can construct an empty Foo instance and add items to it with foo[key] = value, but it doesn't mean someone can construct Foo(key=value), or Foo({key: value}) or Foo([(key, value)]).
And for this case, there are existing tests that come with all of the standard Python implementations. CPython comes with a very extensive test suite that includes things like test_dict.py. PyPy runs all the (Python-level) CPython tests, and some extra ones besides.
You will obviously have to modify these tests to run on an arbitrary class instead of one hardcoded into the tests, and you may also have to modify them to handle whichever definition you pick. Plus, they probably test more than you asked for. You just want to know if a class conforms to the protocol, not whether its methods do the right thing, right? But still, I think they're a good starting point.
Finally, the C API defines a Concrete Objects Layer that, although it's not authoritative, matches the previous definition and is more rigorously defined.
Unfortunately, the tests for this one are definitely not going to be very useful to you, because they're checking things like whether PyDict_Check and PyDict_GetItem work on your class, which they will not for any mapping defined in pure Python.
If you do build something complete for any of these definitions, I would strongly suggest putting it on PyPI, and posting about it to python-list, so you get feedback (and bug reports).

There are abstract base classes in standart module collections based on ABC module.
You have to inherit your classes from these classes to be sure that your classes correspond to the standard behavior:
import collections
class MyDict(collections.Mapping):
...
Also, your can test already existed class that does not obviously inherit the abstract class:
class MyPerfectDict(object):
... realization ...
def is_inherit(cls, abstract):
try:
class Test(abstract, cls): pass
test = Test()
except TypeError:
return False
else:
return True
is_inherit(MyPerfectDict, Mapping) # False
is_inherit(dict, Mapping) # True

Python classes, how to use them style-wise, and the Single Responsibility Principle [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I've been programming in Python for some time and have covered some knowledge in Python style but still have a problem on how to use classes properly.
When reading object oriented lecture I often find rules like Single Responsibility Principle that state
"The Single Responsibility Principle says that a class should have
one, and only one, reason to change"
Reading this, I might think of breaking one class into two, like:
class ComplicatedOperations(object):
def __init__(self, item):
pass
def do(self):
...
## lots of other functions
class CreateOption(object):
def __init__(self, simple_list):
self.simple_list = simple_list
def to_options(self):
operated_data = self.transform_data(self.simple_list)
return self.default_option() + operated_data
def default_option(self):
return [('', '')]
def transform_data(self, simple_list):
return [self.make_complicated_operations_that_requires_losts_of_manipulation(item)
for item in simple_list]
def make_complicated_operations_that_requires_losts_of_manipulation(self, item):
return ComplicatedOperations(item).do()
This, for me, raises lots of different questions; like:
When should I use class variables or pass arguments in class functions?
Should the ComplicatedOperations class be a class or just a bunch of functions?
Should the __init__ method be used to calculate the final result. Does that makes that class hard to test.
What are the rules for the pythonists?
Edited after answers:
So, reading Augusto theory, I would end up with something like this:
class ComplicatedOperations(object):
def __init__(self):
pass
def do(self, item):
...
## lots of other functions
def default_option():
return [('', '')]
def complicate_data(item):
return ComplicatedOperations().do(item)
def transform_data_to_options(simple_list):
return default_option() + [self.complicate_data(item)
for item in simple_list]
(Also corrected a small bug with default_option.)

When should I use class variables or pass arguments in class functions
In your example I would pass item into the do method. Also, this is related to programming in any language, give a class only the information it needs (Least Authority), and pass everything that is not internal to you algorithm via parameters (Depedency Injection), so, if the ComplicatedOperations does not need item to initialize itself, do not give it as a init parameter, and if it needs item to do it's job, give it as a parameter.
Should the ComplicatedOperations class be a class or just a bunch of functions
I'd say, depends. If you're using various kinds of operations, and they share some sort of interface or contract, absolutely. If the operation reflects some concept and all the methods are related to the class, sure. But if they are loose and unrelated, you might just use functions or think again about the Single Responsability and split the methods up into other classes
Should the init method be used to calculate the final result. Does that makes that class hard to test.
No, the init method is for initialization, you should do its work on a separated method.
As a side note, because of the lack of context, I did not understand what is CreateOption's role. If it is only used as show above, you might as well just remove it ...

I personally think of classes as of concepts. I'd define a Operation class which behaves like an operation, so contains a do() method, and every other method/property that may make it unique.
As mgilson correctly says, if you cannot define and isolate any concept, maybe a simple functional approach would be better.
To answer your questions:
you should use class attributes when a certain property is shared among the instances (in Python class attributes are initialized at compile time, so different object will see the same value. Usually class attributes should be constants). Use instance attributes to have object-specific properties to use in its methods without passing them. This doesn't mean you should put everything in self, but just what you consider characterising for your object. Use passed variables to have values that do not regard your object and may depend from the state of external objects (or on the execution of the program).
As said above, I'd keep one single class Operation and use a list of Operation objects to do your computations.
the init method would just instantiate the object and make all the processing needed for the proper behaviour of the object (in other words make it read to use).
Just think about the ideas you're trying to model.

A class generally represents a type of object. Class instances are specific objects of that type. A classic example is an Animal class. a cat would be an instance of Animal. class variables (I assume you mean those that belong to the instance rather than the class object itself), should be used for attributes of the instance. In this case, for example, colour could be a class attribute, which would be set as cat.colour = "white" or bear.colour = "brown". Arguments should be used where the value could come from some source outside the class. If the Animal class has a sleep method, it might need to know the duration of the sleep and posture that the animal sleeps in. duration would be an argument of the method, since it has no relation on the animal, but posture would be a class variable since it is determined by the animal.
In python, a class is typically used to group together a set of functions and variables which share a state. Continuing with the above example, a specific animal has a state which is shared across its methods and is defined by its attributes. If your class is just a group of functions which don't in any way depend on the state of the class, then they could just as easily be separate functions.
If __init__ is used to calculate the final result (which would have to be stored in an attribute of the class since __init__ cannot return a result), then you might as well use a function. A common pattern, however, is to do a lot of processing in __init__ via several other, sometimes private, methods of the class. The reason for this is that large complicated functions are often easier to test if they are broken down into smaller, distinct tasks, each of which can then be tested individually. However, this is usually only done when a class is needed anyway.
One approach to the whole business is to start out by deciding what functionality you need. When you have a group of functions or variables which all act on or apply to the same object, then it is time to move them into a class. Remember that Object Oriented Programming (OOP) is a design method suited to some tasks, but is not inherently superiour to functional programming (in fact, some programmers would argue the opposite!), so there's no need to use classes unless there is actually a need.

Classes are an organizational structure. So, if you are not using them to organize, you are doing it wrong. :)
There are several different things you can use them for organizing:
Bundle data with methods that use said data, defines one spot that the code will interact with this data
Bundle like functions together, provides understandable api since 'everyone knows' that all math functions are in the math object
Provide defined communications between methods, sets up a 'conveyor belt' of operations with a defined interface. Each operation is a black box, and can change arbitrarily, so long as it keeps to the standard
Abstract a concept. This can include sub classes, data, methods, so on and so forth all around some central idea like database access. This class then becomes a component you can use in other projects with a minimal amount of retooling
If you don't need to do some organizational thing like the above, then you should go for simplicity and program in a procedural/functional style. Python is about having a toolbox, not a hammer.

Which is more pythonic, factory as a function in a module, or as a method on the class it creates?

I have some Python code that creates a Calendar object based on parsed VEvent objects from and iCalendar file.
The calendar object just has a method that adds events as they get parsed.
Now I want to create a factory function that creates a calendar from a file object, path, or URL.
I've been using the iCalendar python module, which implements a factory function as a class method directly on the Class that it returns an instance of:
cal = icalendar.Calendar.from_string(data)
From what little I know about Java, this is a common pattern in Java code, though I seem to find more references to a factory method being on a different class than the class you actually want to instantiate instances from.
The question is, is this also considered Pythonic ? Or is it considered more pythonic to just create a module-level method as the factory function ?

[Note. Be very cautious about separating "Calendar" a collection of events, and "Event" - a single event on a calendar. In your question, it seems like there could be some confusion.]
There are many variations on the Factory design pattern.
A stand-alone convenience function (e.g., calendarMaker(data))
A separate class (e.g., CalendarParser) which builds your target class (Calendar).
A class-level method (e.g. Calendar.from_string) method.
These have different purposes. All are Pythonic, the questions are "what do you mean?" and "what's likely to change?" Meaning is everything; change is important.
Convenience functions are Pythonic. Languages like Java can't have free-floating functions; you must wrap a lonely function in a class. Python allows you to have a lonely function without the overhead of a class. A function is relevant when your constructor has no state changes or alternate strategies or any memory of previous actions.
Sometimes folks will define a class and then provide a convenience function that makes an instance of the class, sets the usual parameters for state and strategy and any other configuration, and then calls the single relevant method of the class. This gives you both the statefulness of class plus the flexibility of a stand-alone function.
The class-level method pattern is used, but it has limitations. One, it's forced to rely on class-level variables. Since these can be confusing, a complex constructor as a static method runs into problems when you need to add features (like statefulness or alternative strategies.) Be sure you're never going to expand the static method.
Two, it's more-or-less irrelevant to the rest of the class methods and attributes. This kind of from_string is just one of many alternative encodings for your Calendar objects. You might have a from_xml, from_JSON, from_YAML and on and on. None of this has the least relevance to what a Calendar IS or what it DOES. These methods are all about how a Calendar is encoded for transmission.
What you'll see in the mature Python libraries is that factories are separate from the things they create. Encoding (as strings, XML, JSON, YAML) is subject to a great deal of more-or-less random change. The essential thing, however, rarely changes.
Separate the two concerns. Keep encoding and representation as far away from state and behavior as you can.

It's pythonic not to think about esoteric difference in some pattern you read somewhere and now want to use everywhere, like the factory pattern.
Most of the time you would think of a #staticmethod as a solution it's probably better to use a module function, except when you stuff multiple classes in one module and each has a different implementation of the same interface, then it's better to use a #staticmethod
Ultimately weather you create your instances by a #staticmethod or by module function makes little difference.
I'd probably use the initializer ( __init__ ) of a class because one of the more accepted "patterns" in python is that the factory for a class is the class initialization.

IMHO a module-level method is a cleaner solution. It hides behind the Python module system that gives it a unique namespace prefix, something the "factory pattern" is commonly used for.

The factory pattern has its own strengths and weaknesses. However, choosing one way to create instances usually has little pragmatic effect on your code.

A staticmethod rarely has value, but a classmethod may be useful. It depends on what you want the class and the factory function to actually do.
A factory function in a module would always make an instance of the 'right' type (where 'right' in your case is the 'Calendar' class always, but you might also make it dependant on the contents of what it is creating the instance out of.)
Use a classmethod if you wish to make it dependant not on the data, but on the class you call it on. A classmethod is like a staticmethod in that you can call it on the class, without an instance, but it receives the class it was called on as first argument. This allows you to actually create an instance of that class, which may be a subclass of the original class. An example of a classmethod is dict.fromkeys(), which creates a dict from a list of keys and a single value (defaulting to None.) Because it's a classmethod, when you subclass dict you get the 'fromkeys' method entirely for free. Here's an example of how one could write dict.fromkeys() oneself:
class dict_with_fromkeys(dict):
#classmethod
def fromkeys(cls, keys, value=None):
self = cls()
for key in keys:
self[key] = value
return self

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.