Say I have the following class and i want to test it.
class SearchRecommended:
def __init__(self, request2template):
self._r2t = request2template
def handle(self, request: Request):
return request.user().queries().add_recommendation_query().run(1).print(
RecommendedSearchMedia(self._r2t(request))
).message(RecommendedSearchMessage)
The object returned by .user() belongs to the User "interface" and is database-related.
class User(Equalable, ABC):
#abstractmethod
def user_id(self):
pass
#abstractmethod
def lang(self):
pass
#abstractmethod
def queries(self) -> "UserQueries":
pass
#abstractmethod
def subscriptions(self) -> "UserSubscriptions":
pass
#abstractmethod
def notifications(self) -> "UserSubsNotifications":
pass
#abstractmethod
def access(self) -> "UserAccess":
pass
def repr(self):
return self.user_id()
UserQueries, UserSubscriptions, UserSubsNotifications, UserAccess are also base classes for database-interacting classes.
As far as I know, unit-tests are meant to be fast and shouldn't use the actual database connection.
Unit tests also shouldn't know too much about the inner structure of the code they are testing.
Mocking the whole database interaction layer is tedious, but mocking only methods used in the method being tested seems like "knowing too much" about the inner code.
Shouldn't my code in the .handle method be free to call whatever method it pleases from User interface (or the object it is being mocked by) and subsequent persistence layer classes (as long as those calls are correct for the given interfaces),
unless I explicitly test for the orded of methods called?
Am I getting something wrong & what should I do?
Your method handle is not suitable for being tested in unit-testing. The only thing that handle does is interacting with other code. But, for testing interactions with other code you rather use integration testing.
The background is, that with any kind of testing your goal is to find bugs. With unit-testing you try to find the bugs in the isolated code. But, if you really isolate your code - what bugs are there to find?
The bugs in your code are more in the direction of "am I calling the proper methods of the other objects with the right arguments in the right order and will the return values be in the form that I expect them to be." All these questions will not be answered by unit-testing, but by integration testing instead.
Your unit need to make sure the class does what it is supposed to do.
In order to accomplish that your class needs certain things to function, in this case a version of the class User.
Your class knows enough about User to call its methods and the results of the methods, so your tests have have enough for those calls to work as expected.
Your mocks don't actually have to make a fake database, or have real functionality, it just has to look like it does. If you only really care about make sure that the data layer is called in order, just have each step of the function chain set a var to true or something, and verify that all of the var are true at the end of the test. Not great, but it makes sure that this class calls the data layer as expected.
Long term, if you keep having to do something like this, make a test double for User or similar classes, add functionality as needed.
Related
Here's my problem:
I have a class. And I have two objects of that class: ObjectOne and ObjectTwo
I'd like my class to have certain methods for ObjectOne and different methods for ObjectTwo.
I'd also like to choose those methods from a variety depending on some condition.
and of course, I need to call the methods I have 'from the outside code'
As I see the solution on my own (just logic, no code):
I make a default class. And I make a list of functions defined somewhere.
IF 'some condition' is True I construct a child class that takes one of those functions and adds it into class as class method. Otherwise I add some default set of methods. Then I make ObjectOne of this child class.
The question is: can I do that at all? And how do I do that? And how do I call such a method once it is added? They all would surely be named differently...
I do not ask for a piece of working code here. If you could give me a hint on where to look or maybe a certain topic to learn, this would do just fine!
PS: In case you wonder, the context is this: I am making a simple game prototype, and my objects represent two game units (characters) that fight each other automatically. Something like an auto-chess. Each unit may have unique abilities and therefore should act (make decisions on the battlefield) depending on the abilities it has. At first I tried to make a unified decision-making routine that would include all possible abilities at once (such as: if hasDoubleStrike else if... etc). But it turned out to be a very complex task, because there are tens of abilities overall, each unit may have any two, so the number of combinations is... vast. So, now I am trying to distribute this logic over separate units: each one would 'know' only of its own two abilities.
I mean I believe this is what would generally be referred to as a bad idea, but... you could have an argument passed into the class's constructor and then define the behavior/existence of a function depending on that condition. Like So:
class foo():
def __init__(self, condition):
if condition:
self.func = lambda : print('baz')
else:
self.func = lambda : print('bar')
if __name__ == '__main__':
obj1 = foo(True)
obj2 = foo(False)
obj1.func()
obj2.func()
Outputs:
baz
bar
You'd likely be better off just having different classes or setting up some sort of class hierarchy.
So in the end the best solution was the classical factory method and factory class. Like this:
import abc
import Actions # a module that works as a library of standard actions
def make_creature(some_params):
creature_factory = CreatureFactory()
tempCreature = creature_factory.make_creature(some_params)
return tempCreature
class CreatureFactory:
def make_creature(some_params):
...
if "foo" in some_params:
return fooChildCreature()
class ParentCreature(metaclass=abc.ABCMeta):
someStaticParams = 'abc'
#abc.abstractmethod
def decisionMaking():
pass
class fooChildCreature(ParentCreature):
def decisionMaking():
Actions.foo_action()
Actions.bar_action()
# some creature-specific decision making here that calls same static functions from 'Actions'
NewCreature = make_creature(some_params)
This is not ideal, this still requires much manual work to define decision making for various kinds of creatures, but it is still WAY better than anything else. Thank you very much for this advice.
I want to use metaclass to implement a factory which make processors for data coming in from different sources. Following is the skeleton code:
class ProcessorFactory
def __call__(self, classname, supers, classdict):
...
def __New__(self, classname, supers, classdict):
...
def __int__(self):
...
class MQ_AddOn(object):
# MQ-specific code
class File_AddOn(object):
# Filesystem-specific code
class Web_AddOn(object):
# Web-specific code
class MQ_Processor(MQ_AddOn, metaclass=ProcessorFactory()):
# code common to all channels (MQ, Filesystem, Web)
class File_Processor(File_AddOn, metaclass=ProcessorFactory()):
# code common to all channels (MQ, Filesystem, Web)
class Web_Processor(Web_AddOn, metaclass=ProcessorFactory()):
# code common to all channels (MQ, Filesystem, Web)
My question is whether there is a way, similar to macro expansion in assembly, to factor out the code common to all channels (MQ, Filesystem, Web) so that it doesn't have to be copied for each of those class?
Sorry - I think you have to expand somewhat more on your pseudocode for a signficative answer.
the way for not copying code around is just use the normal inheritance mechanisms in Python - If you have code common to all those classes, just put that code in a common baseclass, which might be a mixin (i.e. no need for it to be the "only" baseclass), and your are set.
If the common code have to call methods for data aquisition or processing that is specific for each of the subclasses, just write fine grained methods to perform that, and place calls for those in the common method.
Also, there is no need in this process to have ither a "metaclass" or for an "inline macro expansion" - just use plain methods and finer-grained methods.
class Base:
def process(self,...):
# preparing code
...
# data gather
data = self.fetch_data()
# common data pre-processing code:
...
# specific data processing code:
post_data = self.refine_data(data)
# more common code
...
# specific output:
self.output(post_data)
def fetch_data(self, ...):
pass
def refine_data(self, data):
pass
def output(self, data):
pass
And then, on the subclasses you implement those addressing the specific channel peculiarities. There is no big secret there.
Even if you need a lot more things to be done, like steps that are to be called from more than one "leaf" class in each step, you are better of having your class feature a "pipeline" data member, were the steps are annotated and called in order - still, no need to (and no sense in) involving metaclasses.
In your example, the "Base" class could fit all the "common code" you mention, and be used with multiple inheritance:
class MQProcessor(MQAddOn, Base):
...
...
(when you write __call__ and __new__ methods in what would be a metaclass, one might try to think 'ok -these are metaclass-related things that possibly make sense' - but __int__ in a metaclass makes no sense at all: it would provide a way to map a class (not an instance, not some data content) to an integer - even if you are putting these classes in a list, and they need indexes to be located in the list, and you want to cross-reference those, you should just add a custom ".index" attribute to the class, not write __int__ on the metaclass)
I am struggling to understand when it makes sense to use an instance method versus a static method. Also, I don't know if my functions are static since there is not a #staticmethod decorator. Would I be able to access the class functions when I make a call to one of the methods?
I am working on a webscraper that sends information to a database. It’s setup to run once a week. The structure of my code looks like this
import libraries...
class Get:
def build_url(url_paramater1, url_parameter2, request_date):
return url_with_parameters
def web_data(request_date, url_parameter1, url_parameter2): #no use of self
# using parameters pull the variables to look up in the database
for a in db_info:
url = build_url(a, url_parameter2, request_date)
x = requests.Session().get(url, proxies).json()
#save data to the database
return None
#same type of function for pulling the web data from the database and parsing it
if __name__ == ‘__main__’:
Get.web_data(request_date, url_parameter1, url_parameter2)
Parse.web_data(get_date, parameter) #to illustrate the second part of the scrapper
That is the basic structure. The code is functional but I don’t know if I am using the methods (functions?) correctly and potentially missing out on ways to use my code in the future. I may even be writing bad code that will cause errors down the line that are impossibly hard to debug only because I didn’t follow best practices.
After reading about when class and instance methods are used. I cannot see why I would use them. If I want the url built or the data pulled from the website I call the build_url or get_web_data function. I don’t need an instance of the function to keep track of anything separate. I cannot imagine when I would need to keep something separate either which I think is part of the problem.
The reason I think my question is different than the previous questions is: the conceptual examples to explain the differences don't seem to help me when I am sitting down and writing code. I have not run into real world problems that are solved with the different methods that show when I should even use an instance method, yet instance methods seem to be mandatory when looking at conceptual examples of code.
Thank you!
Classes can be used to represent objects, and also to group functions under a common namespace.
When a class represents an object, like a cat, anything that this object 'can do', logically, should be an instance method, such as meowing.
But when you have a group of static functions that are all related to each other or are usually used together to achieve a common goal, like build_url and web_data, you can make your code clearer and more organized by putting them under a static class, which provides a common namespace, like you did.
Therefore in my opinion the structure you chose is legitimate. It is worth considering though, that you'd find static classes more in more definitively OOP languages, like Java, while in python it is more common to use modules for namespace separation.
This code doesn't need to be a class at all. It should just be a pair of functions. You can't see why you would need an instance method because you don't have a reason to instantiate the object in the first place.
The functions you have wrote in your code are instance methods but they were written incorrectly.
An instance method must have self as first parameter
i.e def build_url(self, url_paramater1, url_parameter2, request_date):
Then you call it like that
get_inst = Get()
get_inst.build_url(url_paramater1, url_parameter2, request_date)
This self parameter is provided by python and it allow you to access all properties and functions - static or not - of your Get class.
If you don't need to access other functions or properties in your class then you add #staticmethod decorator and remove self parameter
#staticmethod
def build_url(url_paramater1, url_parameter2, request_date):
And then you can call it directly
Get.build_url(url_paramater1, url_parameter2, request_date)
or call from from class instance
get_inst = Get()
get_inst.build_url(url_paramater1, url_parameter2, request_date)
But what is the problem with your current code you might ask?
Try calling it from an instance like this and u will see the problem
get_inst = Get()
get_inst.build_url(url_paramater1, url_parameter2, request_date)
Example where creating an instance is useful:
Let's say you want to make a chat client.
You could write code like this
class Chat:
def send(server_url, message):
connection = connect(server_url)
connection.write(message)
connection.close()
def read(server_url):
connection = connect(server_url)
message = connection.read()
connection.close()
return message
But a much cleaner and better way to do it:
class Chat:
def __init__(server_url):
# Initialize connection only once when instance is created
self.connection = connect(server_url)
def __del__()
# Close connection only once when instance is deleted
self.connection.close()
def send(self, message):
self.connection.write(message)
def read(self):
return self.connection.read()
To use that last class you do
# Create new instance and pass server_url as argument
chat = Chat("http://example.com/chat")
chat.send("Hello")
chat.read()
# deleting chat causes __del__ function to be called and connection be closed
delete chat
From given example, there is no need to have Get class after all, since you are using it just like a additional namespace. You do not have any 'state' that you want to preserve, in either class or class instance.
What seems like a good thing is to have separate module and define these functions in it. This way, when importing this module, you get to have this namespace that you want.
I've run into an issue writing unit tests for class attributes with the #property decorator. I am using the excellent py.test package for testing and would definitely prefer sticking with it because it's easy to set up fixtures. The code for which I'm writing unit tests looks something like this:
class Foo(object):
def __init__(self, fizz, buzz, (and many more...)):
self.fizz = fizz
self.buzz = buzz
self.is_fizzing = #some function of fizz
self.is_fizz_and_buzz = fizz and buzz
...
#property
def is_bar(self):
return self.is_fizz_and_buzz and self.buzz > 5
#property
def foo_bar(self):
# Note that this property uses the property above
return self.is_bar + self.fizz
# Long list of properties that call each other
The problem occurs when writing a unit test for a property which uses multiple other properties sometimes in long chains of up to four properties. For each unit test, I need to determine which inputs I need to set up and, even worse, it may be the case that those inputs are irrelevant for testing the functionality of that particular method. As a result, I end up testing many more cases than necessary for some of these "properties."
My intuition tells me that if these were actually "properties" they shouldn't have such long, involved calculations that need to be tested. Perhaps it would be better to separate out the actual methods (making them class methods) and write new properties which call these class methods.
Another issue with the current code--correct me if I'm wrong--is that every time a property is called (and most properties are called a lot) the attribute is recalculated. This seems terribly inefficient and could be fixed like this with new properties.
Does it even make sense to test properties? In other words, should an attribute itself be calculated in a property or should it only be set? Rewriting the code as I described above seems so unpythonic. Why even have properties in the first place? If properties are supposed to be so simple to not require testing, why not just define the attribute in the init?
Sorry if these are silly questions. I'm still fairly new to python.
Edit/update:
Would it be possible (easy/pythonic) to mock the object and then perform the test on the property/attribute?
The easiest thing to do imo is to just test the property, e.g. foo_bar, as a normal method using it's function member, e.g., foo_bar.func, which #property automatically provides.
For me, using pytest & unittest.mock, this means instead of doing the below when foo_bar is a normal non-property method:
class TestFoo:
def test_foo_bar(self):
foo_mock = mock.create_autospec(Foo)
# setup `foo_mock` however is necessary for testing
assert Foo.foo_bar(foo_mock) == some_value
I would change the last line to do this:
assert Foo.foo_bar.func(foo_mock) == some_value
If you need to stub/mock other properties in the process (like Foo.is_bar), take a look at unittest.mock.PropertyMock.
I don't understand the need for the additional "delegate"-layer in the following code, from Learning Python, 5ed, by Mark Lutz:
class Super:
def method(self):
print('in Super.method')
def delegate(self):
self.action()
class Provider(Super):
def action(self):
print('in Provider.action')
this means that you must specify action-method in your subclass for delegate-method-call to work.
Super().delegate() ====> Error!
Provider().delegate() ====> Works, prints 'in Provider.action'!
why not just code delegate-method in the subclass? In other words, remove delegate from Super altogether and code in Provider only. The result is still Errors from Super.delegate() and results from Provider.delegate().
Could you kindly provide use cases or references/ pointers? Thank you!
Can't really answer for Mark Lutz, but I can explain why I think it's a good idea. It's similar to the C++ guideline of making virtual functions nonpublic.
The idea is as follows: when the base class provides a public interface that can be overridden, there is no single interface point through which everything must flow. Do you want at a later time to log something, verify arguments, etc.? If you don't do that, you can't.
So here, in Python, it's the analog of this rule:
class Super:
def delegate(self):
# -> Single interface point; log, verify, time, whatever <-
self.action() # <- This varies on the subclass.
As you can see, this idiom allows subclasses to vary some aspects, while enforcing uniformity of other aspects. IMHO, it's a very good idea.