Python: communication between "faraway" classes on the same call chain - python

I have this start.py:
# start.py
class Start:
def __init__(self):
self.mylist = []
def run(self):
# some code
Executing its run() method will at some point invoke the put_item(obj) method in moduleX.py:
# moduleX.py
def put_item(obj):
# what should I write here
run() is NOT the direct caller of put_item(obj). In fact, from run() to put_item(obj) the execution is quite complex and involves a lot of other invocations.
My problem is, when put_item(obj) is called, can I directly add the value of obj back to mylist in the class Start? For example:
s = Start()
# suppose during this execution, put_item(obj) has been
# invoked 3 times, with obj equals to 1, 2, 3 each time
s.run()
print(s.mylist) # I want it to be [1,2,3]
UPDATE:
From run() to put_item(obj) the execution involves heavy usages of 3rd-party modules and function calls that I have no control over. In other words, the execution inbetween run() to put_item(obj) is like a blackbox to me, and this execution leads to the value of obj that I'm interested in.
obj is consumed in put_item(obj) in moduleX.py, which is also a 3rd-party module. put_item(obj) originally has GUI code that displays obj in a fancy way. However, I want to modify its original behavior such that I can add obj to mylist in class Start and use mylist later in my own way.
Therefore, I cannot pass Start reference along the call chain to put_item since I don't know the call chain and I simply cannot modify it. Also, I cannot change the method signatures in moduleX.py otherwise I'll break the original API. What I can change is the content of put_item(obj) and the start.py.

Simply make put_item return the item you want to put in your instance:
def put_item():
# some code
return 42
class Start:
def __init__(self):
self.mylist = []
def run(self):
# some code
self.mylist.append(put_item())
s = Start()
s.run()
print(s.mylist)
Prints:
[42]

Yes, you can, but you will have to propagate a reference to your Start object's list down the call stack to put_item(). Your Start object can then add items to the list. It does not have to know or care that the object it is being passed is in the Start. It can just blindly add them.
For example (Ideone):
class Start:
def __init__(self):
self.mylist = []
def run(self):
foo(self.mylist)
print(self.mylist)
def foo(listRef):
bar(listRef)
def bar(listRef):
someItem = "Hello, World!"
put_item(listRef, someItem)
def put_item(listRef, obj):
listRef.append(obj)
x = Start()
x.run()
Of course, you'll get the appropriate runtime error if the thing you pass to foo turns out not to be a list.

Related

How to use mock in function run in multiprocessing.Pool

In my code, I use multiprocessing.Pool to run some code concurrently. Simplified code looks somewhat like this:
class Wrapper():
session: Session
def __init__(self):
self.session = requests.Session()
# Session initialization
def upload_documents(docs):
with Pool(4) as pool:
upload_file = partial(self.upload_document)
pool.starmap(upload_file, documents)
summary = create_summary(documents)
self.upload_document(summary)
def upload_document(doc):
self.post(doc)
def post(data):
self.session.post(self.url, data, other_params)
So basically sending documents via HTTP is parallelized. Now I want to test this code, and can't do it. This is my test:
#patch.object(Session, 'post')
def test_study_upload(self, post_mock):
response_mock = Mock()
post_mock.return_value = response_mock
response_mock.ok = True
with Wrapper() as wrapper:
wrapper.upload_documents(documents)
mc = post_mock.mock_calls
And in debug I can check the mock calls. There is one that looks valid, and it's the one uploading the summary, and a bunch of calls like call.json(), call.__len__(), call.__str__() etc.
There are no calls uploading documents. When I set breakpoint in upload_document method, I can see it is called once for each document, it works as expected. However, I can't test it, because I can't verify this behavior by mock. I assume it's because there are many processes calling on the same mock, but still - how can I solve this?
I use Python 3.6
The approach I would take here is to keep your test as granular as possible and mock out other calls. In this case you'd want to mock your Pool object and verify that it's calling what you're expecting, not actually rely on it to spin up child processes during your test. Here's what I'm thinking:
#patch('yourmodule.Pool')
def test_study_upload(self, mock_pool_init):
mock_pool_instance = mock_pool_init.return_value.__enter__.return_value
with Wrapper() as wrapper:
wrapper.upload_documents(documents)
# To get the upload file arg here, you'll need to either mock the partial call here,
# or actually call it and get the return value
mock_pool_instance.starmap.assert_called_once_with_args(upload_file, documents)
Then you'd want to take your existing logic and test your upload_document function separately:
#patch.object(Session, 'post')
def test_upload_file(self, post_mock):
response_mock = Mock()
post_mock.return_value = response_mock
response_mock.ok = True
with Wrapper() as wrapper:
wrapper.upload_document(document)
mc = post_mock.mock_calls
This gives you coverage both on your function that's creating and controlling your pool, and the function that's being called by the pool instance. Caveat this with I didn't test this, but am leaving some of it for you to fill in the blanks since it looks like it's an abbreviated version of the actual module in your original question.
EDIT:
Try this:
def test_study_upload(self):
def call_direct(func_var, documents):
return func_var(documents)
with patch('yourmodule.Pool.starmap', new=call_direct)
with Wrapper() as wrapper:
wrapper.upload_documents(documents)
This is patching out the starmap call so that it calls the function you pass in directly. It circumvents the Pool entirely; the bottom line being that you can't really dive into those subprocesses created by multiprocessing.

Python method decorator that modifies the class

Essentially, I need to keep track of the methods that I wrap with this decorator to use later by editing the original object. The code works if I call the method, but if I don't, the code in the wrapper never executes. The wrapper is the only place where I receive the object where I can modify it. So, I need some other way to modify the object without calling a method that I'm decorating in it.
I've been trying so many different ways but I just can't get this to work.
import functools
def decorator3(**kwargs):
print(1)
def decorator(function):
print(2)
#functools.wraps(function)
def wrapper(self, *args):
print(3)
self.__test__ = "test worked"
function(self, *args)
return wrapper
return decorator
class Test:
def __init__(self):
self.a = "test"
#decorator3()
def test(self):
print(self.a)
t = Test()
#t.test()
print(t.__test__)
Your code parts execute in different times.
The outer decorator function, where you print "1", is called where you read #decorator3(). The result of calling this function, decorator, is then immediately applied to the decorated function (resp. method), where you read the "2".
In your case, the inner decorator replaces the method with a different function which calls the original method. Only if you call this, you reach "3".
So:
at 1, you don't know anything about your function.
at 2, you know your function, but not in which class it lives in.
at 3, you know your object, but only if your function is called.
You have to decide what exactly you want to do: according to your description, you want to flag the functions. That's what you perfectly can do in "2".
You say
I need to keep track of the methods that I decorate later on in the code. Basically, I decorate it to "flag" the method, and I have multiple classes with different methods that I need to flag. I will be calling decorator3 in other classes, so I don't see how putting the decorator in init will help. By editing the original class, I can later on put the method in a dictionary which I was hoping the decorator would do.
Maybe the following works:
import functools
def method_flagger(function):
function._flag = "flag"
return function
list_of_methods = []
def flag_reader(function):
#functools.wraps(function)
def wrapper(self, *args):
for i in dir(self.__class__):
#for i, j in self.__class__.__dict__.items():
method_wrapper = getattr(self, i)
if hasattr(getattr(self, i), "_flag"):
list_of_methods.append((self, i))
return function(self, *args)
return wrapper
class Test:
#flag_reader
def __init__(self):
self.a = "test"
#method_flagger
def test(self):
print(self.a)
#method_flagger
def test2(self):
print(self.a)
t1 = Test()
t2 = Test()
#t.test()
#print(t.__test__)
print(list_of_methods)
It gives me the output
[(<__main__.Test object at 0x000001D56A435860>, 'test'), (<__main__.Test object at 0x000001D56A435860>, 'test2'), (<__main__.Test object at 0x000001D56A435940>, 'test'), (<__main__.Test object at 0x000001D56A435940>, 'test2')]
so for each affected object instance and each decorated function, we get a tuple denoting these both.

python threading - multiple parameter and return

I have a method like this in Python :
def test(a,b):
return a+b, a-b
How can I run this in a background thread and wait until the function returns.
The problem is the method is pretty big and the project involves GUI, so I can't wait until it's return.
In my opinion, you should besides this thread run another thread that checks if there is result. Or Implement callback that is called at the end of the thread. However, since you have gui, which as far as I know is simply a class -> you can store result into obj/class variable and check if the result came.
I would use mutable variable, which is sometimes used. Lets create special class which will be used for storing results from thread functions.
import threading
import time
class ResultContainer:
results = [] # Mutable - anything inside this list will be accesable anywher in your program
# Lets use decorator with argument
# This way it wont break your function
def save_result(cls):
def decorator(func):
def wrapper(*args,**kwargs):
# get result from the function
func_result = func(*args,**kwargs)
# Pass the result into mutable list in our ResultContainer class
cls.results.append(func_result)
# Return result from the function
return func_result
return wrapper
return decorator
# as argument to decorator, add the class with mutable list
#save_result(ResultContainer)
def func(a,b):
time.sleep(3)
return a,b
th = threading.Thread(target=func,args=(1,2))
th.daemon = True
th.start()
while not ResultContainer.results:
time.sleep(1)
print(ResultContainer.results)
So, in this code, we have class ResultContainer with list. Whatever you put in it, you can easily access it from anywhere in the code (between threads and etc... exception is between processes due to GIL). I made decorator, so you can store result from any function without violating the function. This is just example how you can run threads and leave it to store result itself without you taking care of it. All you have to do, is to check, if the result arrived.
You can use global variables, to do the same thing. But I dont advise you to use them. They are ugly and you have to be very careful when using them.
For even more simplicity, if you dont mind violating your function, you can just, without using decorator, just push result to class with list directly in the function, like this:
def func(a,b):
time.sleep(3)
ResultContainer.results.append(tuple(a,b))
return a,b

Inner class methods vs instance methods in python

I wrote a simple program that accepts input from the user and capitalizes it, obviously this can be done in different ways
class example():
def say_it(self):
result = input('what do you wanna say to the world')
return result
def get_result(self):
res = self.say_it()
return res
def capitalize(self):
res = self.get_result()
res = res.upper()
print(res)
def main():
Ex = example()
res = Ex.capitalize()
if __name__ == '__main__': main()
This program has 3 methods in the class body, then a new instance is created in the main function and only the capitalize method is called and the class does the whole magic and prints out a capitalized in put from the user making the whole main method look very clean
class example():
def say_it(self):
result = input('what do you wanna say to the world')
return result
def capitalize(self, words):
words = words.upper()
return words
def main():
Ex = example()
res = Ex.say_it()
final_result = Ex.capitalize(res)
print(final_result)
if __name__ == '__main__': main()
The second program does the same thing but it has less methods in the class body and more methods in the main method, it calls the methods in the class and works with the results returned, and then the final print statement is actually issued in the main method unlike the first program, thought it looks like the main method could get very confusing when the program expands and grows
My question is this which method will scale better in real life situations (i.e more readable, easier to debug) where they might be like 15 methods, will it be better to just call a single method that does all the magic and gets the result or call the methods one by one in the main method, i sometimes find myself writing programs the first way where i just call one method and the class handles everything else, Also is there any difference in speed between this two programs, which one will be faster?
Functions should do what they say they do. It is confusing to have a function called capitalize() that goes and calls a function to print and prompt and gather input.
A function shouldn't just call another function and not provide any value. The get_result() function doesn't serve a purpose. Calling say_it() instead produces the same result.
Your class should keep the data. That's the whole point of the object. Main can call the functions, but it shouldn't have the data. The words should have been stored in the class.
There is not perceptible performance difference between who calls the functions.

Parallel processing loop using multiprocessing Pool

I want to process a large for loop in parallel, and from what I have read the best way to do this is to use the multiprocessing library that comes standard with Python.
I have a list of around 40,000 objects, and I want to process them in parallel in a separate class. The reason for doing this in a separate class is mainly because of what I read here.
In one class I have all the objects in a list and via the multiprocessing.Pool and Pool.map functions I want to carry out parallel computations for each object by making it go through another class and return a value.
# ... some class that generates the list_objects
pool = multiprocessing.Pool(4)
results = pool.map(Parallel, self.list_objects)
And then I have a class which I want to process each object passed by the pool.map function:
class Parallel(object):
def __init__(self, args):
self.some_variable = args[0]
self.some_other_variable = args[1]
self.yet_another_variable = args[2]
self.result = None
def __call__(self):
self.result = self.calculate(self.some_variable)
The reason I have a call method is due to the post I linked before, yet I'm not sure I'm using it correctly as it seems to have no effect. I'm not getting the self.result value to be generated.
Any suggestions?
Thanks!
Use a plain function, not a class, when possible. Use a class only when there is a clear advantage to doing so.
If you really need to use a class, then given your setup, pass an instance of Parallel:
results = pool.map(Parallel(args), self.list_objects)
Since the instance has a __call__ method, the instance itself is callable, like a function.
By the way, the __call__ needs to accept an additional argument:
def __call__(self, val):
since pool.map is essentially going to call in parallel
p = Parallel(args)
result = []
for val in self.list_objects:
result.append(p(val))
Pool.map simply applies a function (actually, a callable) in parallel. It has no notion of objects or classes. Since you pass it a class, it simply calls __init__ - __call__ is never executed. You need to either call it explicitly from __init__ or use pool.map(Parallel.__call__, preinitialized_objects)

Categories