There is a requirement to load some data into memory. To do this, I need to ensure that my function that does this is run only once at runtime no matter how many times it is called.
I'm using a decorator to do this in a thread-safe manner.
Here's the code I'm using:
import threading
# Instantiating a lock object
# This will be used to ensure that multiple parallel threads will not be able to run the same function at the same time
# in the #run_once decorator written below
__lock = threading.Lock()
def run_once(f):
"""
Decorator to run a function only once.
:param f: function to be run only once during execution time despite the number of calls
:return: The original function with the params passed to it if it hasn't already been run before
"""
def wrapper(*args, **kwargs):
"""
The actual wrapper where the business logic to call the original function resides
:param args:
:param kwargs:
:return: The original function unless the wrapper has been run already
"""
if not wrapper.has_run:
with __lock:
if not wrapper.has_run:
wrapper.has_run = True
return f(*args, **kwargs)
wrapper.has_run = False
return wrapper
Do I need to do a double check on the has_run flag once outside and once inside the lock so that a read is not being done on a stale object ?
Related
For example:
def get_val(n):
return n
def check_args(func):
# gets the arguments of a function at runtime
get_val(1)
get_val(2)
Note: This is probably bad practice, but I want to understand more about how python works.
Python decorators allow you to do this with minimal effort:
import functools
def check_args(func):
#functools.wraps(func) # Copies documentation and other stuff from wrapped func to wrapper, making it look as much like the wrapped func as possible
def wrapper(*args, **kwargs):
# do stuff to inspect arguments
return func(*args, **kwargs)
return wrapper
#check_args
def get_val(n):
return n
Using #check_args is equivalent to fully defining get_val, then doing:
get_val = check_args(get_val)
which means get_val gets replaced with the wrapper function, that now gets called first, can perform checks, then delegate to the wrapped function (in this case, get_args). Obviously, for just one function, it's kind of pointless (you could just put the checking in get_val), but if you want to check other functions, prefixing their definition with #check_args is a single line of code that doesn't get intermingled with the rest of the code, and keeps boilerplate down.
In my code, I use multiprocessing.Pool to run some code concurrently. Simplified code looks somewhat like this:
class Wrapper():
session: Session
def __init__(self):
self.session = requests.Session()
# Session initialization
def upload_documents(docs):
with Pool(4) as pool:
upload_file = partial(self.upload_document)
pool.starmap(upload_file, documents)
summary = create_summary(documents)
self.upload_document(summary)
def upload_document(doc):
self.post(doc)
def post(data):
self.session.post(self.url, data, other_params)
So basically sending documents via HTTP is parallelized. Now I want to test this code, and can't do it. This is my test:
#patch.object(Session, 'post')
def test_study_upload(self, post_mock):
response_mock = Mock()
post_mock.return_value = response_mock
response_mock.ok = True
with Wrapper() as wrapper:
wrapper.upload_documents(documents)
mc = post_mock.mock_calls
And in debug I can check the mock calls. There is one that looks valid, and it's the one uploading the summary, and a bunch of calls like call.json(), call.__len__(), call.__str__() etc.
There are no calls uploading documents. When I set breakpoint in upload_document method, I can see it is called once for each document, it works as expected. However, I can't test it, because I can't verify this behavior by mock. I assume it's because there are many processes calling on the same mock, but still - how can I solve this?
I use Python 3.6
The approach I would take here is to keep your test as granular as possible and mock out other calls. In this case you'd want to mock your Pool object and verify that it's calling what you're expecting, not actually rely on it to spin up child processes during your test. Here's what I'm thinking:
#patch('yourmodule.Pool')
def test_study_upload(self, mock_pool_init):
mock_pool_instance = mock_pool_init.return_value.__enter__.return_value
with Wrapper() as wrapper:
wrapper.upload_documents(documents)
# To get the upload file arg here, you'll need to either mock the partial call here,
# or actually call it and get the return value
mock_pool_instance.starmap.assert_called_once_with_args(upload_file, documents)
Then you'd want to take your existing logic and test your upload_document function separately:
#patch.object(Session, 'post')
def test_upload_file(self, post_mock):
response_mock = Mock()
post_mock.return_value = response_mock
response_mock.ok = True
with Wrapper() as wrapper:
wrapper.upload_document(document)
mc = post_mock.mock_calls
This gives you coverage both on your function that's creating and controlling your pool, and the function that's being called by the pool instance. Caveat this with I didn't test this, but am leaving some of it for you to fill in the blanks since it looks like it's an abbreviated version of the actual module in your original question.
EDIT:
Try this:
def test_study_upload(self):
def call_direct(func_var, documents):
return func_var(documents)
with patch('yourmodule.Pool.starmap', new=call_direct)
with Wrapper() as wrapper:
wrapper.upload_documents(documents)
This is patching out the starmap call so that it calls the function you pass in directly. It circumvents the Pool entirely; the bottom line being that you can't really dive into those subprocesses created by multiprocessing.
I have a method like this in Python :
def test(a,b):
return a+b, a-b
How can I run this in a background thread and wait until the function returns.
The problem is the method is pretty big and the project involves GUI, so I can't wait until it's return.
In my opinion, you should besides this thread run another thread that checks if there is result. Or Implement callback that is called at the end of the thread. However, since you have gui, which as far as I know is simply a class -> you can store result into obj/class variable and check if the result came.
I would use mutable variable, which is sometimes used. Lets create special class which will be used for storing results from thread functions.
import threading
import time
class ResultContainer:
results = [] # Mutable - anything inside this list will be accesable anywher in your program
# Lets use decorator with argument
# This way it wont break your function
def save_result(cls):
def decorator(func):
def wrapper(*args,**kwargs):
# get result from the function
func_result = func(*args,**kwargs)
# Pass the result into mutable list in our ResultContainer class
cls.results.append(func_result)
# Return result from the function
return func_result
return wrapper
return decorator
# as argument to decorator, add the class with mutable list
#save_result(ResultContainer)
def func(a,b):
time.sleep(3)
return a,b
th = threading.Thread(target=func,args=(1,2))
th.daemon = True
th.start()
while not ResultContainer.results:
time.sleep(1)
print(ResultContainer.results)
So, in this code, we have class ResultContainer with list. Whatever you put in it, you can easily access it from anywhere in the code (between threads and etc... exception is between processes due to GIL). I made decorator, so you can store result from any function without violating the function. This is just example how you can run threads and leave it to store result itself without you taking care of it. All you have to do, is to check, if the result arrived.
You can use global variables, to do the same thing. But I dont advise you to use them. They are ugly and you have to be very careful when using them.
For even more simplicity, if you dont mind violating your function, you can just, without using decorator, just push result to class with list directly in the function, like this:
def func(a,b):
time.sleep(3)
ResultContainer.results.append(tuple(a,b))
return a,b
I have a decorator #newthread which wraps functions to run in a separate thread (using wraps from functools and Thread from threading). However, there are some functions for which I only want this to happen some of the time.
At the moment, I have #newthread check the keyword arguments of the function to be wrapped and if it finds a bool new_thread equal to True then it runs the function in a separate thread, otherwise it runs the function normally. For example,
#newthread
def foo(new_thread=False)
# Do stuff...
foo() # Runs normally
foo(new_thread=True) # Runs in new thread
Is this the canonical way of doing this, or am I missing something?
Don't use newthread as a decorator, then. A decorator is just a function that takes a function and returns a function.
If you want it to run in the current thread, call
foo(some, params)
If you want to run foo in a new thread, call
newthread(foo)(some, params)
#newthread
def foo(new_thread=False)
# Do stuff...
foo() # Runs normally
foo(new_thread=True) # Runs in new thread
That is good - but, I for one, would prefer to have the decorator do consume the "new_thread" argument, instead of having it showing on the parameter list of the decorated functions.
Also, you could use a "default" value so that you'd pick the actual need to use a different thread from somewhere else (like an enviroment variable):
MARKER = object()
def newthread(func):
def wrapper(*args, newthread=MARKER, **kwargs):
if newthread is MARKER:
newthread = os.environ.get("force_threads", True)
if newthread:
...
# cretae new thread and return future-like object
else:
return func(*args, **kwargs)
return wrapper
I have a module which decorates some key functions with custom decorators.
Debugging these functions with pdb often is a bit of a pain, because every time I step into a decorated function I first have to step through the decorator code itself.
I could of course just set the debugger to break within the function I'm interested in, but as key functions they are called many times from many places so I usually prefer to start debugging outside the function.
I tried to illustrate it with code, but I don't know if that helps:
def i_dont_care_about_this(fn):
#functiontools.wraps(fn)
def wrapper(*args, **kwargs):
return fn(*args, **kwargs)
return wrapper
#i_dont_care_about_this
def i_only_care_about_this():
# no use to set pdb here
def i_am_here():
import pdb; pdb.set_trace()
i_only_care_about_this()
So, is there a way for me to step into i_only_care_about_this from i_am_herewithout going through i_dont_care_about_this?
Essentially I want to skip all decorator code when using s to (s)tep into a given decorated function.
If the decorator is purely for logging or other non-functional behavior, then make it a no-op for debugging - insert this code right after the definition of i_dont_care_about_this:
DEBUG = False
# uncomment this line when pdb'ing
# DEBUG = True
if DEBUG:
i_dont_care_about_this = lambda fn : fn
But if it contains actual active code, then you will have to do the work using pdb methods, such as a conditionalized call to pdb.set_trace inside the code inside the decorator:
BREAK_FLAG = False
...
# (inside your function you want to debug)
if BREAK_FLAG:
import pdb; pdb.set_trace()
...
# at your critical calling point
BREAK_FLAG = True
I don't think you can do that. It would change the meaning of step to be something very different.
However, there is a way to achieve something similar to what you want. Set a breakpoint in your decorated function and one just before the decorated function is called. Now, disable the breakpoint inside the function.
Now, when you run the code, it will only break when you reach the specific invocation you care about. Once that break happens, re-enable the breakpoint in the function and continue the execution. This will execute all the decorated code and break on the first line of the decorated function.
TL;DR: Modify bdb.Bdb so that it adds decorator's module name to the list of skipped code. This works with both pdb and ipdb, possibly many others. Examples at the bottom.
From my own experiments with pdb.Pdb (the class that actually does the debugging in case of pdb and ipdb), it seems like it is perfectly doable without modifying neither the code of the function you want to debug nor the decorator.
Python debuggers have facilities that make it possible to skip some predefined code. After all, the debuger has to skip its own code to be of any use.
In fact, the base class for python debuggers has something called "skip argument". It's an argument to it's __init__(), that specifies what the debugger should ignore.
From Python Documentation:
The skip argument, if given, must be an iterable of glob-style module name patterns. The debugger will not step into frames that originate in a module that matches one of these patterns. Whether a frame is considered to originate in a certain module is determined by the __name__ in the frame globals.
The problem with this is that it is specified on a call to set_trace(), after which we already landed in the frame of the decorator, on a break. So there is no feature there that would let us add to that argument at runtime.
Fortunately, modifying existing code at runtime is easy in Python, and there are hacks that we can use to add decorator's module name whenever Bdb.__init__() is called. We can "decorate" Bdb class, so that our module is added to skip list whenever someone creates a Bdb object.
So, here be the example of just that. Please excuse the weird signature and usage of Bdb.__init__() instead of super() - in order to be compatible with pdb we have to do it this way:
# magic_decorator.py
import bdb
old_bdb = bdb.Bdb
class DontDebugMeBdb(bdb.Bdb):
#classmethod
def __init__(cls, *args, **kwargs):
if 'skip' not in kwargs or kwargs['skip'] is None:
kwargs['skip'] = []
kwargs['skip'].append(__name__)
old_bdb.__init__(*args, **kwargs)
#staticmethod
def reset(*args, **kwargs):
old_bdb.reset(*args, **kwargs)
bdb.Bdb = DontDebugMeBdb
def dont_debug_decorator(func):
print("Decorating {}".format(func))
def decorated():
"""IF YOU SEE THIS IN THE DEBUGER - YOU LOST"""
print("I'm decorated")
return func()
return decorated
# buged.py
from magic_decorator import dont_debug_decorator
#dont_debug_decorator
def debug_me():
print("DEBUG ME")
Output of ipdb.runcall in Ipython:
In [1]: import buged, ipdb
Decorating <function debug_me at 0x7f0edf80f9b0>
In [2]: ipdb.runcall(buged.debug_me)
I'm decorated
--Call--
> /home/mrmino/treewrite/buged.py(4)debug_me()
3
----> 4 #dont_debug_decorator
5 def debug_me():
ipdb>
With the following:
def my_decorator(fn):
def wrapper(*args, **kwargs):
return fn(*args, **kwargs)
return wrapper
#my_decorator
def my_func():
...
I invoke pdb with import pdb; pdb.run('my_func()') which enters pdb here:
> <string>(1)<module>()
step to enter the call stack – we are now looking at the first line of the decorator function definition:
def my_decorator(fn):
-> def wrapper(*args, **kwargs):
return fn(*args, **kwargs)
return wrapper
next until pdb is on (pointing at) the line where we return the original function (this may be one next or multiple – just depends on your code):
def my_decorator(fn):
def wrapper(*args, **kwargs):
-> return fn(*args, **kwargs)
return wrapper
step into the original function and voila! we are now at at the point where we can next through our original function.
-> #my_decorator
def my_funct():
...
Not entirely the answer to the question but for the newcomers, if someone debugs decorated function in VSCode to skip the decorator and step in the function do the following:
place a breakpoint inside a function you decorated (function body)
call that function and stat debugging
instead of clicking step over or step into click continue and you end up inside a function.
continue debugging as usual
For example:
#some_decorator
def say_hello(name):
x = f"Hello {name}"
return x
hello = say_hello(name="John")
Place one breakpoint at hello and the second breakpoint at x inside a function.