Mutex lock in python3

Mutex lock in python3 - python

I'm using mutex for blocking part of code in the first function. Can I unlock mutex in the second function?
For example:
import threading
mutex = threading.Lock()
def function1():
mutex.acquire()
#do something
def function2():
#do something
mutex.release()
#do something

You certainly can do what you're asking, locking the mutex in one function and unlocking it in another one. But you probably shouldn't. It's bad design. If the code that uses those functions calls them in the wrong order, the mutex may be locked and never unlocked, or be unlocked when it isn't locked (or even worse, when it's locked by a different thread). If you can only ever call the functions in exactly one order, why are they even separate functions?
A better idea may be to move the lock-handling code out of the functions and make the caller responsible for locking and unlocking. Then you can use a with statement that ensures the lock and unlock are exactly paired up, even in the face of exceptions or other unexpected behavior.
with mutex:
function1()
function2()
Or if not all parts of the two functions are "hot" and need the lock held to ensure they run correctly, you might consider factoring out the parts that need the lock into a third function that runs in between the other two:
function1_cold_parts()
with mutex:
hot_parts()
function2_cold_parts()

Related

Do I need to terminate a started thread before starting another thread?

I have a button which calls a function. This function is called using a thread. If I click the button more than once I get an error: RuntimeError: threads can only be started once. I got the solution on SO (create a new thread). But if I create a new thread every time that button is clicked, what happens to the previous thread. Should I worry about the previous threads?
import tkinter as tk
from tkinter import ttk
import threading
root = tk.Tk()
#creating new thread on click
def start_rename():
new_thread = threading.Thread(target= bulk_rename)
new_thread.start()
def bulk_rename():
print("renaming...")
rename_button = ttk.Button(root, text="Bulk Rename", command=start_rename)
rename_button.pack()
root.mainloop()

Here's a different way to say the same things:
A mutex (a.k.a., a "lock", or a "binary semaphore") is an object with two methods; lock() and unlock(), or acquire() and release(), or decrement() and increment(), (or P() and V() in really old programs.) The lock() method does not return until the calling thread "owns"
the mutex, and a subsequent call to the unlock() method from the
same thread will relinquish ownership.
No two threads will be allowed to own the same mutex at the same time. If two or more threads simultaneously try to lock a mutex, one will immediately "win" ownership, and the others will wait for it to be unlocked again.
Assuming each of the competing threads eventually unlocks the mutex, then all of them eventually will be allowed to own the mutex, but only one-by-one. The lock() function cannot fail. The only thing it can do is wait for the mutex to become available, and then take ownership. If some thread in a buggy program keeps ownership of some mutex forever, then a subsequent attempt by some other thread to lock() that same mutex will wait forever.
We sometimes call the part of the program that comes between a lock() call and the subsequent unlock() call a critical section.
We can use a mutex as an advisory lock.
Imagine a program with three variables, A, B, and C, that are shared by several threads. The program has an important rule: A+B+C must always equal zero. Computer scientists call a rule like that an invariant—it's a statement about the program that always is true.
But, what happens if one thread needs to perform some operation, mut(), that changes the values of A, B, and C? It cannot change all three variables simultaneously, so it must temporarily break the invariant. In that moment, some other thread could see the variables in that broken state, and the program could fail.
We fix that problem by having every thread lock() the same advisory lock (i.e., the same mutex) before accessing A, B, or C. And we make sure that A+B+C=0 again before any thread unlocks() the mutex. If the thread calling mut() obeys this rule, then no other thread that also obeys the same rule will ever see A, B, and C in the "broken" state.
If none of the threads in the program ever accesses A, or B, or C without owning the mutex, then we can say that we have effectively made mut() an atomic operation.
You actually should lock a mutex when accessing shared variables regardless of any invariant—do it even if accessing just a single, primitive flag or integer—because using mutexes on a multi-CPU machine enables the different CPUs to see a consistent view of memory. In modern systems, access to the same variable by more than one thread with no locking can lead to undefined behavior.
A program with more than one invariant may use more than one mutex object to protect them: One mutex for each. But, programmers are strongly advised to learn about deadlock before writing any program in which a single thread locks more than one mutex at the same time.
"Deadlock" is the answer to a whole other question, but TLDR, it's what happens when Thread1 owns mutexA, and it's waiting to acquire mutexB; while at the same time, Thread2 owns mutexB, and is waiting to acquire mutexA. It's a thing that actually happens sometimes in real, commercial software, that was written by smart programmers. Usually it's because there were a lot of smart programmers, who weren't always talking to each other.

How can I intercept calls to lock.acquire() and lock.release() in the threading module automatically?

My goal is to be able to recognize at runtime directly that a certain lock.acquire() and lock.release() calls were made and be able to get info on this lock object.
So, in theory I do not know where or if they are called - therefore, I don't want to use decorators or subclass the lock class.
My approach:
The problem is that when using sys.settrace(), the calls to acquire() and release() do not cause me to enter a new scope and thus I'm still in the same same scope without knowing these functions were even called.
This gets me to think that the C code is executed directly for a lock from _thread module (is that correct?)
So, is there a way within this approach to do that?

Is resetting Python thread/process tasks by calling self.run() pythonic?

Regarding the code below of the process class MyProcessClass, sometimes I want to rerun all of the self.run tasks.
self.run(retry=True) is what I use to rerun the run(self) tasks within the class. It allows me rerun the tasks of the process class run(self) whenever I want to from wherever I want to from any class function.
MyProcessClass(Process):
def __init__(self):
Process.__init__(self)
#gets called automatically on class initialization
#process.start().
#it also gets called when a class function calls
#self.run(retry=True)
def run(self,end=False,retry=False):
if end==True:
sys.exit()
elif retry==True:
redo_prep()
do_stuff()
#represents class functions doing stuff
def do_stuff():
#stuff happens well
return
#stuff happens and need to redo everything
self.run(retry=True)
I don't want the thread/process to end, but I want everything to rerun. Could this cause problems because the run function is being called recursively-ish and I am running hundreds of these process class objects at one time. The box hits about 32GB of memory when all are running. Only objects that need to will be rerun.
My goal is to rerun the self.run tasks if needed or end the thread if needed from anywhere in the class, be it 16 functions deep or 2. In a sense, I am resetting the thread's tasks, since I know resetting the thread from within doesn't work. I have seen other ideas regarding "resetting" threads from How to close a thread from within?. I am looking for the most pythonic way of dealing with rerunning class self.run tasks.
I usually use try-catch throughout the class:
def function():
while True:
try:
#something bad
except Exception as e:
#if throttle just wait
#otherwise, raise
else:
return
Additional Question: If I were to raise a custom exception to trigger a #retry for the retries module, would I have to re-raise? Is that more or less pythonic than the example above?
My script had crapped out in a way I hadn't seen before and I worried that calling the self.run(retry=True) had caused it to do this. I am trying to see if there is anything crazy about the way I am calling the self.run() within the process class.

It looks like you're implementing a rudimentary retrying scenario. You should consider delegating this to a library for this purpose, like retrying. This will probably be a better approach compared to the logic you're trying to implement within the thread to 'reset' it.
By raising/retrying on specific exceptions, you should be able to implement the proper error-handling logic cleanly with retrying. As a best-practice, you should avoid broad excepts and catch specific exceptions whenever possible.
Consider a pattern whereby the thread itself does not need to know if it will need to be 'reset' or restarted. Instead, if possible, try to have your thread return some value or exception info so the main thread can decide whether to re-queue a task.

What is the Python "with" statement used for?

I am trying to understand the with statement in python. Everywhere I look it talks of opening and closing a file, and is meant to replace the try-finally block. Could someone post some other examples too. I am just trying out flask and there are with statements galore in it. Definitely request someone to provide some clarity on it.

There's a very nice explanation here. Basically, the with statement calls two special methods on the associated object. The __enter__ and __exit__ methods. The enter method returns the variable associated with the "with" statement. While the __exit__ method is called after the statement executes to handle any cleanup (such as closing a file pointer).

The idea of the with statement is to make "doing the right thing" the path of least resistance. While the file example is the simplest, threading locks actually provide a more classic example of non-obviously buggy code:
try:
lock.acquire()
# do stuff
finally:
lock.release()
This code is broken - if the lock acquisition ever fails, either the wrong exception will be thrown (since the code will attempt to release a lock that it never acquired), or, worse, if this is a recursive lock, it will be released early. The correct code looks like this:
lock.acquire()
try:
# do stuff
finally:
# If lock.acquire() fails, this *doesn't* run
lock.release()
By using a with statement, it becomes impossible to get this wrong, since it is built into the context manager:
with lock: # The lock *knows* how to correctly handle acquisition and release
# do stuff
The other place where the with statement helps greatly is similar to the major benefit of function and class decorators: it takes "two piece" code, which may be separated by an arbitrary number of lines of code (the function definition for decorators, the try block in the current case) and turns it into "one piece" code where the programmer simply declares up front what they're trying to do.
For short examples, this doesn't look like a big gain, but it actually makes a huge difference when reviewing code. When I see lock.acquire() in a piece of code, I need to scroll down and check for a corresponding lock.release(). When I see with lock:, though, no such check is needed - I can see immediately that the lock will be released correctly.

There are twelve examples of using with in PEP343, including the file-open example:
A template for ensuring that a lock, acquired at the start of a
block, is released when the block is left
A template for opening a file that ensures the file is closed
when the block is left
A template for committing or rolling back a database
transaction
Example 1 rewritten without a generator
Redirect stdout temporarily
A variant on opened() that also returns an error condition
Another useful example would be an operation that blocks
signals
Another use for this feature is the Decimal context
Here's a simple context manager for the decimal module
A generic "object-closing" context manager
a released() context to temporarily release a previously acquired lock by swapping the acquire() and release() calls
A "nested" context manager that automatically nests the
supplied contexts from left-to-right to avoid excessive
indentation

Release a lock temporarily if it is held, in python

I have a bunch of different methods that are not supposed to run concurrently, so I use a single lock to synchronize them. Looks something like this:
selected_method = choose_method()
with lock:
selected_method()
In some of these methods, I sometimes call a helper function that does some slow network IO. (Let's call that one network_method()). I would like to release the lock while this function is running, to allow other threads to continue their processing.
One way to achieve this would be by calling lock.release() and lock.acquire() before and after calling the network method. However, I would prefer to keep the methods oblivious to the lock, since there are many of them and they change all the time.
I would much prefer to rewrite network_method() so that it checks to see whether the lock is held, and if so release it before starting and acquire it again at the end.
Note that network_method() sometimes gets called from other places, so it shouldn't release the lock if it's not on the thread that holds it.
I tried using the locked() method on the Lock object, but that method only tells me whether the lock is held, not if it is held by the current thread.
By the way, lock is a global object and I'm fine with that.

I would much prefer to rewrite network_method() so that it checks to see whether the lock is held, and if so release it before starting and acquire it again at the end.
Note that network_method() sometimes gets called from other places, so it shouldn't release the lock if it's not on the thread that holds it.
This just sounds like entirely the wrong thing to do :(
For a start, it's bad to have a function that sometimes has some other magical side-effect depending on where you call it from. That's the sort of thing that is a nightmare to debug.
Secondly, a lock should have clear acquire and release semantics. If I look at code that says "lock(); do_something(); unlock();" then I expect it to be locked for the duration of do_something(). In fact, it is also telling me that do_something() requires a lock. If I find out that someone has written a particular do_something() which actually unlocks the lock that I just saw to be locked, I will either (a) fire them or (b) hunt them down with weapons, depending on whether I am in a position of seniority relative to them or not.
By the way, lock is a global object and I'm fine with that.
Incidentally, this is also why globals are bad. If I modify a value, call a function, and then modify a value again, I don't want that function in the middle being able to reach back out and modify this value in an unpredictable way.
My suggestion to you is this: your lock is in the wrong place, or doing the wrong thing, or both. You say these methods aren't supposed to run concurrently, but you actually want some of them to run concurrently. The fact that one of them is "slow" can't possibly make it acceptable to remove the lock - either you need the mutual exclusion during this type of operation for it to be correct, or you do not. If the slower operation is indeed inherently safe when the others are not, then maybe it doesn't need the lock - but that implies the lock should go inside each of the faster operations, not outside them. But all of this is dependent on what exactly the lock is for.

Why not just do this?
with lock:
before_network()
do_network_stuff()
with lock:
after_network()

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.