Is resetting Python thread/process tasks by calling self.run() pythonic? - python

Regarding the code below of the process class MyProcessClass, sometimes I want to rerun all of the self.run tasks.
self.run(retry=True) is what I use to rerun the run(self) tasks within the class. It allows me rerun the tasks of the process class run(self) whenever I want to from wherever I want to from any class function.
MyProcessClass(Process):
def __init__(self):
Process.__init__(self)
#gets called automatically on class initialization
#process.start().
#it also gets called when a class function calls
#self.run(retry=True)
def run(self,end=False,retry=False):
if end==True:
sys.exit()
elif retry==True:
redo_prep()
do_stuff()
#represents class functions doing stuff
def do_stuff():
#stuff happens well
return
#stuff happens and need to redo everything
self.run(retry=True)
I don't want the thread/process to end, but I want everything to rerun. Could this cause problems because the run function is being called recursively-ish and I am running hundreds of these process class objects at one time. The box hits about 32GB of memory when all are running. Only objects that need to will be rerun.
My goal is to rerun the self.run tasks if needed or end the thread if needed from anywhere in the class, be it 16 functions deep or 2. In a sense, I am resetting the thread's tasks, since I know resetting the thread from within doesn't work. I have seen other ideas regarding "resetting" threads from How to close a thread from within?. I am looking for the most pythonic way of dealing with rerunning class self.run tasks.
I usually use try-catch throughout the class:
def function():
while True:
try:
#something bad
except Exception as e:
#if throttle just wait
#otherwise, raise
else:
return
Additional Question: If I were to raise a custom exception to trigger a #retry for the retries module, would I have to re-raise? Is that more or less pythonic than the example above?
My script had crapped out in a way I hadn't seen before and I worried that calling the self.run(retry=True) had caused it to do this. I am trying to see if there is anything crazy about the way I am calling the self.run() within the process class.

It looks like you're implementing a rudimentary retrying scenario. You should consider delegating this to a library for this purpose, like retrying. This will probably be a better approach compared to the logic you're trying to implement within the thread to 'reset' it.
By raising/retrying on specific exceptions, you should be able to implement the proper error-handling logic cleanly with retrying. As a best-practice, you should avoid broad excepts and catch specific exceptions whenever possible.
Consider a pattern whereby the thread itself does not need to know if it will need to be 'reset' or restarted. Instead, if possible, try to have your thread return some value or exception info so the main thread can decide whether to re-queue a task.

Related

Mutex lock in python3

I'm using mutex for blocking part of code in the first function. Can I unlock mutex in the second function?
For example:
import threading
mutex = threading.Lock()
def function1():
mutex.acquire()
#do something
def function2():
#do something
mutex.release()
#do something
You certainly can do what you're asking, locking the mutex in one function and unlocking it in another one. But you probably shouldn't. It's bad design. If the code that uses those functions calls them in the wrong order, the mutex may be locked and never unlocked, or be unlocked when it isn't locked (or even worse, when it's locked by a different thread). If you can only ever call the functions in exactly one order, why are they even separate functions?
A better idea may be to move the lock-handling code out of the functions and make the caller responsible for locking and unlocking. Then you can use a with statement that ensures the lock and unlock are exactly paired up, even in the face of exceptions or other unexpected behavior.
with mutex:
function1()
function2()
Or if not all parts of the two functions are "hot" and need the lock held to ensure they run correctly, you might consider factoring out the parts that need the lock into a third function that runs in between the other two:
function1_cold_parts()
with mutex:
hot_parts()
function2_cold_parts()

How to get exception occurring in a polling thread?

I am using the concurrent.futures module and my code looks like this
class SomeClass:
def __init__(self):
executor = ThreadPoolExecutor(max_workers=1)
executor.submit(self._start_polling())
def _start_polling(self):
while True:
# polling some stuff
def get_stuff(self):
# returns some stuff
I will be using this class to get_stuff() multiple times in my code and I need to polling to make sure I always get the latest stuff (the stuff changes from time to time by some other program). Now, if there occurs an exception in the polling thread, how do I raise it in the main thread and stop the entire program? Currently, if there's an exception the polling thread dies and get_stuff() returns stale data.
I tried getting the future object but if I use it in any way like future.exception() it just blocks the execution of the main thread on it. Any advice would be much appreciated.
Edit:
I first looked into asyncio to do this but after reading about it a bunch it looks like running a asyncio is not really good for running background tasks like this. Correct me if I am wrong.

How to reference a thread in Python 3?

I am trying to call a thread I define in a function from another function. Here is the first function, its purpose is to create and start a thread:
def startThread(func):
listen = threading.Thread(target = func)
listen.start()
I am trying to implement a function that will close the thread created in that first function, how should I go about it? I don't know how to successfully pass the thread.
def endThread(thread):
thread.exit()
Thank you!
This problem is almost FAQ material.
To summarise, there is no way to kill a thread from the outside. You can of course pass the thread object to any function you want, but threading library is missing kill and exit calls.
There are more or less two distinct ways around this, depending on what your thread does.
The first method is to make it so that your thread co-operates. This approach is discussed here: Is there any way to kill a Thread in Python? This method adds a check to your thread loop and a way to raise a "stop signal", which will then cause the thread to exit from the inside when detected.
This method works fine if your thread is a relatively busy loop. If it is something that is blocking in IO wait, not so much, as your thread could be blocking in a read call for days or weeks before receiving something and executing the signal check part. Many IO calls accept a timeout value, and if it is acceptable to wait a couple of seconds before your thread exits, you can use this to force the exit check every N seconds without making your thread a busy loop.
The other approach is to replace threads with processes. You can force kill a subprocess. If you can communicate with your main program with queues instead of shared variables, this is not too complicated, either. If your program relies heavily on sharing global variables, this would require a major redesign.
If your program is waiting in IO loops, you need instantaneous termination and you are using shared global variables, then you are somewhat out of luck, as you either need to accept your threads not behaving nicely or you need to redesign some parts of your code to untangle either the IO wait or shared variables.

Python; best practices for killing other threads

I want to know: what is the best practice for killing threads started by a main Python application in the case the main application receives a SIGINT?
I am doing the following thing, but I HIGHLY suspect that because needing to kill other started threads is such a common problem, that probably there is a better way to do it:
class Handler(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.keep_go = True
def run(self):
while self.keep_go:
#do something
def stop(self): #seems like i shouldn't have to do this myself
self.keep_go = False
try:
h = Handler()
h.start()
while True: #ENTER SOME OTHER LOOP HERE
#do something else
except KeyboardInterrupt: #seems like i shouldn't have to do this myself
pass
finally:
h.stop()
The following post is related, but it is not clear to me what the actual recommended practice is, because the answers are more of a "here's some possibly hackish way you can do this". Also, I do not need to kill somethng "abruptly"; I am ok with doing it "the right way": Is there any way to kill a Thread in Python?
Edit: I guess one minor flaw with my approach is that it does not kill the current processing in the while loop. It does not receive a "kill event" that "rolls" back this loop as a transaction, nor does it halt the remainder of the loop.
I usually just set each thread's daemon attribute to True. That way, when the main thread terminates, so does everything else.
The documentation has a little more to say on the matter:
Daemon threads are abruptly stopped at shutdown. Their resources (such as open files, database transactions, etc.) may not be released properly. If you want your threads to stop gracefully, make them non-daemonic and use a suitable signalling mechanism such as an Event.

How to know if a particular task inside a queue is complete?

I have a doubt with respect to python queues.
I have written a threaded class, whose run() method executes the queue.
import threading
import Queue
def AThread(threading.Thread):
def __init__(self,arg1):
self.file_resource=arg1
threading.Thread.__init__(self)
self.queue=Queue.Queue()
def __myTask(self):
self.file_resource.write()
''' Method that will access a common resource
Needs to be synchronized.
Returns a Boolean based on the outcome
'''
def run():
while True:
cmd=self.queue.get()
#cmd is actually a call to method
exec("self.__"+cmd)
self.queue.task_done()
#The problem i have here is while invoking the thread
a=AThread()
a.queue.put("myTask()")
print "Hai"
The same instance of AThread (a=AThread()) will load tasks to the queue from different locations.
Hence the print statement at the bottom should wait for the task added to the queue through the statement above and wait for a definitive period and also receive the value returned after executing the task.
Is there a simplistic way to achieve this ?. I have searched a lot regarding this, kindly review this code and provide suggessions.
And Why python's acquire and release lock are not on the instances of the class. In the scenario mentioned, instances a and b of AThread need not be synchronized, but myTask runs synchronized for both instances of a as well as b when acquire and release lock are applied.
Kindly provide suggestions.
There's lots of approaches you could take, depending on the particular contours of your problem.
If your print "Hai" just needs to happen after myTask completes, you could put it into a task and have myTask put that task on the queue when it finishes. (if you're a CS theory sort of person, you can think of this as being analogous to continuation-passing style).
If your print "Hai" has a more elaborate dependency on multiple tasks, you might look into futures or promises.
You could take a step into the world of Actor-based concurrency, in which case there would probably be a synchronous message send method that does more or less what you want.
If you don't want to use futures or promises, you can achieve a similar thing manually, by introducing a condition variable. Set the condition variable before myTask starts and pass it to myTask, then wait for it to be cleared. You'll have to be very careful as your program grows and constantly rethink your locking strategy to make sure it stays simple and comprehensible - this is the stuff of which difficult concurrency bugs is made.
The smallest sensible step to get what you want is probably to provide a blocking version of Queue.put() which does the condition variable thing. Make sure you think about whether you want to block until the queue is empty, or until the thing you put on the queue is removed from the queue, or until the thing you put on the queue has finished processing. And then make sure you implement the thing you decided to implement when you were thinking about it.

Categories