Behaviour of class members when threading inside a class - python

I have some code which sets up an interrupt handler in the main thread and runs a loop in a side thread. This is so I can Ctrl-C the main thread to signal to the loop to gracefully shutdown, and this all happens inside one class, which looks like:
class MyClass:
# non-relevant stuff omitted for brevity
def run(self):
with concurrent.futures.ThreadPoolExecutor() as executor:
future = executor.submit(self.my_loop, self.arg_1, self.arg_2)
try:
future.result()
except KeyboardInterrupt as e:
self.exit_event.set() # read in my_loop(), exits after finishing an iteration
future.result()
This works fine. My question is, are there are special types of objects or characteristics of objects I should be aware of with this approach, specifically regarding self. members on MyClass? I think it's fine because my_loop is spawned inside MyClass and so no copies of the self. properties are made - initial testing points this to be the case. I'm really wondering if there are any more exotic objects (eg non-pickleable, which does work fine in this) I should consider?

As this is threads instead of between processes communication, pickleability does not matter as nothing is transmitted in queues. Your objects within your class (or outside the class) can be anything.
The only thing you need to keep in mind with class variables is that you need a lock to protect access to them. If you use several threads to modify a class variable, your results will eventually be something unexpected.

Related

Python thread run() blocking

I was attempting to create a thread class that could be terminated by an exception (since I am trying to have the thread wait on an event) when I created the following:
import sys
class testThread(threading.Thread):
def __init__(self):
super(testThread,self).__init__()
self.daemon = True
def run(self):
try:
print('Running')
while 1:
pass
except:
print('Being forced to exit')
test1 = testThread()
test2 = testThread()
print(test1.daemon)
test1.run()
test2.run()
sys.exit()
However, running the program will only print out one Running message, until the other is terminated. Why is that?
The problem is that you're calling the run method.
This is just a plain old method that you implement, which does whatever you put in its body. In this case, the body is an infinite loop, so calling run just loops forever.
The way to start a thread is the start method. This method is part of the Thread class, and what it does is:
Start the thread’s activity.
It must be called at most once per thread object. It arranges for the object’s run() method to be invoked in a separate thread of control.
So, if you call this, it will start a new thread, make that new thread run your run() method, and return immediately, so the main thread can keep doing other stuff.1 That's what you want here.
1. As pointed out by Jean-François Fabre, you're still not going to get any real parallelism here. Busy loops are never a great idea in multithreaded code, and if you're running this in CPython or PyPy, almost all of that busy looping is executing Python bytecode while holding the GIL, and only one thread can hold the GIL at a time. So, from a coarse view, things look concurrent—three threads are running, and all making progress. But if you zoom in, there's almost no overlap where two threads progress at once, usually not even enough to make up for the small scheduler overhead.

On Ctrl-d, call Close() like with file objects happen

I've wrote a class that inherits from object and has instances of sub-objects that uses some threads for tasks. There are two socket listeners that creates other threads for each accepted connection. They do what they have to do. To finish them, they are looking on a Threading.Event object to know that they have to finish.
I've noticed that, when exit the python console they are not notified (or don't catch the notification) and the exit don't return control to the bash console, unless a Close() is called before.
First idea to fix it has been to implement the '__del__' method to use the garbage collector to clean it when exit.
class ServiceProvider(object):
def __init__(self):
super(ServiceProvider,self).__init__()
#...
self.Open()
def Open(self):
#... Some threads are created.
def Close(self):
#.... Threading.Event to report the threads to finish
def __del__(self):
self.Close()
But the behaviour is the same. If I place a print in those methods, non in '__del__', neither in 'Close' they are written. Unless it is closed before, then the print in the del is wrote.
Then I've implemented the '__enter__' and '__exit__' methods to manage the with statement. And the exit behaves as expected and when the with ends, things are release. But what I really want is to have something like the file descriptors that event if file.close() is not called, it is executed when exits the program.
class ServiceProvider(object):
#...
def __enter__(self):
return self
def __exit__(self):
self.Close()
Searching for more solutions I've tried with atexit but not. I have similar results that doesn't fix the issue. Even I collect all the objects created of this class, the doOnExit only writes its print if the objects in the list are already Close.
import atexit
global objects2Close
objects2Close = []
#atexit.register
def doOnExit():
for obj in objects2Close:
obj.Close()
class ServiceProvider(object):
def __init__(self):
super(ServiceProvider,self).__init__()
objects2Close.append(self)
It's usually a good idea to use with when you have resources that you don't want to leak (files, connections, whatever else you care about).
Somewhere, just outside your main loop you should have something like:
with ServiceProvider(some_params) as service_provider:
rest_of_the_code()
What this does is that regardless of how you exit rest_of_the_code() (except for kill -9) it will call service_provider.Close() at the end. This works for exceptions and interrupts as well. Kill -9 doesn't work because the process is kill at os level and doesn't have a chance to attempt to recover.
I've got a solution for this issue. The posted information in this question was not related with the real issue.
This is as simple as daemon threading.
A the implementation uses some threads for listening remote connections they have to finish their execution when the program goes to exit. But the program ends when all the no daemon thread has finished.
Mistakenly those listeners and talkers where not set to be daemons and that's why the execution waits for them.

correct way to update attributes on one thread from another

I can best explain this with example code first;
class reciever(threading.Thread,simple_server):
def __init__(self,callback):
threading.Thread.__init__(self)
self.callback=callback
def run(self):
self.serve_forever(self.callback)
class sender(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.parameter=50
def run(self):
while True:
#do some processing in general
#....
#send some udp messages derived from self.parameter
send_message(self.parameter)
if __name__=='__main__':
osc_send=sender()
osc_send.start()
def update_parameter(val):
osc_send.parameter=val
osc_recv=reciever(update_parameter)
osc_recv.start()
the pieces I have left out are hopefully self explanatory from the code thats there..
My question is, is this a safe way to use a server running in a thread to update the attributes on a separate thread that could be reading the value at any time?
The way you're updating that parameter is actually thread-safe already, because of the Global Interpreter Lock (GIL). The GIL means that Python only allows one thread to execute byte-code at a time, so it is impossible for one thread to be reading from parameter at the same time another thread is writing to it. Reading from and setting an attribute are both single, atomic byte-code operations; one will always start and complete before the other can happen. You would only need to introduce synchronization primitives if you needed to do operations that are more than one byte-code operation from more than one threads (e.g. incrementing parameter from multiple threads).

Communication between threads in Python (without using Global Variables)

Let's say if we have a main thread which launches two threads for test modules - " test_a" and " test_b".
Both the test module threads maintain their state whether they are done performing test or if they encountered any error, warning or if they want to update some other information.
How main thread can get access to this information and act accordingly.
For example, if " test_a" raised an error flag; How "main" will know and stop rest of the tests before existing with error ?
One way to do this is using global variables but that gets very ugly.. Very soon.
The obvious solution is to share some kind of mutable variable, by passing it in to the thread objects/functions at constructor/start.
The clean way to do this is to build a class with appropriate instance attributes. If you're using a threading.Thread subclass, instead of just a thread function, you can usually use the subclass itself as the place to stick those attributes. But I'll show it with a list just because it's shorter:
def test_a_func(thread_state):
# ...
thread_state[0] = my_error_state
# ...
def main_thread():
test_states = [None]
test_a = threading.Thread(target=test_a_func, args=(test_states,))
test_a.start()
You can (and usually want to) also pack a Lock or Condition into the mutable state object, so you can properly synchronize between main_thread and test_a.
(Another option is to use a queue.Queue, an os.pipe, etc. to pass information around, but you still need to get that queue or pipe to the child thread—which you do in the exact same way as above.)
However, it's worth considering whether you really need to do this. If you think of test_a and test_b as "jobs", rather than "thread functions", you can just execute those jobs on a pool, and let the pool handle passing results or errors back.
For example:
try:
with concurrent.futures.ThreadPoolExecutor(workers=2) as executor:
tests = [executor.submit(job) for job in (test_a, test_b)]
for test in concurrent.futures.as_completed(tests):
result = test.result()
except Exception as e:
# do stuff
Now, if the test_a function raises an exception, the main thread will get that exception—and, because that means exiting the with block, and all of the other jobs get cancelled and thrown away, and the worker threads shut down.
If you're using 2.5-3.1, you don't have concurrent.futures built in, but you can install the backport off PyPI, or you can rewrite things around multiprocessing.dummy.Pool. (It's slightly more complicated that way, because you have to create a sequence of jobs and call map_async to get back an iterator over AsyncResult objects… but really that's still pretty simple.)

Why does the instance need to be recreated when restarting a thread?

Imagine the following classes:
Class Object(threading.Thread):
# some initialisation blabla
def run(self):
while True:
# do something
sleep(1)
class Checker():
def check_if_thread_is_alive(self):
o = Object()
o.start()
while True:
if not o.is_alive():
o.start()
I want to restart the thread in case it is dead. This doens't work. Because the threads can only be started once. First question. Why is this?
For as far as I know I have to recreate each instance of Object and call start() to start the thread again. In case of complex Objects this is not very practical. I've to read the current values of the old Object, create a new one and set the parameters in the new object with the old values. Second question: Can this be done in a smarter, easier way?
The reason why threading.Thread is implemented that way is to keep correspondence between a thread object and operating system's thread. In major OSs threads can not be restarted, but you may create another thread with another thread id.
If recreation is a problem, there is no need to inherit your class from threading.Thread, just pass a target parameter to Thread's constructor like this:
class MyObj(object):
def __init__(self):
self.thread = threading.Thread(target=self.run)
def run(self):
...
Then you may access thread member to control your thread execution, and recreate it as needed. No MyObj recreation is required.
See here:
http://docs.python.org/2/library/threading.html#threading.Thread.start
It must be called at most once per thread object. It arranges for the
object’s run() method to be invoked in a separate thread of control.
This method will raise a RuntimeError if called more than once on the
same thread object.
A thread isn't intended to run more than once. You might want to use a Thread Pool
I believe, that has to do with how Thread class is implemented. It wraps a real OS thread, so that restarting the thread would actually change its identity, which might be confusing.
A better way to deal with threads is actually through target functions/callables:
class Worker(object):
""" Implements the logic to be run in separate threads """
def __call__(self):
# do useful stuff and change the state
class Supervisor():
def run(self, worker):
thr = None
while True:
if not thr or not thr.is_alive():
thr = Thread(target=worker)
thr.daemon = True
thr.start()
thr.join(1) # give it some time

Categories