In my application, I have a custom QThread responsible for communicating with the backend, and call a utility function with a url and data from the run() method:
class SomeThread(QtCore.QThread):
def __init__(self, parent=None...):
QtCore.QThread.__init__(self, parent)
...
def run(self):
final_desired_content = some_utility_method(url, data ...)
# emitting success with final_desired_content
In the utility method(s), I'm making an http POST, getting a response back, parsing the response, and eventually passing the desired information to the above thread variable final_desired_content. Before I pass the information back, I am parsing some more information which I don't want to return, and would like to store in a SomeClass singleton instance:
def some_utility_method( ... ):
...
return response_parsing(response)
def response_parsing(response):
...
some_file.SomeClass.instance().setNewData(otherData)
return mainParsedData
Because there may be multiple threads contacting the BE within a few seconds (specifically during the application start) I would like to prevent the writing of data before has passed (it is ok that data we ignore is thrown away):
class SomeClass(QtCore.QObject):
_instance = None
#classmethod
def instance(klass):
if not klass._instance:
klass._instance = SomeClass()
return klass._instance
def __init__(self):
QtCore.QObject.__init__(self)
self._recentlyUpdatedTimer = QtCore.QTimer()
self._recentlyUpdatedTimer.setSingleShot(True)
self._recentlyUpdatedTimer.timeout.connect(self._setOkToUpdateCB)
self._storedData = None
self._allowUpdate = True
def _setOkToUpdateCB(self):
self._allowUpdate = True
def setNewData(self, newData):
if self._allowUpdate:
print "UPDATING!"
self._allowUpdate = False
self._storedData = newData
self._recentlyUpdatedTimer.start(<some_time>)
else:
print "BLOCKED!" # ok to ignore newData
The problem is that this successfully updates once, then after the second update succeeds, I am getting this error: QObject::startTimer: Timers cannot be started from another thread
From what I know and read about threads, the run() in the QThread is another thread, that might not know what has been happening in the main thread.
Debugging, it appears that the timer is still running, even though it is set to singleShot.
I will appreciate any suggestions :)
I resolved this by not using a timer.
What I ended up doing was:
class SomeClass(QtCore.QObject):
...
DATA_EXPIRE_THRESHOLD = 3
...
def __init__(self):
...
self._counter = 0 # Just for debugging
self._dataExpireTime = None
self._data = None
def setNewData(self, new_data):
self._counter += 1
if self._dataExpireTime is None or self._isExpired():
print "self._isExpired(): [%s], counter: [%s]" % (self._isExpired(), self.counter)
self._data = new_data
self._dataExpireTime = time.time() + self.DATA_EXPIRE_THRESHOLD
def _isExpired(self):
return time.time() >= self._dataExpireTime
so setting the threshold to three seconds, the output is:
self._isExpired(): [True], counter: [1]
self._isExpired(): [True], counter: [2]
self._isExpired(): [True], counter: [6]
self._isExpired(): [True], counter: [15]
self._isExpired(): [True], counter: [17]
self._isExpired(): [True], counter: [18]
I was trying to use a mutex, but it had a similar issue to the timer.
I would still appreciate an explanation, or advice to dealing with such issues.
Related
I have a class (MyClass) which contains a queue (self.msg_queue) of actions that need to be run and I have multiple sources of input that can add tasks to the queue.
Right now I have three functions that I want to run concurrently:
MyClass.get_input_from_user()
Creates a window in tkinter that has the user fill out information and when the user presses submit it pushes that message onto the queue.
MyClass.get_input_from_server()
Checks the server for a message, reads the message, and then puts it onto the queue. This method uses functions from MyClass's parent class.
MyClass.execute_next_item_on_the_queue()
Pops a message off of the queue and then acts upon it. It is dependent on what the message is, but each message corresponds to some method in MyClass or its parent which gets run according to a big decision tree.
Process description:
After the class has joined the network, I have it spawn three threads (one for each of the above functions). Each threaded function adds items from the queue with the syntax "self.msg_queue.put(message)" and removes items from the queue with "self.msg_queue.get_nowait()".
Problem description:
The issue I am having is that it seems that each thread is modifying its own queue object (they are not sharing the queue, msg_queue, of the class of which they, the functions, are all members).
I am not familiar enough with Multiprocessing to know what the important error messages are; however, it is stating that it cannot pickle a weakref object (it gives no indication of which object is the weakref object), and that within the queue.put() call the line "self._sem.acquire(block, timeout) yields a '[WinError 5] Access is denied'" error. Would it be safe to assume that this failure in the queue's reference not copying over properly?
[I am using Python 3.7.2 and the Multiprocessing package's Process and Queue]
[I have seen multiple Q/As about having threads shuttle information between classes--create a master harness that generates a queue and then pass that queue as an argument to each thread. If the functions didn't have to use other functions from MyClass I could see adapting this strategy by having those functions take in a queue and use a local variable rather than class variables.]
[I am fairly confident that this error is not the result of passing my queue to the tkinter object as my unit tests on how my GUI modifies its caller's queue work fine]
Below is a minimal reproducible example for the queue's error:
from multiprocessing import Queue
from multiprocessing import Process
import queue
import time
class MyTest:
def __init__(self):
self.my_q = Queue()
self.counter = 0
def input_function_A(self):
while True:
self.my_q.put(self.counter)
self.counter = self.counter + 1
time.sleep(0.2)
def input_function_B(self):
while True:
self.counter = 0
self.my_q.put(self.counter)
time.sleep(1)
def output_function(self):
while True:
try:
var = self.my_q.get_nowait()
except queue.Empty:
var = -1
except:
break
print(var)
time.sleep(1)
def run(self):
process_A = Process(target=self.input_function_A)
process_B = Process(target=self.input_function_B)
process_C = Process(target=self.output_function)
process_A.start()
process_B.start()
process_C.start()
# without this it generates the WinError:
# with this it still behaves as if the two input functions do not modify the queue
process_C.join()
if __name__ == '__main__':
test = MyTest()
test.run()
Indeed - these are not "threads" - these are "processes" - while if you were using multithreading, and not multiprocessing, the self.my_q instance would be the same object, placed at the same memory space on the computer,
multiprocessing does a fork of the process, and any data in the original process (the one in execution in the "run" call) will be duplicated when it is used - so, each subprocess will see its own "Queue" instance, unrelated to the others.
The correct way to have various process share a multiprocessing.Queue object is to pass it as a parameter to the target methods. The simpler way to reorganize your code so that it works is thus:
from multiprocessing import Queue
from multiprocessing import Process
import queue
import time
class MyTest:
def __init__(self):
self.my_q = Queue()
self.counter = 0
def input_function_A(self, queue):
while True:
queue.put(self.counter)
self.counter = self.counter + 1
time.sleep(0.2)
def input_function_B(self, queue):
while True:
self.counter = 0
queue.put(self.counter)
time.sleep(1)
def output_function(self, queue):
while True:
try:
var = queue.get_nowait()
except queue.Empty:
var = -1
except:
break
print(var)
time.sleep(1)
def run(self):
process_A = Process(target=self.input_function_A, args=(queue,))
process_B = Process(target=self.input_function_B, args=(queue,))
process_C = Process(target=self.output_function, args=(queue,))
process_A.start()
process_B.start()
process_C.start()
# without this it generates the WinError:
# with this it still behaves as if the two input functions do not modify the queue
process_C.join()
if __name__ == '__main__':
test = MyTest()
test.run()
As you can see, since your class is not actually sharing any data through the instance's attributes, this "class" design does not make much sense for your application - but for grouping the different workers in the same code block.
It would be possible to have a magic-multiprocess-class that would have some internal method to actually start the worker-methods and share the Queue instance - so if you have a lot of those in a project, there would be a lot less boilerplate.
Something along:
from multiprocessing import Queue
from multiprocessing import Process
import time
class MPWorkerBase:
def __init__(self, *args, **kw):
self.queue = None
self.is_parent_process = False
self.is_child_process = False
self.processes = []
# ensure this can be used as a colaborative mixin
super().__init__(*args, **kw)
def run(self):
if self.is_parent_process or self.is_child_process:
# workers already initialized
return
self.queue = Queue()
processes = []
cls = self.__class__
for name in dir(cls):
method = getattr(cls, name)
if callable(method) and getattr(method, "_MP_worker", False):
process = Process(target=self._start_worker, args=(self.queue, name))
self.processes.append(process)
process.start()
# Setting these attributes here ensure the child processes have the initial values for them.
self.is_parent_process = True
self.processes = processes
def _start_worker(self, queue, method_name):
# this method is called in a new spawned process - attribute
# changes here no longer reflect attributes on the
# object in the initial process
# overwrite queue in this process with the queue object sent over the wire:
self.queue = queue
self.is_child_process = True
# call the worker method
getattr(self, method_name)()
def __del__(self):
for process in self.processes:
process.join()
def worker(func):
"""decorator to mark a method as a worker that should
run in its own subprocess
"""
func._MP_worker = True
return func
class MyTest(MPWorkerBase):
def __init__(self):
super().__init__()
self.counter = 0
#worker
def input_function_A(self):
while True:
self.queue.put(self.counter)
self.counter = self.counter + 1
time.sleep(0.2)
#worker
def input_function_B(self):
while True:
self.counter = 0
self.queue.put(self.counter)
time.sleep(1)
#worker
def output_function(self):
while True:
try:
var = self.queue.get_nowait()
except queue.Empty:
var = -1
except:
break
print(var)
time.sleep(1)
if __name__ == '__main__':
test = MyTest()
test.run()
I am trying to write a class to handle signals using the signal python module. The reason for having a class is to avoid the use of globals. This is the code I came up with, but unfortunately it is not working:
import signal
import constants
class SignalHandler (object):
def __init__(self):
self.counter = 0
self.break = False
self.vmeHandlerInstalled = False
def setVmeHandler(self):
self.vmeBufferFile = open('/dev/vme_shared_memory0', 'rb')
self.vmeHandlerInstalled = True
signal.signal(signal.SIGUSR1, self.traceHandler)
signal.siginterrupt(signal.SIGUSR1, False)
#...some other stuff...
def setBreakHandler(self):
signal.signal(signal.SIGINT, self.newBreakHandler)
signal.siginterrupt(signal.SIGINT, False)
def newBreakHandler(self, signum, frame):
self.removeVMEHandler()
self.break = True
def traceHandler(self, signum, frame):
self.counter += constants.Count
def removeVMEHandler(self):
if not self.vmeHandlerInstalled: return
if self.vmeBufferFile is None: return
signal.signal(signal.SIGUSR1, signal.SIG_DFL)
self.vmeHandlerInstalled = False
On the main program I use this class in the following way:
def run():
sigHandler = SignalHandler()
sigHandler.setBreakHandler()
sigHandler.setVmeHandler()
while not sigHandler.break:
#....do some stuff
if sigHandler.counter >= constants.Count:
#...do some stuff
This solution is not working, as it appears that the handler for the signal.SIGUSR1 installed in the setVmeHandler method never gets called.
So my question is: is it possible to handle signal inside a class or shall I use globals?
To answer your question, I created the following simple code:
import signal
import time
class ABC(object):
def setup(self):
signal.signal(signal.SIGUSR1, self.catch)
signal.siginterrupt(signal.SIGUSR1, False)
def catch(self, signum, frame):
print("xxxx", self, signum, frame)
abc = ABC()
abc.setup()
time.sleep(20)
If I run it:
python ./test.py
Then in another window send a USR1 signal:
kill -USR1 4357
The process prints the expected message:
('xxxx', <__main__.ABC object at 0x7fada09c6190>, 10, <frame object at 0x7fada0aaf050>)
So I think the answer is Yes, it possible to handle signal inside a class.
As for why you code doesn't work, sorry, I have no idea.
I got a similar problem as toti08, referring to setVmeHandler(self), and found out the handler must have matching parameters i.e. (self, signum,frame).
I'm trying to create my own threading class in Python2.7. I want it to be able to stop that thread with my own class function. Currently I have something like this:
class loop(threading.Thread):
def __init__(self, myvar):
super(loop, self).__init__()
self.terminate = False
self.myvar = myvar
def run(self):
while not self.terminate:
do.smthng.useful(self.myvar)
def change(self, newvar):
self.myvar = newvar #Doesnt work, in run() my old var is still being used
def stoploop(self):
self.terminate = True #Also not working
l = loop(1)
l.start()
time.sleep(1)
l.change(2) #thread still using "1"
time.sleep(1)
l.stoploop() #doesnt stop
I've read some posts here about this, but it wasnt what I needed.
Any help would be appreciated.
Thank you.
EDIT:
As some of the commenters already stated, this part of code looks like to be really working! Problem is in another place of my project. I've found it, but can't solve it. Maybe some of you could help.
So, my project uses Apache Thrift library and the server is in python.
Server.py:
loo = loop(0)
handler = ServHandler(loo)
processor = serv.Processor(handler)
transport = TSocket.TServerSocket('0.0.0.0', port=9090)
tfactory = TTransport.TBufferedTransportFactory()
pfactory = TBinaryProtocol.TBinaryProtocolFactory()
server = TProcessPoolServer.TProcessPoolServer(processor, transport, tfactory, pfactory)
print 'Starting the server...'
server.serve()
ServHandler.py:
class ServHandler:
def __init__(self, loo):
self.loo = loo
def terminate(self): #Function that can be called remotely
self.loo.stoploop() #Doesn't work
In above case thread isn't terminated and I don't why. There's no error, object exists, but it sets self.terminate value somewhere else. The object id seems to be the same as well as memory address, but it just looks like object is different although loop init function is called only once...
Below is the example, when the loop is terminated successfully.
ServHandler.py:
class ServHandler:
def __init__(self, loo):
self.loo = None
def terminate(self): #Function that can be called remotely
self.loo.stoploop() #Does work!!!!!!
def create(self):
self.loo = loop(0) #Function that can be called remotely
When I create loop object remotely, I can terminate it remotely. But it doesn't fit me. There should be a thread created before thrift server is served and multiple users have to be able to change vars/terminate/etc of that thread. How can I achieve this?
Thank you!
Not a answer per sae, but a useful debug code for the OP
from time import sleep
from threading import Thread
class loop(Thread):
def __init__(self, myvar):
Thread.__init__(self)
self.terminate = False
self.myvar = myvar
def run(self):
while self.terminate is False:
print('Run says myvar is:',self.myvar)
sleep(0.5)
def change(self, newvar):
self.myvar = newvar
def stoploop(self):
self.terminate = True
l = loop(1)
l.start()
sleep(1)
l.change(2)
sleep(1)
l.stoploop()
print('Final product:',l.myvar)
sleep(2)
print('Is the thread alive:',l.isAlive())
Tried your code with some debugging prints, and it's working?
Following code produced:
[torxed#archie ~]$ python test.py
Run says myvar is: 1
Run says myvar is: 1
Run says myvar is: 2 <-- Proves that change() does change `myvar`
Run says myvar is: 2
Final product: 2 <-- Also the global scope knows about the change
Is the thread alive: False <-- And the thread got terminated as intended
However, these are not bulletproof ideas when fetching data or dealing with thread-returns for a number of reasons (even tho i use this method myself from time to time), you should consider using thread.join which should be used in combination with l.toplooop() like so:
l = loop(1)
l.start()
l.change(2)
l.stoploop()
ret = l.join()
Also when updating data you should aquire locks on your data so collisions don't occur, have a look at semaphore objects.
Is it what you need?
import threading
import time
class Worker(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.state = threading.Condition()
self.variable = 10
self.paused = False
def run(self):
while True:
with self.state:
if self.paused:
self.state.wait()
self.do_stuff()
def do_stuff(self):
time.sleep(.1)
print self.variable
def resume(self):
with self.state:
self.paused = False
self.state.notify()
def pause(self):
with self.state:
self.paused = True
loop = Worker()
loop.start()
time.sleep(1)
loop.pause()
loop.variable = 11
print 'CHANGED!'
loop.resume()
time.sleep(1)
I have a generator that takes a long time for each iteration to run. Is there a standard way to have it yield a value, then generate the next value while waiting to be called again?
The generator would be called each time a button is pressed in a gui and the user would be expected to consider the result after each button press.
EDIT: a workaround might be:
def initialize():
res = next.gen()
def btn_callback()
display(res)
res = next.gen()
if not res:
return
If I wanted to do something like your workaround, I'd write a class like this:
class PrefetchedGenerator(object):
def __init__(self, generator):
self._data = generator.next()
self._generator = generator
self._ready = True
def next(self):
if not self._ready:
self.prefetch()
self._ready = False
return self._data
def prefetch(self):
if not self._ready:
self._data = self._generator.next()
self._ready = True
It is more complicated than your version, because I made it so that it handles not calling prefetch or calling prefetch too many times. The basic idea is that you call .next() when you want the next item. You call prefetch when you have "time" to kill.
Your other option is a thread..
class BackgroundGenerator(threading.Thread):
def __init__(self, generator):
threading.Thread.__init__(self)
self.queue = Queue.Queue(1)
self.generator = generator
self.daemon = True
self.start()
def run(self):
for item in self.generator:
self.queue.put(item)
self.queue.put(None)
def next(self):
next_item = self.queue.get()
if next_item is None:
raise StopIteration
return next_item
This will run separately from your main application. Your GUI should remain responsive no matter how long it takes to fetch each iteration.
No. A generator is not asynchronous. This isn't multiprocessing.
If you want to avoid waiting for the calculation, you should use the multiprocessing package so that an independent process can do your expensive calculation.
You want a separate process which is calculating and enqueueing results.
Your "generator" can then simply dequeue the available results.
You can definitely do this with generators, just create your generator so that each next call alternates between getting the next value and returning it by putting in multiple yield statements. Here is an example:
import itertools, time
def quick_gen():
counter = itertools.count().next
def long_running_func():
time.sleep(2)
return counter()
while True:
x = long_running_func()
yield
yield x
>>> itr = quick_gen()
>>> itr.next() # setup call, takes two seconds
>>> itr.next() # returns immediately
0
>>> itr.next() # setup call, takes two seconds
>>> itr.next() # returns immediately
1
Note that the generator does not automatically do the processing to get the next value, it is up to the caller to call next twice for each value. For your use case you would call next once as a setup up, and then each time the user clicks the button you would display the next value generated, then call next again for the pre-fetch.
I was after something similar. I wanted yield to quickly return a value (if it could) while a background thread processed the next, next.
import Queue
import time
import threading
class MyGen():
def __init__(self):
self.queue = Queue.Queue()
# Put a first element into the queue, and initialize our thread
self.i = 1
self.t = threading.Thread(target=self.worker, args=(self.queue, self.i))
self.t.start()
def __iter__(self):
return self
def worker(self, queue, i):
time.sleep(1) # Take a while to process
queue.put(i**2)
def __del__(self):
self.stop()
def stop(self):
while True: # Flush the queue
try:
self.queue.get(False)
except Queue.Empty:
break
self.t.join()
def next(self):
# Start a thread to compute the next next.
self.t.join()
self.i += 1
self.t = threading.Thread(target=self.worker, args=(self.queue, self.i))
self.t.start()
# Now deliver the already-queued element
while True:
try:
print "request at", time.time()
obj = self.queue.get(False)
self.queue.task_done()
return obj
except Queue.Empty:
pass
time.sleep(.001)
if __name__ == '__main__':
f = MyGen()
for i in range(5):
# time.sleep(2) # Comment out to get items as they are ready
print "*********"
print f.next()
print "returned at", time.time()
The code above gave the following results:
*********
request at 1342462505.96
1
returned at 1342462505.96
*********
request at 1342462506.96
4
returned at 1342462506.96
*********
request at 1342462507.96
9
returned at 1342462507.96
*********
request at 1342462508.96
16
returned at 1342462508.96
*********
request at 1342462509.96
25
returned at 1342462509.96
I was reading this question (which you do not have to read because I will copy what is there... I just wanted to give show you my inspiration)...
So, if I have a class that counts how many instances were created:
class Foo(object):
instance_count = 0
def __init__(self):
Foo.instance_count += 1
My question is, if I create Foo objects in multiple threads, is instance_count going to be correct? Are class variables safe to modify from multiple threads?
It's not threadsafe even on CPython. Try this to see for yourself:
import threading
class Foo(object):
instance_count = 0
def inc_by(n):
for i in xrange(n):
Foo.instance_count += 1
threads = [threading.Thread(target=inc_by, args=(100000,)) for thread_nr in xrange(100)]
for thread in threads: thread.start()
for thread in threads: thread.join()
print(Foo.instance_count) # Expected 10M for threadsafe ops, I get around 5M
The reason is that while INPLACE_ADD is atomic under GIL, the attribute is still loaded and store (see dis.dis(Foo.__init__)). Use a lock to serialize the access to the class variable:
Foo.lock = threading.Lock()
def interlocked_inc(n):
for i in xrange(n):
with Foo.lock:
Foo.instance_count += 1
threads = [threading.Thread(target=interlocked_inc, args=(100000,)) for thread_nr in xrange(100)]
for thread in threads: thread.start()
for thread in threads: thread.join()
print(Foo.instance_count)
No it is not thread safe. I've faced a similar problem a few days ago, and I chose to implement the lock thanks to a decorator. The benefit is that it makes the code readable:
def threadsafe_function(fn):
"""decorator making sure that the decorated function is thread safe"""
lock = threading.Lock()
def new(*args, **kwargs):
lock.acquire()
try:
r = fn(*args, **kwargs)
except Exception as e:
raise e
finally:
lock.release()
return r
return new
class X:
var = 0
#threadsafe_function
def inc_var(self):
X.var += 1
return X.var
Following on from luc's answer, here's a simplified decorator using with context manager and a little __main__ code to spin up the test. Try it with and without the #synchronized decorator to see the difference.
import concurrent.futures
import functools
import logging
import threading
def synchronized(function):
lock = threading.Lock()
#functools.wraps(function)
def wrapper(self, *args, **kwargs):
with lock:
return function(self, *args, **kwargs)
return wrapper
class Foo:
counter = 0
#synchronized
def increase(self):
Foo.counter += 1
if __name__ == "__main__":
foo = Foo()
print(f"Start value is {foo.counter}")
with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
for index in range(200000):
executor.submit(foo.increase)
print(f"End value is {foo.counter}")
Without #synchronized
End value is 198124
End value is 196827
End value is 197968
With #synchronized
End value is 200000
End value is 200000
End value is 200000
Is modifying a class variable in python threadsafe?
It depends on the operation.
While the Python GIL (Global Interpreter Lock) only allows access to one thread at a time, per atomic operation, some operations are not atomic, that is, they are implemented with more than one operation, such as, given (L, L1, L2 are lists, D, D1, D2 are dicts, x, y are objects, i, j are ints)
i = i+1
L.append(L[-1])
L[i] = L[j]
D[x] = D[x] + 1
See What kinds of global value mutation are thread-safe?
You're example is included in the non-safe operations, as += is short hand for i = i + 1.
Other posters have shown how to make the operation thread-safe. An alternative thread-safe way to implement your operation, without using a thread locking mechanism would be to reference a different variable, only set via an atomic operation. For example
max_reached = False
# in one thread
count = 0
maximum = 100
count += 1
if count >= maximum:
max_reached = True
# in another thread
while not max_reached:
time.sleep(1)
# do something
This would be thread safe, as long as only one thread increments the count.
I would say it is thread-safe, at least on CPython implementation. The GIL will make all your "threads" to run sequentially so they will not be able to mess with your reference count.