Okay, suppose I've got a working class that inherits Thread:
from threading import Thread
import time
class DoStuffClass(Thread):
def __init__(self, queue):
self. queue = queue
self.isstart = False
def startthread(self, isstart):
self.isstart = isstart
if isstart:
Thread.__init__(self)
else:
print 'Thread not started!'
def run(self):
while self.isstart:
time.sleep(1)
if self.queue.full():
y = self.queue.get() #y goes nowhere, it's just to free up the queue
self.queue.put('stream data')
I've tried calling it in another file and it's working successfully:
from Queue import Queue
import dostuff
q = Queue(maxsize=1)
letsdostuff= dostuff.DoStuffClass()
letsdostuff.startthread(True)
letsdostuff.start()
val = ''
i=0
while (True):
val = q.get()
print "Outputting: %s" % val
Right now, I can get the value of the class output thru the queue.
My question: Suppose I want to create another class (ProcessStuff) that inherits the DoStuffClass so that I can grab the output of DoStuffClass through a queue object (or any other method), process it, and pass it to ProcessStuff's queue so that codes calling the ProcessStuff can get its value through queuing. How do I do that?
It sound like you don't really want ProcessStuff to inherit from DoStuffClass, instead you want ProcessStuff to consume from the DoStuffClass queue internally. So rather than use inheritance, just have ProcessStuff keep a reference to a DoStuffClass instance internally, along with an internal Queue object to get the values that DoStuffClass produces:
class ProcessStuff(Thread):
def __init__(self, queue):
super(ProcessStuff, self).__init__()
self.queue = queue
self._do_queue = Queue() # internal Queue for DoStuffClass
self._do_stuff = dostuff.DoStuffClass(self._do_queue)
def run(self):
self._do_stuff.startthread(True)
self._do_stuff.start()
while True:
val = self._do_queue.get() # Grab value from DoStuffClass
# process it
processed_val = "processed {}".format(val)
self.queue.put(processed_val)
q = Queue(maxsize=1)
letsprocessstuff = ProcessStuff(q)
letsprocessstuff.start()
while (True):
val = q.get()
print "Outputting: %s" % val
Output:
Outputting: processed stream data
Outputting: processed stream data
Outputting: processed stream data
Outputting: processed stream data
Related
I have a class (MyClass) which contains a queue (self.msg_queue) of actions that need to be run and I have multiple sources of input that can add tasks to the queue.
Right now I have three functions that I want to run concurrently:
MyClass.get_input_from_user()
Creates a window in tkinter that has the user fill out information and when the user presses submit it pushes that message onto the queue.
MyClass.get_input_from_server()
Checks the server for a message, reads the message, and then puts it onto the queue. This method uses functions from MyClass's parent class.
MyClass.execute_next_item_on_the_queue()
Pops a message off of the queue and then acts upon it. It is dependent on what the message is, but each message corresponds to some method in MyClass or its parent which gets run according to a big decision tree.
Process description:
After the class has joined the network, I have it spawn three threads (one for each of the above functions). Each threaded function adds items from the queue with the syntax "self.msg_queue.put(message)" and removes items from the queue with "self.msg_queue.get_nowait()".
Problem description:
The issue I am having is that it seems that each thread is modifying its own queue object (they are not sharing the queue, msg_queue, of the class of which they, the functions, are all members).
I am not familiar enough with Multiprocessing to know what the important error messages are; however, it is stating that it cannot pickle a weakref object (it gives no indication of which object is the weakref object), and that within the queue.put() call the line "self._sem.acquire(block, timeout) yields a '[WinError 5] Access is denied'" error. Would it be safe to assume that this failure in the queue's reference not copying over properly?
[I am using Python 3.7.2 and the Multiprocessing package's Process and Queue]
[I have seen multiple Q/As about having threads shuttle information between classes--create a master harness that generates a queue and then pass that queue as an argument to each thread. If the functions didn't have to use other functions from MyClass I could see adapting this strategy by having those functions take in a queue and use a local variable rather than class variables.]
[I am fairly confident that this error is not the result of passing my queue to the tkinter object as my unit tests on how my GUI modifies its caller's queue work fine]
Below is a minimal reproducible example for the queue's error:
from multiprocessing import Queue
from multiprocessing import Process
import queue
import time
class MyTest:
def __init__(self):
self.my_q = Queue()
self.counter = 0
def input_function_A(self):
while True:
self.my_q.put(self.counter)
self.counter = self.counter + 1
time.sleep(0.2)
def input_function_B(self):
while True:
self.counter = 0
self.my_q.put(self.counter)
time.sleep(1)
def output_function(self):
while True:
try:
var = self.my_q.get_nowait()
except queue.Empty:
var = -1
except:
break
print(var)
time.sleep(1)
def run(self):
process_A = Process(target=self.input_function_A)
process_B = Process(target=self.input_function_B)
process_C = Process(target=self.output_function)
process_A.start()
process_B.start()
process_C.start()
# without this it generates the WinError:
# with this it still behaves as if the two input functions do not modify the queue
process_C.join()
if __name__ == '__main__':
test = MyTest()
test.run()
Indeed - these are not "threads" - these are "processes" - while if you were using multithreading, and not multiprocessing, the self.my_q instance would be the same object, placed at the same memory space on the computer,
multiprocessing does a fork of the process, and any data in the original process (the one in execution in the "run" call) will be duplicated when it is used - so, each subprocess will see its own "Queue" instance, unrelated to the others.
The correct way to have various process share a multiprocessing.Queue object is to pass it as a parameter to the target methods. The simpler way to reorganize your code so that it works is thus:
from multiprocessing import Queue
from multiprocessing import Process
import queue
import time
class MyTest:
def __init__(self):
self.my_q = Queue()
self.counter = 0
def input_function_A(self, queue):
while True:
queue.put(self.counter)
self.counter = self.counter + 1
time.sleep(0.2)
def input_function_B(self, queue):
while True:
self.counter = 0
queue.put(self.counter)
time.sleep(1)
def output_function(self, queue):
while True:
try:
var = queue.get_nowait()
except queue.Empty:
var = -1
except:
break
print(var)
time.sleep(1)
def run(self):
process_A = Process(target=self.input_function_A, args=(queue,))
process_B = Process(target=self.input_function_B, args=(queue,))
process_C = Process(target=self.output_function, args=(queue,))
process_A.start()
process_B.start()
process_C.start()
# without this it generates the WinError:
# with this it still behaves as if the two input functions do not modify the queue
process_C.join()
if __name__ == '__main__':
test = MyTest()
test.run()
As you can see, since your class is not actually sharing any data through the instance's attributes, this "class" design does not make much sense for your application - but for grouping the different workers in the same code block.
It would be possible to have a magic-multiprocess-class that would have some internal method to actually start the worker-methods and share the Queue instance - so if you have a lot of those in a project, there would be a lot less boilerplate.
Something along:
from multiprocessing import Queue
from multiprocessing import Process
import time
class MPWorkerBase:
def __init__(self, *args, **kw):
self.queue = None
self.is_parent_process = False
self.is_child_process = False
self.processes = []
# ensure this can be used as a colaborative mixin
super().__init__(*args, **kw)
def run(self):
if self.is_parent_process or self.is_child_process:
# workers already initialized
return
self.queue = Queue()
processes = []
cls = self.__class__
for name in dir(cls):
method = getattr(cls, name)
if callable(method) and getattr(method, "_MP_worker", False):
process = Process(target=self._start_worker, args=(self.queue, name))
self.processes.append(process)
process.start()
# Setting these attributes here ensure the child processes have the initial values for them.
self.is_parent_process = True
self.processes = processes
def _start_worker(self, queue, method_name):
# this method is called in a new spawned process - attribute
# changes here no longer reflect attributes on the
# object in the initial process
# overwrite queue in this process with the queue object sent over the wire:
self.queue = queue
self.is_child_process = True
# call the worker method
getattr(self, method_name)()
def __del__(self):
for process in self.processes:
process.join()
def worker(func):
"""decorator to mark a method as a worker that should
run in its own subprocess
"""
func._MP_worker = True
return func
class MyTest(MPWorkerBase):
def __init__(self):
super().__init__()
self.counter = 0
#worker
def input_function_A(self):
while True:
self.queue.put(self.counter)
self.counter = self.counter + 1
time.sleep(0.2)
#worker
def input_function_B(self):
while True:
self.counter = 0
self.queue.put(self.counter)
time.sleep(1)
#worker
def output_function(self):
while True:
try:
var = self.queue.get_nowait()
except queue.Empty:
var = -1
except:
break
print(var)
time.sleep(1)
if __name__ == '__main__':
test = MyTest()
test.run()
I am currently working on making my program use multiprocessing.Process and I want to get an object from my Process subclass.
Inside main.py:
p = DataProcessor()
p.start()
#later:
obj = p.x
Inside data_processor.py:
from multiprocessing import Process
class DataProcessor(Process):
def __init__(self):
#call to super etc
self.x = None
def run(self):
while True:
if self.x is None:
self.x = 5 #normally i set this to an object
When I now want to use x in my main it is always None.
How can I get this to work without having to use a multiprocessing.Queue?
(In my opinion queues are neither readable nor useful when dealing with only one object once)
You can use multiprocessing Pipe ;)
But seriously, you have to use inter-process communication to share data between the processes. Here is a simple example that retrieves x from the child process.
from multiprocessing import Process, Queue
import time
class DataProcessor(Process):
def __init__(self, queue):
#call to super etc
Process.__init__(self)
self.queue = queue
self.x = None
def run(self):
while True:
if self.x is None:
self.x = 5 #normally i set this to an object
self.queue.put(self.x)
if __name__ == '__main__':
queue = Queue()
p = DataProcessor(queue)
p.start()
#later:
while queue.empty():
time.sleep(.1)
x = queue.get()
print(x)
I have a list of strings in Queue
queue = ["First","Second","Third","Fourth","",etc]
Snippet Of The Code :
import thread
from thread import Threading
import Queue
import time
class MainThread(threading.Thread):
def __init__(self,queue):
threading.Thread.__init__(self)
self.queue = queue
def run(self):
Details = self.queue.get()
Trail = Details
#..........#
# Now Mark If Trail Can Be Use Again Or Not
self.queue.put(Trail) # Now Put Them Back No Matter What
self.queue.task_done()
queue = Queue.Queue(maxsize=0)
while not #...........#
for i in range(TotalThreads):
try:
t = ThreadingPower(queue)
t.setDaemon(False)
t.start()
except:
time.sleep(5)
My question : if the string "First" from queue can't be used again,how can i mark it for a specific time,so the thread can continue grabbing other and ignore it until the specific time is over
Edit :
Currently the solution i have is
queue = [["First",1],["Second",1],["Third",1],["Fourth",1],etc]
class MainThread(threading.Thread):
def __init__(self,queue):
threading.Thread.__init__(self)
self.queue = queue
def run(self):
Details = self.queue.get()
Trail,State = Details
if State == 1:
#................#
if #.............#:
State = 0
self.queue.put((Trail,State)) # Now Put Them Back No Matter What
self.queue.task_done()
queue = Queue.Queue(maxsize=0)
while not #...........#
for i in range(TotalThreads):
try:
t = ThreadingPower(queue)
t.setDaemon(False)
t.start()
except:
time.sleep(5)
However this only works if i want to disable them permanently,i need a solution where i can disable them for a specific time
Below is the code that I have that downloads various URLS into each separate thread, I was in attempt to make some changes before I implement the thread pool but with this change the queue is coming to be empty and download is not beginning.
import Queue
import urllib2
import os
import utils as _fdUtils
import signal
import sys
import time
import threading
class ThreadedFetch(threading.Thread):
""" docstring for ThreadedFetch
"""
def __init__(self, queue, out_queue):
super(ThreadedFetch, self).__init__()
self.queueItems = queue.get()
self.__url = self.queueItems[0]
self.__saveTo = self.queueItems[1]
self.outQueue = out_queue
def run(self):
fileName = self.__url.split('/')[-1]
path = os.path.join(DESKTOP_PATH, fileName)
file_size = int(_fdUtils.getUrlSizeInBytes(self.__url))
while not STOP_REQUEST.isSet():
urlFh = urllib2.urlopen(self.__url)
_log.info("Download: %s" , fileName)
with open(path, 'wb') as fh:
file_size_dl = 0
block_sz = 8192
while True:
buffer = urlFh.read(block_sz)
if not buffer:
break
file_size_dl += len(buffer)
fh.write(buffer)
status = r"%10d [%3.2f%%]" % (file_size_dl, file_size_dl * 100. / file_size)
status = status + chr(8)*(len(status)+1)
sys.stdout.write('%s\r' % status)
time.sleep(.05)
sys.stdout.flush()
if file_size_dl == file_size:
_log.info("Download Completed %s%% for file %s, saved to %s",
file_size_dl * 100. / file_size, fileName, DESKTOP_PATH)
below is the main function that does the call and initiation.
def main(appName):
args = _fdUtils.getParser()
urls_saveTo = {}
# spawn a pool of threads, and pass them queue instance
# each url will be downloaded concurrently
for i in range(len(args.urls)):
t = ThreadedFetch(queue, out_queue)
t.daemon = True
t.start()
try:
for url in args.urls:
urls_saveTo[url] = args.saveTo
# urls_saveTo = {urls[0]: args.saveTo, urls[1]: args.saveTo, urls[2]: args.saveTo}
# populate queue with data
for item, value in urls_saveTo.iteritems():
queue.put([item, value])
# wait on the queue until everything has been processed
queue.join()
print '*** Done'
except (KeyboardInterrupt, SystemExit):
lgr.critical('! Received keyboard interrupt, quitting threads.')
You create the queue and then the first thread which immediately tries to fetch an item from the still empty queue. The ThreadedFetch.__init__() method isn't run asynchronously, just the run() method when you call start() on a thread object.
Store the queue in the __init__() and move the get() into the run() method. That way you can create all the threads and they are blocking in their own thread, giving you the chance to put items into the queue in the main thread.
class ThreadedFetch(threading.Thread):
def __init__(self, queue, out_queue):
super(ThreadedFetch, self).__init__()
self.queue = queue
self.outQueue = out_queue
def run(self):
url, save_to = self.queue.get()
# ...
For this example the queue is unnecessary by the way as every thread gets exactly one item from the queue. You could pass that item directly to the thread when creating the thread object:
class ThreadedFetch(threading.Thread):
def __init__(self, url, save_to, out_queue):
super(ThreadedFetch, self).__init__()
self.url = url
self.save_to = save_to
self.outQueue = out_queue
def run(self):
# ...
And when the ThreadedFetch class really just consists of the __init__() and run() method you may consider moving the run() method into a function and start that asynchronously.
def fetch(url, save_to, out_queue):
# ...
# ...
def main():
# ...
thread = Thread(target=fetch, args=(url, save_to, out_queue))
thread.daemon = True
thread.start()
I'm having a bit of trouble with this queue:
import Queue
import threading
class test(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.request_queue = Queue.Queue()
def addtoqueue(self, item):
self.request_queue.put(item)
def run(self):
while True:
item = self.request_queue.get(True)
print item
This simple class implements a threaded queue. Calling test::addtoqueue will append an item to the queue. The thread waits for an item to be added to the queue - and immediately prints it and waits for the next thing.
My problem is application shutdown. What is the best way to terminate the thread? I could use a Condition, but how could I wait for either a notification from the Condition or a new item in the queue?
You can send some poison to the thread to kill it:
poison = None # something you wouldn't normally put in the Queue
class test(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.request_queue = Queue.Queue()
def kill(self):
self.addtoqueue(poison)
def addtoqueue(self, item):
self.request_queue.put(item)
def run(self):
while True:
item = self.request_queue.get(True)
if item is poison:
# do stuff
return # end thread
print item
I'd alter the condition in your while loop so that it checked for a local variable. Add add a kill-switch to allow an external process to shut the thread down. You should probably extend kill_me to dispose of the object and its Queue in a nice way (eg if you want to store the Queue for the next time it's run).
Edit I've also added a has_finished variable in there so kill_me should block the main process thread. This should allow the thread to exit before handing back to the main flow.
I may have overcomplicated things ;)
class test(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.request_queue = Queue.Queue()
self.is_running = True
self.has_finished = False
def addtoqueue(self, item):
self.request_queue.put(item)
def kill_me(self):
self.is_running = False
while not self.has_finished:
pass
def run(self):
while self.is_running:
item = self.request_queue.get(True)
print item
self.has_finished = True
Do The Simplest Thing That Could Possibly Work - which, in this case, might be a Sentinel. And although threading was inspired by Java's threading library, in Python the simplest thing is not do things Java-like and inherit from threading.Thread, but to pass a function and its arguments to threading.Thread():
DONE = object() # Sentinel
def run(queue):
while True:
item = queue.get()
queue.task_done()
if item is DONE:
break
print item
request_queue = Queue.Queue()
some_thread = Thread(target=run, args=(request_queue,))
some_thread.start()
request_queue.put('hey')
request_queue.put('joe')
request_queue.put(DONE)