Create a new log thread or use a daemon thread? - python

I am a new programmer towards multi-thread, now I want to use a thread to write to a log file for every 2 seconds, I have two solutions but I don't know which one is better.
First:
def logger(msg):
if msg != None:
logging.info(msg)
def main():
last = time.time()
while True:
msg = get_msg_from_somewhere()
current = time.time()
if current - last > 2:
t1 = threading.Thread(target=logger, args = (msg, ))
t1.start()
last = current
Second:
message = None
def logger():
global msg
while True:
if msg != None:
logging.info(msg)
msg = None
time.sleep(2)
def main():
t1 = threading.Thread(target=logger)
t1.setDaemon(True)
t1.start()
while True:
update_msg_from_somewhere()
My thoughts:
I prefer the second solution, because it doesn't need to compare the timestamp all the time and to create endless new threads (though they will be destroyed after they finish, right?), but I think the way I pass the msg is not the best (through global variables).
Do you have any ideas on how to pass variables to the daemon thread when it's running? And which solution do you prefer? Why?
Thanks a lot!

There are two questions.
The first one is using either daemon thread or not. It depends on your demand. If you can accept that the thread terminates suddenly which means there is no need for cleaning up, then you can use daemon thread as it will be convenient.
The second one is how to pass message in. As far as I think, this is a classic message queue problem. A better structure should be using a queue.
from queue import Queue
def logger(q):
for msg in iter(q.get, None):
logging.info(msg)
def main():
q = Queue()
t1 = threading.Thread(target=logger, args=(q,))
t1.setDaemon(True)
t1.start()
while True:
q.put(get_msg_from_somewhere())
time.sleep(2)

Related

Why does my multiprocess queue not appear to be thread safe?

I am building a watchdog timer that runs another Python program, and if it fails to find a check-in from any of the threads, shuts down the whole program. This is so it will, eventually, be able to take control of needed communication ports. The code for the timer is as follows:
from multiprocessing import Process, Queue
from time import sleep
from copy import deepcopy
PATH_TO_FILE = r'.\test_program.py'
WATCHDOG_TIMEOUT = 2
class Watchdog:
def __init__(self, filepath, timeout):
self.filepath = filepath
self.timeout = timeout
self.threadIdQ = Queue()
self.knownThreads = {}
def start(self):
threadIdQ = self.threadIdQ
process = Process(target = self._executeFile)
process.start()
try:
while True:
unaccountedThreads = deepcopy(self.knownThreads)
# Empty queue since last wake. Add new thread IDs to knownThreads, and account for all known thread IDs
# in queue
while not threadIdQ.empty():
threadId = threadIdQ.get()
if threadId in self.knownThreads:
unaccountedThreads.pop(threadId, None)
else:
print('New threadId < {} > discovered'.format(threadId))
self.knownThreads[threadId] = False
# If there is a known thread that is unaccounted for, then it has either hung or crashed.
# Shut everything down.
if len(unaccountedThreads) > 0:
print('The following threads are unaccounted for:\n')
for threadId in unaccountedThreads:
print(threadId)
print('\nShutting down!!!')
break
else:
print('No unaccounted threads...')
sleep(self.timeout)
# Account for any exceptions thrown in the watchdog timer itself
except:
process.terminate()
raise
process.terminate()
def _executeFile(self):
with open(self.filepath, 'r') as f:
exec(f.read(), {'wdQueue' : self.threadIdQ})
if __name__ == '__main__':
wd = Watchdog(PATH_TO_FILE, WATCHDOG_TIMEOUT)
wd.start()
I also have a small program to test the watchdog functionality
from time import sleep
from threading import Thread
from queue import SimpleQueue
Q_TO_Q_DELAY = 0.013
class QToQ:
def __init__(self, processQueue, threadQueue):
self.processQueue = processQueue
self.threadQueue = threadQueue
Thread(name='queueToQueue', target=self._run).start()
def _run(self):
pQ = self.processQueue
tQ = self.threadQueue
while True:
while not tQ.empty():
sleep(Q_TO_Q_DELAY)
pQ.put(tQ.get())
def fastThread(q):
while True:
print('Fast thread, checking in!')
q.put('fastID')
sleep(0.5)
def slowThread(q):
while True:
print('Slow thread, checking in...')
q.put('slowID')
sleep(1.5)
def hangThread(q):
print('Hanging thread, checked in')
q.put('hangID')
while True:
pass
print('Hello! I am a program that spawns threads!\n\n')
threadQ = SimpleQueue()
Thread(name='fastThread', target=fastThread, args=(threadQ,)).start()
Thread(name='slowThread', target=slowThread, args=(threadQ,)).start()
Thread(name='hangThread', target=hangThread, args=(threadQ,)).start()
QToQ(wdQueue, threadQ)
As you can see, I need to have the threads put into a queue.Queue, while a separate object slowly feeds the output of the queue.Queue into the multiprocessing queue. If instead I have the threads put directly into the multiprocessing queue, or do not have the QToQ object sleep in between puts, the multiprocessing queue will lock up, and will appear to always be empty on the watchdog side.
Now, as the multiprocessing queue is supposed to be thread and process safe, I can only assume I have messed something up in the implementation. My solution seems to work, but also feels hacky enough that I feel I should fix it.
I am using Python 3.7.2, if it matters.
I suspect that test_program.py exits.
I changed the last few lines to this:
tq = threadQ
# tq = wdQueue # option to send messages direct to WD
t1 = Thread(name='fastThread', target=fastThread, args=(tq,))
t2 = Thread(name='slowThread', target=slowThread, args=(tq,))
t3 = Thread(name='hangThread', target=hangThread, args=(tq,))
t1.start()
t2.start()
t3.start()
QToQ(wdQueue, threadQ)
print('Joining with threads...')
t1.join()
t2.join()
t3.join()
print('test_program exit')
The calls to join() means that the test program never exits all by itself since none of the threads ever exit.
So, as is, t3 hangs and the watchdog program detects this and detects the unaccounted for thread and stops the test program.
If t3 is removed from the above program, then the other two threads are well behaved and the watchdog program allows the test program to continue indefinitely.

Basic multiprocessing with infinity loop and queue

import random
import queue as Queue
import _thread as Thread
a = Queue.Queue()
def af():
while True:
a.put(random.randint(0,1000))
def bf():
while True:
if (not a.empty()): print (a.get())
def main():
Thread.start_new_thread(af, ())
Thread.start_new_thread(bf, ())
return
if __name__ == "__main__":
main()
the above code works fine with extreme high CPU usage, i tried to use multiprocessing with no avail. i have tried
def main():
multiprocessing.Process(target=af).run()
multiprocessing.Process(target=bf).run()
and
def main():
manager = multiprocessing.Manager()
a = manager.Queue()
pool = multiprocessing.Pool()
pool.apply_async(af)
pool.apply_async(bf)
both not working, can anyone please help me? thanks a bunch ^_^
def main():
multiprocessing.Process(target=af).run() # will not return
multiprocessing.Process(target=bf).run()
The above code does not work because af does not return; no chance to call bf. You need to separate run call to start/join so that both can run in parallel. (+ to make them share manage.Queue)
To make the second code work, you need to pass a (manager.Queue object) to functions. Otherwise they will use Queue.Queue global object which is not shared between processes; need to modify af, bf to accepts a, and main to pass a.
def af(a):
while True:
a.put(random.randint(0, 1000))
def bf(a):
while True:
print(a.get())
def main():
manager = multiprocessing.Manager()
a = manager.Queue()
pool = multiprocessing.Pool()
proc1 = pool.apply_async(af, [a])
proc2 = pool.apply_async(bf, [a])
# Wait until process ends. Uncomment following line if there's no waiting code.
# proc1.get()
# proc2.get()
In the first alternative main you use Process, but the method you should call to start the activity is not run(), as one would think, but rather start(). You will want to follow that up with appropriate join() statements. Following the information in multiprocessing (available here: https://docs.python.org/2/library/multiprocessing.html), here is a working sample:
import random
from multiprocessing import Process, Queue
def af(q):
while True:
q.put(random.randint(0,1000))
def bf(q):
while True:
if not q.empty():
print (q.get())
def main():
a = Queue()
p = Process(target=af, args=(a,))
c = Process(target=bf, args=(a,))
p.start()
c.start()
p.join()
c.join()
if __name__ == "__main__":
main()
To add to the accepted answer, in the original code:
while True:
if not q.empty():
print (q.get())
q.empty() is being called every time which is unnecessary since q.get() if the queue is empty will wait until something is available here documentation.
Similar answer here
I assume that this could affect the performance since calling the .empty() every iteration should consume more resources (it should be more noticeable if Thread was used instead of Process because Python Global Interpreter Lock (GIL))
I know it's an old question but hope it helps!

Can I somehow avoid using time.sleep() in this script?

I have the following python script:
#! /usr/bin/python
import os
from gps import *
from time import *
import time
import threading
import sys
gpsd = None #seting the global variable
class GpsPoller(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
global gpsd #bring it in scope
gpsd = gps(mode=WATCH_ENABLE) #starting the stream of info
self.current_value = None
self.running = True #setting the thread running to true
def run(self):
global gpsd
while gpsp.running:
gpsd.next() #this will continue to loop and grab EACH set of gpsd info to clear the buffer
if __name__ == '__main__':
gpsp = GpsPoller() # create the thread
try:
gpsp.start() # start it up
while True:
print gpsd.fix.speed
time.sleep(1) ## <<<< THIS LINE HERE
except (KeyboardInterrupt, SystemExit): #when you press ctrl+c
print "\nKilling Thread..."
gpsp.running = False
gpsp.join() # wait for the thread to finish what it's doing
print "Done.\nExiting."
I'm not very good with python, unfortunately. The script should be multi-threaded somehow (but that probably doesn't matter in the scope of this question).
What baffles me is the gpsd.next() line. If I get it right, it was supposed to tell the script that new gps data have been acquired and are ready to be read.
However, I read the data using the infinite while True loop with a 1 second pause with time.sleep(1).
What this does, however, is that it sometimes echoes the same data twice (the sensor hasn't updated the data in the last second). I figure it also skips some sensor data somehow too.
Can I somehow change the script to print the current speed not every second, but every time the sensor reports new data? According to the data sheet it should be every second (a 1 Hz sensor), but obviously it isn't exactly 1 second, but varies by milliseconds.
As a generic design rule, you should have one thread for each input channel or more generic, for each "loop over a blocking call". Blocking means that the execution stops at that call until data arrives. E.g. gpsd.next() is such a call.
To synchronize multiple input channels, use a Queue and one extra thread. Each input thread should put its "events" on the (same) queue. The extra thread loops over queue.get() and reacts appropriately.
From this point of view, your script need not be multithreaded, since there is only one input channel, namely the gpsd.next() loop.
Example code:
from gps import *
class GpsPoller(object):
def __init__(self, action):
self.gpsd = gps(mode=WATCH_ENABLE) #starting the stream of info
self.action=action
def run(self):
while True:
self.gpsd.next()
self.action(self.gpsd)
def myaction(gpsd):
print gpsd.fix.speed
if __name__ == '__main__':
gpsp = GpsPoller(myaction)
gpsp.run() # runs until killed by Ctrl-C
Note how the use of the action callback separates the plumbing from the data evaluation.
To embed the poller into a script doing other stuff (i.e. handling other threads as well), use the queue approach. Example code, building on the GpsPoller class:
from threading import Thread
from Queue import Queue
class GpsThread(object):
def __init__(self, valuefunc, queue):
self.valuefunc = valuefunc
self.queue = queue
self.poller = GpsPoller(self.on_value)
def start(self):
self.t = Thread(target=self.poller.run)
self.t.daemon = True # kill thread when main thread exits
self.t.start()
def on_value(self, gpsd):
# note that we extract the value right here.
# Otherwise it could change while the event is in the queue.
self.queue.put(('gps', self.valuefunc(gpsd)))
def main():
q = Queue()
gt = GpsThread(
valuefunc=lambda gpsd: gpsd.fix.speed,
queue = q
)
print 'press Ctrl-C to stop.'
gt.start()
while True:
# blocks while q is empty.
source, data = q.get()
if source == 'gps':
print data
The "action" we give to the GpsPoller says "calculate a value by valuefunc and put it in the queue". The mainloop sits there until a value pops out, then prints it and continues.
It is also straightforward to put other Thread's events on the queue and add the appropriate handling code.
I see two options here:
GpsPoller will check if data changed and raise a flag
GpsPoller will check id data changed and put new data in the queue.
Option #1:
global is_speed_changed = False
def run(self):
global gpsd, is_speed_changed
while gpsp.running:
prev_speed = gpsd.fix.speed
gpsd.next()
if prev_speed != gpsd.fix.speed
is_speed_changed = True # raising flag
while True:
if is_speed_changed:
print gpsd.fix.speed
is_speed_changed = False
Option #2 ( I prefer this one since it protects us from raise conditions):
gpsd_queue = Queue.Queue()
def run(self):
global gpsd
while gpsp.running:
prev_speed = gpsd.fix.speed
gpsd.next()
curr_speed = gpsd.fix.speed
if prev_speed != curr_speed:
gpsd_queue.put(curr_speed) # putting new speed to queue
while True:
# get will block if queue is empty
print gpsd_queue.get()

parallelly execute blocking calls in python

I need to do a blocking xmlrpc call from my python script to several physical server simultaneously and perform actions based on response from each server independently.
To explain in detail let us assume following pseudo code
while True:
response=call_to_server1() #blocking and takes very long time
if response==this:
do that
I want to do this for all the servers simultaneously and independently but from same script
Use the threading module.
Boilerplate threading code (I can tailor this if you give me a little more detail on what you are trying to accomplish)
def run_me(func):
while not stop_event.isSet():
response= func() #blocking and takes very long time
if response==this:
do that
def call_to_server1():
#code to call server 1...
return magic_server1_call()
def call_to_server2():
#code to call server 2...
return magic_server2_call()
#used to stop your loop.
stop_event = threading.Event()
t = threading.Thread(target=run_me, args=(call_to_server1))
t.start()
t2 = threading.Thread(target=run_me, args=(call_to_server2))
t2.start()
#wait for threads to return.
t.join()
t2.join()
#we are done....
You can use multiprocessing module
import multiprocessing
def call_to_server(ip,port):
....
....
for i in xrange(server_count):
process.append( multiprocessing.Process(target=call_to_server,args=(ip,port)))
process[i].start()
#waiting process to stop
for p in process:
p.join()
You can use multiprocessing plus queues. With one single sub-process this is the example:
import multiprocessing
import time
def processWorker(input, result):
def remoteRequest( params ):
## this is my remote request
return True
while True:
work = input.get()
if 'STOP' in work:
break
result.put( remoteRequest(work) )
input = multiprocessing.Queue()
result = multiprocessing.Queue()
p = multiprocessing.Process(target = processWorker, args = (input, result))
p.start()
requestlist = ['1', '2']
for req in requestlist:
input.put(req)
for i in xrange(len(requestlist)):
res = result.get(block = True)
print 'retrieved ', res
input.put('STOP')
time.sleep(1)
print 'done'
To have more the one sub-process simply use a list object to store all the sub-processes you start.
The multiprocessing queue is a safe object.
Then you may keep track of which request is being executed by each sub-process simply storing the request associated to a workid (the workid can be a counter incremented when the queue get filled with new work). Usage of multiprocessing.Queue is robust since you do not need to rely on stdout/err parsing and you also avoid related limitation.
Then, you can also set a timeout on how long you want a get call to wait at max, eg:
import Queue
try:
res = result.get(block = True, timeout = 10)
except Queue.Empty:
print error
Use twisted.
It has a lot of useful stuff for work with network. It is also very good at working asynchronously.

ideal thread structure question (involves multiple thread communication)

I'm writing an application that listens for sound events (using messages passed in with Open Sound Control), and then based on those events pauses or resumes program execution. My structure works most of the time but always bombs out in the main loop, so I'm guessing it's a thread issue. Here's a generic, simplified version of what I'm talking about:
import time, threading
class Loop():
aborted = False
def __init__(self):
message = threading.Thread(target=self.message, args=((0),))
message.start()
loop = threading.Thread(target=self.loop)
loop.start()
def message(self,val):
if val > 1:
if not self.aborted:
self.aborted = True
# do some socket communication
else:
self.aborted = False
# do some socket communication
def loop(self):
cnt = 0
while True:
print cnt
if self.aborted:
while self.aborted:
print "waiting"
time.sleep(.1);
cnt += 1
class FakeListener():
def __init__(self,loop):
self.loop = loop
listener = threading.Thread(target=self.listener)
listener.start()
def listener(self):
while True:
loop.message(2)
time.sleep(1)
if __name__ == '__main__':
loop = Loop()
#fake listener standing in for the real OSC event listener
listener = FakeListener(loop)
Of course, this simple code seems to work great, so it's clearly not fully illustrating my real code, but you get the idea. What isn't included here is also the fact that on each loop pause and resume (by setting aborted=True/False) results in some socket communication which also involves threads.
What always happens in my code is that the main loop doesn't always pickup where it left off after a sound event. It will work for a number of events but then eventually it just doesn't answer.
Any suggestions for how to structure this kind of communication amongst threads?
UPDATE:
ok, i think i've got it. here's a modification that seems to work. there's a listener thread that periodically puts a value into a Queue object. there's a checker thread that keeps checking the queue looking for the value, and once it sees it sets a boolean to its opposite state. that boolean value controls whether the loop thread continues or waits.
i'm not entirely sure what the q.task_done() function is doing here, though.
import time, threading
import Queue
q = Queue.Queue(maxsize = 0)
class Loop():
aborted = False
def __init__(self):
checker = threading.Thread(target=self.checker)
checker.setDaemon(True)
checker.start()
loop = threading.Thread(target=self.loop)
loop.start()
def checker(self):
while True:
if q.get() == 2:
q.task_done()
if not self.aborted:
self.aborted = True
else:
self.aborted = False
def loop(self):
cnt = 0
while cnt < 40:
if self.aborted:
while self.aborted:
print "waiting"
time.sleep(.1)
print cnt
cnt += 1
time.sleep(.1)
class fakeListener():
def __init__(self):
listener = threading.Thread(target=self.listener)
listener.setDaemon(True)
listener.start()
def listener(self):
while True:
q.put(2)
time.sleep(1)
if __name__ == '__main__':
#fake listener standing in for the real OSC event listener
listener = fakeListener()
loop = Loop()
Umm.. I don't completely understand your question but i'll do my best to explain what I think you need to fix your problems.
1) The thread of your Loop.loop function should be set as a daemon thread so that it exits with your main thread (so you don't have to kill the python process every time you want to shut down your program). To do this just put loop.setDaemon(True) before you call the thread's "start" function.
2)The most simple and fail-proof way to communicate between threads is with a Queue. On thread will put an item in that Queue and another thread will take an item out, do something with the item and then terminate (or get another job)
In python a Queue can be anything from a global list to python's built-in Queue object. I recommend the python Queue because it is thread safe and easy to use.

Categories