I'm trying to make a threaded cgi webserver similar to this; however, I'm stuck on how to set local data in the handler for a different thread. Is it possible to set threading.local data, such as a dict, for a thread other than the handler. To be more specific I want to have the request parameters, headers, etc available from a cgi file that was started with subprocess.run. The bottom of the do_GET in this file on github is what I use now, but that can only serve one client at a time. I want to replace this part because I want multiple connections/threads at once, and I need different data in each connection/thread.
Is there a way to edit/set threading.local data from a different thread. Or if there is a better way to achieve what I am trying, please let me know. If you know that this is definently impossible, say so.
Thanks in advance!
Without seeing what test code you have, and knowing what you've tried so far, I can't tell you exactly what you need to succeed. That said, I can tell you that trying to edit information in a threading.local() object from another thread is not the cleanest path to take.
Generally, the best way to send calls to other threads is through threading.Event() objects. Usually, a thread listens to an Event() object and does an action based on that. In this case, I could see having a handler set an event in the case of a GET request.
Then, in the thread that is writing the cgi file, have a function that, when the Event() object is set, records the data you need and unsets the Event() object.
So, in pseudo-code:
import threading
evt = threading.Event()
def noteTaker(evt):
while True:
if evt.wait():
modifyDataYouNeed()
f.open()
f.write()
f.close()
evt.clear()
def do_GET(evt):
print "so, a query hit your webserver"
evt.set()
print "and noteTaker was just called"
So, while I couldn't answer your question directly, I hope this helps some on how threads communicate and will help you infer what you need :)
threading information (as I'm sure you've read already, but for the sake of diligence) is here
Related
I have two classes that I use to automate gdb from the command line, the main class creates a process for it to execute gdb commands from the command-line, then sends commands to the gdb_engine class(which I override the run() method in, it also has the gdb related functions in it) depending on the user request. The two separate processes communicate from the queue which holds the jobs that should be done. To do this task, I thought of this simple plan:
1-Check queue
2-wait if queue is empty, if not, execute the first function in the queue
3-Rewrite the queue
4-return to 1
But I couldn't find any function in multiprocessing documentary to make the spawned process stop/sleep if the queue is empty. I'm sure that there's a way to do this but since I'm still a beginner at python, I can't find my way easily. Things are still a bit confusing at this point.
Thanks in advance, have a good day! (I use python3.4 btw, if that matters)
EDIT: I don't have much going on right now but still posting my code on grzgrzgrz3's request. The codebase is somewhat large so I'm only copy/pasting the multiprocessing related ones.
GDB_Engine class, where I control gdb with pexpect:
class GDB_Engine(Process):
jobqueue=Queue()
def __init__(self):
super(GDB_Engine, self).__init__()
self.jobqueue=GDB_Engine.jobqueue
def run(self):
#empty since I still don't know how to implement that algorithm
Main of the program
if __name__ == "__main__":
gdbprocess=GDB_Engine()
gdbprocess.start()
I simply put items in queue whenever I need to do a job like this(middle of the code that attaches gdb to the target):
gdbprocess.jobqueue.put("attachgdb")
My main idea about it is spawned process will compare the string in queue and run the specified function in GDB_Engine class, to show an example, here's the attach code:
def attachgdb(self,str):
global p
p=pexpect.spawnu('sudo gdb')
p.expect_exact("(gdb) ")
p.sendline("attach " + str)
p.expect_exact("(gdb) ")
p.sendline("c")
p.expect_exact("Continuing")
I just found out that get() method blocks the process automatically if the queue is empty, so the answer to my question was very simple. I should have tried the methods more before asking, looks like it was just another silly and unnecessary question of mine.
I have already read
http://wiki.wxpython.org/LongRunningTasks
http://wiki.wxpython.org/CallAfter
and searched a lot in Google but found no answer to my problem. Because in my opinion it would be to much code and it is more a theoretical problem, I hope it is ok without code.
What I want to do with an example: I have a grid (wx.grid) with check boxes in the main thread. Then I start a new thread (thread.start_new_thread) where I go through all rows (1 second per row) and check if the checkbox is set. If it is set, some job is done.
This is working, if I read out all rows before I start the thread. But I need to read it out while the thread is running, because the user should have the ability to uncheck or check another checkbox! But if I read it out in the new thread sometimes a "NonType Object is not callable" error is raised. I think because wx.CallAfter should be used to interact with the grid in the other thread. But CallAfter I can not use to get the return value.
I have no idea how to solve this issue. Perhaps some people with more thread experience have some idea? If you need additional data please ask, but I think that my example contains all necessary information.
A common approach to this type of thing is to use a Queue.Queue object to pass commands to one or more worker threads. The worker thread(s) will wait on a pull from the queue until there are items in the queue ready to be pulled. Part of the command object could be a target in the GUI thread to send a message to (in a thread-safe way, like with wx.CallAfter) when the command is completed.
You should also take a look at the wx.lib.delayedresult module. It is similar to the above but a little more capable and robust.
I have a doubt with respect to python queues.
I have written a threaded class, whose run() method executes the queue.
import threading
import Queue
def AThread(threading.Thread):
def __init__(self,arg1):
self.file_resource=arg1
threading.Thread.__init__(self)
self.queue=Queue.Queue()
def __myTask(self):
self.file_resource.write()
''' Method that will access a common resource
Needs to be synchronized.
Returns a Boolean based on the outcome
'''
def run():
while True:
cmd=self.queue.get()
#cmd is actually a call to method
exec("self.__"+cmd)
self.queue.task_done()
#The problem i have here is while invoking the thread
a=AThread()
a.queue.put("myTask()")
print "Hai"
The same instance of AThread (a=AThread()) will load tasks to the queue from different locations.
Hence the print statement at the bottom should wait for the task added to the queue through the statement above and wait for a definitive period and also receive the value returned after executing the task.
Is there a simplistic way to achieve this ?. I have searched a lot regarding this, kindly review this code and provide suggessions.
And Why python's acquire and release lock are not on the instances of the class. In the scenario mentioned, instances a and b of AThread need not be synchronized, but myTask runs synchronized for both instances of a as well as b when acquire and release lock are applied.
Kindly provide suggestions.
There's lots of approaches you could take, depending on the particular contours of your problem.
If your print "Hai" just needs to happen after myTask completes, you could put it into a task and have myTask put that task on the queue when it finishes. (if you're a CS theory sort of person, you can think of this as being analogous to continuation-passing style).
If your print "Hai" has a more elaborate dependency on multiple tasks, you might look into futures or promises.
You could take a step into the world of Actor-based concurrency, in which case there would probably be a synchronous message send method that does more or less what you want.
If you don't want to use futures or promises, you can achieve a similar thing manually, by introducing a condition variable. Set the condition variable before myTask starts and pass it to myTask, then wait for it to be cleared. You'll have to be very careful as your program grows and constantly rethink your locking strategy to make sure it stays simple and comprehensible - this is the stuff of which difficult concurrency bugs is made.
The smallest sensible step to get what you want is probably to provide a blocking version of Queue.put() which does the condition variable thing. Make sure you think about whether you want to block until the queue is empty, or until the thing you put on the queue is removed from the queue, or until the thing you put on the queue has finished processing. And then make sure you implement the thing you decided to implement when you were thinking about it.
I'd like to do something like this:
twistedServer.start() # This would be a nonblocking call
while True:
while twistedServer.haveMessage():
message = twistedServer.getMessage()
response = handleMessage(message)
twistedServer.sendResponse(response)
doSomeOtherLogic()
The key thing I want to do is run the server in a background thread. I'm hoping to do this with a thread instead of through multiprocessing/queue because I already have one layer of messaging for my app and I'd like to avoid two. I'm bringing this up because I can already see how to do this in a separate process, but what I'd like to know is how to do it in a thread, or if I can. Or if perhaps there is some other pattern I can use that accomplishes this same thing, like perhaps writing my own reactor.run method. Thanks for any help.
:)
The key thing I want to do is run the server in a background thread.
You don't explain why this is key, though. Generally, things like "use threads" are implementation details. Perhaps threads are appropriate, perhaps not, but the actual goal is agnostic on the point. What is your goal? To handle multiple clients concurrently? To handle messages of this sort simultaneously with events from another source (for example, a web server)? Without knowing the ultimate goal, there's no way to know if an implementation strategy I suggest will work or not.
With that in mind, here are two possibilities.
First, you could forget about threads. This would entail defining your event handling logic above as only the event handling parts. The part that tries to get an event would be delegated to another part of the application, probably something ultimately based on one of the reactor APIs (for example, you might set up a TCP server which accepts messages and turns them into the events you're processing, in which case you would start off with a call to reactor.listenTCP of some sort).
So your example might turn into something like this (with some added specificity to try to increase the instructive value):
from twisted.internet import reactor
class MessageReverser(object):
"""
Accept messages, reverse them, and send them onwards.
"""
def __init__(self, server):
self.server = server
def messageReceived(self, message):
"""
Callback invoked whenever a message is received. This implementation
will reverse and re-send the message.
"""
self.server.sendMessage(message[::-1])
doSomeOtherLogic()
def main():
twistedServer = ...
twistedServer.start(MessageReverser(twistedServer))
reactor.run()
main()
Several points to note about this example:
I'm not sure how your twistedServer is defined. I'm imagining that it interfaces with the network in some way. Your version of the code would have had it receiving messages and buffering them until they were removed from the buffer by your loop for processing. This version would probably have no buffer, but instead just call the messageReceived method of the object passed to start as soon as a message arrives. You could still add buffering of some sort if you want, by putting it into the messageReceived method.
There is now a call to reactor.run which will block. You might instead write this code as a twistd plugin or a .tac file, in which case you wouldn't be directly responsible for starting the reactor. However, someone must start the reactor, or most APIs from Twisted won't do anything. reactor.run blocks, of course, until someone calls reactor.stop.
There are no threads used by this approach. Twisted's cooperative multitasking approach to concurrency means you can still do multiple things at once, as long as you're mindful to cooperate (which usually means returning to the reactor once in a while).
The exact times the doSomeOtherLogic function is called is changed slightly, because there's no notion of "the buffer is empty for now" separate from "I just handled a message". You could change this so that the function is installed called once a second, or after every N messages, or whatever is appropriate.
The second possibility would be to really use threads. This might look very similar to the previous example, but you would call reactor.run in another thread, rather than the main thread. For example,
from Queue import Queue
from threading import Thread
class MessageQueuer(object):
def __init__(self, queue):
self.queue = queue
def messageReceived(self, message):
self.queue.put(message)
def main():
queue = Queue()
twistedServer = ...
twistedServer.start(MessageQueuer(queue))
Thread(target=reactor.run, args=(False,)).start()
while True:
message = queue.get()
response = handleMessage(message)
reactor.callFromThread(twistedServer.sendResponse, response)
main()
This version assumes a twistedServer which works similarly, but uses a thread to let you have the while True: loop. Note:
You must invoke reactor.run(False) if you use a thread, to prevent Twisted from trying to install any signal handlers, which Python only allows to be installed in the main thread. This means the Ctrl-C handling will be disabled and reactor.spawnProcess won't work reliably.
MessageQueuer has the same interface as MessageReverser, only its implementation of messageReceived is different. It uses the threadsafe Queue object to communicate between the reactor thread (in which it will be called) and your main thread where the while True: loop is running.
You must use reactor.callFromThread to send the message back to the reactor thread (assuming twistedServer.sendResponse is actually based on Twisted APIs). Twisted APIs are typically not threadsafe and must be called in the reactor thread. This is what reactor.callFromThread does for you.
You'll want to implement some way to stop the loop and the reactor, one supposes. The python process won't exit cleanly until after you call reactor.stop.
Note that while the threaded version gives you the familiar, desired while True loop, it doesn't actually do anything much better than the non-threaded version. It's just more complicated. So, consider whether you actually need threads, or if they're merely an implementation technique that can be exchanged for something else.
I have a python webapp which accepts some data via POST. The method which is called can take a while to complete (30-60s), so I would like to "background" the method so I can respond to the user with a "processing" message.
The data is quite sensitive, so I'd prefer not to use any queue-based solutions. I also want to ensure that the backgrounded method doesn't get interrupted should the webapp fail in any way.
My first thought is to fork a process, however I'm unsure how I can pass variables to a process.
I've used Gevent before, which has a handy method: gevent.spawn(function, *args, **kwargs). Is there anything like this that I could use at the process-level?
Any other advice?
The simplest approach would be to use a thread. Pass data to and from a thread with a Queue.