I have a python script that can run for long time in the background, and am trying to find a way of getting a status update from it. Basically we're considering to send it a SIGUSR1 signal, and then have it report back a status update.
Catching the signal in Python is not the issue, lots of information about that.
But how to get back information to the process initiating the signal? It seems that there is no way to figure out the pid of the initiating process by the receiving process, which could provide a way to send information back. A single reply message is enough here (in the tune of 'busy uploading; at 55% now; will finish at such a time'); a continuing update would be fantastic but not necessary.
What I've come up with is to write this data to a temporary file with predetermined name - has the issue of leaving stale files behind, and need some kind of clean-up routine then. But that sounds like a hack. Is there anything better available?
The way the running process is signalled doesn't matter, it doesn't have to be kill -SIGUSR1 pid. Any way to communicate with it would do. As long as the communication can be initiated from a new process that's started after the main process runs, possibly running under as different user.
Signals are not designed to be general inter-process communication mechanisms that allow for passing data. They can't do much more than provide a notification. What the target process does in response can be fairly general (generating output to a particular file that the sender then knows to go look at, for example), but passing data directly back to the sender would require a different mechanism like a pipe, shared memory, message queue, etc. Also note that, in general, a process receiving a signal can't really determine who sent the signal, so it wouldn't know where to send a response anyway.
Related
So I have been getting my feet wet with python, attempting to build a reminder system that ties into the gnome notification ui. The basic idea is you type a command into your shell like remind me to check on dinner in 20 min and then in 20 min you get a desktop notification saying "check on dinner". The way I am doing this is by having a script parse the message and write the time the notification should be sent and the message that should be sent to a log file.
The notifications are getting triggered by a python daemon. I am using this daemon design I found online. The issue I am seeing is when this daemon is running it is taking 100% of my cpu! I stripped down all the code the daemon was doing and it I still have this problem when all the daemon is doing is
while True:
last_modified = os.path.getmtime(self.logfile)
I presume that this is a bad approach and I should instead be notifying the daemon when there is a new reminder and then most of the time the reminder daemon should be sleeping. Now this is just an idea but I am having a hard time finding resources on 'how to notify a process' when all I know is the daemons pid. So if I have suspend the daemon with something like time.sleep(time_to_next_notification) would there be a way for me to send a signal to to the daemon letting it know that there was a new reminder?
Though I believe you're better off using a server - client type solution that listens on a port, what you are asking is 100% possible using the signal and os libraries. This approach will not work well with multi threaded programs however as signals are only handled by the parent thread in python. Additionally windows doesn't implement signals in the same way so the options are more limited.
Signals
The "client" process can send arbitrary signals using os.kill(pid, signal). You will have to go through the available signals and determine which one you want to use (signal.NSIG may be a good option because it shouldn't stomp on any other default behavior).
The "daemon" process on startup must register a handler for what to do when it receives your chosen signal. The handler is a function you must define that receives the signal itself that was received as well as the current stack frame of execuiton (def handler(signum, frame):). If you're only doing one thing with this handler, and it doesn't need to know what was happening when it was called, you can probably ignore both these parameters. Then you must register the handler with signal.signal ex: signal.signal(signal.NSIG, handler).
From there you will want to find some appropriate way to wait until the next signal without consuming too many resources. This could be as simple as looping on a os.sleep
command, or you could try to get fancy. I'm not sure 100% how execution resumes on returning from a signal handler, so you may need to concern yourself with recursion depth (ie, make sure you don't recurse every time a signal is handled or you'll only ever be able to handle a limited number of signals before needing to re-start).
Server
Having a process listen on a port (generally referred to as a server, but functionally the same as your 'daemon' description) instead of listen for operating system signals has several main advantages.
Ports are able to send data where signals are only able to trigger events
Ports are more similar cross-platform
Ports play nice[r] with multi-threading
Ports make it easy to send messages across a network (ie: create reminder from phone and execute on PC)
Waiting for multiple things at once
In order to address the need to wait for multiple processes at once (listening for input as well as waiting to deliver next notification) you have quite a few options:
Signals actually may be a good use case as signal.SIGALRM can be used as a conveniently re-settable alarm clock (if you're using UNIX). You would set up the handler in the same way as before, and simply set an alarm for the next notification. After setting the alarm, you could simply resume listening on the port for new tasks. If a new task comes in, setting the alarm again will override the existing one, so the handler would need to retrieve the next queued notification and re-set the alarm once done with the first task.
Threads could either be used to poll a queue of notification tasks, or an individual thread could be created to wait for each task. This is not a particularly elegant solution, however it would be effective and easy to implement.
The most elegant solution would likely be to use asyncio co-routines, however I am not as well versed in asyncio, and will admit they're a bit more confusing than threads.
In my (python) code I have a thread listening for changes from a couchdb feed (continuous changes). The changes request has a timeout parameter which is too big in certain circumstances (for example when a user wants to interrupt the program manually with ^C).
How can I abort a long-running blocking http request?
Is this possible, or do I need to reduce the timeout to make my program more responsive?
This would be unfortunate, because having a timeout small enough to make the program really responsive (say, 1s), means that there are lots of connections being created (one per second!), which defeats the purpose of listening to changes, and makes it very difficult to make sure that we are not missing any changes (in the re-connecting timespan we can indeed miss changes, so that special code is needed to handle that case)
The other option is to forcefully abort the thread, but that is not really an option in python.
If I understand correctly it looks like you are waiting too long between requests before deciding whether to respond to the users or not. You are right continuously closing and creating new connections will defeat the purpose of changes feed.
A solution could be to use heartbeat query parameter in which couchdb will keep sending newlines to tell the client that the connection is still alive.
http://localhost:5984/hello/_changes?feed=continuous&heartbeat=1000&include_docs=true
as long as you are getting heartbeats (newlines) you can be sure that you are getting new changes. A new line will indicate that no changes have occurred. Where as an actual change will be reported back. No need to close the connection. Respond to your clients if resp!="/n"
Blocking the thread execution in general prevents the thread from beeing terminated. You need to wait until the request timed out. But this is already clear.
Using a library that supports non blocking requests is maybe a solution, but I don't know if there is any.
Anyway ... you've mentioned that reducing the timeout will lead to more connections. I'd suggest to implement a waiting loop between requests that can be interrupted by an external signal to terminate the thread. with this loop you can control the number of requests independent from the timeout.
I have several scripts that I use to do some web crawling. They are always running, and should never stop. However, after about a week, they systematically "freeze": there is no output anymore, no response to Ctrl+C or anything. The only way is to kill the process and restart it.
I suspect that these issues come from the library I use for retrieving the data (urllib2), but the issue is very hard to reproduce.
I am thus wondering how I could check the state of the process and kill/restart it automatically if it is frozen. I was thinking of creating a PID file, and update it regularly. Another script could then periodically check the last modification date of this PID file, and restart the process if it's too old. I could use something like Monit to do the monitoring.
Is this how I should do it? Is there another best practice/common way for checking the responsiveness of a process?
If you have a process that is always running, has no connected terminal, and is the process group leader - that is a daemon. You undoubtedly know all that.
There are some defacto practices in coding programs like that. One is to have a signal handler which takes SIGHUP and forces the program to reinitialize itself. This means closing all of the open log files, rereading config scripts, etc. I do not know how applicable that is to your problem but it sometimes solves issues like frozen daemons at my work.
You can customize the idea by employing SIGUSR1 and SIGUSR2 signals to do special things, like write status to a file, or anything else. Since signals come in on an interrupt, the trap statement in scripts and signal handlers in python itself will push program state onto the interrupt stack and do "stuff".
In your case you may want the program fork/exec itself and then kill the parent.
Can I "move" response object somehow from one process to another?
The first process is a non-blocking server which does some other IO. It needs to be done in a non-blocking environment like Tornado or Twisted or something like this.
Another process (actually, a pool of "worker" processes) is needed to process images with PIL. I can't do it in threads because of GIL. However, either the worker needs to get a file-handle of response object to write the result to, or it should return the result back to the first process, and since the result can be pretty huge (~1 mb), it does not seem like a good idea. (It's probably going to be a separate pool of processes, not a fork for every request - the latter one seems like a bad strategy)
So, can I somehow allow the worker process to write to the response directly?
You can't. Only one process can have access to one port at one time and you cannot respond directly without accessing the port.
But you don't need that. What you need is proxy! You can add a thread to your app which will listen on a different port. Then you fire your image process and when that process finishes its work you can send the result to the port. Then you're thread will read it and send the response.
I've set up a cluster using
ipcluster start --n=8
then accessed it using
from IPython.parallel import Client
c=Client()
dview=c[:]
e=[i for i in c]
I'm running processes on the slave nodes (e[0]-e[7]) which take a lot of time and I'd like them to send progress reports to the master so I can keep an eye on how far through they are.
There are two ways I can think to do this but so far I haven't been able to implement either of them, despite hours of trawling through question pages.
Either I want the nodes to push some data back to the master without prompt. i.e. within the long process that is run on the nodes I implement a function which passes its progress to the master at regular intervals.
Or I could redirect the stdout of the nodes to the that of the master and then just keep track of the progress using print. This is what I've been working on so far. Each node has its own stdout so print doesn't do anything if run remotely. I've tried pushing sys.stdout to the nodes but this just closes it.
I can't believe I'm the only person who wants to do this so maybe I'm missing something very simple. How can I keep track of long processes happening remotely using ipython?
stdout is already captured, logged, and tracked, and arrives at Clients as it comes, before the result is complete.
IPython ships with an example script that monitors stdout/err of all engines, which can easily be tweaked to only monitor a subset of this information, etc.
In the Client itself, you can check the metadata dict for stdout/err (Client.metadata[msg_id].stdout) before results are done. Use Client.spin() to flush any incoming messages off of the zeromq sockets, to ensure this data is up-to-date.
If you want stdout to update frequently, make sure you call sys.stdout.flush() to guarantee that the stream is actually published at that point, rather than relying on implicit flushes, which may not happen until the work completes.