Spawn a subprocess but kill it if main process gets killed - python

I am creating a program in Python that listens to varios user interactions and logs them. I have these requirements/restrictions:
I need a separate process that sends those logs to a remote database every hour
I can't do it in the current process because it blocks the UI.
If the main process stops, the background process should also stop.
I've been reading about subprocess but I can't seem to find anything on how to stop both simultaneously. I need the equivalent of spawn_link if anybody know some Erlang/Elixir.
Thanks!

To answer the question in the title (for visitors from google): there are robust solutions on Linux, Windows using OS-specific APIs and less robust but more portable psutil-based solutions.
To fix your specific problem (it is XY problem): use a daemon thread instead of a process.
A thread would allow to perform I/O without blocking GUI, code example (even if GUI you've chosen doesn't provide async. I/O API such as tkinter's createfilehandler() or gtk's io_add_watch()).

Related

Communication between UI and Core on windows machine

I am working on an GUI based application, that is developed using python and go. I am using python(+Kivy) to implement UI and Go to implement middleware / core on windows OS.
my problem statement is :
1) I want to run the exe of the core on launching the application and it should remain in the background till my application is closed.
2) When an event is triggered from the application, a command is sent to the core, which it turns execute the command on the remote device and returns with the result of command execution.
I want to know, how I control the lifetime of the exe and how I can establish a communication between the UI and Core.
Any ideas!!
There are many ways you can tackle this but what I would recommend is having one of the parts (GUI/Core) as the main application that does all of the initializations and starts the other part. I would recommend using the core for this.
Here's a sample architecture you can use, though the architecture you choose is highly dependent on the application and your goals.
Core runs first, performing initialization actions including starting the GUI, sets up the communication between the GUI (using pipes, sockets, etc), then waits for commands from the GUI. If the GUI signals to close, the core can perform whatever clean up necessary and then exits. With this scenario, the lifetime of the exe is controlled by the GUI.(GUI sends a signal to the core when the user hits the exit button to let the core know it should exit)
If the core starts the GUI, then it can set up the STDIN/STDOUT pipes for it and listen for commands on the STDOUT, while sending results on the STDIN. You can also take the server approach, having the core listen on a socket, and the GUI sends requests to it and wait for a response. With the server approach, you can have some sort of concurrency unlike the serial pipes, but I think it might be slower than the pipes (the difference might be negligible but its hard to say without knowing what exactly you're doing).

Best practice: Monitor processes

I was wondering what the best practice solution would be to constantly monitor and resart processes, because there are multiple ways in doing it.
Additional info:
I have a unix program which uses multiple processes to work. There's a main process, it always starts first and is not likely to die or terminate without stopping the program.
Then I spawn multiple "module" processes, which take care of some work and communicate through the main process. Those modules sometimes die because of exceptions, and because it's an external program I can't resolve the issues, so I have to restart them if they die.
I've made a program to check if any of the modules died and restart them, but I need to run it manually. My program checks if the pid files of the modules exist and if they listen on a specific tcp port. If the pid file doesn't exist or the socket can't establish connection, it restarts the module.
My thoughts so far:
Cron job to run the checks every minute and restart any dead modules. (kind of an overkill, because they don't die that frequently)
Daemon running in the background, which starts the modules and receives notifications if they die, so it doesn't have to check them constantly. (SIGCHLD signal, os.wait)
If I use the daemon method, how should I communicate with the daemon through my interface? (socket, or maybe a file which gets read if the daemon receives a specific signal)
Usually I would just go with the daemon because it seems to be the best practice method to restart the modules asap(cron only runs once a minute), but I've wanted to get some opinions from more experienced users. (I've never done something like this before, and asking doesn't hurt anyone :D)
I apologize if these questions are answered somewhere else, but I couldn't find any related question.
P.S. If I forgot something or you need more infos, please feel free to ask. :)
I would investigate running the monitoring process as part of a dedicated monitoring framework. Monit is one example, however there are of course others.
This has the advantage of providing additional features which might be useful, such as email alerts and analytics. In my experience, you should be able to use your existing program without too much modification, and Monit itself uses few system resources if that is a concern.

How to detect unresponsive/frozen processes?

I have several scripts that I use to do some web crawling. They are always running, and should never stop. However, after about a week, they systematically "freeze": there is no output anymore, no response to Ctrl+C or anything. The only way is to kill the process and restart it.
I suspect that these issues come from the library I use for retrieving the data (urllib2), but the issue is very hard to reproduce.
I am thus wondering how I could check the state of the process and kill/restart it automatically if it is frozen. I was thinking of creating a PID file, and update it regularly. Another script could then periodically check the last modification date of this PID file, and restart the process if it's too old. I could use something like Monit to do the monitoring.
Is this how I should do it? Is there another best practice/common way for checking the responsiveness of a process?
If you have a process that is always running, has no connected terminal, and is the process group leader - that is a daemon. You undoubtedly know all that.
There are some defacto practices in coding programs like that. One is to have a signal handler which takes SIGHUP and forces the program to reinitialize itself. This means closing all of the open log files, rereading config scripts, etc. I do not know how applicable that is to your problem but it sometimes solves issues like frozen daemons at my work.
You can customize the idea by employing SIGUSR1 and SIGUSR2 signals to do special things, like write status to a file, or anything else. Since signals come in on an interrupt, the trap statement in scripts and signal handlers in python itself will push program state onto the interrupt stack and do "stuff".
In your case you may want the program fork/exec itself and then kill the parent.

How do I run long term (infinite) Python processes?

I've recently started experimenting with using Python for web development. So far I've had some success using Apache with mod_wsgi and the Django web framework for Python 2.7. However I have run into some issues with having processes constantly running, updating information and such.
I have written a script I call "daemonManager.py" that can start and stop all or individual python update loops (Should I call them Daemons?). It does that by forking, then loading the module for the specific functions it should run and starting an infinite loop. It saves a PID file in /var/run to keep track of the process. So far so good. The problems I've encountered are:
Now and then one of the processes will just quit. I check ps in the morning and the process is just gone. No errors were logged (I'm using the logging module), and I'm covering every exception I can think of and logging them. Also I don't think these quitting processes has anything to do with my code, because all my processes run completely different code and exit at pretty similar intervals. I could be wrong of course. Is it normal for Python processes to just die after they've run for days/weeks? How should I tackle this problem? Should I write another daemon that periodically checks if the other daemons are still running? What if that daemon stops? I'm at a loss on how to handle this.
How can I programmatically know if a process is still running or not? I'm saving the PID files in /var/run and checking if the PID file is there to determine whether or not the process is running. But if the process just dies of unexpected causes, the PID file will remain. I therefore have to delete these files every time a process crashes (a couple of times per week), which sort of defeats the purpose. I guess I could check if a process is running at the PID in the file, but what if another process has started and was assigned the PID of the dead process? My daemon would think that the process is running fine even if it's long dead. Again I'm at a loss just how to deal with this.
Any useful answer on how to best run infinite Python processes, hopefully also shedding some light on the above problems, I will accept
I'm using Apache 2.2.14 on an Ubuntu machine.
My Python version is 2.7.2
I'll open by stating that this is one way to manage a long running process (LRP) -- not de facto by any stretch.
In my experience, the best possible product comes from concentrating on the specific problem you're dealing with, while delegating supporting tech to other libraries. In this case, I'm referring to the act of backgrounding processes (the art of the double fork), monitoring, and log redirection.
My favorite solution is http://supervisord.org/
Using a system like supervisord, you basically write a conventional python script that performs a task while stuck in an "infinite" loop.
#!/usr/bin/python
import sys
import time
def main_loop():
while 1:
# do your stuff...
time.sleep(0.1)
if __name__ == '__main__':
try:
main_loop()
except KeyboardInterrupt:
print >> sys.stderr, '\nExiting by user request.\n'
sys.exit(0)
Writing your script this way makes it simple and convenient to develop and debug (you can easily start/stop it in a terminal, watching the log output as events unfold). When it comes time to throw into production, you simply define a supervisor config that calls your script (here's the full example for defining a "program", much of which is optional: http://supervisord.org/configuration.html#program-x-section-example).
Supervisor has a bunch of configuration options so I won't enumerate them, but I will say that it specifically solves the problems you describe:
Backgrounding/Daemonizing
PID tracking (can be configured to restart a process should it terminate unexpectedly)
Log normally in your script (stream handler if using logging module rather than printing) but let supervisor redirect to a file for you.
You should consider Python processes as able to run "forever" assuming you don't have any memory leaks in your program, the Python interpreter, or any of the Python libraries / modules that you are using. (Even in the face of memory leaks, you might be able to run forever if you have sufficient swap space on a 64-bit machine. Decades, if not centuries, should be doable. I've had Python processes survive just fine for nearly two years on limited hardware -- before the hardware needed to be moved.)
Ensuring programs restart when they die used to be very simple back when Linux distributions used SysV-style init -- you just add a new line to the /etc/inittab and init(8) would spawn your program at boot and re-spawn it if it dies. (I know of no mechanism to replicate this functionality with the new upstart init-replacement that many distributions are using these days. I'm not saying it is impossible, I just don't know how to do it.)
But even the init(8) mechanism of years gone by wasn't as flexible as some would have liked. The daemontools package by DJB is one example of process control-and-monitoring tools intended to keep daemons living forever. The Linux-HA suite provides another similar tool, though it might provide too much "extra" functionality to be justified for this task. monit is another option.
I assume you are running Unix/Linux but you don't really say. I have no direct advice on your issue. So I don't expect to be the "right" answer to this question. But there is something to explore here.
First, if your daemons are crashing, you should fix that. Only programs with bugs should crash. Perhaps you should launch them under a debugger and see what happens when they crash (if that's possible). Do you have any trace logging in these processes? If not, add them. That might help diagnose your crash.
Second, are your daemons providing services (opening pipes and waiting for requests) or are they performing periodic cleanup? If they are periodic cleanup processes you should use cron to launch them periodically rather then have them run in an infinite loop. Cron processes should be preferred over daemon processes. Similarly, if they are services that open ports and service requests, have you considered making them work with INETD? Again, a single daemon (inetd) should be preferred to a bunch of daemon processes.
Third, saving a PID in a file is not very effective, as you've discovered. Perhaps a shared IPC, like a semaphore, would work better. I don't have any details here though.
Fourth, sometimes I need stuff to run in the context of the website. I use a cron process that calls wget with a maintenance URL. You set a special cookie and include the cookie info in with wget command line. If the special cookie doesn't exist, return 403 rather than performing the maintenance process. The other benefit here is login to the database and other environmental concerns of avoided since the code that serves normal web pages are serving the maintenance process.
Hope that gives you ideas. I think avoiding daemons if you can is the best place to start. If you can run your python within mod_wsgi that saves you having to support multiple "environments". Debugging a process that fails after running for days at a time is just brutal.

How do I execute two programs from python at the same time?

This post explains how to launch a single external program from Python
How shall I launch multipal programs(or threads) at the same time ?
My intended application is a video slide show. I want to launch a image sequence player and a music player at the same time
Thanks in advance
subprocess.Popen doesn't block unless you explicitly ask it to by calling communicate on the returned object, so you can call it more than once to start more than one process.
If you do need to communicate with both sub-processes simultaneously (read their STDOUT, for instance), then invoke subprocess.Popen in separate threads. Each thread can manage a sub-process and communicate with it. Naturally, this leaves you to do all the synchronization but that highly depends on your specific application.

Categories