I'm working on a project that spins off several long-running workers as processes. Child workers catch SIGINT and clean up after themselves - based on my research, this is considered a best practice, and works as expected when terminating scripts.
I am actively developing this project, which means that I am regularly testing changes in the interpreter. When I'm working in an interpreter, I often hit CTRL+C to clear currently written text and get a fresh prompt. Unfortunately, if I do this while a subprocess is running, SIGINT is sent to that worker, causing it to terminate.
Is there a solution to this problem other than "never hit CTRL+C in your interpreter"?
One option is to set a variable (e.g. environment variable, commandline option) when debugging.
Related
I am working on a basic crawler which crawls 5 websites concurrently using threads.
For each site it creates a new thread. When I run the program from the shell then the output log indicates that all the 5 threads run as expected.
But when I run this program as a supervisord program then the log indicates that only 2 threads are being run everytime! The log indicates that the all the 5 threads have started but only the same two of them are being executed and the rest get stuck.
I cannot understand why this inconsistency is happening when it is run from a shell and when it run from supervisor. Is there something I am not taking into account?
Here is the code which creates the threads:
for sid in entries:
url = entries[sid]
threading.Thread(target=self.crawl_loop, \
args=(sid, url)).start()
UPDATES:
As suggested by tdelaney in the comments, I changed the working directory in the supervisord configuration and now all the threads are being run as expected. Though I still don't understand that why setting the working directory to the crawler file directory rectifies the issue. Perhaps some one who knows about how supervisor manages processes can explain?
AFAIK python threads can't do threads properly because it is not thread safe. It just gives you a facility to simulate simultaneous run of the code. Your code will still use 1 core only.
https://wiki.python.org/moin/GlobalInterpreterLock
https://en.wikibooks.org/wiki/Python_Programming/Threading
Therefore it is possible that it does not spawn more processes/threads.
You should use multiprocessing I think?
https://docs.python.org/2/library/multiprocessing.html
I was having the same silent problem, but then realised that I was setting daemon to true, which was causing supervisor problems.
https://docs.python.org/2/library/threading.html#threading.Thread.daemon
So the answer is, daemon = true when running the script yourself, false when running under supervisor.
Just to say, I was just experiencing a very similar problem.
In my case, I was working on a low powered machine (RaspberryPi), with threads that were dedicated to listening to a serial device (an Arduino nano on /dev/ttyUSB0). Code worked perfectly on the command line - but the serial reading thread stalled under supervisor.
After a bit of hacking around (and trying all of the options here), I tried running python in unbuffered mode and managed to solve the issue! I got the idea from https://stackoverflow.com/a/17961520/741316.
In essence, I simply invoked python with the -u flag.
I have a python script I use at work that checks the contents of a webpage and reports to me the changes. It used to run fine checking the site every 5 minutes and then our company switched some security software. The script will still run but will stop after an hour or so. Its not consistent, it will sometimes be a few hours but about an hour seems average. There are no errors raised that are reported in the shell. Is there a way to have this re-started automatically? The code is below, it used to just call the function and then a sleep command, but I added the for loop and the print line for debugging to see what time it is stopping.
import time
import datetime
import txtWip
while True:
txtWip.main()
for i in range(1, 300,100):
current_time = time.strftime(r"%d.%m.%Y %H:%M:%S", time.localtime())
print(current_time)
time.sleep(100)
What you want is for your program to run as a daemon[1]: a background process that no longer responds to ordinary SIGKILL or even SIGHUP. You also want this daemon to restart itself on termination - effectively making it run forever.
Rather than write your own daemon script, it's far easier to do one of the following:
If on Linux, use Upstart.
This is a replacement for the init.d daemon that supervises all processes while your machine running. It is capable of respawning a process in the event of an unexpected crash - see an example here, and a Python-specific example here. It is the gold standard for such tasks on this platform.
For certain Ubuntu releases, systemd is the prodigal son and should be used instead.
An alternative with an external bash script that doesn't require messing around with upstart is also a possibility.
If on Windows, use ReStartMe
If on Mac, install and configure runit appropriately
[1] The following is not cross-platform advice. You may or may not actually need to run this as a true daemon. A simple background job (i.e. invoked by appending an & when you run python <file_name>.py to the end) should be sufficient - this will quit when the terminal you ran it in quits, but you can get around this by using the Linux utility screen.
Sometimes I run many instances of a python script simultaneously. To do it anagrammatically I use tmux (a terminal multiplexer), and when I fill I'm done, or I when I have to fix something, then I kill the tmux session instead of exiting each of the (up to 100) script manually.
Killing the tmux session actually kills the bash processes which are parents of the python processes that were executed from them. If I understand correctly, it means a SIGHUP signal is sent to all of the python processes.
It cleans everything quite quickly - memory is freed (it seems), cpu is freed, sockets are closed and apparently ports are freed. The advantage is that it is a much quicker and simpler task than exiting each of the scripts.
My question is: are there any possible consequences to such a habit? If I don't care about the output of the script itself - may it cause any other damage, such as making the OS dirtier, heavier, etc? Is there a better practice?
The SIGHUP handler is called. If no SIGHUP handler is installed, then the default action as shown by the signal(7) man page is invoked.
To be certain that your scripts close all files, release all resources, etc., install a SIGHUP handler that performs the appropriate actions.
I have a python script that continuously process new data and writes to a mongodb. In the script, its a while loop and a sleep that runs the code continuously.
What is the recommended way to run the Python script forever, logging errors when they occur, and restarting when it crashes?
Will node.js's forever be suitable? I'm also running node/meteor on the same Ubuntu server.
supervisord is perfect for this sort of thing. While I used to check that programs were still running every couple of minutes with a cron job, supervisord runs all programs in an in-process thread, so in the event your program terminates, supervisord will automatically restart the process. I no longer need to parse the output of ps to see if a program crashed.
It has a simple declaritive config file and configurable logging. By default it creates a log file for your-program-name-stderr.log your-program-name-stdout.log which are automatically handled by logrotate when supervisord is installed from an OS package manager (Debian for me).
If you don't want to configure supervisord's logging, you should look at logging in python so you can control what goes into those files.
if you're on a debian derivative you should be able to install and start the daemon simply by executing apt-get install supervisord as root.
The config file is very straightforward too:
[program:myprogram]
command=/path/to/my/program/script
directory=/path/to/my/program/base
user=myuser
autostart=true
autorestart=true
redirect_stderr=True
supervisorctl also allows you to see what your program is doing interactively and can start and stop multiple programs with supervisorctl start myprogram etc
Recently wrote something similar. The basic pattern I follow is
while True:
try:
#functionality
except SpecificError:
#log exception
except: #catch everything else
finally:
time.sleep(600)
to handle reboots you can use init.d or cron jobs.
If you are writing a daemon, you should probably do it with this command:
http://manpages.ubuntu.com/manpages/lucid/man8/start-stop-daemon.8.html
You can spawn this from a System V /etc/init.d/ script, or use Upstart which is slowly replacing it.
Upstart: http://upstart.ubuntu.com/getting-started.html
System V: http://www.cyberciti.biz/tips/linux-write-sys-v-init-script-to-start-stop-service.html
I find System V easier to write, but if this will ever be packaged and distributed in a debian file, I recommend writing an Upstart conf.
Definitely keep the sleep so it won't keep a grip on CPU load.
I don't know if this is still relevant to you, but I have been reading forever about how to do this and want to share somewhere what I did.
For me, the goal was to have a python script running always (on my Linux computer). The python script also has a "while True " loop in it which should theoretically run forever, but if it for any reason I cannot think of would crash, I want the script to restart. Also, when I restart the computer it should run the script.
I am not an expert but for me the best and most understandable was to use systemd (assuming you use Linux).
There are two nice examples of how to do this given here and here, showing how to write your .service files in either /etc/systemd/system or /lib/systemd/system. If you want to be completely correct you should take the former:
" /etc/systemd/system/: units installed by the system administrator" 1
The documentation of systemd here is actually nice to read, even if you are not an expert.
Hope this helps someone!
I have a multiprocessed python application which is being run as an EXE on windows. Upon selecting to shutdown the operating system the applications throws a number of exceptions as a result of the processes being shutdown.
Is there a way to capture the system shutdown request by windows so I may handle the closure of the multiprocesses myself?
A nabble.com page suggests using win32api.SetConsoleCtrlHandler:
“I need to do something when windows shuts down, as when someone presses the power button. I believe this is a window message, WM_QUERYENDSESSION or WM_ENDSESSION. I can't find any way to trap this in python. atexit() does not work. Using the signal module to trap SIGBREAK or SIGTERM does not work either.”
You might be able to use win32api.SetConsoleCtrlHandler and catch the CTRL_SHUTDOWN_EVENT that's sent to the console.
Also see Python windows shutdown events, which says, “When using win32api.setConsoleCtrlHandler() I'm able to receive shutdown/logoff/etc events from Windows, and cleanly shut down my app” etc.