Efficient Python Daemon

Efficient Python Daemon - python

I was curious how you can run a python script in the background, repeating a task every 60 seconds. I know you can put something in the background using &, is that effeictive for this case?
I was thinking of doing a loop, having it wait 60s and loading it again, but something feels off about that.

Rather than writing your own daemon, use python-daemon instead! python-daemon implements the well-behaved daemon specification of PEP 3143, "Standard daemon process library".
I have included example code based on the accepted answer to this question, and even though the code looks almost identical, it has an important fundamental difference. Without python-daemon you would have to use & to put your process in the background and nohup and to keep your process from getting killed when you exit your shell. Instead this will automatically detach from your terminal when you run the program.
For example:
import daemon
import time
def do_something():
while True:
with open("/tmp/current_time.txt", "w") as f:
f.write("The time is now " + time.ctime())
time.sleep(5)
def run():
with daemon.DaemonContext():
do_something()
if __name__ == "__main__":
run()
To actually run it:
python background_test.py
And note the absence of & here.
Also, this other stackoverflow answer explains in detail the many benefits of using python-daemon.

I think your idea is pretty much exactly what you want. For example:
import time
def do_something():
with open("/tmp/current_time.txt", "w") as f:
f.write("The time is now " + time.ctime())
def run():
while True:
time.sleep(60)
do_something()
if __name__ == "__main__":
run()
The call to time.sleep(60) will put your program to sleep for 60 seconds. When that time is up, the OS will wake up your program and run the do_something() function, then put it back to sleep. While your program is sleeping, it is doing nothing very efficiently. This is a general pattern for writing background services.
To actually run this from the command line, you can use &:
$ python background_test.py &
When doing this, any output from the script will go to the same terminal as the one you started it from. You can redirect output to avoid this:
$ python background_test.py >stdout.txt 2>stderr.txt &

Using & in the shell is probably the dead simplest way as Greg described.
If you really want to create a powerful Daemon though, you will need to look into the os.fork() command.
The example from Wikipedia:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os, time
def createDaemon():
"""
This function create a service/Daemon that will execute a det. task
"""
try:
# Store the Fork PID
pid = os.fork()
if pid > 0:
print 'PID: %d' % pid
os._exit(0)
except OSError, error:
print 'Unable to fork. Error: %d (%s)' % (error.errno, error.strerror)
os._exit(1)
doTask()
def doTask():
"""
This function create a task that will be a daemon
"""
# Open the file in write mode
file = open('/tmp/tarefa.log', 'w')
# Start the write
while True:
print >> file, time.ctime()
file.flush()
time.sleep(2)
# Close the file
file.close()
if __name__ == '__main__':
# Create the Daemon
createDaemon()
And then you could put whatever task you needed inside the doTask() block.
You wouldn't need to launch this using &, and it would allow you to customize the execution a little further.

Related

How to restart a Python script?

In a program I am writing in python I need to completely restart the program if a variable becomes true, looking for a while I found this command:
while True:
if reboot == True:
os.execv(sys.argv[0], sys.argv)
When executed it returns the error [Errno 8] Exec format error. I searched for further documentation on os.execv, but didn't find anything relevant, so my question is if anyone knows what I did wrong or knows a better way to restart a script (by restarting I mean completely re-running the script, as if it were been opened for the first time, so with all unassigned variables and no thread running).

There are multiple ways to achieve the same thing. Start by modifying the program to exit whenever the flag turns True. Then there are various options, each one with its advantages and disadvantages.
Wrap it using a bash script.
The script should handle exits and restart your program. A really basic version could be:
#!/bin/bash
while :
do
python program.py
sleep 1
done
Start the program as a sub-process of another program.
Start by wrapping your program's code to a function. Then your __main__ could look like this:
def program():
### Here is the code of your program
...
while True:
from multiprocessing import Process
process = Process(target=program)
process.start()
process.join()
print("Restarting...")
This code is relatively basic, and it requires error handling to be implemented.
Use a process manager
There are a lot of tools available that can monitor the process, run multiple processes in parallel and automatically restart stopped processes. It's worth having a look at PM2 or similar.
IMHO the third option (process manager) looks like the safest approach. The other approaches will have edge cases and require implementation from your side to handle edge cases.

This has worked for me. Please add the shebang at the top of your code and os.execv() as shown below
#!/usr/bin/env python3
import os
import sys
if __name__ == '__main__':
while True:
reboot = input('Enter:')
if reboot == '1':
sys.stdout.flush()
os.execv(sys.executable, [sys.executable, __file__] + [sys.argv[0]])
else:
print('OLD')

I got the same "Exec Format Error", and I believe it is basically the same error you get when you simply type a python script name at the command prompt and expect it to execute. On linux it won't work because a path is required, and the execv method is basically encountering the same error.
You could add the pathname of your python compiler, and that error goes away, except that the name of your script then becomes a parameter and must be added to the argv list. To avoid that, make your script independently executable by adding "#!/usr/bin/python3" to the top of the script AND chmod 755.
This works for me:
#!/usr/bin/python3
# this script is called foo.py
import os
import sys
import time
if (len(sys.argv) >= 2):
Arg1 = int(sys.argv[1])
else:
sys.argv.append(None)
Arg1 = 1
print(f"Arg1: {Arg1}")
sys.argv[1] = str(Arg1 + 1)
time.sleep(3)
os.execv("./foo.py", sys.argv)
Output:
Arg1: 1
Arg1: 2
Arg1: 3
.
.
.

Run Flask app inside of another flask app [duplicate]

I'm trying to make a non blocking subprocess call to run a slave.py script from my main.py program. I need to pass args from main.py to slave.py once when it(slave.py) is first started via subprocess.call after this slave.py runs for a period of time then exits.
main.py
for insert, (list) in enumerate(list, start =1):
sys.args = [list]
subprocess.call(["python", "slave.py", sys.args], shell = True)
{loop through program and do more stuff..}
And my slave script
slave.py
print sys.args
while True:
{do stuff with args in loop till finished}
time.sleep(30)
Currently, slave.py blocks main.py from running the rest of its tasks, I simply want slave.py to be independent of main.py, once I've passed args to it. The two scripts no longer need to communicate.
I've found a few posts on the net about non blocking subprocess.call but most of them are centered on requiring communication with slave.py at some-point which I currently do not need. Would anyone know how to implement this in a simple fashion...?

You should use subprocess.Popen instead of subprocess.call.
Something like:
subprocess.Popen(["python", "slave.py"] + sys.argv[1:])
From the docs on subprocess.call:
Run the command described by args. Wait for command to complete, then return the returncode attribute.
(Also don't use a list to pass in the arguments if you're going to use shell = True).
Here's a MCVE1 example that demonstrates a non-blocking suprocess call:
import subprocess
import time
p = subprocess.Popen(['sleep', '5'])
while p.poll() is None:
print('Still sleeping')
time.sleep(1)
print('Not sleeping any longer. Exited with returncode %d' % p.returncode)
An alternative approach that relies on more recent changes to the python language to allow for co-routine based parallelism is:
# python3.5 required but could be modified to work with python3.4.
import asyncio
async def do_subprocess():
print('Subprocess sleeping')
proc = await asyncio.create_subprocess_exec('sleep', '5')
returncode = await proc.wait()
print('Subprocess done sleeping. Return code = %d' % returncode)
async def sleep_report(number):
for i in range(number + 1):
print('Slept for %d seconds' % i)
await asyncio.sleep(1)
loop = asyncio.get_event_loop()
tasks = [
asyncio.ensure_future(do_subprocess()),
asyncio.ensure_future(sleep_report(5)),
]
loop.run_until_complete(asyncio.gather(*tasks))
loop.close()
1Tested on OS-X using python2.7 & python3.6

There's three levels of thoroughness here.
As mgilson says, if you just swap out subprocess.call for subprocess.Popen, keeping everything else the same, then main.py will not wait for slave.py to finish before it continues. That may be enough by itself. If you care about zombie processes hanging around, you should save the object returned from subprocess.Popen and at some later point call its wait method. (The zombies will automatically go away when main.py exits, so this is only a serious problem if main.py runs for a very long time and/or might create many subprocesses.) And finally, if you don't want a zombie but you also don't want to decide where to do the waiting (this might be appropriate if both processes run for a long and unpredictable time afterward), use the python-daemon library to have the slave disassociate itself from the master -- in that case you can continue using subprocess.call in the master.

For Python 3.8.x
import shlex
import subprocess
cmd = "<full filepath plus arguments of child process>"
cmds = shlex.split(cmd)
p = subprocess.Popen(cmds, start_new_session=True)
This will allow the parent process to exit while the child process continues to run. Not sure about zombies.
Tested on Python 3.8.1 on macOS 10.15.5

The easiest solution for your non-blocking situation would be to add & at the end of the Popen like this:
subprocess.Popen(["python", "slave.py", " &"])
This does not block the execution of the rest of the program.

If you want to start a function several times with different arguments in a non-blocking way, you can use the ThreadPoolExecuter.
You submit your function calls to the executer like this
from concurrent.futures import ThreadPoolExecutor
def threadmap(fun, xs):
with ThreadPoolExecutor(max_workers=8) as executer:
return list(executer.map(fun, xs))

Self Restarting a Python Script

I have created a watchdog timer for my script (Python 3), which allows me to halt execution if anything goes wrong (not shown in code below). However, I would like to have the ability to restart the script automatically using only Python (no external scripts). The code needs to be cross platform compatible.
I have tried subprocess and execv (os.execv(sys.executable, ['python'] + sys.argv)), however I am seeing very weird functionality on Windows. I open the command line, and run the script ("python myscript.py"). The script stops but does not exit (verified through Task Manager), and it will not restart itself unless I press enter twice. I would like it to work automatically.
Any suggestions? Thanks for your help!
import threading
import time
import subprocess
import os
import sys
if __name__ == '__main__':
print("Starting thread list: " + str(threading.enumerate()))
for _ in range(3):
time.sleep(1)
print("Sleeping")
''' Attempt 1 with subprocess.Popen '''
# child = subprocess.Popen(['python',__file__], shell=True)
''' Attempt 2 with os.execv '''
args = sys.argv[:]
args.insert(0, sys.executable)
if sys.platform == 'win32':
args = ['"%s"' % arg for arg in args]
os.execv(sys.executable, args)
sys.exit()

Sounds like you are using threading in your original script, which explains why your can't break your original script when simply pressing Ctrl+C. In that case, you might want to add a KeyboardInterrupt exception to your script, like this:
from time import sleep
def interrupt_this()
try:
while True:
sleep(0.02)
except KeyboardInterrupt as ex:
# handle all exit procedures and data cleaning
print("[*] Handling all exit procedures...")
After this, you should be able to automatically restart your relevant procedure (even from within the script itself, without any external scripts). Anyway, it's a bit hard to know without seeing the relevant script, so maybe I can be of more help if you share some of it.

How to know if a running script dies?

So I'm somewhat new to programming and mostly self-taught, so sorry if this question is a bit on the novice side.
I have a python script that runs over long periods (e.g. it downloads pages every few seconds for days at a time.) Sort of a monitoring script for a web app.
Every so often, something will disrupt it, and it'll need restarted. I've gotten these events to a bare minimum but it still happens every few days, and when it does get killed it could be bad news if I don't notice for a few hours.
Right now it's running in a screen session on a VPS.
Could someone point me in the right direction as far as knowing when the script dies / and having it automatically restart?
Would this be something to write in Bash? Or something else? I've never done anything like it before and don't know where to start or even look for information.

You could try supervisord, it's a tool for controlling daemon processes.

You should daemonize your program.
As described in Efficient Python Daemon, you can install and use the python-daemon which implements the well-behaved daemon specification of PEP 3143, "Standard daemon process library".
Create a file mydaemon.py with contents like this:
#!/usr/bin/env python
import daemon
import time
import logging
def do_something():
name = 'mydaemon'
logger = logging.getLogger(name)
handler = logging.FileHandler('/tmp/%s.log' % (name))
formatter = logging.Formatter('%(asctime)s %(levelname)s %(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler)
logger.setLevel(logging.WARNING)
while True:
try:
time.sleep(5)
with open("/tmp/file-does-not-exist", "r") as f:
f.write("The time is now " + time.ctime())
except Exception, ex:
logger.error(ex)
def run():
with daemon.DaemonContext():
do_something()
if __name__ == "__main__":
run()
To actually run it use:
python mydaemon.py
Which will spawn do_something() within the DaemonContext and then the script mydaemon.py will exit. You can see the running daemon with: pgrep -fl mydaemon.py. This short example will simply log errors to a log file in /tmp/mydaemon.log. You'll need to kill the daemon manually or it will run indefinitely.
To run your own program, just replace the contents of the try block with a call to your code.

I believe a wrapper bash script that executes the python script inside a loop should do the trick.
while true; do
# Execute python script here
echo "Web app monitoring script disrupted ... Restarting script."
done
Hope this helps.

That depends on the kind of failure you want to guard against. If it's just the script crashing, the simplest thing to do would be to wrap your main function in a try/except:
import logging as log
while True:
try:
main()
except:
log.exception("main() crashed")
If something is killing the Python process, it might be simplest to run it in a shell loop:
while sleep 1; do python checker.py; done
And if it's crashing because the machine is going down… well… Quis custodiet ipsos custodes?
However, to answer your question directly: the absolute simplest way to check if it's running from the shell would be to grep the output of ps:
ps | grep "python checker.py" 2>&1 > /dev/null
running=$?
Of course, this isn't fool-proof, but it's generally Good Enough.

Start background process/daemon from CGI script

I'm trying to launch a background process from a CGI scripts. Basically, when a form is submitted the CGI script will indicate to the user that his or her request is being processed, while the background script does the actual processing (because the processing tends to take a long time.) The problem I'm facing is that Apache won't send the output of the parent CGI script to the browser until the child script terminates.
I've been told by a colleague that what I want to do is impossible because there is no way to prevent Apache from waiting for the entire process tree of a CGI script to die. However, I've also seen numerous references around the web to a "double fork" trick which is supposed to do the job. The trick is described succinctly in this Stack Overflow answer, but I've seen similar code elsewhere.
Here's a short script I wrote to test the double-fork trick in Python:
import os
import sys
if os.fork():
print 'Content-type: text/html\n\n Done'
sys.exit(0)
if os.fork():
os.setsid()
sys.exit(0)
# Second child
os.chdir("/")
sys.stdout.close()
sys.stderr.close()
sys.stdin.close()
f = open('/tmp/lol.txt', 'w')
while 1:
f.write('test\n')
If I run this from the shell, it does exactly what I'd expect: the original script and first descendant die, and the second descendant keeps running until it's killed manually. But if I access it through CGI, the page won't load until I kill the second descendant or Apache kills it because of the CGI timeout. I've also tried replacing the second sys.exit(0) with os._exit(0), but there is no difference.
What am I doing wrong?

Don't fork - run batch separately
This double-forking approach is some kind of hack, which to me is indication it shouldn't be done :). For CGI anyway. Under the general principle that if something is too hard to accomplish, you are probably approaching it the wrong way.
Luckily you give the background info on what you need - a CGI call to initiate some processing that happens independently and to return back to the caller. Well sure - there are unix commands that do just that - schedule command to run at specific time (at) or whenever CPU is free (batch). So do this instead:
import os
os.system("batch <<< '/home/some_user/do_the_due.py'")
# or if you don't want to wait for system idle,
# os.system("at now <<< '/home/some_user/do_the_due.py'")
print 'Content-type: text/html\n'
print 'Done!'
And there you have it. Keep in mind that if there is some output to stdout/stderr, that will be mailed to the user (which is good for debugging but otherwise script probably should keep quiet).
PS. i just remembered that Windows also has version of at, so with minor modification of the invocation you can have that work under apache on windows too (vs fork trick that won't work on windows).
PPS. make sure the process running CGI is not excluded in /etc/at.deny from scheduling batch jobs

I think there are two issues: setsid is in the wrong place and doing buffered IO operations in one of the transient children:
if os.fork():
print "success"
sys.exit(0)
if os.fork():
os.setsid()
sys.exit()
You've got the original process (grandparent, prints "success"), the middle parent, and the grandchild ("lol.txt").
The os.setsid() call is being performed in the middle parent after the grandchild has been spawned. The middle parent can't influence the grandchild's session after the grandchild has been created. Try this:
print "success"
sys.stdout.flush()
if os.fork():
sys.exit(0)
os.setsid()
if os.fork():
sys.exit(0)
This creates a new session before spawning the grandchild. Then the middle parent dies, leaving the session without a process group leader, ensuring that any calls to open a terminal will fail, making sure there's never any blocking on terminal input or output, or sending unexpected signals to the child.
Note that I've also moved the success to the grandparent; there's no guarantee of which child will run first after calling fork(2), and you run the risk that the child would be spawned, and potentially try to write output to standard out or standard error, before the middle parent could have had a chance to write success to the remote client.
In this case, the streams are closed quickly, but still, mixing standard IO streams among multiple processes is bound to give difficulty: keep it all in one process, if you can.
Edit I've found a strange behavior I can't explain:
#!/usr/bin/python
import os
import sys
import time
print "Content-type: text/plain\r\n\r\npid: " + str(os.getpid()) + "\nppid: " + str(os.getppid())
sys.stdout.flush()
if os.fork():
print "\nfirst fork pid: " + str(os.getpid()) + "\nppid: " + str(os.getppid())
sys.exit(0)
os.setsid()
print "\nafter setsid pid: " + str(os.getpid()) + "\nppid: " + str(os.getppid())
sys.stdout.flush()
if os.fork():
print "\nsecond fork pid: " + str(os.getpid()) + "\nppid: " + str(os.getppid())
sys.exit(0)
#os.sleep(1) # comment me out, uncomment me, notice following line appear and dissapear
print "\nafter second fork pid: " + str(os.getpid()) + "\nppid: " + str(os.getppid())
The last line, after second fork pid, only appears when the os.sleep(1) call is commented out. When the call is left in place, the last line never appears in the browser. (But otherwise all the content is printed to the browser.)

I wouldn't suggets going about the problem this way. If you need to execute some task asynchronously, why not use a work queue like beanstalkd instead of trying to fork off the tasks from the request? There are client libraries for beanstalkd available for python.

I needed to break the stdout as well as the stderr like this:
sys.stdout.flush()
os.close(sys.stdout.fileno()) # Break web pipe
sys.sterr.flush()
os.close(sys.stderr.fileno()) # Break web pipe
if os.fork(): # Get out parent process
sys.exit()
#background processing follows here

Ok, I'm adding a simpler solution, if you don't need to start another script but continue in the same one to do the long process in background. This will let you give a waiting message instantly seen by the client and continue your server processing even if the client kill the browser session:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os
import sys
import time
import datetime
print "Content-Type: text/html;charset=ISO-8859-1\n\n"
print "<html>Please wait...<html>\n"
sys.stdout.flush()
os.close(sys.stdout.fileno()) # Break web pipe
if os.fork(): # Get out parent process
sys.exit()
# Continue with new child process
time.sleep(1) # Be sure the parent process reach exit command.
os.setsid() # Become process group leader
# From here I cannot print to Webserver.
# But I can write in other files or do any long process.
f=open('long_process.log', 'a+')
f.write( "Starting {0} ...\n".format(datetime.datetime.now()) )
f.flush()
time.sleep(15)
f.write( "Still working {0} ...\n".format(datetime.datetime.now()) )
f.flush()
time.sleep(300)
f.write( "Still alive - Apache didn't scalped me!\n" )
f.flush()
time.sleep(150)
f.write( "Finishing {0} ...\n".format(datetime.datetime.now()) )
f.flush()
f.close()
I have read half the Internet for one week without success on this one, finally I tried to test if there is a difference between sys.stdout.close() and os.close(sys.stdout.fileno()) and there is an huge one: The first didn't do anything while the second closed the pipe from the web server and completly disconnected from the client. The fork is only necessary because the webserver will kill its processes after a while and your long process probably needs more time to complete.

As other answers have noted, it is tricky to start a persistent process from your CGI script because the process must cleanly dissociate itself from the CGI program. I have found that a great general-purpose program for this is daemon. It takes care of the messy details involving open file handles, process groups, root directory, etc etc for you. So the pattern of such a CGI program is:
#!/bin/sh
foo-service-ping || daemon --restart foo-service
# ... followed below by some CGI handler that uses the "foo" service
The original post describes the case where you want your CGI program to return quickly, while spawning off a background process to finish handling that one request. But there is also the case where your web application depends on a running service which must be kept alive. (Other people have talked about using beanstalkd to handle jobs. But how do you ensure that beanstalkd itself is alive?) One way to do this is to restart the service (if it's down) from within the CGI script. This approach makes sense in an environment where you have limited control over the server and can't rely on things like cron or an init.d mechanism.

There are situations where passing work off to a daemon or cron is not appropriate. Sometimes you really DO need to fork, let the parent exit (to keep Apache happy) and let something slow happen in the child.
What worked for me: When done generating web output, and before the fork:
fflush(stdout), close(0), close(1), close(2); // in the process BEFORE YOU FORK
Then fork() and have the parent immediately exit(0);
The child then AGAIN does
close(0), close(1), close(2);
and also a
setsid();
...and then gets on with whatever it needs to do.
Why you need to close them in the child even though they were closed in the primordial process in advance is confusing to me, but this is what worked. It didn't without the 2nd set of closes. This was on Linux (on a raspberry pi).

I haven't tried using fork but I have accomplished what you're asking by executing a sys.stdout.flush() after the original message, before calling the background process.
i.e.
print "Please wait..."
sys.stdout.flush()
output = some_processing() # put what you want to accomplish here
print output # in my case output was a redirect to a results page

My head still hurting on that one. I tried all possible ways to use your code with fork and stdout closing, nulling or anything but nothing worked. The uncompleted process output display depends on webserver (Apache or other) config, and in my case it wasn't an option to change it, so tries with "Transfer-Encoding: chunked;chunk=CRLF" and "sys.stdout.flush()" didn't worked either. Here is the solution that finally worked.
In short, use something like:
if len(sys.argv) == 1: # I'm in the parent process
childProcess = subprocess.Popen('./myScript.py X', bufsize=0, stdin=open("/dev/null", "r"), stdout=open("/dev/null", "w"), stderr=open("/dev/null", "w"), shell=True)
print "My HTML message that says to wait a long time"
else: # Here comes the child and his long process
# From here I cannot print to Webserver, but I can write in files that will be refreshed in my web page.
time.sleep(15) # To verify the parent completes rapidly.
I use the "X" parameter to make the distinction between parent and child because I call the same script for both, but you could do it simpler by calling another script. If a complete example would be useful, please ask.

For thous that have "sh: 1: Syntax error: redirection unexpected" with the at/batch solution try using something like this:
Make sure that the at command is installed and the user running the application ins't in /etc/at.deny
os.system("echo sudo /srv/scripts/myapp.py | /usr/bin/at now")

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Efficient Python Daemon - python

I was curious how you can run a python script in the background, repeating a task every 60 seconds. I know you can put something in the background using &, is that effeictive for this case? I was thinking of doing a loop, having it wait 60s and loading it again, but something feels off about that.

Related

How to restart a Python script?

Run Flask app inside of another flask app [duplicate]

Self Restarting a Python Script

How to know if a running script dies?

Start background process/daemon from CGI script

Categories

Resources