I am working on a daemon where I need to embed a HTTP server. I am attempting to do it with BaseHTTPServer, which when I run it in the foreground, it works fine, but when I try and fork the daemon into the background, it stops working. My main application continues to work, but BaseHTTPServer does not.
I believe this has something to do with the fact that BaseHTTPServer sends log data to STDOUT and STDERR. I am redirecting those to files. Here is the code snippet:
# Start the HTTP Server
server = HTTPServer((config['HTTPServer']['listen'],config['HTTPServer']['port']),HTTPHandler)
# Fork our process to detach if not told to stay in foreground
if options.foreground is False:
try:
pid = os.fork()
if pid > 0:
logging.info('Parent process ending.')
sys.exit(0)
except OSError, e:
sys.stderr.write("Could not fork: %d (%s)\n" % (e.errno, e.strerror))
sys.exit(1)
# Second fork to put into daemon mode
try:
pid = os.fork()
if pid > 0:
# exit from second parent, print eventual PID before
print 'Daemon has started - PID # %d.' % pid
logging.info('Child forked as PID # %d' % pid)
sys.exit(0)
except OSError, e:
sys.stderr.write("Could not fork: %d (%s)\n" % (e.errno, e.strerror))
sys.exit(1)
logging.debug('After child fork')
# Detach from parent environment
os.chdir('/')
os.setsid()
os.umask(0)
# Close stdin
sys.stdin.close()
# Redirect stdout, stderr
sys.stdout = open('http_access.log', 'w')
sys.stderr = open('http_errors.log', 'w')
# Main Thread Object for Stats
threads = []
logging.debug('Kicking off threads')
while ...
lots of code here
...
server.serve_forever()
Am I doing something wrong here or is BaseHTTPServer somehow prevented from becoming daemonized?
Edit: Updated code to demonstrate the additional, previously missing code flow and that log.debug shows in my forked, background daemon I am hitting code after fork.
After a bit of googling I finally stumbled over this BaseHTTPServer documentation and after that I ended up with:
from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer
from SocketServer import ThreadingMixIn
class ThreadedHTTPServer(ThreadingMixIn, HTTPServer):
"""Handle requests in a separate thread."""
server = ThreadedHTTPServer((config['HTTPServer']['listen'],config['HTTPServer']['port']), HTTPHandler)
server.serve_forever()
Which for the most part comes after I fork and ended up resolving my problem.
Here's how to do this with the python-daemon library:
from BaseHTTPServer import (HTTPServer, BaseHTTPRequestHandler)
import contextlib
import daemon
from my_app_config import config
# Make the HTTP Server instance.
server = HTTPServer(
(config['HTTPServer']['listen'], config['HTTPServer']['port']),
BaseHTTPRequestHandler)
# Make the context manager for becoming a daemon process.
daemon_context = daemon.DaemonContext()
daemon_context.files_preserve = [server.fileno()]
# Become a daemon process.
with daemon_context:
server.serve_forever()
As usual for a daemon, you need to decide how you will interact with the program after it becomes a daemon. For example, you might register a systemd service, or write a PID file, etc. That's all outside the scope of the question though.
In particular, it's outside the scope of the question to ask: once it's become a daemon process (necessarily detached from any controlling terminal), how do I stop the daemon process? That's up to you to decide, as part of defining the program's behaviour.
You start by instantiating a HTTPServer. But you don't actually tell it to start serving in any of the supplied code. In your child process try calling server.serve_forever().
See this for reference
A simple solution that worked for me was to override the BaseHTTPRequestHandler method log_message(), so we prevent any kind of writing in stdout and avoid problems when demonizing.
class CustomRequestHandler(BaseHTTPServer.BaseHTTPRequestHandler):
def log_message(self, format, *args):
pass
...
rest of custom class code
...
Just use daemontools or some other similar script instead of rolling your own daemonizing process. It is much better to keep this off your script.
Also, your best option: Don't use BaseHTTPServer. It is really bad. There are many good HTTP servers for python, i.e. cherrypy or paste. Both includes ready-to-use daemonizing scripts.
Since this has solicited answers since I originally posted, I thought that I'd share a little info.
The issue with the output has to do with the fact that the default handler for the logging module uses the StreamHandler. The best way to handle this is to create your own handlers. In the case where you want to use the default logging module, you can do something like this:
# Get the default logger
default_logger = logging.getLogger('')
# Add the handler
default_logger.addHandler(myotherhandler)
# Remove the default stream handler
for handler in default_logger.handlers:
if isinstance(handler, logging.StreamHandler):
default_logger.removeHandler(handler)
Also at this point I have moved to using the very nice Tornado project for my embedded http servers.
Related
I created a service to run a Python program and added some lines of code to create a lock to avoid launching it twice.
Unfortunately I don't know how to configure the service to stop the running program correctly. When running the stop command it doesn't delete the lock, then I can't start the service anymore. If I execute the program myself via CLI and exit with a Ctrl+C, the lock is deleted.
I've read the manual about KillMode, ExecStop and Signal. My understanding is that the default configuration was the one I needed.
Any help please?
Main program
if __name__ == '__main__':
#Creating lock to avoid launching program twice
lock = pathlib.Path("program.lock")
if not lock.exists():
lock_acquired_on = datetime.now()
with open('program.lock', 'w') as lock:
lock.write(f'Lock acquired on {lock_acquired_on}')
logger.info('Added lock file to avoid running the program twice.')
try:
while True:
#Doing stuff here
except KeyboardInterrupt:
close_program() #Close other threads
#Removing the lock file
os.remove(pathlib.Path("program.lock"))
else:
with open('program.lock', 'r') as lock:
lock_acquisition_time = str(lock.readlines()[0])
logger.info('Programme Maquette Status is already running.')
logger.info(lock_acquisition_time)
Service
[Unit]
Description=Programme Maquette IoT End-to-End
After=multi-user.target
Conflicts=getty#tty1.service
[Service]
WorkingDirectory=/home/pi/Documents/ProductionMaquette
Type=simple
ExecStart=/usr/local/bin/python3.8 /home/pi/Documents/ProductionMaquette/Lo_main.py
StandardInput=tty-force
[Install]
WantedBy=multi-user.target
Systemd sends the SIGTERM to the process - so you need to handle that.
So following little example uses the a signal handler for SIGTERMto clean up a file. Actually it uses atexit to clean up the file, as that handles standard exit conditions as well and a signal handler to initiate in "normal" closing down of the process on receiving the SIGTERM signal
import atexit
import signal
import os
locking_file = "/var/lock/my_service.lock"
if __name__ == '__main__':
def clean_lock():
# atexit handler to clean up a file
os.remove(locking_file)
def signal_term_handler(sigNum, frame):
# on receiving a signal initiate a normal exit
raise SystemExit('terminating')
with open("test_file.lock", "w") as lock:
while True:
lock.write("x")
time.sleep(10)
# register the cleanup handler
atexit.register(clean_lock)
# register the signal handler
signal.signal(signal.SIGTERM, signal_term_handler)
As a note: there is a file locking library you might want to look at:https://pypi.org/project/filelock/ as that should handle that use case as well.
It is not only testing for presents of a file but uses the os-file locking mechanism. I.e. not only the existence of the file is tested - but if it can be locked as well. In effect that means even if the file still exists but the previous process died it is not a problem, as the file is no longer locked.
I have been told that logging can not be used in Multiprocessing. You have to do the concurrency control in case multiprocessing messes the log.
But I did some test, it seems like there is no problem using logging in multiprocessing
import time
import logging
from multiprocessing import Process, current_process, pool
# setup log
logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.DEBUG,
format='%(asctime)s %(filename)s[line:%(lineno)d] %(levelname)s %(message)s',
datefmt='%a, %d %b %Y %H:%M:%S',
filename='/tmp/test.log',
filemode='w')
def func(the_time, logger):
proc = current_process()
while True:
if time.time() >= the_time:
logger.info('proc name %s id %s' % (proc.name, proc.pid))
return
if __name__ == '__main__':
the_time = time.time() + 5
for x in xrange(1, 10):
proc = Process(target=func, name=x, args=(the_time, logger))
proc.start()
As you can see from the code.
I deliberately let the subprocess write log at the same moment( 5s after start) to increase the chance of conflict. But there are no conflict at all.
So my question is can we use logging in multiprocessing?
Why so many posts say we can not ?
As Matino correctly explained: logging in a multiprocessing setup is not safe, as multiple processes (who do not know anything about the other ones existing) are writing into the same file, potentially intervening with each other.
Now what happens is that every process holds an open file handle and does an "append write" into that file. The question is under what circumstances the append write is "atomic" (that is, cannot be interrupted by e.g. another process writing to the same file and intermingling his output). This problem applies to every programming language, as in the end they'll do a syscall to the kernel. This answer answers under which circumstances a shared log file is ok.
It comes down to checking your pipe buffer size, on linux that is defined in /usr/include/linux/limits.h and is 4096 bytes. For other OSes you find here a good list.
That means: If your log line is less than 4'096 bytes (if on Linux), then the append is safe, if the disk is directly attached (i.e. no network in between). But for more details please check the first link in my answer. To test this you can do logger.info('proc name %s id %s %s' % (proc.name, proc.pid, str(proc.name)*5000)) with different lenghts. With 5000 for instance I got already mixed up log lines in /tmp/test.log.
In this question there are already quite a few solutions to this, so I won't add my own solution here.
Update: Flask and multiprocessing
Web frameworks like flask will be run in multiple workers if hosted by uwsgi or nginx. In that case, multiple processes may write into one log file. Will it have problems?
The error handling in flask is done via stdout/stderr which is then cought by the webserver (uwsgi, nginx, etc.) which needs to take care that logs are written in correct fashion (see e.g. this flask+nginx example), probably also adding process information so you can associate error lines to processes. From flasks doc:
By default as of Flask 0.11, errors are logged to your webserver’s log automatically. Warnings however are not.
So you'd still have this issue of intermingled log files if you use warn and the message exceeds the pipe buffer size.
It is not safe to write to a single file from multiple processes.
According to https://docs.python.org/3/howto/logging-cookbook.html#logging-to-a-single-file-from-multiple-processes
Although logging is thread-safe, and logging to a single file from
multiple threads in a single process is supported, logging to a single
file from multiple processes is not supported, because there is no
standard way to serialize access to a single file across multiple
processes in Python.
One possible solution would be to have each process write to its own file. You can achieve this by writing your own handler that adds process pid to the end of the file:
import logging.handlers
import os
class PIDFileHandler(logging.handlers.WatchedFileHandler):
def __init__(self, filename, mode='a', encoding=None, delay=0):
filename = self._append_pid_to_filename(filename)
super(PIDFileHandler, self).__init__(filename, mode, encoding, delay)
def _append_pid_to_filename(self, filename):
pid = os.getpid()
path, extension = os.path.splitext(filename)
return '{0}-{1}{2}'.format(path, pid, extension)
Then you just need to call addHandler:
logger = logging.getLogger('foo')
fh = PIDFileHandler('bar.log')
logger.addHandler(fh)
Use a queue for correct handling of concurrency simultaneously recovering from errors by feeding everything to the parent process via a pipe.
from logging.handlers import RotatingFileHandler
import multiprocessing, threading, logging, sys, traceback
class MultiProcessingLog(logging.Handler):
def __init__(self, name, mode, maxsize, rotate):
logging.Handler.__init__(self)
self._handler = RotatingFileHandler(name, mode, maxsize, rotate)
self.queue = multiprocessing.Queue(-1)
t = threading.Thread(target=self.receive)
t.daemon = True
t.start()
def setFormatter(self, fmt):
logging.Handler.setFormatter(self, fmt)
self._handler.setFormatter(fmt)
def receive(self):
while True:
try:
record = self.queue.get()
self._handler.emit(record)
except (KeyboardInterrupt, SystemExit):
raise
except EOFError:
break
except:
traceback.print_exc(file=sys.stderr)
def send(self, s):
self.queue.put_nowait(s)
def _format_record(self, record):
# ensure that exc_info and args
# have been stringified. Removes any chance of
# unpickleable things inside and possibly reduces
# message size sent over the pipe
if record.args:
record.msg = record.msg % record.args
record.args = None
if record.exc_info:
dummy = self.format(record)
record.exc_info = None
return record
def emit(self, record):
try:
s = self._format_record(record)
self.send(s)
except (KeyboardInterrupt, SystemExit):
raise
except:
self.handleError(record)
def close(self):
self._handler.close()
logging.Handler.close(self)
The handler does all the file writing from the parent process and uses just one thread to receive messages passed from child processes
QueueHandler is native in Python 3.2+, and safely handles multiprocessing logging.
Python docs have two complete examples: Logging to a single file from multiple processes
For those using Python < 3.2, just copy QueueHandler into your own code from: https://gist.github.com/vsajip/591589 or alternatively import logutils.
Each process (including the parent process) puts its logging on the Queue, and then a listener thread or process (one example is provided for each) picks those up and writes them all to a file - no risk of corruption or garbling.
Note: this question is basically a duplicate of How should I log while using multiprocessing in Python? so I've copied my answer from that question as I'm pretty sure it's currently the best solution.
I've been having an odd issue with PyCharm and subprocesses created by the multiprocessing library locking up forever. I'm using Windows with Python 3.5. What I'm trying to do is:
Start a background thread to block on stdin (waiting for input)
Have the main thread check occasionally for input from stdin and then delegate the work to Python processes created using multiprocessing
However, I've found that newly created multiprocessing Processes lock up forever if and only if the following conditions are met:
I'm running the code via Pycharm (both the latest and older versions)
The background thread is blocking on stdin
Here's the simplest example I can create that reproduces the problem:
import multiprocessing
import threading
import sys
def noop():
pass
def consume():
while True:
sys.stdin.readline()
if __name__ == '__main__':
# create a daemon thread to block on stdin
thread = threading.Thread(target=consume, daemon=True)
thread.start()
# create a background process
process = multiprocessing.Process(target=noop)
process.start()
I've Googled various combinations of "PyCharm stdin multiprocessing hang ..." and had no luck at finding an explanation, and I can't figure out why a thread of the main process blocking on stdin should ever cause a subprocess to also block/hang, let alone why it would only happen when running the script in PyCharm. The only think I can guess is that there might be some monkey-patching of either stdin or the multiprocessing library going on.
Has anyone else encountered this problem? Can anyone explain to me why this only occurs in PyCharm, and how I can make it work regardless of the Python editor I'm using?
I faced the same problem when I was trying to do multiple API calls to fetch data from a remote server. I replaced multiprocessing dummy with ThreadPoolExecutor. It works in the same way as dummy.
Following is a short snippet of a running code to write the response to a json file:
uids = [] # an array of the requisite parameters used in requests
with open('flight_config.json', 'w') as f:
futures = []
for i in range(chunk_index, len(uids)):
print('For uid[{}], fetching started:'.format(i))
chunk_index += 1
auth_token = get_header()
with ThreadPoolExecutor(max_workers=7) as executor:
future_to_url = {executor.submit(fetch_response_from_api, uid=uid, auth_token=auth_token): uid for uid in
uids[i]}
for future in concurrent.futures.as_completed(future_to_url):
result = future_to_url[future]
try:
data = future.result()
print(data)
except Exception as exc:
print('%r generated an exception: %s' % (result, exc))
else:
print('%r page is %d bytes' % (result, len(data)))
I am using autobahn[twisted] to achieve some WAMP communication. While subscribing to a topic a getting feed from it i print it. When i do it i get something like this:
2016-09-25T21:13:29+0200 (u'USDT_ETH', u'12.94669009', u'12.99998074', u'12.90000334', u'0.00035594', u'18396.86929477', u'1422.19525455', 0, u'13.14200000', u'12.80000000')
I have sacrificed too many hours to take it out. And yes, i tested other things to print, it print without this timestamp. This is my code:
from twisted.internet.defer import inlineCallbacks
from autobahn.twisted.wamp import ApplicationSession, ApplicationRunner
class PushReactor(ApplicationSession):
#inlineCallbacks
def onJoin(self, details):
print "subscribed"
yield self.subscribe(self.onTick, u'ticker')
def onTick(self, *args):
print args
if __name__ == '__main__':
runner = ApplicationRunner(u'wss://api.poloniex.com', u'realm1')
runner.run(PushReactor)
How can i remove this timestamp?
Well, sys.stderr and sys.stdout are redirected to a twisted logger.
You need to change the logging format before running you app.
See: https://twistedmatrix.com/documents/15.2.1/core/howto/logger.html
How to reproduce
You can reproduce your problem with this simple application:
from autobahn.twisted.wamp import ApplicationRunner
if __name__ == '__main__':
print("hello1")
runner = ApplicationRunner(u'wss://api.poloniex.com', u'realm1')
print("hello2")
runner.run(None)
print("hello3")
When the process is killed, you'll see:
hello1
hello2
2016-09-26T14:08:13+0200 Received SIGINT, shutting down.
2016-09-26T14:08:13+0200 Main loop terminated.
2016-09-26T14:08:13+0200 hello3
During application launching, stdout (and stderr) are redirected to a file-like object (of class twisted.logger._io.LoggingFile).
Every call to print or write are changed in twister log messages (one for each line).
The redirection is done in the class twisted.logger._global.LogBeginner, look at the beginLoggingTo method.
Given this code:
from time import sleep
class TemporaryFileCreator(object):
def __init__(self):
print 'create temporary file'
# create_temp_file('temp.txt')
def watch(self):
try:
print 'watching tempoary file'
while True:
# add_a_line_in_temp_file('temp.txt', 'new line')
sleep(4)
except (KeyboardInterrupt, SystemExit), e:
print 'deleting the temporary file..'
# delete_temporary_file('temp.txt')
sleep(3)
print str(e)
t = TemporaryFileCreator()
t.watch()
during the t.watch(), I want to close this application in the console..
I tried using CTRL+C and it works:
However, if I click the exit button:
it doesn't work.. I checked many related questions about this but it seems that I cannot find the right answer..
What I want to do:
The console can be exited while the program is still running.. to handle that, when the exit button is pressed, I want to make a cleanup of the objects (deleting of created temporary files), rollback of temporary changes, etc..
Question:
how can I handle console exit?
how can I integrate it on object destructors (__exit__())
Is it even possible? (how about py2exe?)
Note: code will be compiled on py2exe.. "hopes that the effect is the same"
You may want to have a look at signals. When a *nix terminal is closed with a running process, this process receives a couple signals. For instance this code waits for the SIGHUB hangup signal and writes a final message. This codes works under OSX and Linux. I know you are specifically asking for Windows but you might want to give it a shot or investigate what signals a Windows command prompt is emitting during shutdown.
import signal
import sys
def signal_handler(signal, frame):
with open('./log.log', 'w') as f:
f.write('event received!')
signal.signal(signal.SIGHUP, signal_handler)
print('Waiting for the final blow...')
#signal.pause() # does not work under windows
sleep(10) # so let us just wait here
Quote from the documentation:
On Windows, signal() can only be called with SIGABRT, SIGFPE, SIGILL, SIGINT, SIGSEGV, or SIGTERM. A ValueError will be raised in any other case.
Update:
Actually, the closest thing in Windows is win32api.setConsoleCtrlHandler (doc). This was already discussed here:
When using win32api.setConsoleCtrlHandler(), I'm able to receive shutdown/logoff/etc events from Windows, and cleanly shut down my app.
And if Daniel's code still works, this might be a nice way to use both (signals and CtrlHandler) for cross-platform purposes:
import os, sys
def set_exit_handler(func):
if os.name == "nt":
try:
import win32api
win32api.SetConsoleCtrlHandler(func, True)
except ImportError:
version = “.”.join(map(str, sys.version_info[:2]))
raise Exception(”pywin32 not installed for Python ” + version)
else:
import signal
signal.signal(signal.SIGTERM, func)
if __name__ == "__main__":
def on_exit(sig, func=None):
print "exit handler triggered"
import time
time.sleep(5)
set_exit_handler(on_exit)
print "Press to quit"
raw_input()
print "quit!"
If you use tempfile to create your temporary file, it will be automatically deleted when the Python process is killed.
Try it with:
>>> foo = tempfile.NamedTemporaryFile()
>>> foo.name
'c:\\users\\blah\\appdata\\local\\temp\\tmpxxxxxx'
Now check that the named file is there. You can write to and read from this file like any other.
Now kill the Python window and check that file is gone (it should be)
You can simply call foo.close() to delete it manually in your code.