Is there a way to log the stdout output from a given Process when using the multiprocessing.Process class in python?
The easiest way might be to just override sys.stdout. Slightly modifying an example from the multiprocessing manual:
from multiprocessing import Process
import os
import sys
def info(title):
print title
print 'module name:', __name__
print 'parent process:', os.getppid()
print 'process id:', os.getpid()
def f(name):
sys.stdout = open(str(os.getpid()) + ".out", "w")
info('function f')
print 'hello', name
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
q = Process(target=f, args=('fred',))
q.start()
p.join()
q.join()
And running it:
$ ls
m.py
$ python m.py
$ ls
27493.out 27494.out m.py
$ cat 27493.out
function f
module name: __main__
parent process: 27492
process id: 27493
hello bob
$ cat 27494.out
function f
module name: __main__
parent process: 27492
process id: 27494
hello fred
There are only two things I would add to #Mark Rushakoff answer. When debugging, I found it really useful to change the buffering parameter of my open() calls to 0.
sys.stdout = open(str(os.getpid()) + ".out", "a", buffering=0)
Otherwise, madness, because when tail -fing the output file the results can be verrry intermittent. buffering=0 for tail -fing great.
And for completeness, do yourself a favor and redirect sys.stderr as well.
sys.stderr = open(str(os.getpid()) + "_error.out", "a", buffering=0)
Also, for convenience you might dump that into a separate process class if you wish,
class MyProc(Process):
def run(self):
# Define the logging in run(), MyProc's entry function when it is .start()-ed
# p = MyProc()
# p.start()
self.initialize_logging()
print 'Now output is captured.'
# Now do stuff...
def initialize_logging(self):
sys.stdout = open(str(os.getpid()) + ".out", "a", buffering=0)
sys.stderr = open(str(os.getpid()) + "_error.out", "a", buffering=0)
print 'stdout initialized'
Heres a corresponding gist
You can set sys.stdout = Logger() where Logger is a class whose write method (immediately, or accumulating until a \n is detected) calls logging.info (or any other way you want to log). An example of this in action.
I'm not sure what you mean by "a given" process (who's given it, what distinguishes it from all others...?), but if you mean you know what process you want to single out that way at the time you instantiate it, then you could wrap its target function (and that only) -- or the run method you're overriding in a Process subclass -- into a wrapper that performs this sys.stdout "redirection" -- and leave other processes alone.
Maybe if you nail down the specs a bit I can help in more detail...?
Here is the simple and straightforward way for capturing stdout for multiprocessing.Process and io.TextIOWrapper:
import app
import io
import sys
from multiprocessing import Process
def run_app(some_param):
out_file = open(sys.stdout.fileno(), 'wb', 0)
sys.stdout = io.TextIOWrapper(out_file, write_through=True)
app.run()
app_process = Process(target=run_app, args=('some_param',))
app_process.start()
# Use app_process.termninate() for python <= 3.7.
app_process.kill()
The log_to_stderr() function is the simplest solution.
From PYMOTW:
multiprocessing has a convenient module-level function to enable logging called log_to_stderr(). It sets up a logger object using logging and adds a handler so that log messages are sent to the standard error channel. By default, the logging level is set to NOTSET so no messages are produced. Pass a different level to initialize the logger to the level of detail desired.
import logging
from multiprocessing import Process, log_to_stderr
print("Running main script...")
def my_process(my_var):
print(f"Running my_process with {my_var}...")
# Initialize logging for multiprocessing.
log_to_stderr(logging.DEBUG)
# Start the process.
my_var = 100;
process = Process(target=my_process, args=(my_var,))
process.start()
process.kill()
This code will output both print() statements to stderr.
Related
I am using batch processing and call below function parallelly. I need to create new log file for each process
Below is sample code
import logging
def processDocument(inputfilename):
logfile=inputfilename+'.log'
logging.basicConfig(
filename=logfile,
level=logging.INFO)
//performing some function
logging.info("process completed for file")
logging.shutdown()
It is creating log file. But when I pass this function in batch for calling 20 times. Only 16 log files are getting created.
These issue can happen with threads race conditions.
If the document processing is independent from each other, I would suggest to use multiprocessing via the high level class concurrent.futures.ProcessPoolExecutor.
If you want to stick to threads because the document processing is more I/O bound, there is concurrent.futures.ThreadPoolExecutor, which provides the same interface but with threads.
Last but not least, configure properly your logging like this toy example (which uses standard threading library):
import logging
from sys import stdout
import time
import threading
def processDocument(inputfilename:str):
logfile = inputfilename + '.log'
this_thread = threading.current_thread().name
this_thread_logger = logging.getLogger(this_thread)
file_handler = logging.FileHandler(filename=logfile)
out_handler = logging.StreamHandler(stdout)
file_handler.level = logging.INFO
out_handler.level = logging.INFO
this_thread_logger.addHandler(file_handler)
this_thread_logger.addHandler(out_handler)
this_thread_logger.info(f'Processing {inputfilename} from {this_thread}...')
time.sleep(1) # processing
this_thread_logger.info(f'Processing {inputfilename} from {this_thread}... Done')
def main():
filenames = ['hello', 'hello2', 'hello3']
threads = [threading.Thread(target=processDocument, args=(name,)) for name in filenames]
for thread in threads:
thread.start()
for thread in threads:
thread.join()
logging.shutdown()
main()
I have a python-daemon process that logs to a file via a ThreadedTCPServer (inspired by the cookbook example: https://docs.python.org/2/howto/logging-cookbook.html#sending-and-receiving-logging-events-across-a-network, as I will have many such processes writing to the same file). I am controlling the spawning of the daemon process using subprocess.Popen from an ipython console, and this is how the application will be run. I am able to successfully write to the log file from both the main ipython process, as well as the daemon process, but I am unable to change the level of both by just simply setting the level of the root logger in ipython. Is this something that should be possible? Or will it require custom functionality to set the logging.level of the daemon separately?
Edit: As requested, here is an attempt to provide a pseudo-code example of what I am trying to achieve. I hope that this is a sufficient description.
daemon_script.py
import logging
import daemon
from other_module import function_to_run_as_daemon
class daemon(object):
def __init__(self):
self.daemon_name = __name__
logging.basicConfig() # <--- required, or I don't get any log messages
self.logger = logging.getLogger(self.daemon_name)
self.logger.debug( "Created logger successfully" )
def run(self):
with daemon.daemonContext( files_preserve = [self.logger.handlers[0].stream] )
self.logger.debug( "Daemonised successfully - about to enter function" )
function_to_run_as_daemon()
if __name__ == "__main__":
d = daemon()
d.run()
Then in ipython i would run something like
>>> import logging
>>> rootlogger = logging.getLogger()
>>> rootlogger.info( "test" )
INFO:root:"test"
>>> subprocess.Popen( ["python" , "daemon_script.py"] )
DEBUG:__main__:"Created logger successfully"
DEBUG:__main__:"Daemonised successfully - about to enter function"
# now i'm finished debugging and testing, i want to reduce the level for all the loggers by changing the level of the handler
# Note that I also tried changing the level of the root handler, but saw no change
>>> rootlogger.handlers[0].setLevel(logging.INFO)
>>> rootlogger.info( "test" )
INFO:root:"test"
>>> print( rootlogger.debug("test") )
None
>>> subprocess.Popen( ["python" , "daemon_script.py"] )
DEBUG:__main__:"Created logger successfully"
DEBUG:__main__:"Daemonised successfully - about to enter function"
I think that I may not be approaching this correctly, but, its not clear to me what would work better. Any advice would be appreciated.
The logger you create in your daemon won't be the same as the logger you made in ipython. You could test this to be sure, by just printing out both logger objects themselves, which will show you their memory addresses.
I think a better pattern would be be that you pass if you want to be in "debug" mode or not, when you run the daemon. In other words, call popen like this:
subprocess.Popen( ["python" , "daemon_script.py", "debug"] )
It's up to you, you could pass a string meaning "debug mode is on" as above, or you could pass the log level constant that means "debug", e.g.:
subprocess.Popen( ["python" , "daemon_script.py", "10"] )
(https://docs.python.org/2/library/logging.html#levels)
Then in the daemon's init function use argv for example, to get that argument and use it:
...
import sys
def __init__(self):
self.daemon_name = __name__
logging.basicConfig() # <--- required, or I don't get any log messages
log_level = int(sys.argv[1]) # Probably don't actually just blindly convert it without error handling
self.logger = logging.getLogger(self.daemon_name)
self.logger.setLevel(log_level)
...
I have a complex python pipeline (which code I cant change), calling multiple other scripts and other executables. The point is it takes ages to run over 8000 directories, doing some scientific analyses. So, I wrote a simple wrapper, (might not be most effective, but seems to work) using the multiprocessing module.
from os import path, listdir, mkdir, system
from os.path import join as osjoin, exists, isfile
from GffTools import Gene, Element, Transcript
from GffTools import read as gread, write as gwrite, sort as gsort
from re import match
from multiprocessing import JoinableQueue, Process
from sys import argv, exit
# some absolute paths
inbase = "/.../abfgp_in"
outbase = "/.../abfgp_out"
abfgp_cmd = "python /.../abfgp-2.rev/abfgp.py"
refGff = "/.../B0510_manual_reindexed_noSeq.gff"
# the Queue
Q = JoinableQueue()
i = 0
# define number of processes
try: num_p = int(argv[1])
except ValueError: exit("Wrong CPU argument")
# This is the function calling the abfgp.py script, which in its turn calls alot of third party software
def abfgp(id_, pid):
out = osjoin(outbase, id_)
if not exists(out): mkdir(out)
# logfile
log = osjoin(outbase, "log_process_%s" %(pid))
try:
# call the script
system("%s --dna %s --multifasta %s --target %s -o %s -q >>%s" %(abfgp_cmd, osjoin(inbase, id_, id_ +".dna.fa"), osjoin(inbase, id_, "informants.mfa"), id_, out, log))
except:
print "ABFGP FAILED"
return
# parse the output
def extractGff(id_):
# code not relevant
# function called by multiple processes, using the Queue
def run(Q, pid):
while not Q.empty():
try:
d = Q.get()
print "%s\t=>>\t%s" %(str(i-Q.qsize()), d)
abfgp(d, pid)
Q.task_done()
except KeyboardInterrupt:
exit("Interrupted Child")
# list of directories
genedirs = [d for d in listdir(inbase)]
genes = gread(refGff)
for d in genedirs:
i += 1
indir = osjoin(inbase, d)
outdir = osjoin(outbase, d)
Q.put(d)
# this loop creates the multiple processes
procs = []
for pid in range(num_p):
try:
p = Process(target=run, args=(Q, pid+1))
p.daemon = True
procs.append(p)
p.start()
except KeyboardInterrupt:
print "Aborting start of child processes"
for x in procs:
x.terminate()
exit("Interrupted")
try:
for p in procs:
p.join()
except:
print "Terminating child processes"
for x in procs:
x.terminate()
exit("Interrupted")
print "Parsing output..."
for d in genedirs: extractGff(d)
Now the problem is, abfgp.py uses the os.chdir function, which seems to disrupt the parallel processing. I get a lot of errors, stating that some (input/output) files/directories cannot be found for reading/writing. Even though I call the script through os.system(), from which I though spawning separate processes would prevent this.
How can I work around these chdir interference?
Edit: I might change os.system() to subprocess.Popen(cwd="...") with the right directory. I hope this makes a difference.
Thanks.
Edit 2
Do not use os.system() use subprocess.call()
system("%s --dna %s --multifasta %s --target %s -o %s -q >>%s" %(abfgp_cmd, osjoin(inbase, id_, id_ +".dna.fa"), osjoin(inbase, id_, "informants.mfa"), id_, out, log))
would translate to
subprocess.call((abfgp_cmd, '--dna', osjoin(inbase, id_, id_ +".dna.fa"), '--multifasta', osjoin(inbase, id_, "informants.mfa"), '--target', id_, '-o', out, '-q')) # without log.
Edit 1
I think the problem is that multiprocessing is using the module names to serialize functions, classes.
This means if you do import module where module is in ./module.py and the you do something like os.chdir('./dir') now you would need to from .. import module.
The child processes inherit the folder of the parent process. This may be a problem.
Solutions
Make sure that all modules are imported (in the child processes) and after this you change the directory
insert the original os.getcwd() to sys.path to enable import from the original directory. This must be done before any functions are called from the local directory.
put all functions that you use inside a directory that can always be imported. The site-packages could be such a directory. Then you can do something like import module module.main() to start what you do.
This is a hack that I do because I know how pickle works. Only use this if other attempts fail.
The script prints:
serialized # the function runD is serialized
string executed # before the function is loaded the code is executed
loaded # now the function run is deserialized
run # run is called
In you case you would do something like this:
runD = evalBeforeDeserialize('__import__("sys").path.append({})'.format(repr(os.getcwd())), run)
p = Process(target=runD, args=(Q, pid+1))
This is the script:
# functions that you need
class R(object):
def __init__(self, call, *args):
self.ret = (call, args)
def __reduce__(self):
return self.ret
def __call__(self, *args, **kw):
raise NotImplementedError('this should never be called')
class evalBeforeDeserialize(object):
def __init__(self, string, function):
self.function = function
self.string = string
def __reduce__(self):
return R(getattr, tuple, '__getitem__'), \
((R(eval, self.string), self.function), -1)
# code to show how it works
def printing():
print('string executed')
def run():
print('run')
runD = evalBeforeDeserialize('__import__("__main__").printing()', run)
import pickle
s = pickle.dumps(runD)
print('serialized')
run2 = pickle.loads(s)
print('loaded')
run2()
Please report back if these do not work.
You could determine which instance of the os library the unalterable program is using; then create a tailored version of chdir in that library that does what you need -- prevent the directory change, log it, whatever. If the tailored behavior needs to be just for the single program, you can use the inspect module to identify the caller and tailor the behavior in a specific way for just that caller.
Your options are limited if you truly can't alter the existing program; but if you have the option of altering libraries it imports, something like this could be a least-invasive way to skirt the undesired behavior.
Usual caveats apply when altering a standard library.
I have a large project consisting of sufficiently large number of modules, each printing something to the standard output. Now as the project has grown in size, there are large no. of print statements printing a lot on the std out which has made the program considerably slower.
So, I now want to decide at runtime whether or not to print anything to the stdout. I cannot make changes in the modules as there are plenty of them. (I know I can redirect the stdout to a file but even this is considerably slow.)
So my question is how do I redirect the stdout to nothing ie how do I make the print statement do nothing?
# I want to do something like this.
sys.stdout = None # this obviously will give an error as Nonetype object does not have any write method.
Currently the only idea I have is to make a class which has a write method (which does nothing) and redirect the stdout to an instance of this class.
class DontPrint(object):
def write(*args): pass
dp = DontPrint()
sys.stdout = dp
Is there an inbuilt mechanism in python for this? Or is there something better than this?
Cross-platform:
import os
import sys
f = open(os.devnull, 'w')
sys.stdout = f
On Windows:
f = open('nul', 'w')
sys.stdout = f
On Linux:
f = open('/dev/null', 'w')
sys.stdout = f
A nice way to do this is to create a small context processor that you wrap your prints in. You then just use is in a with-statement to silence all output.
Python 2:
import os
import sys
from contextlib import contextmanager
#contextmanager
def silence_stdout():
old_target = sys.stdout
try:
with open(os.devnull, "w") as new_target:
sys.stdout = new_target
yield new_target
finally:
sys.stdout = old_target
with silence_stdout():
print("will not print")
print("this will print")
Python 3.4+:
Python 3.4 has a context processor like this built-in, so you can simply use contextlib like this:
import contextlib
with contextlib.redirect_stdout(None):
print("will not print")
print("this will print")
If the code you want to surpress writes directly to sys.stdout using None as redirect target won't work. Instead you can use:
import contextlib
import sys
import os
with contextlib.redirect_stdout(open(os.devnull, 'w')):
sys.stdout.write("will not print")
sys.stdout.write("this will print")
If your code writes to stderr instead of stdout, you can use contextlib.redirect_stderr instead of redirect_stdout.
Running this code only prints the second line of output, not the first:
$ python test.py
this will print
This works cross-platform (Windows + Linux + Mac OSX), and is cleaner than the ones other answers imho.
If you're in python 3.4 or higher, there's a simple and safe solution using the standard library:
import contextlib
with contextlib.redirect_stdout(None):
print("This won't print!")
(at least on my system) it appears that writing to os.devnull is about 5x faster than writing to a DontPrint class, i.e.
#!/usr/bin/python
import os
import sys
import datetime
ITER = 10000000
def printlots(out, it, st="abcdefghijklmnopqrstuvwxyz1234567890"):
temp = sys.stdout
sys.stdout = out
i = 0
start_t = datetime.datetime.now()
while i < it:
print st
i = i+1
end_t = datetime.datetime.now()
sys.stdout = temp
print out, "\n took", end_t - start_t, "for", it, "iterations"
class devnull():
def write(*args):
pass
printlots(open(os.devnull, 'wb'), ITER)
printlots(devnull(), ITER)
gave the following output:
<open file '/dev/null', mode 'wb' at 0x7f2b747044b0>
took 0:00:02.074853 for 10000000 iterations
<__main__.devnull instance at 0x7f2b746bae18>
took 0:00:09.933056 for 10000000 iterations
If you're in a Unix environment (Linux included), you can redirect output to /dev/null:
python myprogram.py > /dev/null
And for Windows:
python myprogram.py > nul
You can just mock it.
import mock
sys.stdout = mock.MagicMock()
Your class will work just fine (with the exception of the write() method name -- it needs to be called write(), lowercase). Just make sure you save a copy of sys.stdout in another variable.
If you're on a *NIX, you can do sys.stdout = open('/dev/null'), but this is less portable than rolling your own class.
How about this:
from contextlib import ExitStack, redirect_stdout
import os
with ExitStack() as stack:
if should_hide_output():
null_stream = open(os.devnull, "w")
stack.enter_context(null_stream)
stack.enter_context(redirect_stdout(null_stream))
noisy_function()
This uses the features in the contextlib module to hide the output of whatever command you are trying to run, depending on the result of should_hide_output(), and then restores the output behavior after that function is done running.
If you want to hide standard error output, then import redirect_stderr from contextlib and add a line saying stack.enter_context(redirect_stderr(null_stream)).
The main downside it that this only works in Python 3.4 and later versions.
sys.stdout = None
It is OK for print() case. But it can cause an error if you call any method of sys.stdout, e.g. sys.stdout.write().
There is a note in docs:
Under some conditions stdin, stdout and stderr as well as the original
values stdin, stdout and stderr can be None. It is usually
the case for Windows GUI apps that aren’t connected to a console and
Python apps started with pythonw.
Supplement to iFreilicht's answer - it works for both python 2 & 3.
import sys
class NonWritable:
def write(self, *args, **kwargs):
pass
class StdoutIgnore:
def __enter__(self):
self.stdout_saved = sys.stdout
sys.stdout = NonWritable()
return self
def __exit__(self, *args):
sys.stdout = self.stdout_saved
with StdoutIgnore():
print("This won't print!")
If you don't want to deal with resource-allocation nor rolling your own class, you may want to use TextIO from Python typing. It has all required methods stubbed for you by default.
import sys
from typing import TextIO
sys.stdout = TextIO()
There are a number of good answers in the flow, but here is my Python 3 answer (when sys.stdout.fileno() isn't supported anymore) :
import os
import sys
oldstdout = os.dup(1)
oldstderr = os.dup(2)
oldsysstdout = sys.stdout
oldsysstderr = sys.stderr
# Cancel all stdout outputs (will be lost) - optionally also cancel stderr
def cancel_stdout(stderr=False):
sys.stdout.flush()
devnull = open('/dev/null', 'w')
os.dup2(devnull.fileno(), 1)
sys.stdout = devnull
if stderr:
os.dup2(devnull.fileno(), 2)
sys.stderr = devnull
# Redirect all stdout outputs to a file - optionally also redirect stderr
def reroute_stdout(filepath, stderr=False):
sys.stdout.flush()
file = open(filepath, 'w')
os.dup2(file.fileno(), 1)
sys.stdout = file
if stderr:
os.dup2(file.fileno(), 2)
sys.stderr = file
# Restores stdout to default - and stderr
def restore_stdout():
sys.stdout.flush()
sys.stdout.close()
os.dup2(oldstdout, 1)
os.dup2(oldstderr, 2)
sys.stdout = oldsysstdout
sys.stderr = oldsysstderr
To use it:
Cancel all stdout and stderr outputs with:
cancel_stdout(stderr=True)
Route all stdout (but not stderr) to a file:
reroute_stdout('output.txt')
To restore stdout and stderr:
restore_stdout()
Why don't you try this?
sys.stdout.close()
sys.stderr.close()
Will add some example to the numerous answers here:
import argparse
import contextlib
class NonWritable:
def write(self, *args, **kwargs):
pass
parser = argparse.ArgumentParser(description='my program')
parser.add_argument("-p", "--param", help="my parameter", type=str, required=True)
#with contextlib.redirect_stdout(None): # No effect as `argparse` will output to `stderr`
#with contextlib.redirect_stderr(None): # AttributeError: 'NoneType' object has no attribute 'write'
with contextlib.redirect_stderr(NonWritable): # this works!
args = parser.parse_args()
The normal output would be:
>python TEST.py
usage: TEST.py [-h] -p PARAM
TEST.py: error: the following arguments are required: -p/--param
I use this. Redirect stdout to a string, which you subsequently ignore. I use a context manager to save and restore the original setting for stdout.
from io import StringIO
...
with StringIO() as out:
with stdout_redirected(out):
# Do your thing
where stdout_redirected is defined as:
from contextlib import contextmanager
#contextmanager
def stdout_redirected(new_stdout):
save_stdout = sys.stdout
sys.stdout = new_stdout
try:
yield None
finally:
sys.stdout = save_stdout
How do I redirect stdout to an arbitrary file in Python?
When a long-running Python script (e.g, web application) is started from within the ssh session and backgounded, and the ssh session is closed, the application will raise IOError and fail the moment it tries to write to stdout. I needed to find a way to make the application and modules output to a file rather than stdout to prevent failure due to IOError. Currently, I employ nohup to redirect output to a file, and that gets the job done, but I was wondering if there was a way to do it without using nohup, out of curiosity.
I have already tried sys.stdout = open('somefile', 'w'), but this does not seem to prevent some external modules from still outputting to terminal (or maybe the sys.stdout = ... line did not fire at all). I know it should work from simpler scripts I've tested on, but I also didn't have time yet to test on a web application yet.
If you want to do the redirection within the Python script, setting sys.stdout to a file object does the trick:
# for python3
import sys
with open('file', 'w') as sys.stdout:
print('test')
A far more common method is to use shell redirection when executing (same on Windows and Linux):
$ python3 foo.py > file
There is contextlib.redirect_stdout() function in Python 3.4+:
from contextlib import redirect_stdout
with open('help.txt', 'w') as f:
with redirect_stdout(f):
print('it now prints to `help.text`')
It is similar to:
import sys
from contextlib import contextmanager
#contextmanager
def redirect_stdout(new_target):
old_target, sys.stdout = sys.stdout, new_target # replace sys.stdout
try:
yield new_target # run some code with the replaced stdout
finally:
sys.stdout = old_target # restore to the previous value
that can be used on earlier Python versions. The latter version is not reusable. It can be made one if desired.
It doesn't redirect the stdout at the file descriptors level e.g.:
import os
from contextlib import redirect_stdout
stdout_fd = sys.stdout.fileno()
with open('output.txt', 'w') as f, redirect_stdout(f):
print('redirected to a file')
os.write(stdout_fd, b'not redirected')
os.system('echo this also is not redirected')
b'not redirected' and 'echo this also is not redirected' are not redirected to the output.txt file.
To redirect at the file descriptor level, os.dup2() could be used:
import os
import sys
from contextlib import contextmanager
def fileno(file_or_fd):
fd = getattr(file_or_fd, 'fileno', lambda: file_or_fd)()
if not isinstance(fd, int):
raise ValueError("Expected a file (`.fileno()`) or a file descriptor")
return fd
#contextmanager
def stdout_redirected(to=os.devnull, stdout=None):
if stdout is None:
stdout = sys.stdout
stdout_fd = fileno(stdout)
# copy stdout_fd before it is overwritten
#NOTE: `copied` is inheritable on Windows when duplicating a standard stream
with os.fdopen(os.dup(stdout_fd), 'wb') as copied:
stdout.flush() # flush library buffers that dup2 knows nothing about
try:
os.dup2(fileno(to), stdout_fd) # $ exec >&to
except ValueError: # filename
with open(to, 'wb') as to_file:
os.dup2(to_file.fileno(), stdout_fd) # $ exec > to
try:
yield stdout # allow code to be run with the redirected stdout
finally:
# restore stdout to its previous value
#NOTE: dup2 makes stdout_fd inheritable unconditionally
stdout.flush()
os.dup2(copied.fileno(), stdout_fd) # $ exec >&copied
The same example works now if stdout_redirected() is used instead of redirect_stdout():
import os
import sys
stdout_fd = sys.stdout.fileno()
with open('output.txt', 'w') as f, stdout_redirected(f):
print('redirected to a file')
os.write(stdout_fd, b'it is redirected now\n')
os.system('echo this is also redirected')
print('this is goes back to stdout')
The output that previously was printed on stdout now goes to output.txt as long as stdout_redirected() context manager is active.
Note: stdout.flush() does not flush
C stdio buffers on Python 3 where I/O is implemented directly on read()/write() system calls. To flush all open C stdio output streams, you could call libc.fflush(None) explicitly if some C extension uses stdio-based I/O:
try:
import ctypes
from ctypes.util import find_library
except ImportError:
libc = None
else:
try:
libc = ctypes.cdll.msvcrt # Windows
except OSError:
libc = ctypes.cdll.LoadLibrary(find_library('c'))
def flush(stream):
try:
libc.fflush(None)
stream.flush()
except (AttributeError, ValueError, IOError):
pass # unsupported
You could use stdout parameter to redirect other streams, not only sys.stdout e.g., to merge sys.stderr and sys.stdout:
def merged_stderr_stdout(): # $ exec 2>&1
return stdout_redirected(to=sys.stdout, stdout=sys.stderr)
Example:
from __future__ import print_function
import sys
with merged_stderr_stdout():
print('this is printed on stdout')
print('this is also printed on stdout', file=sys.stderr)
Note: stdout_redirected() mixes buffered I/O (sys.stdout usually) and unbuffered I/O (operations on file descriptors directly). Beware, there could be buffering issues.
To answer, your edit: you could use python-daemon to daemonize your script and use logging module (as #erikb85 suggested) instead of print statements and merely redirecting stdout for your long-running Python script that you run using nohup now.
you can try this too much better
import sys
class Logger(object):
def __init__(self, filename="Default.log"):
self.terminal = sys.stdout
self.log = open(filename, "a")
def write(self, message):
self.terminal.write(message)
self.log.write(message)
sys.stdout = Logger("yourlogfilename.txt")
print "Hello world !" # this is should be saved in yourlogfilename.txt
The other answers didn't cover the case where you want forked processes to share your new stdout.
To do that:
from os import open, close, dup, O_WRONLY
old = dup(1)
close(1)
open("file", O_WRONLY) # should open on 1
..... do stuff and then restore
close(1)
dup(old) # should dup to 1
close(old) # get rid of left overs
Quoted from PEP 343 -- The "with" Statement (added import statement):
Redirect stdout temporarily:
import sys
from contextlib import contextmanager
#contextmanager
def stdout_redirected(new_stdout):
save_stdout = sys.stdout
sys.stdout = new_stdout
try:
yield None
finally:
sys.stdout = save_stdout
Used as follows:
with open(filename, "w") as f:
with stdout_redirected(f):
print "Hello world"
This isn't thread-safe, of course, but neither is doing this same dance manually. In single-threaded programs (for example in scripts) it is a popular way of doing things.
import sys
sys.stdout = open('stdout.txt', 'w')
Here is a variation of Yuda Prawira answer:
implement flush() and all the file attributes
write it as a contextmanager
capture stderr also
.
import contextlib, sys
#contextlib.contextmanager
def log_print(file):
# capture all outputs to a log file while still printing it
class Logger:
def __init__(self, file):
self.terminal = sys.stdout
self.log = file
def write(self, message):
self.terminal.write(message)
self.log.write(message)
def __getattr__(self, attr):
return getattr(self.terminal, attr)
logger = Logger(file)
_stdout = sys.stdout
_stderr = sys.stderr
sys.stdout = logger
sys.stderr = logger
try:
yield logger.log
finally:
sys.stdout = _stdout
sys.stderr = _stderr
with log_print(open('mylogfile.log', 'w')):
print('hello world')
print('hello world on stderr', file=sys.stderr)
# you can capture the output to a string with:
# with log_print(io.StringIO()) as log:
# ....
# print('[captured output]', log.getvalue())
You need a terminal multiplexer like either tmux or GNU screen
I'm surprised that a small comment by Ryan Amos' to the original question is the only mention of a solution far preferable to all the others on offer, no matter how clever the python trickery may be and how many upvotes they've received. Further to Ryan's comment, tmux is a nice alternative to GNU screen.
But the principle is the same: if you ever find yourself wanting to leave a terminal job running while you log-out, head to the cafe for a sandwich, pop to the bathroom, go home (etc) and then later, reconnect to your terminal session from anywhere or any computer as though you'd never been away, terminal multiplexers are the answer. Think of them as VNC or remote desktop for terminal sessions. Anything else is a workaround. As a bonus, when the boss and/or partner comes in and you inadvertently ctrl-w / cmd-w your terminal window instead of your browser window with its dodgy content, you won't have lost the last 18 hours-worth of processing!
Based on this answer: https://stackoverflow.com/a/5916874/1060344, here is another way I figured out which I use in one of my projects. For whatever you replace sys.stderr or sys.stdout with, you have to make sure that the replacement complies with file interface, especially if this is something you are doing because stderr/stdout are used in some other library that is not under your control. That library may be using other methods of file object.
Check out this way where I still let everything go do stderr/stdout (or any file for that matter) and also send the message to a log file using Python's logging facility (but you can really do anything with this):
class FileToLogInterface(file):
'''
Interface to make sure that everytime anything is written to stderr, it is
also forwarded to a file.
'''
def __init__(self, *args, **kwargs):
if 'cfg' not in kwargs:
raise TypeError('argument cfg is required.')
else:
if not isinstance(kwargs['cfg'], config.Config):
raise TypeError(
'argument cfg should be a valid '
'PostSegmentation configuration object i.e. '
'postsegmentation.config.Config')
self._cfg = kwargs['cfg']
kwargs.pop('cfg')
self._logger = logging.getlogger('access_log')
super(FileToLogInterface, self).__init__(*args, **kwargs)
def write(self, msg):
super(FileToLogInterface, self).write(msg)
self._logger.info(msg)
Programs written in other languages (e.g. C) have to do special magic (called double-forking) expressly to detach from the terminal (and to prevent zombie processes). So, I think the best solution is to emulate them.
A plus of re-executing your program is, you can choose redirections on the command-line, e.g. /usr/bin/python mycoolscript.py 2>&1 1>/dev/null
See this post for more info: What is the reason for performing a double fork when creating a daemon?
I know this question is answered (using python abc.py > output.log 2>&1 ), but I still have to say:
When writing your program, don't write to stdout. Always use logging to output whatever you want. That would give you a lot of freedom in the future when you want to redirect, filter, rotate the output files.
As mentioned by #jfs, most solutions will not properly handle some types of stdout output such as that from C extensions. There is a module that takes care of all this on PyPI called wurlitzer. You just need its sys_pipes context manager. It's as easy as using:
from contextlib import redirect_stdout
import os
from wurlitzer import sys_pipes
log = open("test.log", "a")
with redirect_stdout(log), sys_pipes():
print("print statement")
os.system("echo echo call")
Based on previous answers on this post I wrote this class for myself as a more compact and flexible way of redirecting the output of pieces of code - here just to a list - and ensure that the output is normalized afterwards.
class out_to_lt():
def __init__(self, lt):
if type(lt) == list:
self.lt = lt
else:
raise Exception("Need to pass a list")
def __enter__(self):
import sys
self._sys = sys
self._stdout = sys.stdout
sys.stdout = self
return self
def write(self,txt):
self.lt.append(txt)
def __exit__(self, type, value, traceback):
self._sys.stdout = self._stdout
Used as:
lt = []
with out_to_lt(lt) as o:
print("Test 123\n\n")
print(help(str))
Updating. Just found a scenario where I had to add two extra methods, but was easy to adapt:
class out_to_lt():
...
def isatty(self):
return True #True: You're running in a real terminal, False:You're being piped, redirected, cron
def flush(self):
pass
There are other versions using context but nothing this simple. I actually just googled to double check it would work and was surprised not to see it, so for other people looking for a quick solution that is safe and directed at only the code within the context block, here it is:
import sys
with open('test_file', 'w') as sys.stdout:
print('Testing 1 2 3')
Tested like so:
$ cat redirect_stdout.py
import sys
with open('test_file', 'w') as sys.stdout:
print('Testing 1 2 3')
$ python redirect_stdout.py
$ cat test_file
Testing 1 2 3