Do python thread wait for standard output?

Do python thread wait for standard output? - python

if you run couple of threads but they all have to print to the same stdout, does this mean they have to wait on each other? so say if all 4 threads have something to write, they have to pause and wait for the stdout to be free so they can get on with their work?

Deep deep (deep deep deep...) down in the OS's system calls, yes. Modern OSes have thread-safe terminal printing routines which usually just lock around the critical sections that do the actual device access (or buffer, depending on what you're writing into and what its settings are). These waits are very short, however. Keep in mind that this is IO you're dealing with here, so the wait times are likely to be negligible relatively to actual IO execution.

It depends. If stdout is a pipe, each pipe gets a 4KB buffer which you can override when the pipe is created. Buffers are flushed when the buffer is full or with a call to flush().
If stdout is a terminal, output is usually line buffered. So until you print a newline, all threads can write to their buffers. When the newline is written, the whole buffer is dumped on the console and all other threads that are writing newlines at the same time have to wait.
Since threads do other things than writing newlines, each thread gets some CPU. So even in the worst case, the congestion should be pretty small.
There is one exception, though: If you write a lot of data or if the console is slow (like the Linux kernel debug console which uses the serial port). When the console can't cope with the amount of data, more and more threads will hang in the write of the newline waiting for the buffers to flush.

Related

Confused about buffered and unbuffered stdout/stderr in C and Python

I am confused about a couple of things when it comes to the issue of stdout and stderr being buffered/unbuffered:
1)
Is the statement "stdout/err is buffered/unbuffered" decided by my Operating System or the programming language library functions (particular the write() or print() functions) that I am working with ?
While programming in C, I have always gone by the rule that stdout is buffered while stderr is unbuffered. I have seen this in action by calling sleep() after putchar() statements within a while loop to see the individual characters being placed on stderr one by one, while only complete lines appeared in stdout. When I tried to replicate this program in python, both stderr and stdout had the same behaviour: produced complete lines - so I looked this up and found a post that said:
sys.stderr is line-buffered by default since Python 3.9.
Hence the question - because I was under the impression that the behaviour of stderr being buffered/unbuffered was decided and fixed by the OS but apparently, code libraries free to implement their own behaviour ? Can I hypothetically write a routine that writes to stdout without a buffer ?
The relevant code snippets for reference:
/* C */
while ((c = fgetc(file)) != EOF) {
fputc(c, stdout /* or stderr */);
usleep(800);
}
# Python
for line in file:
for ch in line:
print(ch, end='', file=sys.stdout) # or sys.stderr
time.sleep(0.08);
2)
Secondly, my understanding of the need for buffering is that: since disk access is slower than RAM access, writing individual bytes would be inefficient and thus bytes are written in blocks. But is writing to a device file like /dev/stdout and /dev/stdin the same as writing to disk? (Isn't disk supposed to be permanent? Stuff written to stdout or stderr only appears in the terminal, if connected, and then lost right?)
3)
Finally, is there really a need for stderr to be unbuffered in C if it is less efficient?

Is the statement "stdout/err is buffered/unbuffered" decided by my Operating System or the programming language library functions (particular the write() or print() functions) that I am working with ?
Mostly it is decided by the programming language implementation, and programming languages standardize this. For example, the C language specification says:
At program startup, three text streams are predefined and need not be
opened explicitly — standard input (for reading conventional input),
standard output (for writing conventional output), and standard error (for writing diagnostic output). As initially opened, the
standard error stream is not fully buffered; the standard input and
standard output streams are fully buffered if and only if the stream
can be determined not to refer to an interactive device.
(C2017, paragraph 7.21.3/7)
Similarly, the Python docs for sys.stdin, sys.stdout, and sys.stderr say:
When interactive, the stdout stream is line-buffered. Otherwise, it is
block-buffered like regular text files. The stderr stream is
line-buffered in both cases. You can make both streams unbuffered by
passing the -u command-line option or setting the PYTHONUNBUFFERED
environment variable.
Be aware, however, that both of those particular languages provide mechanisms to change the buffering of the standard streams (or in the Python case, at least stdout and stderr).
MOREOVER, the above is relevant only if you are using streams (C) or File objects (Python). In C, this is what all of the stdio functions use -- printf(), fgets(), fwrite(), etc. -- but it is not what (say) the POSIX raw I/O functions such as read() and write() use. If you use raw I/O interfaces such as the latter then there is only whatever buffering you perform manually.
Hence the question - because I was under the impression that the behaviour of stderr being buffered/unbuffered was decided and fixed by the OS
No. The OS (at least Unixes (including Mac) and Windows) does not perform I/O buffering on behalf of programs. Programming language implementations do, under some circumstances, and they are then in control of the details.
but apparently, code libraries free to implement their own behaviour ?
It's a bit more nuanced than that, but basically yes.
Can I hypothetically write a routine that writes to stdout without a buffer ?
Maybe. In C or Python, at least, you can exert some control over the buffering mode of the stdout stream. In C you can adjust it dynamically at runtime, but in Python I think the buffering mode is decided when Python starts.
You may also be able to bypass the buffer of a buffered stream by performing (raw) I/O on the underlying file descriptor, but this is extremely poor form, and depending on the details, it may produce undefined behavior.
Secondly, my understanding of the need for buffering is that: since disk access is slower than RAM access, writing individual bytes would be inefficient and thus bytes are written in blocks.
All I/O is slow, even I/O to a terminal. Disk I/O tends to be especially slow, but program performance generally benefits from buffering I/O to all devices.
But is writing to a device file like /dev/stdout and /dev/stdin the same as writing to disk?
Sometimes it is exactly writing to disk (look up I/O redirection). Different devices do have different performance characteristics, so buffering may improve performance more with some than with others, but again, all I/O is slow.
Finally, is there really a need for stderr to be unbuffered in C if it is less efficient?
The point of stderr being unbuffered (by default) in C is so that messages directed there are written to the underlying device (often a terminal) as soon as possible. Efficiency is not really a concern for the kinds of messages that this policy is most intended to serve.

https://linux.die.net/man/3/stderr, https://linux.die.net/man/3/setbuf, and https://linux.die.net/man/2/write are helpful resources here
If you use the raw syscall write, there won't be buffering. I'd imagine the same is true for WinAPI but I don't know.
Python and C want to make it easier to write things, so they wrap the raw syscalls with a file pointer (in C)/file object (in python). This, in addition to storing the raw file descriptor used to make the syscalls, can optionally do things like buffer to reduce the amount of syscalls you're making.
You can change the buffering settings of a file or stream. (In C that's setbuf, I'm not sure for python.)
C and Python just happen to have different default configurations of stderr's wrapper.
For 2), writing to a pipe is usually much faster than writing to disk, but it's still a relatively slow operation compared to memcpy or the like, which is what buffering essentially is. The processor has to jump into kernel mode and back.
For 3), I'd guess that C developers decided it was more important to get errors on-time than to get performance. In general, if your program is spitting out lots of data to stderr you have bigger problems than performance.

Interprocess Communication between two python scripts without STDOUT

I am trying to create a Monitor script that monitors all the threads or a huge python script which has several loggers running, several thread running.
From Monitor.py i could run subprocess and forward the STDOUT which might contain my status of the threads.. but since several loggers are running i am seeing other logging in that..
Question: How can run the main script as a separate process and get custom messages, thread status without interfering with logging. ( passing PIPE as argument ? )
Main_Script.py * Runs Several Threads * Each Thread has separate Loggers.
Monitor.py * Spins up the Main_script.py * Monitors the each of the threads in MainScript.py ( may be obtain other messages from Main_script in the future)
So Far, I tried subprocess, process from Multiprocessing.
Subprocess lets me start the Main_script and forward the stdout back to monitor but I see the logging of threads coming in through the same STDOUT. I am using the “import logging “ Library to log the data from each threads to separate files.
I tried “process” from Multiprocessing. I had to call the main function of the main_script.py as a process and send a PIPE argument to it from monitor.py. Now I can’t see the Main_script.py as a separate process when I run top command.

Normally, you want to change the child process to work like a typical Unix userland tool: the logging and other side-band information goes to stderr (or to a file, or syslog, etc.), and only the actual output goes to stdout.
Then, the problem is easy: just capture stdout to a PIPE that you process, and either capture stderr to a different PIPE, or pass it through to real stderr.
If that's not appropriate for some reason, you need to come up with some other mechanism for IPC: Unix or Windows named pipes, anonymous pipes that you pass by leaking the file descriptor across the fork/exec and then pass the fd as an argument, Unix-domain sockets, TCP or UDP localhost sockets, a higher-level protocol like a web service on top of TCP sockets, mmapped files, anonymous mmaps or pipes that you pass between processes via a Unix-domain socket or Windows API calls, …
As you can see, there are a huge number of options. Without knowing anything about your problem other than that you want "custom messages", it's impossible to tell you which one you want.
While we're at it: If you can rewrite your code around multiprocessing rather than subprocess, there are nice high-level abstractions built in to that module. For example, you can use a Queue that automatically manages synchronization and blocking, and also manages pickling/unpickling so you can just pass any (picklable) object rather than having to worry about serializing to text and parsing the text. Or you can create shared memory holding arrays of int32 objects, or NumPy arrays, or arbitrary structures that you define with ctypes. And so on. Of course you could build the same abstractions yourself, without needing to use multiprocessing, but it's a lot easier when they're there out of the box.
Finally, while your question is tagged ipc and pipe, and titled "Interprocess Communication", your description refers to threads, not processes. If you actually are using a bunch of threads in a single process, you don't need any of this.
You can just stick your results on a queue.Queue, or store them in a list or deque with a Lock around it, or pass in a callback to be called with each new result, or use a higher-level abstraction like concurrent.futures.ThreadPoolExecutor and return a Future object or an iterator of Futures, etc.

In h5py, do I need to call flush() before I close a file?

In the Python HDF5 library h5py, do I need to flush() a file before I close() it?
Or does closing the file already make sure that any data that might still be in the buffers will be written to disk?
What exactly is the point of flushing? When would flushing be necessary?

No, you do not need to flush the file before closing. Flushing is done automatically by the underlying HDF5 C library when you close the file.
As to the point of flushing. File I/O is slow compared to things like memory or cache access. If programs had to wait before data was actually on the disk each time a write was performed, that would slow things down a lot. So the actual writing to disk is buffered by at least the OS, but in many cases by the I/O library being used (e.g., the C standard I/O library). When you ask to write data to a file, it usually just means that the OS has copied your data to its own internal buffer, and will actually put it on the disk when it's convenient to do so.
Flushing overrides this buffering, at whatever level the call is made. So calling h5py.File.flush() will flush the HDF5 library buffers, but not necessarily the OS buffers. The point of this is to give the program some control over when data actually leaves a buffer.
For example, writing to the standard output is usually line-buffered. But if you really want to see the output before a newline, you can call fflush(stdout). This might make sense if you are piping the standard output of one process into another: that downstream process can start consuming the input right away, without waiting for the OS to decide it's a good time.
Another good example is making a call to fork(2). This usually copies the entire address space of a process, which means the I/O buffers as well. That may result in duplicated output, unnecessary copying, etc. Flushing a stream guarantees that the buffer is empty before forking.

Twisted, how can ProcessProtocol receive stdout w/o buffering?

I'm using external process which writes short line of output for each chunk of data processed. I would like to react after each of these lines without any additional delay. However, seems that .outReceived() of ProcessProtocol is buffered. Docs state:
.outReceived(data): This is called with data that was received from
the process' stdout pipe. Pipes tend to provide data in larger
chunks than sockets (one kilobyte is a common buffer size), so you
may not experience the "random dribs and drabs" behavior typical of
network sockets, but regardless you should be prepared to deal if you
don't get all your data in a single call. To do it properly,
outReceived ought to simply accumulate the data and put off doing
anything with it until the process has finished.
The result is, that I get output in one chunk after whole processing is done. How can I force ProcessProtocol not to buffer stdout?

I'm using external process which writes short line of output for each chunk of data processed. I would like to react after each of these lines without any additional delay.
The result is, that I get output in one chunk after whole processing is done. How can I force ProcessProtocol not to buffer stdout?
The buffering is happening in the producer process, not the consumer. Standard C library stdout is line-buffered only when connected to a terminal, otherwise it is fully-buffered. This is what causes the producer process to output data in large chunks rather than line by line when it is not connected to a terminal.
Use stdbuf utility to force producer process' stdout to be line-buffered.
If the producer process is a python script use -u python interpreter switch to completely turn off buffering of the standard streams. stdbuf utility is better though.

dealing with subprocess that floods stdout

I'm dealing with subprocess that occasionally goes into infinite loop and floods stdout with garbage. I generally need to capture stdout, except for those cases.
This discussion gives a way to limit the amount of time a subprocess takes, but the problem is that for a reasonable timeout it can produce GB's of output before being killed.
Is there a way to limit the amount of output that's captured from the process?

If you can't detect when the flooding occurs, chances are nobody else can. Since you do the capturing, you are of course free to limit your capturing, but that requires that you know when the looping has occured.
Perhaps you can use rate-limiting, if the "regular" rate is lower than the one observed when the spamming occurs?

You can connect the subprocess's stout to a file-like object that limits the amount of data it will pass to the real stdout when you call Popen. The file-like object could be a fifo or a cStringIO.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.