Understanding Python fork and memory allocation errors

Understanding Python fork and memory allocation errors - python

I have a memory intensive Python application (between hundreds of MB to several GB).
I have a couple of VERY SMALL Linux executables the main application needs to run, e.g.
child = Popen("make html", cwd = r'../../docs', stdout = PIPE, shell = True)
child.wait()
When I run these external utilities (once, at the end of the long main process run) using subprocess.Popen I sometimes get OSError: [Errno 12] Cannot allocate memory.
I don't understand why... The requested process is tiny!
The system has enough memory for many more shells.
I'm using Linux (Ubuntu 12.10, 64 bits), so I guess subprocess calls Fork.
And Fork forks my existing process, thus doubling the amount of memory consumed, and fails??
What happened to "copy on write"?
Can I spawn a new process without fork (or at least without copying memory - starting fresh)?
Related:
The difference between fork(), vfork(), exec() and clone()
fork () & memory allocation behavior
Python subprocess.Popen erroring with OSError: [Errno 12] Cannot allocate memory after period of time
Python memory allocation error using subprocess.Popen

It doesn't appear that a real solution will be forthcoming (i.e. an alternate implementation of subprocess that uses vfork). So how about a cute hack? At the beginning of your process, spawn a slave that hangs around with a small memory footprint, ready to spawn your subprocesses, and keep open communication to it throughout the life of the main process.
Here's an example using rfoo (http://code.google.com/p/rfoo/) with a named unix socket called rfoosocket (you could obviously use other connection types rfoo supports, or another RPC library):
Server:
import rfoo
import subprocess
class MyHandler(rfoo.BaseHandler):
def RPopen(self, cmd):
c = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True)
c.wait()
return c.stdout.read()
rfoo.UnixServer(MyHandler).start('rfoosocket')
Client:
import rfoo
# Waste a bunch of memory before spawning the child. Swap out the RPC below
# for a straight popen to show it otherwise fails. Tweak to suit your
# available system memory.
mem = [x for x in range(100000000)]
c = rfoo.UnixConnection().connect('rfoosocket')
print rfoo.Proxy(c).RPopen('ls -l')
If you need real-time back and forth coprocess interaction with your spawned subprocesses this model probably won't work, but you might be able to hack it in. You'll presumably want to clean up the available args that can be passed to Popen based on your specific needs, but that should all be relatively straightforward.
You should also find it straightforward to launch the server at the start of the client, and to manage the socket file (or port) to be cleaned up on exit.

Related

Sharing multiple pipes with subprocess for Windows and Unix

I currently have a worker subprocess that does a lot of processing for my main application. I have my stdin and stdout already connected between the two, but now I need more than just these two between the main application and the subprocess worker, since the subprocess worker is heavily threaded and should be able to run multiple different workload at the same time.
So for each workload, I want to dynamically create a separate pipe between the main and subprocess. I don't want to run more than 1 subprocess worker in the background, and want everything to run on a single subprocess.
My problem that I ran into is creating named pipes between the main and subprocess, that will work on both Unix an Windows. Unix has a os.mkfifo() that can be used with tempfiles, but that does not work on Windows. The os.pipe() will not work, because the memory block being allocated is only for the main application, and I have no way to link that with the subprocess.
So basically,
tmp_read, tmp_write = os.pipe( )
for both tmp_read and tmp_write, they are represented as integers, or memory blocks on the main app stack. I can't send these integers to the subprocess and connect, as the subprocess has no idea what they mean. Am I missing something, or is it not possible to share an undefined number of pipes between the processes using IPC? I can't use sockets for IPC either, since the computers that this has to run on are heavily restricted and I don't want to deal with blocked ports.

Interprocess Communication between two python scripts without STDOUT

I am trying to create a Monitor script that monitors all the threads or a huge python script which has several loggers running, several thread running.
From Monitor.py i could run subprocess and forward the STDOUT which might contain my status of the threads.. but since several loggers are running i am seeing other logging in that..
Question: How can run the main script as a separate process and get custom messages, thread status without interfering with logging. ( passing PIPE as argument ? )
Main_Script.py * Runs Several Threads * Each Thread has separate Loggers.
Monitor.py * Spins up the Main_script.py * Monitors the each of the threads in MainScript.py ( may be obtain other messages from Main_script in the future)
So Far, I tried subprocess, process from Multiprocessing.
Subprocess lets me start the Main_script and forward the stdout back to monitor but I see the logging of threads coming in through the same STDOUT. I am using the “import logging “ Library to log the data from each threads to separate files.
I tried “process” from Multiprocessing. I had to call the main function of the main_script.py as a process and send a PIPE argument to it from monitor.py. Now I can’t see the Main_script.py as a separate process when I run top command.

Normally, you want to change the child process to work like a typical Unix userland tool: the logging and other side-band information goes to stderr (or to a file, or syslog, etc.), and only the actual output goes to stdout.
Then, the problem is easy: just capture stdout to a PIPE that you process, and either capture stderr to a different PIPE, or pass it through to real stderr.
If that's not appropriate for some reason, you need to come up with some other mechanism for IPC: Unix or Windows named pipes, anonymous pipes that you pass by leaking the file descriptor across the fork/exec and then pass the fd as an argument, Unix-domain sockets, TCP or UDP localhost sockets, a higher-level protocol like a web service on top of TCP sockets, mmapped files, anonymous mmaps or pipes that you pass between processes via a Unix-domain socket or Windows API calls, …
As you can see, there are a huge number of options. Without knowing anything about your problem other than that you want "custom messages", it's impossible to tell you which one you want.
While we're at it: If you can rewrite your code around multiprocessing rather than subprocess, there are nice high-level abstractions built in to that module. For example, you can use a Queue that automatically manages synchronization and blocking, and also manages pickling/unpickling so you can just pass any (picklable) object rather than having to worry about serializing to text and parsing the text. Or you can create shared memory holding arrays of int32 objects, or NumPy arrays, or arbitrary structures that you define with ctypes. And so on. Of course you could build the same abstractions yourself, without needing to use multiprocessing, but it's a lot easier when they're there out of the box.
Finally, while your question is tagged ipc and pipe, and titled "Interprocess Communication", your description refers to threads, not processes. If you actually are using a bunch of threads in a single process, you don't need any of this.
You can just stick your results on a queue.Queue, or store them in a list or deque with a Lock around it, or pass in a callback to be called with each new result, or use a higher-level abstraction like concurrent.futures.ThreadPoolExecutor and return a Future object or an iterator of Futures, etc.

On what CPU cores are my Python processes running?

The setup
I have written a pretty complex piece of software in Python (on a Windows PC). My software starts basically two Python interpreter shells. The first shell starts up (I suppose) when you double click the main.py file. Within that shell, other threads are started in the following way:
# Start TCP_thread
TCP_thread = threading.Thread(name = 'TCP_loop', target = TCP_loop, args = (TCPsock,))
TCP_thread.start()
# Start UDP_thread
UDP_thread = threading.Thread(name = 'UDP_loop', target = UDP_loop, args = (UDPsock,))
TCP_thread.start()
The Main_thread starts a TCP_thread and a UDP_thread. Although these are separate threads, they all run within one single Python shell.
The Main_threadalso starts a subprocess. This is done in the following way:
p = subprocess.Popen(['python', mySubprocessPath], shell=True)
From the Python documentation, I understand that this subprocess is running simultaneously (!) in a separate Python interpreter session/shell. The Main_threadin this subprocess is completely dedicated to my GUI. The GUI starts a TCP_thread for all its communications.
I know that things get a bit complicated. Therefore I have summarized the whole setup in this figure:
I have several questions concerning this setup. I will list them down here:
Question 1 [Solved]
Is it true that a Python interpreter uses only one CPU core at a time to run all the threads? In other words, will the Python interpreter session 1 (from the figure) run all 3 threads (Main_thread, TCP_thread and UDP_thread) on one CPU core?
Answer: yes, this is true. The GIL (Global Interpreter Lock) ensures that all threads run on one CPU core at a time.
Question 2 [Not yet solved]
Do I have a way to track which CPU core it is?
Question 3 [Partly solved]
For this question we forget about threads, but we focus on the subprocess mechanism in Python. Starting a new subprocess implies starting up a new Python interpreter instance. Is this correct?
Answer: Yes this is correct. At first there was some confusion about whether the following code would create a new Python interpreter instance:
p = subprocess.Popen(['python', mySubprocessPath], shell = True)
The issue has been clarified. This code indeed starts a new Python interpreter instance.
Will Python be smart enough to make that separate Python interpreter instance run on a different CPU core? Is there a way to track which one, perhaps with some sporadic print statements as well?
Question 4 [New question]
The community discussion raised a new question. There are apparently two approaches when spawning a new process (within a new Python interpreter instance):
# Approach 1(a)
p = subprocess.Popen(['python', mySubprocessPath], shell = True)
# Approach 1(b) (J.F. Sebastian)
p = subprocess.Popen([sys.executable, mySubprocessPath])
# Approach 2
p = multiprocessing.Process(target=foo, args=(q,))
The second approach has the obvious downside that it targets just a function - whereas I need to open up a new Python script. Anyway, are both approaches similar in what they achieve?

Q: Is it true that a Python interpreter uses only one CPU core at a time to run all the threads?
No. GIL and CPU affinity are unrelated concepts. GIL can be released during blocking I/O operations, long CPU intensive computations inside a C extension anyway.
If a thread is blocked on GIL; it is probably not on any CPU core and therefore it is fair to say that pure Python multithreading code may use only one CPU core at a time on CPython implementation.
Q: In other words, will the Python interpreter session 1 (from the figure) run all 3 threads (Main_thread, TCP_thread and UDP_thread) on one CPU core?
I don't think CPython manages CPU affinity implicitly. It is likely relies on OS scheduler to choose where to run a thread. Python threads are implemented on top of real OS threads.
Q: Or is the Python interpreter able to spread them over multiple cores?
To find out the number of usable CPUs:
>>> import os
>>> len(os.sched_getaffinity(0))
16
Again, whether or not threads are scheduled on different CPUs does not depend on Python interpreter.
Q: Suppose that the answer to Question 1 is 'multiple cores', do I have a way to track on which core each thread is running, perhaps with some sporadic print statements? If the answer to Question 1 is 'only one core', do I have a way to track which one it is?
I imagine, a specific CPU may change from one time-slot to another. You could look at something like /proc/<pid>/task/<tid>/status on old Linux kernels. On my machine, task_cpu can be read from /proc/<pid>/stat or /proc/<pid>/task/<tid>/stat:
>>> open("/proc/{pid}/stat".format(pid=os.getpid()), 'rb').read().split()[-14]
'4'
For a current portable solution, see whether psutil exposes such info.
You could restrict the current process to a set of CPUs:
os.sched_setaffinity(0, {0}) # current process on 0-th core
Q: For this question we forget about threads, but we focus on the subprocess mechanism in Python. Starting a new subprocess implies starting up a new Python interpreter session/shell. Is this correct?
Yes. subprocess module creates new OS processes. If you run python executable then it starts a new Python interpeter. If you run a bash script then no new Python interpreter is created i.e., running bash executable does not start a new Python interpreter/session/etc.
Q: Supposing that it is correct, will Python be smart enough to make that separate interpreter session run on a different CPU core? Is there a way to track this, perhaps with some sporadic print statements as well?
See above (i.e., OS decides where to run your thread and there could be OS API that exposes where the thread is run).
multiprocessing.Process(target=foo, args=(q,)).start()
multiprocessing.Process also creates a new OS process (that runs a new Python interpreter).
In reality, my subprocess is another file. So this example won't work for me.
Python uses modules to organize the code. If your code is in another_file.py then import another_file in your main module and pass another_file.foo to multiprocessing.Process.
Nevertheless, how would you compare it to p = subprocess.Popen(..)? Does it matter if I start the new process (or should I say 'python interpreter instance') with subprocess.Popen(..)versus multiprocessing.Process(..)?
multiprocessing.Process() is likely implemented on top of subprocess.Popen(). multiprocessing provides API that is similar to threading API and it abstracts away details of communication between python processes (how Python objects are serialized to be sent between processes).
If there are no CPU intensive tasks then you could run your GUI and I/O threads in a single process. If you have a series of CPU intensive tasks then to utilize multiple CPUs at once, either use multiple threads with C extensions such as lxml, regex, numpy (or your own one created using Cython) that can release GIL during long computations or offload them into separate processes (a simple way is to use a process pool such as provided by concurrent.futures).
Q: The community discussion raised a new question. There are apparently two approaches when spawning a new process (within a new Python interpreter instance):
# Approach 1(a)
p = subprocess.Popen(['python', mySubprocessPath], shell = True)
# Approach 1(b) (J.F. Sebastian)
p = subprocess.Popen([sys.executable, mySubprocessPath])
# Approach 2
p = multiprocessing.Process(target=foo, args=(q,))
"Approach 1(a)" is wrong on POSIX (though it may work on Windows). For portability, use "Approach 1(b)" unless you know you need cmd.exe (pass a string in this case, to make sure that the correct command-line escaping is used).
The second approach has the obvious downside that it targets just a function - whereas I need to open up a new Python script. Anyway, are both approaches similar in what they achieve?
subprocess creates new processes, any processes e.g., you could run a bash script. multprocessing is used to run Python code in another process. It is more flexible to import a Python module and run its function than to run it as a script. See Call python script with input with in a python script using subprocess.

Since you are using the threading module which is build up on thread. As the documentation suggests, it uses the ''POSIX thread implementation'' pthread of your OS.
The threads are managed by the OS instead of Python interpreter. So the answer will depend on the pthread library in your system. However, CPython uses GIL to prevent multiple threads from executing Python bytecodes simutanously. So they will be sequentialized. But still they can be separated to different cores, which depends on your pthread libs.
Simplly use a debugger and attach it to your python.exe. For example the GDB thread command.
Similar to question 1, the new process is managed by your OS and probably running on a different core. Use debugger or any process monitor to see it. For more details, go to the CreatProcess() documentation page.

1, 2: You have three real threads, but in CPython they're limited by GIL , so, assuming they're running pure python, code you'll see CPU usage as if only one core used.
3: As said gdlmx it's up to OS to choose a core to run a thread on,
but if you really need control, you can set process or thread affinity using
native API via ctypes. Since you are on Windows, it would be like this:
# This will run your subprocess on core#0 only
p = subprocess.Popen(['python', mySubprocessPath], shell = True)
cpu_mask = 1
ctypes.windll.kernel32.SetProcessAffinityMask(p._handle, cpu_mask)
I use here private Popen._handle for simplicty. The clean way would beOpenProcess(p.tid) etc.
And yes, subprocess runs python like everything else in another new process.

Prevent a second process from listening to the same pipe in Python

I have a process that connects to a pipe with Python 2.7's multiprocessing.Listener() and waits for a message with recv(). I run it various on Windows 7 and Ubuntu 11.
On Windows, the pipe is called \\.\pipe\some_unique_id. On Ubuntu, the pipe is called /temp/some_unique_id. Other than that, the code is the same.
All works well, until, in an unrelated bug, monit starts a SECOND copy of the same program. It tries to listen to the exact same pipe.
I had naively* expected that the second connection attempt would fail, leaving the first connection unscathed.
Instead, I find the behaviour is officially undefined.
Note that data in a pipe may become corrupted if two processes (or threads) try to read from or write to the same end of the pipe at the same time.
On Ubuntu, the earlier copies seem to be ignored, and are left without any messages, while the latest version wins.
On Windows, there is some more complex behaviour. Sometimes the original pipe raises an EOFError exception on the recv() call. Sometimes, both listeners are allowed to co-exist and each message is distributed arbitrarily.
Is there a way to open a pipe exclusively, so the second process cannot open the pipe while the first process hasn't closed it or exited?
* I could have sworn I manually tested this exact scenario, but clearly I did not.
Other SO questions I looked at:
several TCP-servers on the same port - I don't (knowngly) set SO_REUSEADDR
Can two applications listen to the same port?
accept() with sockets shared between multiple processes (based on Apache preforking) - there's no forking involved.

Named pipes have the same access symantics as regular files. Any process with read or write permission can open the pipe for reading or writing.
If you had a way to guarantee that the two instances of the Python script were invoked by processes with differing UID's or GID's, then you can implement unique access control using file permissions.
If both instances of the script have the same UID and GID, you can try file locking implemented in Skip Montanaro's FileLock hosted on github. YMMV.
A simpler way to implement this might be to create a lock file in /var/lock that contains the PID of the process creating the lock file and then check for the existence of the lock file before opening the pipe. This scheme is used by most long-running daemons but has problems when the processes that create the lock files terminate in situations that prevent them from removing the lock file.
You could also try a Python System V semaphore to prevent synchronous access.

run external program in a loop with max time limit

I wish to have a python script that runs an external program in a loop sequentially. I also want to limit each execution of the program to a max running time. If it is exceeded, then kill the program. What is the best way to accomplish this?
Thanks!

To run an external program from Python you'll normally want to use the subprocess module.
You could "roll your own" subprocess handling using os.fork() and os.execve() (or one of its exec* cousins) ... with any file descriptor plumbing and signal handling magic you like. However, the subprocess.Popen() function has implemented and exposed most of the features for what you'd want to do for you.
To arrange for the program to die after a given period of time you can have your Python script kill it after the timeout. Naturally you'll want to check to see if the process has already completed before then. Here's a dirt stupid example (using the split function from the shlex module for additional readability:
from shlex import split as splitsh
import subprocess
import time
TIMEOUT=10
cmd = splitsh('/usr/bin/sleep 60')
proc = subprocess.Popen(cmd)
time.sleep(TIMEOUT)
pstatus = proc.poll()
if pstatus is None:
proc.kill()
# Could use os.kill() to send a specific signal
# such as HUP or TERM, check status again and
# then resort to proc.kill() or os.kill() for
# SIGKILL only if necessary
As noted there are a few ways to kill your subprocess. Note that I check for "is None" rather than testing pstatus for truth. If your process completed with an exit value of zero (conventionally indicating that no error occurred) then a naïve test of the proc.poll() results would conflate that completion with the still running process status.
There are also a few ways to determine if sufficient time has passed. In this example we sleep, which is somewhat silly if there's anything else we could be doing. That just leaves our Python process (the parent of your external program) laying about idle.
You could capture the start time using time.time() then launch your subprocess, then do other work (launch other subprocesses, for example) and check the time (perhaps in a loop of other activity) until your desired timeout has been exceeded.
If any of your other activity involves file or socket (network) operations then you'd want to consider using the select module as a way to return a list of file descriptors which are readable, writable or ready with "exceptional" events. The select.select() function also takes an optional "timeout" value. A call to select.select([],[],[],x) is essentially the same as time.sleep(x) (in the case where we aren't providing any file descriptors for it to select among).
In lieu of select.select() it's also possible to use the fcntl module to set your file descriptor into a non-blocking mode and then use os.read() (NOT the normal file object .read() methods, but the lower level functionality from the os module). Again it's better to use the higher level interfaces where possible and only to resort to the lower level functions when you must. (If you use non-blocking I/O then all your os.read() or similar operations must be done within exception handling blocks, since Python will represent the "-EWOULDBLOCK" condition as an OSError (exception) like: "OSError: [Errno 11] Resource temporarily unavailable" (Linux). The precise number of the error might vary from one OS to another. However, it should be portable (at least for POSIX systems) to use the -EWOULDBLOCK value from the errno module.
(I realize I'm going down a rathole here, but information on how your program can do something useful while your child processes are running external programs is a natural extension of how to manage the timeouts for them).
Ugly details about non-blocking file I/O (including portability issues with MS Windows) have been discussed here in the past: Stackoverflow: non-blocking read on a stream in Python
As others have commented, it's better to provide more detailed questions and include short, focused snippets of code which show what effort you've already undertaken. Usually you won't find people here inclined to write tutorials rather than answers.

If you are able to use Python 3.3
From docs,
subprocess.call(args, *, stdin=None, stdout=None, stderr=None, shell=False, timeout=None)
subprocess.call(["ls", "-l"])
0
subprocess.call("exit 1", shell=True)
1
Should do the trick.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.