OSError : [Errno 12] Not enough Space on os.execl call - python

So I was messing around with a script that is supposed to restart itself using os.execl.
It is working a few times but after ~ 30 or 40 calls it crashes;
Traceback (most recent call last):
File "C:\Users\Admin#\PycharmProjects\DiscordBot_Fred_the_Financier\test_suit.py", line 9, in <module>
os.execl(sys.executable, sys.executable, *(*sys.argv, code))
File "C:\Users\Admin#\AppData\Local\Programs\Python\Python37\lib\os.py", line 540, in execl
execv(file, args)
OSError: [Errno 12] Not enough space
So this is the whole code I'm running actually:
import sys
import os
print(sys.argv) # print args
code = "" # placeholder for mutable args
os.execl(sys.executable, sys.executable, *(*sys.argv, code)) # passing new args and replacing process
I have literally no idea why and how this error occurs.
All my drives have >200 GB free storage and my RAM more than 17 GB as well.
I'm running this code via terminal on Win10 64bit python 3.7.
Thank you very much for your help!
P.S. I apologize if there is already an answer to this problem but I could not find one.

Are you opening any huge files in your script? Most likely you are not closing those file handles and they keep accumulating. Once the script crashes all the handles will be released and you see the 200GB
While running the script, can you keep an eye on disk usage? Do you see it rising continuously? (at least after the subsequent calls of exec)
EDIT: I see now that the question was asked in 'Mar 28 '20 at 18:58'. I do not know why I saw it in main list. Question can be closed if OP does not reply or gives more info.

I haven't found why this error happens, but I am assuming it has something to do with the parent process not being able to fully close (maybe due to references left or something similar). I actually found a workaround which I am leaving here in case it helps you or someone else with the same problem.
The workaround is using subprocess.Popen instead of os.execl, and immediately adding os._exit(1). What the latter does is immediately shut down the parent process and thus free all resources. So if you do this, you won't get the 'Not enough space' error, no matter how many times you restart the process. To verify this, I made a test, by running these two lines of code in an infinite loop (although in retrospective, it's not necessary as a second iteration is impossible due to os._exit(1) statement), and leaving it running for a while, and writing to a .txt file per each process replacement. After some considerable amount of time, I stopped the final process by interrupting it with Ctrl+C, and when I opened the file, I found the value 15127, which means that the process had been replaced 15127 times before I interrupted it and I never had any exceptions raised due to space issues.
Additionally, as with os.execl, you can pass command-line arguments by passing sys.argv, for example by typing:
import os
import sys
import subprocess
print(sys.argv)
code = ""
subprocess.Popen([sys.executable, *(*sys.argv, code)])
os._exit(1)
I omitted the second sys.executable because I am not sure of what you are trying to do and if it would work in subprocess.Popen, but given the similarity of these two commands, I'd say it would work the same way if you typed subprocess.Popen(sys.executable, sys.executable, *(*sys.argv, code) instead.
EDIT: Actually, with subprocess.Popen, you need to pass all arguments as a list, as otherwise the second argument you pass will be considered a bufsize according to the method signature (see https://docs.python.org/3/library/subprocess.html#popen-constructor for further details), or as a string concatenation, if possible (for example, subprocess.Popen(sys.executable + " " + code). Also, I added a missing parenthesis.

Related

External command fails with return code 0xC0000005 when called from Python but works in console

I have a python 3.5 script running under Windows that is calling an external problem (tblastn from the BLAST+ suite to be precise) on a number of files. With most of these files it runs fine but on some it fails with return code 0xC0000005. If I take the exact same command line call and run it from the console in the same current working directory it executes fine.
I am currently running the command with subprocess.Popen, like this:
childProcess = subprocess.Popen(blast_cmd, stdin=subprocess.PIPE,
stdout=subprocess.PIPE, stderr=subprocess.PIPE,
universal_newlines=True,shell=True)
and then calling subprocess.poll() until it completes. I am multi-threading this by running four processes simultaneously but it still happens if I force it to run one at a time. The same thing happens with os.system, subprocess.run(), subprocess.call() and subprocess.check_call() and it happens whether I set shell to True or False.
Which file(s) it fails on is/are the same each time I run the code but the same file will work if put into a different list of files to process. Changing the calling method sometimes changes which files fail so using os.system can cause different files to fail compared to subprocess.Popen. Thus it doesn't appear to be down to which file I am invoking tblastn on.
Does anyone have any idea what might be causing this behaviour?
Or if anyone knows what could be different between running in created process (the documentation says it uses CreateProcess()) as compared to running from the command line then at least I'd have somewhere to start?
the error code is likely to be "Access Denied" (although there are 4 code constructs in the windows header files, the Access Denied is the most likely:
# for hex 0xc0000005 / decimal -1073741819
FILE_LOG_INFORMATION_FAILED iasmsg.h
# Information for the %1 log could not be logged to the text
# file %2 in the path %3. Error code: %0
STATUS_ACCESS_VIOLATION ntstatus.h
# The instruction at 0x%08lx referenced memory at 0x%08lx.
# The memory could not be %s.
USBD_STATUS_DEV_NOT_RESPONDING usb.h
# as an HRESULT: Severity: FAILURE (1), FACILITY_NULL (0x0), Code 0x5
# for hex 0x5 / decimal 5
ERROR_ACCESS_DENIED winerror.h
# Access is denied.
I would start by looking at the user priviledges/credentials which are used to run the original (launching/parent) script, which the childprocess/subprocess inherits its credentials from... then compare this to the credentials which are used when you "run this on cmd prompt" as you have described.
HTH,
Edwin.
sub proc that are launched programmatically often get other memory settings (heap size etc.) than interactively launched procs. So try putting some heap/mem checking wrapper around tblastn.exe.
Your description of "failing on one file but processing in another file list works" shows, that the error is not related to the failing call itself but to some condition caused by the prior activities.
output is buffered in memory. If tblastn has a lot of output take communicate() to void (or use) the output of the sub procs.
shell=True is not needed for calling executables, it's meant to execute shell build-ins. Using this you wrap tblastn with a senseless cmd.exe.

why process doesn't join and doesn't run?

i have a simple problem to solve(more or less)
if i watch python multiprocessing tutorials i see that a process should be started more or less like this:
from multiprocessing import *
def u(m):
print(m)
return
A=Process(target=u,args=(0,))
A.start()
A.join()
It should print a 0 but nothing gets printed. Instead it hangs forever at the A.join().
if i manually start the function u doing this
A.run()
it actually prints 0 on the shell but it doesn't work simultaneously
for example the output of following code:
from multiprocessing import *
from time import sleep
def u(m):
sleep(1)
print(m)
return
A=Process(target=u,args=(1,))
A.start()
print(0)
should be
0
1
but actually is
0
and if i add before the last line
A.run()
then the output becomes
1
0
this seems confusing to me...
and if i try to join the process it waits forever.
however,if it can help giving me an answer
my OS is Mac os x 10.6.8
python versions used are 3.1 and 3.3
my computer has 1 intel core i3 processor
--Update--
I have noticed that this strange behaviour is present only when launching the program from IDLE ,if i run the program from the terminal everything works as it is supposed to,so this problem must be connected to some IDLE bug.
But runnung programs from terminal is even weirder: using something like range(100000000) activates all my computer's ram until the end of the program; if i remember well this shouldn't happen in python 3,only in older python versions.
I hope these new informations will help you giving an answer
--Update 2--
the bug occurs even if i don't perform output from my process,because setting this:
def u():
return
as the target of the process and then starting it , if i try to join the process,idle waits forever
As suggested here and here, the problem is that IDLE overrides sys.stdin and sys.stdout in some weird ways, which do not propagate cleanly to processes you spawn from it (they are not real filehandles).
The first link also indicates it's unlikely to be fixed any time soon ("may be a 'cannot fix' issue", they say).
So unfortunately the only solution I can suggest is not to use IDLE for this script...
Have you tried adding A.join() to your program? I am guessing that your main process is exiting before the child process prints which is causing the output to be hidden. If you tell the main process to wait for the child process (A.join()), I bet you'll see the output you expect.
Given that it only happens with IDLE, I suspect the problem has to do with the stdout used by both processes. Perhaps it's some file-like object that's not safe to use from two different processes.
If you don't have the child process write to stdout, I suspect it will complete and join properly. For example, you could have it write to a file, instead. Or you could set up a pipe between the parent and child.
Have you tried unbuffered output? Try importing the sys module and change the print statement:
print >> sys.stderr, m
How does this affect the behavior? I'm with the others that suspect that IDLE is mucking with the stdio . . .

why would jsmin quit early when run in a python subprocess?

I have downloaded and compiled jsmin.c, and run it on a javascript file from the terminal, and it seems to work great. But when I call it from an os.system() or os.popen() call in a python script (on the same input file), the output file (the minified version of the js file) gets truncated, as though stdout were not getting flushed before the subprocess exits, or as if jsmin were terminating early, or as if maybe something were going on with disk buffering or something.
But none of those things seem to be the case. The exit value is whatever I return from main() even when the output is getting truncated, and adding a call to fflush(stdout) doesn't make any difference, and calling sync from within the same subshell after calling jsmin doesn't make any difference.
I tried replacing calls to putc() with calls to fputc(), and at first it seemed like that fixed the problem (for some unfathomable reason?), but then, inexplicably, the problem started happening again, and now happens reliably. Bizare?
I would say it's some problem with jsmin.c, but the program works fine on the same input file when run from the command line, so it has something to do with running it from a python subprocess.
Here's my subprocess call:
result = os.system('jsmin < ' + tpathname + ' > ' + tpathname + '.min')
(I have put jsmin in the path, and it is running, I get most of the expected results in the .min file.)
Can anyone imagine what might be causing this problem?
Stack Overflow wouldn't let me answer my own question for 5 hours or something like that, so this 'answer' was originally added as an edit. (It also wouldn't let me chat, so the comments stretched on for a bit.)
I found the problem. The problem was with the python program that was creating the input file for jsmin and then calling jsmin on it. It was creating a file, failing to close it (yet), and then calling jsmin on it. So jsmin was not terminating early, nor was it having it's output truncated; rather, it was operating on an (as yet) incomplete input file. (Duh.)
I would have realized this a lot earlier than I did, except that the python program was eventually closing jsmin's input file (by exiting), so by the time I was examining it, it appeared complete. It's just that it was not complete by the time jsmin was processing it.
This very thing is one of the motivations for the 'with' idiom:
with open(targetpath, 'w+') as targetfile:
..code that writes to targetfile

How to use os.spawnv to send email copy using Python?

First let me say that I know it's better to use the subprocess module, but I'm editing other people's code and I'm trying to make as few changes as possible, which includes avoiding the importing any new modules. So I'd like to stick to the currently-imported modules (os, sys, and paths) if at all possible.
The code is currently (in a file called postfix-to-mailman.py that some of you may be familiar with):
if local in ('postmaster', 'abuse', 'mailer-daemon'):
os.execv("/usr/sbin/sendmail", ("/usr/sbin/sendmail", 'first#place.com'))
sys.exit(0)
This works fine (though I think sys.exit(0) might be never be called and thus be unnecessary).
I believe this replaces the current process with a call to /usr/sbin/sendmail passing it the arguments /usr/sbin/sendmail (for argv[0] i.e. itself) and 'someaddress#someplace.com', then passes the environment of the current process - including the email message in sys.stdin - to the child process.
What I'd like to do is essentially send another copy of the message before doing this. I can't use execv again because then execution will stop. So I've tried the following:
if local in ('postmaster', 'abuse', 'mailer-daemon'):
os.spawnv(os.P_WAIT, "/usr/sbin/sendmail", ("/usr/sbin/sendmail", 'other#place.com'))
os.execv("/usr/sbin/sendmail", ("/usr/sbin/sendmail", 'first#place.com'))
sys.exit(0)
However, while it sends the message to other#place.com, it never sends it to first#place.com
This surprised me because I thought using spawn would start a child process and then continue execution in the current process when it returns (or without waiting, if P_NOWAIT is used).
Incidentally, I tried os.P_NOWAIT first, but the message I got at other#place.com was empty, so at least when I used P_WAIT the message came through intact. But it still never got sent to first#place.com which is a problem.
I'd rather not use os.system if I can avoid it because I'd rather not go out to a shell environment if it can be avoided (security issues, possible performance? I admit I'm being paranoid here, but if I can avoid os.system I'd still like to).
The only thing I can think of is that the call to os.spawnv is somehow consuming/emptying the contents of sys.stdin, but that doesn't really make sense either. Ideas?
While it might not make sense, that does appear to be the case
import os
os.spawnv(os.P_WAIT,"/usr/bin/wc", ("/usr/bin/wc",))
os.execv("/usr/bin/wc", ("/usr/bin/wc",))
$ cat j.py | python j.py
4 6 106
0 0 0
In which case you might do something like this
import os
import sys
buf = sys.stdin.read()
wc = os.popen("usr/sbin/sendmail other#place.com","w")
wc.write(buf)
wc.close()
wc = os.popen("usr/sbin/sendmail first#place.com","w")
wc.write(buf)
wc.close()
sys.exit(0)
sys.stdin is a pipe and those aren't seekable so you can never rewind that file-like object to read its contents again. To actually invoke sendmail(1) twice, you need to save the contents of stdin, preferably in a temporary file but if the data is guaranteed to have a limited size you could safe it in memory instead.
But why go through the trouble? Do you specifically need the email copy to be a separately queued email (and if so, why)? Just add the wanted recipient in your original invocation of sendmail(1). The additional recipient will not be seen in the email headers.
if local in ('postmaster', 'abuse', 'mailer-daemon'):
os.execv("/usr/sbin/sendmail", ("/usr/sbin/sendmail",
'first#place.com',
'otheruser#example.com'))
sys.exit(0)
Oh, and the sys.exit(0) line will be executed if os.execv() for some reason fails. This'll happen if /usr/sbin/sendmail cannot be executed, e.g. if the executable file doesn't exist or isn't actually executable. In other words, this is an error condition that you should take care of.

Python subprocess.Popen erroring with OSError: [Errno 12] Cannot allocate memory after period of time

Note: This question has been re-asked with a summary of all debugging attempts here.
I have a Python script that is running as a background process executing every 60 seconds. Part of that is a call to subprocess.Popen to get the output of ps.
ps = subprocess.Popen(['ps', 'aux'], stdout=subprocess.PIPE).communicate()[0]
After running for a few days, the call is erroring with:
File "/home/admin/sd-agent/checks.py", line 436, in getProcesses
File "/usr/lib/python2.4/subprocess.py", line 533, in __init__
File "/usr/lib/python2.4/subprocess.py", line 835, in _get_handles
OSError: [Errno 12] Cannot allocate memory
However the output of free on the server is:
$ free -m
total used free shared buffers cached
Mem: 894 345 549 0 0 0
-/+ buffers/cache: 345 549
Swap: 0 0 0
I have searched around for the problem and found this article which says:
Solution is to add more swap space to your server. When the kernel is forking to start the modeler or discovery process, it first ensures there's enough space available on the swap store the new process if needed.
I note that there is no available swap from the free output above. Is this likely to be the problem and/or what other solutions might there be?
Update 13th Aug 09 The code above is called every 60 seconds as part of a series of monitoring functions. The process is daemonized and the check is scheduled using sched. The specific code for the above function is:
def getProcesses(self):
self.checksLogger.debug('getProcesses: start')
# Memory logging (case 27152)
if self.agentConfig['debugMode'] and sys.platform == 'linux2':
mem = subprocess.Popen(['free', '-m'], stdout=subprocess.PIPE).communicate()[0]
self.checksLogger.debug('getProcesses: memory before Popen - ' + str(mem))
# Get output from ps
try:
self.checksLogger.debug('getProcesses: attempting Popen')
ps = subprocess.Popen(['ps', 'aux'], stdout=subprocess.PIPE).communicate()[0]
except Exception, e:
import traceback
self.checksLogger.error('getProcesses: exception = ' + traceback.format_exc())
return False
self.checksLogger.debug('getProcesses: Popen success, parsing')
# Memory logging (case 27152)
if self.agentConfig['debugMode'] and sys.platform == 'linux2':
mem = subprocess.Popen(['free', '-m'], stdout=subprocess.PIPE).communicate()[0]
self.checksLogger.debug('getProcesses: memory after Popen - ' + str(mem))
# Split out each process
processLines = ps.split('\n')
del processLines[0] # Removes the headers
processLines.pop() # Removes a trailing empty line
processes = []
self.checksLogger.debug('getProcesses: Popen success, parsing, looping')
for line in processLines:
line = line.split(None, 10)
processes.append(line)
self.checksLogger.debug('getProcesses: completed, returning')
return processes
This is part of a bigger class called checks which is initialised once when the daemon is started.
The entire checks class can be found at http://github.com/dmytton/sd-agent/blob/82f5ff9203e54d2adeee8cfed704d09e3f00e8eb/checks.py with the getProcesses function defined from line 442. This is called by doChecks() starting at line 520.
You've perhaps got a memory leak bounded by some resource limit (RLIMIT_DATA, RLIMIT_AS?) inherited by your python script. Check your *ulimit(1)*s before you run your script, and profile the script's memory usage, as others have suggested.
What do you do with the variable ps after the code snippet you show us? Do you keep a reference to it, never to be freed? Quoting the subprocess module docs:
Note: The data read is buffered in memory, so do not use this
method if the data size is large or unlimited.
... and ps aux can be verbose on a busy system...
Update
You can check rlimits from with your python script using the resource module:
import resource
print resource.getrlimit(resource.RLIMIT_DATA) # => (soft_lim, hard_lim)
print resource.getrlimit(resource.RLIMIT_AS)
If these return "unlimited" -- (-1, -1) -- then my hypothesis is incorrect and you may move on!
See also resource.getrusage, esp. the ru_??rss fields, which can help you to instrument for memory consumption from with the python script, without shelling out to an external program.
when you use popen you need to hand in close_fds=True if you want it to close extra file descriptors.
creating a new pipe, which occurs in the _get_handles function from the back trace, creates 2 file descriptors, but your current code never closes them and your eventually hitting your systems max fd limit.
Not sure why the error you're getting indicates an out of memory condition: it should be a file descriptor error as the return value of pipe() has an error code for this problem.
That swap space answer is bogus. Historically Unix systems wanted swap space available like that, but they don't work that way anymore (and Linux never worked that way). You're not even close to running out of memory, so that's not likely the actual problem - you're running out of some other limited resource.
Given where the error is occuring (_get_handles calls os.pipe() to create pipes to the child), the only real problem you could be running into is not enough free file descriptors. I would instead look for unclosed files (lsof -p on the PID of the process doing the popen). If your program really needs to keep a lot of files open at one time, then increase the user limit and/or the system limit for open file descriptors.
If you're running a background process, chances are that you've redirected your processes stdin/stdout/stderr.
In that case, append the option "close_fds=True" to your Popen call, which will prevent the child process from inheriting your redirected output. This may be the limit you're bumping into.
You might want to actually wait for all of those PS processes to finish before adding swap space.
It's not at all clear what "running as a background process executing every 60 seconds" means.
But your call to subprocess.Popen is forking a new process each time.
Update.
I'd guess that you're somehow leaving all those processes running or hung in a zombie state. However, the communicate method should clean up the spawned subprocesses.
Have you watched your process over time?
lsof
ps -aux | grep -i pname
top
All should give interesting information. I am thinking that the process is tying up resources that should be freed up. Is there a chance that it is tying up resource handles (memory blocks, streams, file handles, thread or process handles)? stdin, stdout, stderr from the spawned "ps". Memory handles, ... from many small incremental allocations. I would be very interested in seeing what the above commands display for your process when it has just finished launching and running for the first time and after 24 hours of "sitting" there launching the sub-process regularly.
Since it dies after a few days, you could have it run for only a few loops, and then restart it once a day as a workaround. That would help you in the meantime.
Jacob
You need to
ps = subprocess.Popen(["sleep", "1000"])
os.waitpid(ps.pid, 0)
to free resources.
Note: this does not work on Windows.
I don't think that the circumstances given in the Zenoss article you linked to is the only cause of this message, so it's not clear yet that swap space is definitely the problem. I would advise logging some more information even around successful calls, so that you can see the state of free memory every time just before you do the ps call.
One more thing - if you specify shell=True in the Popen call, do you see different behaviour?
Update: If not memory, the next possible culprit is indeed file handles. I would advise running the failing command under strace to see exactly which system calls are failing.
Virtual Memory matters!!!
I encountered the same issue before I add swap to my OS. The formula for virtual memory is usually like: SwapSize + 50% * PhysicalMemorySize. I finally get this resolved by either adding more physical memory or adding a Swap disk. close_fds won't work in my case.

Categories