why would jsmin quit early when run in a python subprocess? - python

I have downloaded and compiled jsmin.c, and run it on a javascript file from the terminal, and it seems to work great. But when I call it from an os.system() or os.popen() call in a python script (on the same input file), the output file (the minified version of the js file) gets truncated, as though stdout were not getting flushed before the subprocess exits, or as if jsmin were terminating early, or as if maybe something were going on with disk buffering or something.
But none of those things seem to be the case. The exit value is whatever I return from main() even when the output is getting truncated, and adding a call to fflush(stdout) doesn't make any difference, and calling sync from within the same subshell after calling jsmin doesn't make any difference.
I tried replacing calls to putc() with calls to fputc(), and at first it seemed like that fixed the problem (for some unfathomable reason?), but then, inexplicably, the problem started happening again, and now happens reliably. Bizare?
I would say it's some problem with jsmin.c, but the program works fine on the same input file when run from the command line, so it has something to do with running it from a python subprocess.
Here's my subprocess call:
result = os.system('jsmin < ' + tpathname + ' > ' + tpathname + '.min')
(I have put jsmin in the path, and it is running, I get most of the expected results in the .min file.)
Can anyone imagine what might be causing this problem?

Stack Overflow wouldn't let me answer my own question for 5 hours or something like that, so this 'answer' was originally added as an edit. (It also wouldn't let me chat, so the comments stretched on for a bit.)
I found the problem. The problem was with the python program that was creating the input file for jsmin and then calling jsmin on it. It was creating a file, failing to close it (yet), and then calling jsmin on it. So jsmin was not terminating early, nor was it having it's output truncated; rather, it was operating on an (as yet) incomplete input file. (Duh.)
I would have realized this a lot earlier than I did, except that the python program was eventually closing jsmin's input file (by exiting), so by the time I was examining it, it appeared complete. It's just that it was not complete by the time jsmin was processing it.
This very thing is one of the motivations for the 'with' idiom:
with open(targetpath, 'w+') as targetfile:
..code that writes to targetfile

Related

How to end a python subprocess with no return?

I'm working on a BCP wrapper method in Python, but have run into an issue invoking the command with subprocess.
As far as I can tell, the BCP command doesn't return any value or indication that it has completed outside of what it prints to the terminal window, which causes subprocess.call or subprocess.run to hang while they wait for a return.
subprocess.Popen allows a manual .terminate() method, but I'm having issues getting the table to write afterwards.
The bcp command works from the command line with no issues, it loads data from a source csv according to a .fmt file and writes an error log file. My script is able to dismount the file from log path, so I would consider the command itself irrelevant and the question to be around the behavior of the subprocess module.
This is what I'm trying at the moment:
process = subprocess.Popen(bcp_command)
try:
path = Path(log_path)
sleep_counter = 0
while path.is_file() == False and sleep_counter < 16:
sleep(1)
sleep_counter +=1
finally:
process.terminate()
self.datacommand = datacommand
My idea was to check that the error log file has been written by the bcp command as a way to tell that the process had finished, however while my script no longer freezes with this, and the files are apparently being successfully written and dismounted later on in the script. The script terminates in less than the 15 seconds that the sleep loop would use to end it as well.
When the process froze my Spyder shell (and Idle, so it's not the IDE), I could force terminate it by closing the console itself and it would write to the server at least.
However it seems like by using the .terminate() the command isn't actually writing anything to the server.
I checked if a dumb 15 second time-out (it takes about 2 seconds to do the BCP with this data) would work as well, in case it was writing an error log before the load finished.
Still resulted in an empty table on SQL server.
How can I get subprocess to execute a command without hanging?
Well, it seems to be a more general issue about calling helper functions with Popen
as seen here:
https://github.com/dropbox/pyannotate/issues/67
I was able to fix the hanging issue by changing it to:
subprocess.Popen(bcp_command, close_fds = True)

OSError : [Errno 12] Not enough Space on os.execl call

So I was messing around with a script that is supposed to restart itself using os.execl.
It is working a few times but after ~ 30 or 40 calls it crashes;
Traceback (most recent call last):
File "C:\Users\Admin#\PycharmProjects\DiscordBot_Fred_the_Financier\test_suit.py", line 9, in <module>
os.execl(sys.executable, sys.executable, *(*sys.argv, code))
File "C:\Users\Admin#\AppData\Local\Programs\Python\Python37\lib\os.py", line 540, in execl
execv(file, args)
OSError: [Errno 12] Not enough space
So this is the whole code I'm running actually:
import sys
import os
print(sys.argv) # print args
code = "" # placeholder for mutable args
os.execl(sys.executable, sys.executable, *(*sys.argv, code)) # passing new args and replacing process
I have literally no idea why and how this error occurs.
All my drives have >200 GB free storage and my RAM more than 17 GB as well.
I'm running this code via terminal on Win10 64bit python 3.7.
Thank you very much for your help!
P.S. I apologize if there is already an answer to this problem but I could not find one.
Are you opening any huge files in your script? Most likely you are not closing those file handles and they keep accumulating. Once the script crashes all the handles will be released and you see the 200GB
While running the script, can you keep an eye on disk usage? Do you see it rising continuously? (at least after the subsequent calls of exec)
EDIT: I see now that the question was asked in 'Mar 28 '20 at 18:58'. I do not know why I saw it in main list. Question can be closed if OP does not reply or gives more info.
I haven't found why this error happens, but I am assuming it has something to do with the parent process not being able to fully close (maybe due to references left or something similar). I actually found a workaround which I am leaving here in case it helps you or someone else with the same problem.
The workaround is using subprocess.Popen instead of os.execl, and immediately adding os._exit(1). What the latter does is immediately shut down the parent process and thus free all resources. So if you do this, you won't get the 'Not enough space' error, no matter how many times you restart the process. To verify this, I made a test, by running these two lines of code in an infinite loop (although in retrospective, it's not necessary as a second iteration is impossible due to os._exit(1) statement), and leaving it running for a while, and writing to a .txt file per each process replacement. After some considerable amount of time, I stopped the final process by interrupting it with Ctrl+C, and when I opened the file, I found the value 15127, which means that the process had been replaced 15127 times before I interrupted it and I never had any exceptions raised due to space issues.
Additionally, as with os.execl, you can pass command-line arguments by passing sys.argv, for example by typing:
import os
import sys
import subprocess
print(sys.argv)
code = ""
subprocess.Popen([sys.executable, *(*sys.argv, code)])
os._exit(1)
I omitted the second sys.executable because I am not sure of what you are trying to do and if it would work in subprocess.Popen, but given the similarity of these two commands, I'd say it would work the same way if you typed subprocess.Popen(sys.executable, sys.executable, *(*sys.argv, code) instead.
EDIT: Actually, with subprocess.Popen, you need to pass all arguments as a list, as otherwise the second argument you pass will be considered a bufsize according to the method signature (see https://docs.python.org/3/library/subprocess.html#popen-constructor for further details), or as a string concatenation, if possible (for example, subprocess.Popen(sys.executable + " " + code). Also, I added a missing parenthesis.

Python's check_output method doesn't return output sometimes

I have a Python script which is supposed to run a large number of other scripts, each located within a subdirectory of the script's working directory. Each of these other scripts is supposed to connect to a game client and run an AI for that game. To make this run, I had to run each script over two separate threads (one for each player). The problem I'm having is that sometimes the scripts' output isn't captured. My run-code looks like this:
def run(command, name, count):
chdir(name)
output = check_output(" ".join(command), stderr = STDOUT, shell = True).split('\r')
chdir('..')
with open("results_" + str(count) + ".txt", "w") as f:
for line in output:
f.write(line)
The strange part is that it does manage to capture longer streams, but the short ones go unnoticed. How can I change my code to fix this problem?
UPDATE: I don't think it's a buffering issue because check_output("ls ..", shell = True).split('\n')[:-1] returns the expected result and that command should take much less time than the scripts I'm trying to run.
UPDATE 2: I have discovered that output is being cut for the longer runs. It turns out that the end of output is being missed for all processes that I run for some reason. This also explains why the shorter runs don't produce any output at all.

How to use os.spawnv to send email copy using Python?

First let me say that I know it's better to use the subprocess module, but I'm editing other people's code and I'm trying to make as few changes as possible, which includes avoiding the importing any new modules. So I'd like to stick to the currently-imported modules (os, sys, and paths) if at all possible.
The code is currently (in a file called postfix-to-mailman.py that some of you may be familiar with):
if local in ('postmaster', 'abuse', 'mailer-daemon'):
os.execv("/usr/sbin/sendmail", ("/usr/sbin/sendmail", 'first#place.com'))
sys.exit(0)
This works fine (though I think sys.exit(0) might be never be called and thus be unnecessary).
I believe this replaces the current process with a call to /usr/sbin/sendmail passing it the arguments /usr/sbin/sendmail (for argv[0] i.e. itself) and 'someaddress#someplace.com', then passes the environment of the current process - including the email message in sys.stdin - to the child process.
What I'd like to do is essentially send another copy of the message before doing this. I can't use execv again because then execution will stop. So I've tried the following:
if local in ('postmaster', 'abuse', 'mailer-daemon'):
os.spawnv(os.P_WAIT, "/usr/sbin/sendmail", ("/usr/sbin/sendmail", 'other#place.com'))
os.execv("/usr/sbin/sendmail", ("/usr/sbin/sendmail", 'first#place.com'))
sys.exit(0)
However, while it sends the message to other#place.com, it never sends it to first#place.com
This surprised me because I thought using spawn would start a child process and then continue execution in the current process when it returns (or without waiting, if P_NOWAIT is used).
Incidentally, I tried os.P_NOWAIT first, but the message I got at other#place.com was empty, so at least when I used P_WAIT the message came through intact. But it still never got sent to first#place.com which is a problem.
I'd rather not use os.system if I can avoid it because I'd rather not go out to a shell environment if it can be avoided (security issues, possible performance? I admit I'm being paranoid here, but if I can avoid os.system I'd still like to).
The only thing I can think of is that the call to os.spawnv is somehow consuming/emptying the contents of sys.stdin, but that doesn't really make sense either. Ideas?
While it might not make sense, that does appear to be the case
import os
os.spawnv(os.P_WAIT,"/usr/bin/wc", ("/usr/bin/wc",))
os.execv("/usr/bin/wc", ("/usr/bin/wc",))
$ cat j.py | python j.py
4 6 106
0 0 0
In which case you might do something like this
import os
import sys
buf = sys.stdin.read()
wc = os.popen("usr/sbin/sendmail other#place.com","w")
wc.write(buf)
wc.close()
wc = os.popen("usr/sbin/sendmail first#place.com","w")
wc.write(buf)
wc.close()
sys.exit(0)
sys.stdin is a pipe and those aren't seekable so you can never rewind that file-like object to read its contents again. To actually invoke sendmail(1) twice, you need to save the contents of stdin, preferably in a temporary file but if the data is guaranteed to have a limited size you could safe it in memory instead.
But why go through the trouble? Do you specifically need the email copy to be a separately queued email (and if so, why)? Just add the wanted recipient in your original invocation of sendmail(1). The additional recipient will not be seen in the email headers.
if local in ('postmaster', 'abuse', 'mailer-daemon'):
os.execv("/usr/sbin/sendmail", ("/usr/sbin/sendmail",
'first#place.com',
'otheruser#example.com'))
sys.exit(0)
Oh, and the sys.exit(0) line will be executed if os.execv() for some reason fails. This'll happen if /usr/sbin/sendmail cannot be executed, e.g. if the executable file doesn't exist or isn't actually executable. In other words, this is an error condition that you should take care of.

Asynchronously read stdout from subprocess.Popen

I am running a sub-program using subprocess.popen. When I start my Python program from the command window (cmd.exe), the program writes some info and dates in the window as the program evolves.
When I run my Python code not in a command window, it opens a new command window for this sub-program's output, and I want to avoid that. When I used the following code, it doesn't show the cmd window, but it also doesn't print the status:
p = subprocess.Popen("c:/flow/flow.exe", shell=True, stdout=subprocess.PIPE)
print p.stdout.read()
How can I show the sub-program's output in my program's output as it occurs?
Use this:
cmd = subprocess.Popen(["c:/flow/flow.exe"], stdout=subprocess.PIPE)
for line in cmd.stdout:
print line.rstrip("\n")
cmd.wait() # you may already be handling this in your current code
Note that you will still have to wait for the sub-program to flush its stdout buffer (which is commonly buffered differently when not writing to a terminal window), so you may not see each line instantaneously as the sub-program prints it (this depends on various OS details and details of the sub-program).
Also notice how I've removed the shell=True and replaced the string argument with a list, which is generally recommended.
Looking for a recipe to process Popen data asynchronously I stumbled upon http://code.activestate.com/recipes/576759-subprocess-with-async-io-pipes-class/
This looks quite promising, however I got the impression that there might be some typos in it. Not tried it yet.
It is an old post, but a common problem with a hard to find solution. Try this: http://code.activestate.com/recipes/440554-module-to-allow-asynchronous-subprocess-use-on-win/

Categories