Add a subprocess to another one - python

I would like to add a new subprocess to an existing subprocess.
I do not want a parallel execution of both processes, I just want to add the new subprocess (qpdf: which suppose to linearize the output pdf from pdfjam) to the existing one (pdfjam: which adds several pdf together), s. code below, but the newly added subprocess has not been executed, only the existing one.
How can I fix this?
This is the concerning part of the code:
tmp_pdf_file = tmpdir + "/" + basename + ".pdf"
if os.path.exists(tmp_pdf_file):
subprocess.call(["pdfjam", "--keepinfo", "--noautoscale", "true", "--frame", "true", tmp_pdf_file, "-o", target_file_a4], cwd=tmpdir) # existing subprocess
self.progress.step()
subprocess.call(["qpdf", "--linearize", target_file_a4, target_file_a4], cwd=tmpdir) # new subprocess
self.progress.step()
shutil.move(tmp_pdf_file, target_file)
else:
print("No pdf file generated!")
Edit: I think the second subprocess has difficulties to find target_file_a4, even though there is no error message.
Edit2: The code runs through without any error message (I am not quite sure if I am able to implement the suggest communicate due to my few python skills). I check if the output pdf is linearized by pdfinfo foobar.pdf, which gives me Optimized: no = linearized: no, see manpage.
Edit3: The output is: Extract init b"----\n pdfjam: This is pdfjam version 2.08.\n pdfjam: Reading any site-wide or user-specific defaults...\n (none found)\n pdfjam: Effective call for this run of pdfjam:\n /usr/bin/pdfjam --keepinfo --noautoscale 'true' --frame 'true' --outfile /path/to/foobar.pdf -- /tmp/extract-5j70hk9o/foobar.pdf - \n pdfjam: Calling pdfinfo...\n pdfjam: Calling pdflatex...\n pdfjam: Finished. Output was to '/path/to/foobar.pdf'.\n" b'/path/to/foobar.pdf (object 579 0, file position 7769200): EOF while reading token\n'
I might add that I use python 3.5.0.
Edit4:
I did my research this morning and have to comment that I have added the code snippet of cmidi to both subprocess, to the one with pdfjam and to the one with qpdf. When I did it only with the first subprocess than I received the same message as posted in Edit3 without the b'/path/to/foobar.pdf (object 579 0, file position 7769200): EOF while reading token\n' line.
Pdfjam works fine (since the output pdf is one complete; pdflatex creates several separate ones). It clearly has to do with qpdf, see here.

Related

Executing subprocess cannot find specified file on Windows

I'm working inside a system that has Jython2.5 but I need to be able to call some of Google's apis so I wrote an offline script that I wanted to call from my Jython environment and return to me small pieces of data. Like a JobID or a sheet URL or something from Google.
I've tried a number of things but I always get an error back from Windows, saying that it cannot find the file specified.
Path is done in two ways.
The first way using a string
stringPath = r"‪C:\GooglePipes\Scripts\filetobq.py C:\GooglePipes\Keys\DEV-BigQueryKey.json nofile C:\GooglePipes\BQ_Downtime\TESTFILE.CSV dataset1 table1"
And the second way, as a sequence (per the docs, using shell=false supply a sequence)
seqPath = [r"‪C:\GooglePipes\Scripts\filetobq.py",r"C:\GooglePipes\Keys\DEV-BigQueryKey.json","nofile",r"C:\GooglePipes\BQ_Downtime\TESTFILE.CSV","dataset1","table1"]
Called with
data, err = Popen(seqPath, shell=True, stderr=PIPE, stdout=PIPE).communicate()
#Read values back in
print data
print err
Replacing seqPath with stringPath to try it either way.
I've been at this all weekend, every time I run it I get from Windows
The system cannot find the path specified.
from the err print. I've been unable to debug much further than this. I'm not really sure what's happening. When I paste the stringPath variable directly into my computer's command window it executes.
I've also called subprocess.list2cmdline(seqPath) to see what it's outputting. It's giving me a ? in front of the string, but I haven't been able to figure out what that means. I can paste the rest of the string, starting after the question mark into the command window and it executes.
?C:\GooglePipes\Scripts\filetobq.py C:\GooglePipes...
I've tried a number of different combinations of true and false on shell, passing different args into Popen, double slashes, and I have no less than 30 tabs open from stack overflow and other help forums. I just have no idea what to do at this point and any help is appreciated.
Edit
The ? at the start of the sting is actually a NULL character when I did some additional logging. This seems to be the root of my problem. I can't figure out why it shows up, but it was present in my copy pastes. I started manually typing, and I got it working. When I feed the path with my Jython program it is present again.
Ultimately the error was the ?/NULL character.
I went back to the source value where the program was grabbing the path and it was present there. After I hand-re keyed it in, everything started working.
If you copy and paste what I put in the question, you can see the NULL character in the string if you run it through a string->ASCII converter.
>C:
>NULL 67 58
What a bunch of bullsh***.

Python subprocess.call with multiline string EOF

I've hit a issue that I don't really understand how to overcome. I'm trying to create a subprocess in python to run another python script. Not too difficult. The issue is I'm unable to get around is EOF error when a python file includes a super long string.
Here's an example of what my files look like.
Subprocess.py:
### call longstr.py from the primary pyfile
subprocess.call(['python longstr.py'], shell = True)
Longstr.py
### called from subprocess.py
### the actual string is a lot longer; this is an example to illustrate how the string is formatted
lngstr = """here is a really long
string (which has many n3w line$ and "characters")
that are causing the python file to state the file is ending early
"""
print lngstr
Printer error in terminal
SyntaxError: EOF while scanning triple-quoted string literal
As a work around, I tried to remove all linebreaks as well as all spaces to see if it was due to it being multi-line. That still returned the same result.
My assumption is that when the subprocess is running and the shell is doing something with the file contents, when the new line is reached the shell itself is freaking out and that's what's terminating the process; not the file.
What is the correct workaround for having subprocess run a file like this?
Thank you for your help.
Answering my own question here; my problem was that I didn't file.close() before trying to execute a subprocess.call.
If you encounter this problem, and are working with recently written files this could be your issue too. Thank you to everyone who read or responded to this thread.

python: unable to find files in recently changed directory (OSx)

I'm automating some tedious shell tasks, mostly file conversions, in a kind of blunt force way with os.system calls (Python 2.7). For some bizarre reason, however, my running interpreter doesn't seem to be able to find the files that I just created.
Example code:
import os, time, glob
# call a node script to template a word document
os.system('node wordcv.js')
# print the resulting document to pdf
os.system('launch -p gowdercv.docx')
# move to the directory that pdfwriter prints to
os.chdir('/users/shared/PDFwriter/pauliglot')
print glob.glob('*.pdf')
I expect to have a length 1 list with the resulting filename, instead I get an empty list.
The same occurs with
pdfs = [file for file in os.listdir('/users/shared/PDFwriter/pauliglot') if file.endswith(".pdf")]
print pdfs
I've checked by hand, and the expected files are actually where they're supposed to be.
Also, I was under the impression that os.system blocked, but just in case it doesn't, I also stuck a time.sleep(1) in there before looking for the files. (That's more than enough time for the other tasks to finish.) Still nothing.
Hmm. Help? Thanks!
You should add a wait after the call to launch. Launch will spawn the task in the background and return before the document is finished printing. You can either put in some arbitrary sleep statements or if you want you can also check for file existence if you know what the expected filename will be.
import time
# print the resulting document to pdf
os.system('launch -p gowdercv.docx')
# give word about 30 seconds to finish printing the document
time.sleep(30)
Alternative:
import time
# print the resulting document to pdf
os.system('launch -p gowdercv.docx')
# wait for a maximum of 90 seconds
for x in xrange(0, 90):
time.sleep(1)
if os.path.exists('/path/to/expected/filename'):
break
Reference for potentially needing a longer than 1 second wait here

Testing 7-Zip archives from a python script

So I've got a python script that, at it's core, makes .7z archives of selected directories for the purpose of backing up data. For simplicty sake I've simply invoked 7-zip through the windows command line, like so:
def runcompressor(target, contents):
print("Compressing {}...".format(contents))
archive = currentmodule
archive += "{}\\{}.7z".format(target, target)
os.system('7z u "{}" "{}" -mx=9 -mmt=on -ssw -up1q0r2x2y2z1w2'.format(archive, contents))
print("Done!")
Which creates a new archive if one doesn't exist and updates the old one if it does, but if something goes wrong the archive will be corrupted, and if this command hits an existing, corrupted archive, it just gives up. Now 7zip has a command for testing the integrity of an archive, but the documentation says nothing about giving an output, and then comes the trouble of capturing that output in python.
Is there a way I can test the archives first, to determine if they've been corrupted?
The 7z executable returns a value of two or greater if it encounters a problem. In a batch script, you would generally use errorlevel to detect this. Unfortunately, os.system() under Windows gives the return value of the command interpreter used to run your program, not the exit value of your program itself.
If you want the latter, you'll probably going to have to get your hands a little dirtier with the subprocess module, rather than using the os.system() call.
If you have version 3.5 (or better), this is as simple as:
import subprocess as sp
x = sp.run(['7z', 'a', 'junk.7z', 'junk.txt'], stdout=sp.PIPE, stderr=sp.STDOUT)
print(x.returncode)
That junk.txt in my case is a real file but junk.7z is just a copy of one of my text files, hence an invalid archive. The output from the program is 2 so it's easily detectable if something went wrong.
If you print out x rather than just x.returncode, you'll see something like (reformatted and with \r\n sequences removed for readability):
CompletedProcess(
args=['7z', 'a', 'junk.7z', 'junk.txt'],
returncode=2,
stdout=b'
7-Zip [64] 9.20 Copyright (c) 1999-2010 Igor Pavlov 2010-11-18
Error: junk.7z is not supported archive
System error:
Incorrect function.
'
)

Interfacing Python With Fortran through Command-Line Using Pexpect

I am using pexpect with python to create a program that allows a user to interact with a FORTRAN program through a website. From the FORTRAN program I am receive the error:
open: Permission denied apparent state: unit 4 named subsat.out.55 last format: list io lately writing sequential formatted external IO 55
when I attempt to:
p = pexpect.spawn(myFortranProgram,[],5)
p.logfile_read = sys.stdout
p.expect("(.*)")
p.sendline("55")
From what I understand, I am likely sending the 55 to the wrong input unit. How do I correctly send input to a FORTRAN program using pexpect in Python?
Thank You.
Edit: When p.sendline's parameter is empty (e.g. p.sendline()) or only contains spaces, the program proceeds as expected. In sending non-space values to a FORTRAN program, do I need to specify the input format somehow?
The pexpect module is something I'd not used before, but could be useful to me, so I tried this.
Edit:
I've not been able to duplicate the error you're reporting. Looking at this error leads me to believe that it has something to do with reading from a file, which may be a result of other issues. From what I've seen, this isn't what pexpect is designed to handle directly; however, you may be able to make it work with a pipe, like the example in my original answer, below.
I'm having no problem sending data to Fortran's I/O stream 5 (stdin). I created a Fortran program called regurgitate which issues a " Your entry? " prompt, then gets a line of input from the user on I/O stream 5, then prints it back out. The following code works with that program:
import pexpect
child = pexpect.spawn('./regurgitate')
child.setecho(False)
ndx = child.expect('.*Your entry?.*')
child.sendline('42')
child.expect([pexpect.EOF])
print child.before
child.close()
The output is simply:
42
Exactly what I expected. However, if my Fortran program says something different (such as "Your input?"), the pexpect just hangs or times out.
Original suggestion:
Maybe this pexpect.run() sample will help you. At least it seems to run my regurgitate program (a simple Fortran program that accepts an input and then prints it out):
import pexpect
out = pexpect.run('/bin/bash -c "/bin/cat forty-two | ./regurgitate"')
print out
The output was:
Your entry?
42
Where regurgitate prints out a "Your entry?" prompt and the forty-two file contains "42" (without quotes in both cases).

Categories