python: unable to find files in recently changed directory (OSx)

python: unable to find files in recently changed directory (OSx) - python

I'm automating some tedious shell tasks, mostly file conversions, in a kind of blunt force way with os.system calls (Python 2.7). For some bizarre reason, however, my running interpreter doesn't seem to be able to find the files that I just created.
Example code:
import os, time, glob
# call a node script to template a word document
os.system('node wordcv.js')
# print the resulting document to pdf
os.system('launch -p gowdercv.docx')
# move to the directory that pdfwriter prints to
os.chdir('/users/shared/PDFwriter/pauliglot')
print glob.glob('*.pdf')
I expect to have a length 1 list with the resulting filename, instead I get an empty list.
The same occurs with
pdfs = [file for file in os.listdir('/users/shared/PDFwriter/pauliglot') if file.endswith(".pdf")]
print pdfs
I've checked by hand, and the expected files are actually where they're supposed to be.
Also, I was under the impression that os.system blocked, but just in case it doesn't, I also stuck a time.sleep(1) in there before looking for the files. (That's more than enough time for the other tasks to finish.) Still nothing.
Hmm. Help? Thanks!

You should add a wait after the call to launch. Launch will spawn the task in the background and return before the document is finished printing. You can either put in some arbitrary sleep statements or if you want you can also check for file existence if you know what the expected filename will be.
import time
# print the resulting document to pdf
os.system('launch -p gowdercv.docx')
# give word about 30 seconds to finish printing the document
time.sleep(30)
Alternative:
import time
# print the resulting document to pdf
os.system('launch -p gowdercv.docx')
# wait for a maximum of 90 seconds
for x in xrange(0, 90):
time.sleep(1)
if os.path.exists('/path/to/expected/filename'):
break
Reference for potentially needing a longer than 1 second wait here

Related

Executing subprocess cannot find specified file on Windows

I'm working inside a system that has Jython2.5 but I need to be able to call some of Google's apis so I wrote an offline script that I wanted to call from my Jython environment and return to me small pieces of data. Like a JobID or a sheet URL or something from Google.
I've tried a number of things but I always get an error back from Windows, saying that it cannot find the file specified.
Path is done in two ways.
The first way using a string
stringPath = r"‪C:\GooglePipes\Scripts\filetobq.py C:\GooglePipes\Keys\DEV-BigQueryKey.json nofile C:\GooglePipes\BQ_Downtime\TESTFILE.CSV dataset1 table1"
And the second way, as a sequence (per the docs, using shell=false supply a sequence)
seqPath = [r"‪C:\GooglePipes\Scripts\filetobq.py",r"C:\GooglePipes\Keys\DEV-BigQueryKey.json","nofile",r"C:\GooglePipes\BQ_Downtime\TESTFILE.CSV","dataset1","table1"]
Called with
data, err = Popen(seqPath, shell=True, stderr=PIPE, stdout=PIPE).communicate()
#Read values back in
print data
print err
Replacing seqPath with stringPath to try it either way.
I've been at this all weekend, every time I run it I get from Windows
The system cannot find the path specified.
from the err print. I've been unable to debug much further than this. I'm not really sure what's happening. When I paste the stringPath variable directly into my computer's command window it executes.
I've also called subprocess.list2cmdline(seqPath) to see what it's outputting. It's giving me a ? in front of the string, but I haven't been able to figure out what that means. I can paste the rest of the string, starting after the question mark into the command window and it executes.
?C:\GooglePipes\Scripts\filetobq.py C:\GooglePipes...
I've tried a number of different combinations of true and false on shell, passing different args into Popen, double slashes, and I have no less than 30 tabs open from stack overflow and other help forums. I just have no idea what to do at this point and any help is appreciated.
Edit
The ? at the start of the sting is actually a NULL character when I did some additional logging. This seems to be the root of my problem. I can't figure out why it shows up, but it was present in my copy pastes. I started manually typing, and I got it working. When I feed the path with my Jython program it is present again.

Ultimately the error was the ?/NULL character.
I went back to the source value where the program was grabbing the path and it was present there. After I hand-re keyed it in, everything started working.
If you copy and paste what I put in the question, you can see the NULL character in the string if you run it through a string->ASCII converter.
>C:
>NULL 67 58
What a bunch of bullsh***.

Writing to a text file does not occur in real-time. How to fix this

I have a python script that takes a long time to run.
I placed print-outs throughout the script to observe its progress.
As this script different programs, some of whom print many messages, it is unfeasible to print directly to the screen.
Therefore, I am using a report file
f_report = open(os.path.join("//shared_directory/projects/work_area/", 'report.txt'), 'w')
To which I print my massages:
f_report.write(" "+current_image+"\n")
However, when I look at the file while the script is running, I do not see the messages. They appear only when the program finishes and closes the file, making my approach useless for monitoring on-going progress.
What should I do in order to make python output the messages to the report file in real time?
Many thanks.

You should use flush() function to write immediately to the file.
f_report.write(" "+current_image+"\n")
f_report.flush()

try this:
newbuffer = 0
f_report = open(os.path.join("//shared_directory/projects/work_area/", 'report.txt'), 'w', newbuffer)
it sets up a 0 buffer which will push OS to write content to file "immediately". well, different OS may behavior differently but in general content will be flushed out right away.

Add a subprocess to another one

I would like to add a new subprocess to an existing subprocess.
I do not want a parallel execution of both processes, I just want to add the new subprocess (qpdf: which suppose to linearize the output pdf from pdfjam) to the existing one (pdfjam: which adds several pdf together), s. code below, but the newly added subprocess has not been executed, only the existing one.
How can I fix this?
This is the concerning part of the code:
tmp_pdf_file = tmpdir + "/" + basename + ".pdf"
if os.path.exists(tmp_pdf_file):
subprocess.call(["pdfjam", "--keepinfo", "--noautoscale", "true", "--frame", "true", tmp_pdf_file, "-o", target_file_a4], cwd=tmpdir) # existing subprocess
self.progress.step()
subprocess.call(["qpdf", "--linearize", target_file_a4, target_file_a4], cwd=tmpdir) # new subprocess
self.progress.step()
shutil.move(tmp_pdf_file, target_file)
else:
print("No pdf file generated!")
Edit: I think the second subprocess has difficulties to find target_file_a4, even though there is no error message.
Edit2: The code runs through without any error message (I am not quite sure if I am able to implement the suggest communicate due to my few python skills). I check if the output pdf is linearized by pdfinfo foobar.pdf, which gives me Optimized: no = linearized: no, see manpage.
Edit3: The output is: Extract init b"----\n pdfjam: This is pdfjam version 2.08.\n pdfjam: Reading any site-wide or user-specific defaults...\n (none found)\n pdfjam: Effective call for this run of pdfjam:\n /usr/bin/pdfjam --keepinfo --noautoscale 'true' --frame 'true' --outfile /path/to/foobar.pdf -- /tmp/extract-5j70hk9o/foobar.pdf - \n pdfjam: Calling pdfinfo...\n pdfjam: Calling pdflatex...\n pdfjam: Finished. Output was to '/path/to/foobar.pdf'.\n" b'/path/to/foobar.pdf (object 579 0, file position 7769200): EOF while reading token\n'
I might add that I use python 3.5.0.
Edit4:
I did my research this morning and have to comment that I have added the code snippet of cmidi to both subprocess, to the one with pdfjam and to the one with qpdf. When I did it only with the first subprocess than I received the same message as posted in Edit3 without the b'/path/to/foobar.pdf (object 579 0, file position 7769200): EOF while reading token\n' line.
Pdfjam works fine (since the output pdf is one complete; pdflatex creates several separate ones). It clearly has to do with qpdf, see here.

Blocking until a file is closed in python

I have Python set up to create and open a txt file [see Open document with default application in Python ], which I then manually make some changes to and close. Immidiately after this is complete I want Python to open up next txt file. I currently have this set up so that python waits for a key command that I type after I have closed the file, and on that key, it opens the next one for me to edit.
Is there a way of getting Python to open the next document as soon as the prior one is closed (i.e to skip out having python wait for a key to be clicked). ... I will be repeating this task approximately 100,000 times, and thus every fraction of a second of clicking mounts up very quickly. I basically want to get rid of having to interface with python, and simply to have the next txt file automatically appear as soon as prior one is closed.
I couldn't work out how to do it, but was thinking along the lines of a wait until the prior file is closed (wasn't sure if there was a way for python to be able to tell if a file is open/closed).
For reference, I am using python2.7 and Windows.

Use the subprocess module's Popen Constructor to open the file. It will return an object with a wait() method which will block until the file is closed.

How about something like:
for fname in list_of_files:
with open(fname, mode) as f:
# do stuff

In case of interest, the following code using the modified time method worked:
os.startfile(text_file_name)
modified = time.ctime(os.path.getmtime(text_file_name))
created = time.ctime(os.path.getctime(text_file_name))
while modified == created:
sleep(0.5)
modified = time.ctime(os.path.getmtime(text_file_name))
print modified
print "moving on to next item"
sleep(0.5)
sys.stdout.flush()
Athough I think I will use the Popen constructor in the future since that seems a much more elegant way of doing (and also allows for situations where the file is closed without an edit been needed).

Print a PDF and delete the file when printing has finished

I have a Python application taht will be executed repeatedly. It saves a PDF as a file and then prints it. When printing ends it deletes the file.
My current solution (for the print and delete part) is this:
win32api.ShellExecute(0, "print", file_path, None, ".", 0)
time.sleep(10)
os.remove(self.options.dest_name)
time.sleep(10) is a trick to give the printing process the time to run before file deletion. Without it Acrobat Reader opens (it opens anyway) and alerts that it can't find the file. This because file removal has already occured.
The question is:
how can I do it without this unreliable trick? The best thing would be to have an handler for the printing process and get by it an info about the printing state: I wait for it to report it's completed and I delete the file.
it would be even better if Acrobat Reader wouldn't open, but this is not a great problem.
EDIT: I tried switching to Foxit Reader as the default PDF reader and now it doesn't open when I don't want. ;)
OTHER POSSIBLE SOLUTION:
Cylically check if the file is available (not used by another process) and when it's available again delete it. How could I do it in Python?

At last I've found a good solution, thanks to this answer (and also #Lennart mentioned it on a comment):
install Ghostscript
install GSview (which includes gsprint.exe)
write this code:
file_path = "C:\\temp\\test.pdf"
p = subprocess.Popen(["C:\\Ghostgum\\gsview\\gsprint.exe", "-printer", printer_name, "-colour", file_path],
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = p.communicate() # waits for the gs process to end
os.remove(file_path) # now the file can be removed
No Acrobat windows opening, no file removed before printing... The annoyance: installing GS.
See also: gsprint reference

Rather than hard-coding a filename and printing that, you should use the tempfile module to create a temporary file with a unique name.
import tempfile
file_name = tempfile.NamedTemporaryFile(suffix=".pdf", delete=False)
If you want, you can run a regular tidy-up script using Window's scheduling tools to delete the files created.

Adobe acrobat has (or at least used to have) a parameter "/t", which made it open, print and exit. By using it, you can call acrobat reader and wait for it to exit, and then delete the file.
Untested code:
>>> import subprocess
# You will have to figure out where your Acrobate reader is located, can be found in the registry:
>>> acrobatexe = "C:\Program Files\Adobe\Acrobat 4.0\Reader\AcroRd32.exe"
>>> subprocess.call([acrobatexe, "/t", tempfilename, "My Windows Printer Name"])
>>> os.unlink(tempfilename)
Something like that.
If you don't want acrobat to open, there are open source software that will print pdfs from the command line. You could include one with your software.

Why not use os.system, which will wait until the process is finished?

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.