I am trying to convert .doc documents to .docx documents using python. Getting inspiration from this post, I have tried the following code :
import subprocess
import glob
import os
root = "//PARADFS101/7folder/LIAGREV/Documents/RFP/"
data_path = root + '/data2/'
os.chdir(data_path)
for doc in glob.iglob("*.doc"):
print(doc)
subprocess.call(['soffice', '--headless', '--convert-to', 'docx', doc], shell = True)
But unfortunately litterally nothing happens, i.e. I get no error message, the code is running, the docs are detected (which I check thanks to print) but I don't get any result. Any idea how I may troubleshoot this ?
EDITS :
I am running on Windows, hence shell = True
I have tried double quotes : '"
I have tried without spaces in the names
When I execute the subprocess command on one file alone, I get 1as output, which I don't knowhow to interpret...
Related
I am trying to run a list of files in a directory through a UNIX executable using a python. I would the output of the executable for each file written to a different directory but retaining the original filename.
I am using python 2.7 so using the subprocess.call method. I am getting an error that says "'bool' object is not iterable" which I am guessing is due to the part where I am trying to write the output files as when I run the following script through the console I get an expected output specific to the executable within the console window:
import subprocess
import os
for inp in os.listdir('/path/to/input/directory/'):
subprocess.call(['/path/to/UNIX/executable', inp])
My code is currently this:
import subprocess
import os
for inp in os.listdir('/path/to/input/directory/'):
out = ['/path/to/output/directory/%s' % inp]
subprocess.call(['/path/to/UNIX/executable', inp] > out)
However, this second lot of code returns the "'bool' is not iterable" error.
I'm guessing the solution is pretty trivial as it is not a complicated task however, as a beginner, I do not know where to start!
SOLVED: following #barak-itkin's answer, for those who may stumble across this issue in the future, the code ran successfully using the following:
import subprocess
import os
for inp in os.listdir('/path/to/input/directory/'):
with open('/path/to/output/directory/%s' % inp, 'w') as out_file:
subprocess.call(['/path/to/UNIX/executable', inp], stdout=out_file)
To write the output of a subprocess.call to a file, you would need to either use the > path/to/out as part of the command itself, or to do it "properly" by specifying the file to which the output should go:
# Option 1:
# Specify that the command is using a "shell" syntax, meaning that
# things like output redirection (such as with ">") should be handled
# by the shell that will evaluate the command
subprocess.call('my_command arg1 arg2 > /path/to/out', shell=True)
# Option 2:
# Open the file to which you want to write the output, and then specify
# the `stdout` parameter to be that file
with open('/path/to/out', 'w') as out_file:
subprocess.call(['my_command', 'arg1', 'arg2'], stdout=out_file)
Does this work for you?
I need to check GoldenGate processes' lag. In order to this, I execute Goldengate than I try to run GoldenGate's own commands "info all".
import subprocess as sub
import re
import os
location = str(sub.check_output(['ps -ef | grep mgr'], shell = True)).split()
pattern = re.compile(r'mgr\.prm$')
print(type(location))
for index in location:
if pattern.search(index)!=None:
gg_location = index[:-14] + "ggsci"
exec_ggate = sub.call(str(gg_location))
os.system('info all')
Yet, when I execute the GoldenGate it opens a new GoldenGate's own shell. So, I think because of that, Python cannot be able to do run "info all" command. How can I solve this problem? If there is missing information, please inform me.
Thank you in advance,
For command automation on Golden Gate you have the following information in the Oracle docs: https://docs.oracle.com/goldengate/1212/gg-winux/GWUAD/wu_gettingstarted.htm#GWUAD1096
To input a script
Use the following syntax from the command line of the operating system.
ggsci < input_file
Where:
The angle bracket (<) character pipes the file into the GGSCI program.
input_file is a text file, known as an OBEY file, containing the commands that you want to issue, in the order, they are to be issued.
Taking your script (keep into mind I don't know to code into python) you can simply execute a shell command in python in the following way:
import os
os.system("command")
So try doing this:
import os
os.system("ggsci < input_file")
Changing the input_file as indicated by the docs.
I think you will have an easier time doing it this way.
I am trying to convert doc file into docx. I found this code online.
subprocess.call(['soffice', '--headless', '--convert-to', 'docx', filename])
document = docx.Document(path[:-4] + ".docx")
docText = ''.join([
paragraph.text.encode('ascii', 'ignore') for paragraph in
document.paragraphs
It works perfectly fine with I use it on my own machine but I am trying to put this one AWS. It doesn't work there. I get an error saying "No such file or directory".
What could be the reason that it works on my computer but when I put it on AWS it doesnt.
You must have LibreOffice installed in the machine where ever you are using this code and you must close open instances of LibreOffice before running this, or it will exit silently without doing anything.
You can also try
unoconv -d document --format=docx *.doc
But it also dependent on LibreOffice. It will convert the files through LibreOffice. It is imperfect, and some formatting is lost, but it will convert all doc files to docx
On a windows machine, I am trying to get a file's mode using the os module in python, like this (short snippet):
import os
from stat import *
file_stat = os.stat(path)
mode = file_stat[ST_MODE]
An example for the mode I got for a file is 33206.
My question is, how can I convert it to the linux-file mode method? (for example, 666).
Thanks to all repliers!
Edit:
found my answer down here :) for all who want to understand this topic further:
understanding and decoding the file mode value from stat function output
Check if this translates properly:
import os
import stat
file_stat = os.stat(path)
mode = file_stat[ST_MODE]
print oct(stat.S_IMODE(mode))
For your example:
>>>print oct(stat.S_IMODE(33206))
0666
Took it from here. Read for more explanation
One workaround would be to use:os.system(r'attrib –h –s d:\your_file.txt') where you can use the attribute switches :
R – This command will assign the “Read-Only” attribute to your selected files or folders.
H – This command will assign the “Hidden” attribute to your selected files or folders.
A – This command will prepare your selected files or folders for “Archiving.”
S – This command will change your selected files or folders by assigning the “System” attribute.
I am working on a python script that installs an 802.1x certificate on a Windows 8.1 machine. This script works fine on Windows 8 and Windows XP (haven't tried it on other machines).
I have isolated the issue. It has to do with clearing out the folder
"C:\Windows\system32\config\systemprofile\AppData\LocalLow\Microsoft\CryptURLCache\Content"
The problem is that I am using the module os and the command listdir on this folder to delete each file in it. However, listdir errors, saying the folder does not exist, when it does indeed exist.
The issue seems to be that os.listdir cannot see the LocalLow folder. If I make a two line script:
import os
os.listdir("C:\Windows\System32\config\systemprofile\AppData")
It shows the following result:
['Local', 'Roaming']
As you can see, LocalLow is missing.
I thought it might be a permissions issue, but I am having serious trouble figuring out what a next step might be. I am running the process as an administrator from the command line, and it simply doesn't see the folder.
Thanks in advance!
Edit: changing the string to r"C:\Windows\System32\config\systemprofile\AppData", "C:\Windows\System32\config\systemprofile\AppData", or C:/Windows/System32/config/systemprofile/AppData" all produce identical results
Edit: Another unusual wrinkle in this issue: If I manually create a new directory in that location I am unable to see it through os.listdir either. In addition, I cannot browse to the LocalLow or my New Folder through the "Save As.." command in Notepad++
I'm starting to think this is a bug in Windows 8.1 preview.
I encountered this issue recently.
I found it's caused by Windows file system redirector
and you can check out following python snippet
import ctypes
class disable_file_system_redirection:
_disable = ctypes.windll.kernel32.Wow64DisableWow64FsRedirection
_revert = ctypes.windll.kernel32.Wow64RevertWow64FsRedirection
def __enter__(self):
self.old_value = ctypes.c_long()
self.success = self._disable(ctypes.byref(self.old_value))
def __exit__(self, type, value, traceback):
if self.success:
self._revert(self.old_value)
#Example usage
import os
path = 'C:\\Windows\\System32\\config\\systemprofile\\AppData'
print os.listdir(path)
with disable_file_system_redirection():
print os.listdir(path)
print os.listdir(path)
ref : http://code.activestate.com/recipes/578035-disable-file-system-redirector/
You must have escape sequences in your path. You should use a raw string for file/directory paths:
# By putting the 'r' at the start, I make this string a raw string
# Raw strings do not process escape sequences
r"C:\path\to\file"
or put the slashes the other way:
"C:/path/to/file"
or escape the slashes:
# You probably won't want this method because it makes your paths huge
# I just listed it because it *does* work
"C:\\path\\to\\file"
I'm curious as to how you are able to list the contents with those two lines. You are using escape sequences \W, \S, \c, \s, \A in your code. Try escaping the back slash like this:
import os
os.listdir('C:\\Windows\\System32\\config\\systemprofile\\AppData')