I have hundreds of XML files and I would like to parse it into CSV files. I already code this program.
To execute the python program I use this command (on VScode MS):
python ConvertXMLtoCSV.py -i Alarm120.xml -o Alarm120.csv
My question is, how change this script to integrate a sort of for loop to execute this program for each xml files ?
UPDATE
If my files and folders are organized like in the picture:
I tried this and execute the file .bat in windows10 but it does nothing:
#!/bin/bash
for xml_file in XML_Files/*.xml
do
csv_file=${xml_file/.xml/.csv}
python ConvertXMLtoCSV.py -i XML_Files/$xml_file -o CSV_Files/$csv_file
done
Ideally the for loop would be included inside your ConvertXMLtoCSV.py itself. You can use this to find all xml files in a given directory:
for file in os.listdir(directory_path):
if file.endswith(".xml"):
# And here you can do your conversion
You could change the arguments given to the script to be the path of the directory the xml files are located in and the path for an output folder for the .csv files. For renaming, you can leave the files with the same name but give the .csv extension. i.e.
csv_name = file.replace(".xml", ".csv")
If you want to keep your Python script as-is (process one file), and add the looping externally in bash, you could do:
#!/bin/bash
for xml_file in *.xml
do
csv_file=${xml_file/.xml/.csv}
python ConvertXMLtoCSV.py -i $xml_file -o $csv_file
done
After discussion, it appears that you wish to use an external script so as to leave the original ConvertXMLtoCSV.py script unmodified (as required by other projects), but that although you tagged bash in the question, it turned out that you were not in fact able to use bash to invoke python when you tried it in your setup.
This being the case, it is possible to adapt Rolv Apneseth's answer so that you do the looping in Python, but inside a separate script (let's suppose that this is called convert_all.py), which then runs the unmodified ConvertXMLtoCSV.py as an external process. This way, the ConvertXMLtoCSV.py will still be set up to process only one file each time it is run.
To call an external process, you could either use os.system or subprocess.Popen, so here are two options.
Using os.system:
import os
import sys
directory_path = sys.argv[1]
for file in os.listdir(directory_path):
if file.endswith(".xml"):
csv_name = file.replace(".xml", ".csv")
os.system(f'python ConvertXMLtoCSV.py -i {file} -o {csv_name}')
note: for versions of python too old to support f-strings, that last line could be changed to
os.system('python ConvertXMLtoCSV.py -i {} -o {}'.format(file,csv_name))
Using subprocess.Popen:
import subprocess
import sys
directory_path = sys.argv[1]
for file in os.listdir(directory_path):
if file.endswith(".xml"):
csv_name = file.replace(".xml", ".csv")
p = subprocess.Popen(['python', 'ConvertXMLtoCSV.py',
'-i', file,
'-o', csv_name])
p.wait()
You could then run it using some command such as:
python convert_all.py C:/Users/myuser/Desktop/myfolder
or whatever the folder is where you have the XML files.
Related
I have a script in a different folder: a/b/c/my_script.py
I need to execute it in x/y/z/ so that my_script.py will take the path of x/y/z/my_file.txt, modify it and open the same file in another folder: x/y/q/w/e/my_file.txt
How can I do that in Python?
I can do
# my_script.py
import sys
argv = sys.argv[1]
// change the path
// open the file in the other path
then call my_script.py this way:
pwd
x/y/z/
python a/b/c/my_script.py $(pwd) my_file.txt
but I wonder if there's a better way.
Note: I've tried using bash function but it seems more complicated to parse strings.
im kinda new to the python world and im having some issues running a bash file that will be automatically from my python script (using linux) .
i set my python script to create both a text file .geo and a Bash file .sh in a directory somewhere in my Desktop like this :
basedirectory="/home/pst2/Desktop/";
*//Writing the .geo file*
file = open(basedirectory+nomdossier+"/"+nomfichier+".geo", 'w');
file.write
..blabla
..blabla
file.close();
//Writing the .sh file
file = open(basedirectory+nomdossier+"/"+nomfichier+".sh", 'w');
file.write
..blabla
..blabla
file.close();
Now at this point my script works perfectly with all the variables set up and working fine and both those files that i created find themselves in this directory (for exemple after running the python script and entering the variables)
/home/pst2/Desktop/test/
(and in here you will find the new test.geo and test.sh that were created via the python script)
basically the test.sh when executed "manually" with Bash test.sh ( whenever i am in its directory on ubuntu) will create another file called test.msh in the same directory
and i cant seem to find the right coding , using the subprocess modules to execute the newly created test.sh file automatically from the script .
is there a way to do so , like with indicating the absolute path to the .sh file
(in our case basedirectory+nomdossier+"/"+nomfichier+".sh ) ?
take a look at the os module.
I believe
os.system("command_line_with_args")
could be what your looking for
Not sure what you are writing into the .sh file.
But to start with:
Have you started your .sh-file with the hashbang? #!/bin/sh
Have you modded your file as executable? chmod +x
After you have done this, you should be able to use subprocess module and do something like the example from the manual for subprocess:
subprocess.call([path_to_script+'/script.sh'])
I might have to update this answer if & when new information comes to my attention
Roughly equivalent to "manually" executing bash test.sh with the current directory being the one where test.sh has been written by your posted code is:
from subprocess import call
call(['bash', 'test.sh'], cwd=basedirectory+nomdossier)
I have been trying to use the output of a system command to use it as a part of the command in the next portion. However, I cannot seem to join it up properly and hence not being able to run the second command properly. The OS used is KALI LINUX and python 2.7
#IMPORTS
import commands, os, subprocess
os.system('mkdir -p ~/Desktop/TOOLS')
checkdir = commands.getoutput('ls ~/Desktop')
if 'TOOLS' in checkdir:
currentwd = subprocess.check_output('pwd', shell=True)
cmd = 'cp -R {}/RAW ~/Desktop/TOOLS/'.format(currentwd)
os.system(cmd)
os.system('cd ~/Desktop/TOOLS')
os.system('pwd')
The errors are:
cp: missing destination file operand after ‘/media/root/ARSENAL’
Try 'cp --help' for more information.
sh: 2: /RAW: not found
/media/root/ARSENAL
It seems that the reading of the first command is alright but it can't join with the RAW portion. I have read many other solutions, but they seem to be for shell scripting instead.
Assuming you haven't called os.chdir() anywhere prior to the cp -R, then you can use a relative path. Changing the code to...
if 'TOOLS' in checkdir:
cmd = 'cp -R RAW ~/Desktop/TOOLS'
os.system(cmd)
...should do the trick.
Note that the line...
os.system('cd ~/Desktop/TOOLS')
...will not do what you expect. os.system() spawns a subshell, so it will just change the working directory for that process and then exit. The calling process's working directory will remain unchanged.
If you want to change the working directory for the calling process, use...
os.chdir(os.path.expanduser('~/Desktop/TOOLS'))
However, Python has all this functionality built-in, so you can do it without spawning any subshells...
import os, shutil
# Specify your path constants once only, so it's easier to change
# them later
SOURCE_PATH = 'RAW'
DEST_PATH = os.path.expanduser('~/Desktop/TOOLS/RAW')
# Recursively copy the files, creating the destination path if necessary.
shutil.copytree(SOURCE_PATH, DEST_PATH)
# Change to the new directory
os.chdir(DEST_PATH)
# Print the current working directory
print os.getcwd()
I have a python script which takes the filename as a command argument and processes that file. However, i have thousands of files I need to process, and I would like to run the script on every file without having to add the filename as the argument each time.
for example:
process.py file1 will do exactly what I want
however, I want to run process.py on a folder containing thousands of files (file1, file2, file3, etc.)
I have found out it that it can be done simply in Bash
for f in *; do python myscript.py $f; done
However, I am on windows and don't want to install something like Cygwin.
What would a piece of code for the Windows command line look like that would emulate what the above Bash code accomplishes?
for %%f in (*.py) do (
start %%f
)
I think that'll work -- I don't have a Windows box handy at the moment to try it
How to loop through files matching wildcard in batch file
That link might help
import os, subprocess
for f in os.listdir('.'):
if os.path.isfile(f):
subprocess.call(["python", "myscript.py", f])
this solution will work on every platform, provided the python executable is in the PATH.
Also, if you want to recursively process files in nested subdirectories, you can use os.walk() instead of os.listdir()+os.path.isfile().
Since you have python, why not use that?
import subprocess
import glob
import sys
import os.path
for fname in glob.iglob(os.path.join('some-directory-name','*')):
proc = subprocess.Popen([sys.executable, 'myscript.py', fname])
proc.wait()
What's more, its portable.
For each file in current dir.
for %f in (*) do C:\Python34\python.exe "%f"
Update:
Note the quotes on the %f. You need them if your files contain spaces in the name. You can also put any path+executable after the do.
If we imagine your files look like:
./process.py
./myScripts/file1.py
./myScripts/file2.py
./myScripts/file3.py
...
In your example, would simply be:
for %f in (.\myScripts\*) do process.py "%f"
This would invoke:
process.py ".\myScripts\file1.py"
process.py ".\myScripts\file2.py"
process.py ".\myScripts\file3.py"
I have to use cURL on Windows using python script. My goal is: using python script get all files from remote directory ... preferably into local directory. After that I will compare each file with the files stored locally. I am able to get one file at a time but I need to get all of the files from remote directory.
Could someone please advice how to get multiple files?
I use this command:
curl.exe -o file1.txt sftp:///dir1/file1.txt -k -u user:password
thanks
I haven't tested this, but I think you could just try launching each shell command as a separate process to run them simultaneously. Obviously, this might be a bad idea if you have a large set of files, so you might need to manage that more carefully. Here's some untested code, and you'd need to edit the 'cmd' variable in the get_file function, of course.
from multiprocessing import Process
import subprocess
def get_file(filename):
cmd = '''curl.exe -o {} sftp:///dir1/{} -k -u user:password'''.format(filename, filename)
subprocess.check_output(cmd, shell=True, stderr=subprocess.STDOUT) # run the shell command
files = ['file1.txt', 'file2.txt', 'file3.txt']
for filename in files:
p = Process(target=get_file, args=(filename,)) # create a process which passes filename to get_file()
p.start()