passing wildcard arguments from bash into python - python

I'm trying to practice with python script by writing a simple script that would take a large series of files named A_B and write them to the location B\A. The way I was passing the arguments into the file was
python script.py *
and my program looks like
from sys import argv
import os
import ntpath
import shutil
script, filename = argv
target = open(filename)
outfilename = target.name.split('_')
outpath=outfilename[1]
outpath+="/"
outpath+=outfilename[0]
if not os.path.exists(outfilename[1]):
os.makedirs(outfilename[1])
shutil.copyfile(target.name, outpath)
target.close()
The problem with this is that this script the way it's currently written is set up to only accept 1 file at a time. Originally I was hoping the wildcard would pass one file at a time to the script then execute the script each time.
My question covers both cases:
How could I instead pass the wildcard files one at a time to a script.
and
How do I modify this script to instead accept all the arguments? (I can handle list-ifying everything but argv is what I'm having problems with and im a bit unsure about how to create a list of files)

You have two options, both of which involve a loop.
To pass the files one by one, use a shell loop:
for file in *; do python script.py "$file"; done
This will invoke your script once for every file matching the glob *.
To process multiple files in your script, use a loop there instead:
from sys import argv
for filename in argv[1:]:
# rest of script
Then call your script from bash like python script.py * to pass all the files as arguments. argv[1:] is an array slice, which returns a list containing the elements from argv starting from position 1 to the end of the array.
I would suggest the latter approach as it means that you are only invoking one instance of your script.

Related

Passing a value to a python variable from batch script

I have the below python script which takes a user input that is eventually used by the program to read a particular file.
I want to execute the python program from batch script and pass the file_name in the batch script. Can someone please help?
file_name = input("Input File Name to Compare: ")
path = ("outward\\" + file_name)
import sys
file_name = sys.argv[1]
path = "outward\\" + file_name
and you pass it to your script like:
$ python script.py filename.ext
To run the Python program from the batch script and pass the filename into the batch script, you need to use sys, so:
import sys
sys.argv is automatically a list of strings representing arguments on the command line. You can use this as an input for your program. Represents the first command line argument (like a string) given to the script in question.
The lists are indexed by numbers based on zero, so you can get the individual items using the syntax [0]. To get the script name, you need to get the first argument after the script for a filename, so:
filename = sys.argv[1]
path = "outward\\" + file_name
Next you need to pass it to your script like this:
$ python your_script.py filename.ext
(not to be confused with .ext with .text)
Your complete code for the solution to the question, quite simply, will be:
import sys
filename = sys.argv[1]
path = "outward\\" + file_name
and
$ python your_script.py filename.ext
You could use the sys.argv Python function.
The idea would be to get the input from the user in the batch file, then execute the Python program from the batch file while passing the user input as a command line argument to the python file. Example:
batchfile.bat
#echo off
set /p file="Enter Filename: "
python /path/to/program/pythonprogram.py %file%
pythonprogram.py
import sys
sys.argv[0] = file_name
path = ("outward\\" + file_name)
Now, when you execute the batch file, it will prompt for user input of a filename. Then, it will execute the Python program while passing the filename as a command-line argument to the Python file. Then using the sys.argv function you can collect the argument.

apply command to list of files in python

I've a tricky problem. I need to apply a specific command called xRITDecompress to a list of files with extension -C_ and I should do this with Python.
Unfortunately, this command doesn't work with wildcards and I can't do something like:
os.system("xRITDecompress *-C_")
In principle, I could write an auxiliary bash script with a for cycle and call it inside my python program. However, I'd like not to rely on auxiliary files...
What would be the best way to do this within a python program?
You can use glob.glob() to get the list of files on which you want to run the command and then for each file in that list, run the command -
import glob
for f in glob.glob('*-C_'):
os.system('xRITDecompress {}'.format(f))
From documentation -
The glob module finds all the pathnames matching a specified pattern according to the rules used by the Unix shell.
If by _ (underscore) , you wanted to match a single character , you should use - ? instead , like -
glob.glob('*-C?')
Please note, glob would only search in current directory but according to what you wanted with the original trial, seems like that maybe what you want.
You may also, want to look at subprocess module, it is a more powerful module for running commands (spawning processes). Example -
import subprocess
import glob
for f in glob.glob('*-C_'):
subprocess.call(['xRITDecompress',f])
You can use glob.glob or glob.iglob to get files that match the given pattern:
import glob
files = glob.iglob('*-C_')
for f in files:
os.system("xRITDecompress %s" % f)
Just use glob.glob to search and os.system to execute
import os
from glob import glob
for file in glob('*-C_'):
os.system("xRITDecompress %s" % file)
I hope it satisfies your question

Apply function to all files in directory

I have written a function for data processing and need to apply it to a large amount of files in a directory.
The function works when applied to individual files.
def getfwhm(x):
import numpy as np
st=np.std(x[:,7])
fwhm=2*np.sqrt(2*np.log(2))*st
file=open('all_fwhm2.txt', 'at')
file.write("fwhm = %.6f\n" % (fwhm))
file.close()
file=open('all_fwhm2.txt', 'rt')
print file.read()
file.close()
I now want to use this in a larger scale. So far I have written this code
import os
import fwhmfunction
files=os.listdir(".")
fwhmfunction.getfwhm(files)
But I get the following error
File "fwhmfunction.py", line 11, in getfwhm
st=np.std(x[:,7])
TypeError: list indices must be integers, not tuple
I am writing in python using spyder.
Thanks for your help!
In the spirit of the unix You should separate the program in two:
the program which acts on the given file
the program which applies a given script to a given list of files (glob, or whatever)
So here's a sample of 1:
# name: script.py
import sys
File = sys.argv[1]
# do something here
print File
(it is better to use argparse to parse the args, but we used argv to keep it simple)
As to the second part, there is an excellent unix tool already:
$ find . -maxdepth 1 -mindepth 1 -name "*.txt" | parallel python2.7 script.py {}
Here you get an extra bonus: parallel task execution.
If You on windows, then You can write something simple (sequentional) in python:
# name: apply_script_to_glob.py
import sys, os
from glob import glob
Script = sys.argv[1]
Glob = sys.argv[2]
Files = glob(Glob)
for File in Files:
os.system("python2.7 " + Script + " " + File)
(again we didn't use argparse nor checked anything to keep it simple). You'd call the script with
$ python2.7 apply_script_to_glob.py "script.py" "*.txt"

UNIX shell script to call python

I have a python script that runs on three files in the following way
align.py *.wav *.txt *.TextGrid
However, I have a directory full of files that I want to loop through. The original author suggests creating a shell script to loop through the files.
The tricky part about the loop is that I need to match three files at a time with three different extensions for the script to run correctly.
Can anyone help me figure out how to create a shell script to loop through a directory of files, match three of them according to name (with three different extensions) and run the python script on each triplet?
Thanks!
Assuming you're using bash, here is a one-liner:
for f in *.wav; do align.py $f ${f%\.*}.txt ${f%\.*}.TextGrid; done
You could use glob.glob to list only the wav files, then construct the subprocess.Popen call like so:
import glob
import os
import subprocess
for wav_name in glob.glob('*.wav'):
basename,ext = os.path.splitext(wav_name)
txt_name=basename+'.txt'
grid_name=basename+'.TextGrid'
proc=subprocess.Popen(['align.py',wav_name,txt_name,grid_name])
proc.communicate()

Passing arguments with wildcards to a Python script

I want to do something like this:
c:\data\> python myscript.py *.csv
and pass all of the .csv files in the directory to my python script (such that sys.argv contains ["file1.csv", "file2.csv"], etc.)
But sys.argv just receives ["*.csv"] indicating that the wildcard was not expanded, so this doesn't work.
I feel like there is a simple way to do this, but can't find it on Google. Any ideas?
You can use the glob module, that way you won't depend on the behavior of a particular shell (well, you still depend on the shell not expanding the arguments, but at least you can get this to happen in Unix by escaping the wildcards :-) ).
from glob import glob
filelist = glob('*.csv') #You can pass the sys.argv argument
In Unix, the shell expands wildcards, so programs get the expanded list of filenames. Windows doesn't do this: the shell passes the wildcards directly to the program, which has to expand them itself.
Vinko is right: the glob module does the job:
import glob, sys
for arg in glob.glob(sys.argv[1]):
print "Arg:", arg
If your script is a utility, I suggest you to define a function like this in your .bashrc to call it in a directory:
myscript() {
python /path/myscript.py "$#"
}
Then the whole list is passed to your python and you can process them like:
for _file in sys.argv[1:]:
# do something on file
If you have multiple wildcard items passed in (for eg: python myscript.py *.csv *.txt) then, glob(sys.argv[1] may not cut it. You may need something like below.
import sys
from glob import glob
args = [f for l in sys.argv[1:] for f in glob(l)]
This will work even if some arguments dont have wildcard characters in them. (python abc.txt *.csv anotherfile.dat)

Categories