bash program with argument from terminal - python

I am trying to write a bash script to a run a python program which take a files name and print values in the terminal.My bash program should take three argument from the terminal.First the python program name,second the folder name and third the file name where I want to store the output of my python program.
#!/bin/bash
directoryname = "$1"
programname = "$2"
newfilename ="$3"
for file in directoryname
do
python3 programname "$file" >> newfilename
done
and I am executing the program as follows:
./myscript.sh mypython.py /home/data myfile.txt
but it is giving error as :
./myscript.sh: line 2: directoryname: command not found
./myscript.sh: line 3: programname: command not found
./myscript.sh: line 4: newfilename: command not found
Please help me with this.I am pretty new to bash script.

Change to:
#!/bin/bash
directoryname="$1"
programname="$2"
newfilename="$3"
for file in $directoryname
do
python3 "$programname" "$file" >> "$newfilename"
done
No spaces around the =. A var is tagged with the $ before its name. In general, a var expansion is better if quoted ($var vs "$var").
And, I assume that you do want the list of files inside a directory, but the directoryname is only the directory itself (as /home/user/). If so, you will need:
for file in "$directoryname"/*

The unnecessary spaces in-between = and the variable names and the values are causing the issues, remove the spaces and try again. Will work. :)
#!/bin/bash
directoryname="$1"
programname="$2"
newfilename="$3"
for file in directoryname
do
python3 programname "$file" >> newfilename
done

Related

How to create a script to read a file line by line and concatenate them into a string? (Python or Bash)

I am trying to create a script currently that automates a command multiple times. I have a text file containing links to directories/files that is formatted line by line vertically. An example would be:
mv (X) /home/me
The X variable would change for every line in the directory/file text document. The script would execute the same command but change X each time. How would I got about doing this? Can someone point me in the right direction?
I appreciate the help!
Thanks a bunch!
That's a job for xargs:
xargs -d '\n' -I{} mv {} /path < file
Xargs will read standard input and for each element delimetered by a newline, it will substitute {} part by the readed part and execute mv.
import os
command = "mv {path} /home/me" # your command example, the {} will be replaced with the path
with open("path_to_file_list.txt", "r") as file:
paths = [s.strip() for s in file.readlines()] # assuming each line in the file is a path/file of the target files. the .strip() is to clear the newlines
for path in paths:
os.system(command.format(path=path)) # call each command, replacing the {path} with each file path from the text file.
cat file.txt | while read x; do
mv "$x" /home/me/
done

Ignoring a specific file in a shell script for loop

My shell script loops through a given directory to execute a python script. I want to be able to ignore a specific file which is export_config.yaml. This is currently used in the for loop and prints out error on the terminal I want it to totally ignore this file.
yamldir=$1
for yaml in ${yamldir}/*.yaml; do
if [ "$yaml" != "export_config.yaml" ]; then
echo "Running export for $yaml file...";
python jsonvalid.py -p ${yamldir}/export_config.yaml -e $yaml -d ${endDate}
wait
fi
done
Your string comparison in the if statement is not matching export_config.yaml because for each iteration of your for loop you are assigning the entire relative file path (ie "$yamldir/export_config.yaml", not just "export_config.yaml") to $yaml.
First Option: Changing your if statement to reflect that should correct your issue:
if [ "$yaml" != "${yamldir}/export_config.yaml" ]; then
#etc...
Another option: Use basename to grab only the terminal path string (ie the filename).
for yaml in ${yamldir}/*.yaml; do
yaml=$(basename $yaml)
if [ "$yaml" != "export_config.yaml" ]; then
#etc...
Third option: You can do away with the if statement entirely by doing your for loop like this instead:
for yaml in $(ls ${yamldir}/*.yaml | grep -v "export_config.yaml"); do
By piping the output of ls to grep -v, you can exclude any line including export_config.yaml from the directory listing in the first place.

Chain of UNIX commands within Python

I'd like to execute the following UNIX command in Python:
cd 2017-02-10; pwd; echo missing > 123.txt
The date directory DATE = 2017-02-10 and OUT = 123.txt are already variables in Python so I have tried variations of
call("cd", DATE, "; pwd; echo missing > ", OUT)
using the subprocess.call function, but I’m struggling to find documentation for multiple UNIX commands at once, which are normally separated by ; or piping with >
Doing the commands on separate lines in Python doesn’t work either because it “forgets” what was executed on the previous line and essentiality resets.
You can pass a shell script as a single argument, with strings to be substituted as out-of-band arguments, as follows:
date='2017-02-10'
out='123.txt'
subprocess.call(
['cd "$1"; pwd; echo missing >"$2"', # shell script to run
'_', # $0 for that script
date, # $1 for that script
out, # $2 for that script
], shell=True)
This is much more secure than substituting your date and out values into a string which is evaluated by the shell as code, because these values are treated as literals: A date of $(rm -rf ~) will not in fact try to delete your home directory. :)
Doing the commands on separate lines in Python doesn’t work either
because it “forgets” what was executed on the previous line and
essentiality resets.
This is because if you have separate calls to subprocess.call it will run each command in its own shell, and the cd call has no effect on the later shells.
One way around that would be to change the directory in the Python script itself before doing the rest. Whether or not this is a good idea depends on what the rest of the script does. Do you really need to change directory? Why not just write "missing" to 2017-02-10/123.txt from Python directly? Why do you need the pwd call?
Assuming you're looping through a list of directories and want to output the full path of each and also create files with "missing" in them, you could perhaps do this instead:
import os
base = "/path/to/parent"
for DATE, OUT in [["2017-02-10", "123.txt"], ["2017-02-11", "456.txt"]]:
date_dir = os.path.join(base, DATE)
print(date_dir)
out_path = os.path.join(date_dir, OUT)
out = open(out_path, "w")
out.write("missing\n")
out.flush()
out.close()
The above could use some error handling in case you don't have permission to write to the file or the directory doesn't exist, but your shell commands don't have any error handling either.
>>> date = "2017-02-10"
>>> command = "cd " + date + "; pwd; echo missing > 123.txt"
>>> import os
>>> os.system(command)

How can I get the variable that a python file with a path prints when I call that python file from a shell script?

For some reason when I run this in my shell script only $exec_file and $_dir get passed into module_test.py, but neither $filename nor $modified_file get passed.
mod_test=$( /path/module_test.py $modified_file $_dir $filename )
(path is just a normal path that I decided to shorten for the sake of this example)
Am I typing this wrong? I am trying to get the output (an integer) of my module_test.py to be put into the variable mod_test.
My variables are:
modified_file = _File
_dir = /path to directory/
file = _File.py
Based on your example, you need to surround $_dir with quotes because it contains spaces, i.e.:
mod_test=$( /path/module_test.py $modified_file '$_dir' $filename )

feed a command a comma separated list of file names in a directory, extract a variable motif from file names for labels

I have a directory containing files that look like this:
1_reads.fastq
2_reads.fastq
89_reads.fastq
42_reads.fastq
I would like to feed a comma separated list of these file names to a command from a python program, so the input to the python command would like this:
program.py -i 1_reads.fastq,2_reads.fastq,89_reads.fastq,42_reads.fastq
Furthermore, I'd like to use the numbers in the file names for a labeling function within the python command such that the input would look like this:
program.py -i 1_reads.fastq,2_reads.fastq,89_reads.fastq,42_reads.fastq -t s1,s2,s89,s42
Its important that the file names and the label IDs are in the same order.
First: This is a very poorly-thought-out calling convention. Don't use it.
However, if you're using software someone else wrote that already has that convention baked in...
#!/bin/bash
IFS=, # use comma as separator
files=( [[:digit:]]*_* )
[[ -e $files || -L $files ]] || { echo "ERROR: No files matching glob exist" >&2; exit 1; }
prefixes=( )
for file in "${files[#]}"; do
prefixes+=( "s${file%%_*}" )
done
# "exec" only if this is the last command in the script; remove otherwise
exec program.py -i "${files[*]}" -t "${prefixes[*]}"
How this works:
IFS=, causes ${array[*]} to put a comma between each expanded element. Thus, expanding ${files[*]} and ${prefixes[*]} creates comma-separated strings with the contents of each array.
${file%%_*} removes everything after the first _ in a filename, allowing the numbers alone to be extracted.
[[ -e $files || -L $files ]] actually only tests whether the first element in that array exists (as a symlink or otherwise); however, this will always be true if the glob being expanded to form the array matched any files (unless files have been deleted between the two lines' invocation).
Try this:
program.py $(cd DIR && var=$(ls) && echo $var | tr ' ' ',')
That will pass to program.py the string returned by te command line inside the $(..).
That command line will: Enter in your directory, run ls storing the output in a variable, that will remove the newline characters replacing with spaces, and it doesn't add a trailing space. Then echo that variable to 'tr' which will translate spaces to commas.
It can be done easily in pure Bash. Make sure you run from within the directory that contains the files.
#!/bin/bash
shopt -s extglob nullglob
# Create an array of files
f=( +([[:digit:]])_reads.fastq )
# Check that there are some files...
if ((${#f[#]}==0)); then
echo "No files found. Exiting."
exit
fi
# Create an array of labels, directly from the array f:
# Remove trailing _reads.fastq
l=( "${f[#]%_reads.fastq}" )
# And prepend the letter s
l=( "${l[#]/#/s}" )
# Now the arrays f and l are good: check them:
declare -p f l
# To join the arrays, we'll use eval. Safe because the code is single-quoted!
IFS=, eval 'program.py -i "${f[*]}" -t "${l[*]}"'
Note. The use of eval here is perfectly safe as we're passing a constant string (and it's actually an idiomatic way to join an array without using a subshell or a loop). Don't modify the command, in particular the single quotes.
Thanks to Charles Duffy who convinced me to add healthy comments about the use of eval

Categories