I have a bash script that calls a python script with parameters.
In the bash script, I'm reading a file that contains one row of parameters separated by ", and then calls the python script with the line I read.
My problem is that the python gets the parameters separated by the space.
The line looks like this: "param_a" "Param B" "Param C"
Code Example:
Bash Script:
LINE=`cat $tmp_file`
id=`python /full_path/script.py $LINE`
Python Script:
print sys.argv[1]
print sys.argv[2]
print sys.argv[3]
Received output:
"param_a"
"Param
B"
Wanted output:
param_a
Param B
Param C
How can I send the parameters to the Python script the way I need?
Thanks!
What about
id=`python /full_path/script.py $tmp_file`
and
import sys
for line in open(sys.argv[1]):
print(line)
?
The issue is in how bash passes the arguments. Python has nothing do to with it.
So, you have to solve all these stuff before sending it to Python, I decided to use awk and xargs for this. (but xargs is the actual MVP here.)
LINE=$(cat $tmp_file)
awk -v ORS="\0" -v FPAT='"[^"]+"' '{for (i=1;i<=NF;i++){print substr($i,2,length($i)-2)}}' <<<$LINE |
xargs -0 python ./script.py
First $(..) is preferred over backticks, because it is more readable. You are making a variable after all.
awk only reads from stdin or a file, but you can force it to read from a variable with the <<<, also called "here string".
With awk I loop over all fields (as defined by the regex in the FPAT variable), and print them without the "".
The output record separator I choose is the NULL character (-v ORF='\0'), xargs will split on this character.
xargs will now parse the piped input by separating the arguments on NULL characters (set with -0) and execute the command given with the parsed arguments.
Note, while awk is found on most UNIX systems, I make use of FPAT which is a GNU awk extension and you might not be having GNU awk as default (for example Ubuntu), but gnu awk is usually just a install gawk away.
Also, the next command would be a quick and easy solution, but generally considered as unsafe, since eval will execute everything it receives.
eval "python ./script "$LINE
This can be done using bash arrays:
tmp_file='gash.txt'
# Set IFS to " which splits on double quotes and removes them
# Using read is preferable to using the external program cat
# read -a reads into the array called "line"
# UPPERCASE variable names are discouraged because of collisions with bash variables
IFS=\" read -ra line < "$tmp_file"
# That leaves blank and space elements in "line",
# we create a new array called "params" without those elements
declare -a params
for((i=0; i < ${#line[#]}; i++))
do
p="${line[i]}"
if [[ -n "$p" && "$p" != " " ]]
then
params+=("$p")
fi
done
# `backticks` are frowned upon because of poor readability
# I've called the python script "gash.py"
id=$(python ./gash.py "${params[#]}")
echo "$id"
gash.py:
import sys
print "1",sys.argv[1]
print "2",sys.argv[2]
print "3",sys.argv[3]
Gives:
1 param_a
2 Param B
3 Param C
Related
I'm using Git Bash on Windows inside Windows Terminal and I'm writing a python script which needs to output colored text. As an example, I have the following one-line script named example.py:
print('\033[35m\033[K' + 'hello world' + '\033[m\033[K')
When I run the command python example.py, I expect to see colored output, but instead I get this:
←[35m←[Khello world←[m←[K
However, if I run python example.py | cat, I get the colored output I expect. How weird. I also get nice colored output if I run the script from cmd instead of bash, or if I run the line from the live interpreter (but not if it is a child of bash).
Any ideas? If possible I prefer to solve this without bringing in dependencies like Colorama.
EDIT: I resigned to using Colorama after seeing the replies. All it took to fix it was a call to the aptly named colorama.just_fix_windows_console(). Dependency-less solutions still welcome.
EDIT 2: Interestingly, this problem does not occur on my laptop which has what I thought was the exact same setup.
For the record, I tried the following basic script,
#!/usr/bin/python3.8
print('\033[35m\033[K' + 'hello world' + '\033[m\033[K')
and I got the result which you were likely looking for, namely
Since you mentioned bash, to get what you wanted, you need to read the section of the bash man page for printf more closely.
The round brackets are used in AWK, C, others. Bash shows no mention of round brackets in the usage syntax. However, you need to add a "format" specifier (NOT same as AWK), but per the man page.
Your statement should be reworked to show as this:
printf "%b\n" "\033[35m\033[K" "hello world" "\033[m\033[K"
OR ... you could just replace the printf "%b\n" with simply echo -e to get the same result.
Of possible interest, I also created for myself what I call a "Bourne Header" file, in which I defined some preset combinations with variable strings that can be re-used in other scripts by simply source'ing the *.bh file. I include it here for general usage.
#!/bin/sh
##########################################################################################################
### $Id: INCLUDES__TerminalEscape_SGR.bh,v 1.2 2022/09/03 01:57:31 root Exp root $
###
### This includes string variables defined to perform various substitutions for the ANSI Terminal Escape Sequences, i.e. SGR (Select Graphic Rendition subset)
##########################################################################################################
### REFERENCE: https://en.wikipedia.org/wiki/ANSI_escape_code
### https://www.ecma-international.org/publications-and-standards/standards/ecma-48/
### "\e" is same as "\033"
style_e()
{
boldON="\e[1m"
boldOFF="\e[0m"
italicON="\e[3m"
italicOFF="\e[0m"
underlineON="\e[4m"
underlineOFF="\e[0m"
blinkON="\e[5m"
blinkOFF="\e[0m"
cyanON="\e[96;1m"
cyanOFF="\e[0m"
cyanDarkON="\e[36;1m"
cyanDarkOFF="\e[0m"
greenON="\e[92;1m"
greenOFF="\e[0m"
yellowON="\e[93;1m"
yellowOFF="\e[0m"
redON="\e[91;1m"
redOFF="\e[0m"
orangeON="\e[33;1m"
orangeOFF="\e[0m"
blueON="\e[94;1m"
blueOFF="\e[0m"
blueSteelON="\e[34;1m"
blueSteelOFF="\e[0m"
darkBlueON="\e[38;2;95;95;255m"
darkBlueOFF="\e[0m"
magentaON="\e[95;1m"
magentaOFF="\e[0m"
}
style_o()
{
boldON="\033[1m"
boldOFF="\033[0m"
italicON="\033[3m"
italicOFF="\033[0m"
underlineON="\033[4m"
underlineOFF="\033[0m"
blinkON="\033[5m"
blinkOFF="\033[0m"
cyanON="\033[96;1m"
cyanOFF="\033[0m"
cyanDarkON="\033[36;1m"
cyanDarkOFF="\033[0m"
greenON="\033[92;1m"
greenOFF="\033[0m"
yellowON="\033[93;1m"
yellowOFF="\033[0m"
redON="\033[91;1m"
redOFF="\033[0m"
orangeON="\033[33;1m"
orangeOFF="\033[0m"
blueON="\033[94;1m"
blueOFF="\033[0m"
blueSteelON="\033[34;1m"
blueSteelOFF="\033[0m"
darkBlueON="\033[38;2;95;95;255m"
darkBlueOFF="\033[0m"
magentaON="\033[95;1m"
magentaOFF="\033[0m"
}
#style_e
style_o
##########################################################################################################
### Usage Examples:
##########################################################################################################
# echo "\t RSYNC process is ${redON}not${redOFF} running (or has already ${greenON}terminated${greenOFF}).\n"
# echo "\t ${PID} is ${cyanON}${italicON}${descr}${italicOFF}${cyanOFF} process ..."
# echo "\t ${PID} is ${yellowON}${descr}${yellowOFF} process ..."
# echo "\n\n\t RSYNC process (# ${pid}) has ${greenON}completed${greenOFF}.\n"
##########################################################################################################
### Example of scenario where escape codes are hard-coded; \e was not accepted by awk
##########################################################################################################
# echo "\n\t ${testor}\n" | sed 's+--+\n\t\t\t\t\t\t\t\t--+g' | awk '{
# rLOC=index($0,"rsync") ;
# if( rLOC != 0 ){
# sBeg=sprintf("%s", substr($0,1,rLOC-1) ) ;
# sEnd=sprintf("%s", substr($0,rLOC+5) ) ;
# sMid="\033[91;1mrsync\033[0m" ;
# printf("%s%s%s\n", sBeg, sMid, sEnd) ;
# }else{
# print $0 ;
# } ;
# }'
##########################################################################################################
echo "\n\t Imported LIBRARY: INCLUDES__TerminalEscape_SGR.bh ..."
##########################################################################################################
You could likewise define your own set of preset escape sequences as formatting variables.
I am trying to execute the following command in python using plumbum:
sort -u -f -t$'\t' -k1,1 file1 > file2
However, I am having issues passing the -t$'\t' argument. Here is my code:
from plumbum.cmd import sort
separator = r"-t$'\t'"
print separator
cmd = (sort["-u", "-f", separator, "-k1,1", "file1"]) > "file2"
print cmd
print cmd()
I can see problems right away after print separator and print cmd() executes:
-t$'\t'
/usr/bin/sort -u -f "-t\$'\\t'" -k1,1 file1 > file2
The argument is wrapped in double quotes.
An extra \ before $ and \t is inserted.
How should I pass this argument to plumbum?
You may have stumbled into limitations of the command line escaping.
I could make it work using subprocess module, passing a real tabulation char litteraly:
import subprocess
p=subprocess.Popen(["sort","-u","-f","-t\t","-k1,1","file1",">","file2"],shell=True)
p.wait()
Also, full python short solution that does what you want:
with open("file1") as fr, open("file2","w") as fw:
fw.writelines(sorted(set(fr),key=lambda x : x.split("\t")[0]))
The full python solution doesn't work exactly the same way sort does when dealing with unicity. If 2 lines have the same first field but not the same second field, sort keeps one of them, whereas the set will keep both.
EDIT: unchecked but you just confirmed that it works: just tweak your plumbum code with:
separator = "-t\t"
could just work, although out of the 3 ones, I'd recommend the full python solution since it doesn't involve an external process and therefore is more pythonic and portable.
I have a directory containing files that look like this:
1_reads.fastq
2_reads.fastq
89_reads.fastq
42_reads.fastq
I would like to feed a comma separated list of these file names to a command from a python program, so the input to the python command would like this:
program.py -i 1_reads.fastq,2_reads.fastq,89_reads.fastq,42_reads.fastq
Furthermore, I'd like to use the numbers in the file names for a labeling function within the python command such that the input would look like this:
program.py -i 1_reads.fastq,2_reads.fastq,89_reads.fastq,42_reads.fastq -t s1,s2,s89,s42
Its important that the file names and the label IDs are in the same order.
First: This is a very poorly-thought-out calling convention. Don't use it.
However, if you're using software someone else wrote that already has that convention baked in...
#!/bin/bash
IFS=, # use comma as separator
files=( [[:digit:]]*_* )
[[ -e $files || -L $files ]] || { echo "ERROR: No files matching glob exist" >&2; exit 1; }
prefixes=( )
for file in "${files[#]}"; do
prefixes+=( "s${file%%_*}" )
done
# "exec" only if this is the last command in the script; remove otherwise
exec program.py -i "${files[*]}" -t "${prefixes[*]}"
How this works:
IFS=, causes ${array[*]} to put a comma between each expanded element. Thus, expanding ${files[*]} and ${prefixes[*]} creates comma-separated strings with the contents of each array.
${file%%_*} removes everything after the first _ in a filename, allowing the numbers alone to be extracted.
[[ -e $files || -L $files ]] actually only tests whether the first element in that array exists (as a symlink or otherwise); however, this will always be true if the glob being expanded to form the array matched any files (unless files have been deleted between the two lines' invocation).
Try this:
program.py $(cd DIR && var=$(ls) && echo $var | tr ' ' ',')
That will pass to program.py the string returned by te command line inside the $(..).
That command line will: Enter in your directory, run ls storing the output in a variable, that will remove the newline characters replacing with spaces, and it doesn't add a trailing space. Then echo that variable to 'tr' which will translate spaces to commas.
It can be done easily in pure Bash. Make sure you run from within the directory that contains the files.
#!/bin/bash
shopt -s extglob nullglob
# Create an array of files
f=( +([[:digit:]])_reads.fastq )
# Check that there are some files...
if ((${#f[#]}==0)); then
echo "No files found. Exiting."
exit
fi
# Create an array of labels, directly from the array f:
# Remove trailing _reads.fastq
l=( "${f[#]%_reads.fastq}" )
# And prepend the letter s
l=( "${l[#]/#/s}" )
# Now the arrays f and l are good: check them:
declare -p f l
# To join the arrays, we'll use eval. Safe because the code is single-quoted!
IFS=, eval 'program.py -i "${f[*]}" -t "${l[*]}"'
Note. The use of eval here is perfectly safe as we're passing a constant string (and it's actually an idiomatic way to join an array without using a subshell or a loop). Don't modify the command, in particular the single quotes.
Thanks to Charles Duffy who convinced me to add healthy comments about the use of eval
I have a textile with content like this:
honda motor co of japan doesn't expect output at its car manufacturing plant in thailand
When I run wc -l textfile.txt, I receive 0.
The problem is I am running a python script that needs to count the number of line in this text file and run accordingly. I have tried two ways of computing the number of lines but they both keep giving me 0 and my code refuses to run.
Python code:
#Way 1
with open(sys.argv[1]) as myfile:
row=sum(1 for line in myfile)
print(row)
#Way 2
row = run("cat %s | wc -l" % sys.argv[1]).split()[0]
I receive an error that says: with open(sys.argv[1]) as myfile IndexError: list index out of range
I am calling receiving this file from php:
exec('python testthis.py $file 2>&1', $output);
I suspect that argv.sys[1] is giving me an error.
There's nothing wrong with the first example of your Python code (way 1).
The problem is the PHP calling code; the string being passed to exec() uses single quotes which prevents the expansion of the $file variable into the command string. The resulting call therefore passes the literal string $file as the argument to exec(), which in turn runs the command in a shell. That shell treats $file as a shell variable and tries to expand it, but it is not defined, and so it expands to an empty string. The resulting call is:
python testthis.py 2>&1
to which Python raises IndexError: list index out of range because it is missing an argument.
To fix use double quotes around the command when calling exec() in PHP:
$file = 'test.txt';
exec("python testthis.py $file 2>&1", $output);
Now $file can be expanded into the string as required.
This does assume that you actually want to expand a PHP variable into the string. Because exec() runs the command in a shell, it is also possible to have the variable defined in the shell's environment, and it will be expanded by the shell into the final command. To do this you would use single quotes around the command passed to exec().
Note that the Python code of "way 1" will return a line count of 1, not 0 as does wc -l.
my python file has these 2 variables:
week_date = "01/03/16-01/09/16"
cust_id = "12345"
how can i read this into a shell script that takes in these 2 variables?
my current shell script requires manual editing of "dt" and "id". I want to read the python variables into the shell script so i can just edit my python parameter file and not so many files.
shell file:
#!/bin/sh
dt="01/03/16-01/09/16"
cust_id="12345"
In a new python file i could just import the parameter python file.
Consider something akin to the following:
#!/bin/bash
# ^^^^ NOT /bin/sh, which doesn't have process substitution available.
python_script='
import sys
d = {} # create a context for variables
exec(open(sys.argv[1], "r").read()) in d # execute the Python code in that context
for k in sys.argv[2:]:
print "%s\0" % str(d[k]).split("\0")[0] # ...and extract your strings NUL-delimited
'
read_python_vars() {
local python_file=$1; shift
local varname
for varname; do
IFS= read -r -d '' "${varname#*:}"
done < <(python -c "$python_script" "$python_file" "${#%%:*}")
}
You might then use this as:
read_python_vars config.py week_date:dt cust_id:id
echo "Customer id is $id; date range is $dt"
...or, if you didn't want to rename the variables as they were read, simply:
read_python_vars config.py week_date cust_id
echo "Customer id is $cust_id; date range is $week_date"
Advantages:
Unlike a naive regex-based solution (which would have trouble with some of the details of Python parsing -- try teaching sed to handle both raw and regular strings, and both single and triple quotes without making it into a hairball!) or a similar approach that used newline-delimited output from the Python subprocess, this will correctly handle any object for which str() gives a representation with no NUL characters that your shell script can use.
Running content through the Python interpreter also means you can determine values programmatically -- for instance, you could have some Python code that asks your version control system for the last-change-date of relevant content.
Think about scenarios such as this one:
start_date = '01/03/16'
end_date = '01/09/16'
week_date = '%s-%s' % (start_date, end_date)
...using a Python interpreter to parse Python means you aren't restricting how people can update/modify your Python config file in the future.
Now, let's talk caveats:
If your Python code has side effects, those side effects will obviously take effect (just as they would if you chose to import the file as a module in Python). Don't use this to extract configuration from a file whose contents you don't trust.
Python strings are Pascal-style: They can contain literal NULs. Strings in shell languages are C-style: They're terminated by the first NUL character. Thus, some variables can exist in Python than cannot be represented in shell without nonliteral escaping. To prevent an object whose str() representation contains NULs from spilling forward into other assignments, this code terminates strings at their first NUL.
Now, let's talk about implementation details.
${#%%:*} is an expansion of $# which trims all content after and including the first : in each argument, thus passing only the Python variable names to the interpreter. Similarly, ${varname#*:} is an expansion which trims everything up to and including the first : from the variable name passed to read. See the bash-hackers page on parameter expansion.
Using <(python ...) is process substitution syntax: The <(...) expression evaluates to a filename which, when read, will provide output of that command. Using < <(...) redirects output from that file, and thus that command (the first < is a redirection, whereas the second is part of the <( token that starts a process substitution). Using this form to get output into a while read loop avoids the bug mentioned in BashFAQ #24 ("I set variables in a loop that's in a pipeline. Why do they disappear after the loop terminates? Or, why can't I pipe data to read?").
The IFS= read -r -d '' construct has a series of components, each of which makes the behavior of read more true to the original content:
Clearing IFS for the duration of the command prevents whitespace from being trimmed from the end of the variable's content.
Using -r prevents literal backslashes from being consumed by read itself rather than represented in the output.
Using -d '' sets the first character of the empty string '' to be the record delimiter. Since C strings are NUL-terminated and the shell uses C strings, that character is a NUL. This ensures that variables' content can contain any non-NUL value, including literal newlines.
See BashFAQ #001 ("How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?") for more on the process of reading record-oriented data from a string in bash.
Other answers give a way to do exactly what you ask for, but I think the idea is a bit crazy. There's a simpler way to satisfy both scripts - move those variables into a config file. You can even preserve the simple assignment format.
Create the config itself: (ini-style)
dt="01/03/16-01/09/16"
cust_id="12345"
In python:
config_vars = {}
with open('the/file/path', 'r') as f:
for line in f:
if '=' in line:
k,v = line.split('=', 1)
config_vars[k] = v
week_date = config_vars['dt']
cust_id = config_vars['cust_id']
In bash:
source "the/file/path"
And you don't need to do crazy source parsing anymore. Alternatively you can just use json for the config file and then use json module in python and jq in shell for parsing.
I would do something like this. You may want to modify it little bit for minor changes to include/exclude quotes as I didn't really tested it for your scenario:
#!/bin/sh
exec <$python_filename
while read line
do
match=`echo $line|grep "week_date ="`
if [ $? -eq 0 ]; then
dt=`echo $line|cut -d '"' -f 2`
fi
match=`echo $line|grep "cust_id ="`
if [ $? -eq 0 ]; then
cust_id=`echo $line|cut -d '"' -f 2`
fi
done