I have a textile with content like this:
honda motor co of japan doesn't expect output at its car manufacturing plant in thailand
When I run wc -l textfile.txt, I receive 0.
The problem is I am running a python script that needs to count the number of line in this text file and run accordingly. I have tried two ways of computing the number of lines but they both keep giving me 0 and my code refuses to run.
Python code:
#Way 1
with open(sys.argv[1]) as myfile:
row=sum(1 for line in myfile)
print(row)
#Way 2
row = run("cat %s | wc -l" % sys.argv[1]).split()[0]
I receive an error that says: with open(sys.argv[1]) as myfile IndexError: list index out of range
I am calling receiving this file from php:
exec('python testthis.py $file 2>&1', $output);
I suspect that argv.sys[1] is giving me an error.
There's nothing wrong with the first example of your Python code (way 1).
The problem is the PHP calling code; the string being passed to exec() uses single quotes which prevents the expansion of the $file variable into the command string. The resulting call therefore passes the literal string $file as the argument to exec(), which in turn runs the command in a shell. That shell treats $file as a shell variable and tries to expand it, but it is not defined, and so it expands to an empty string. The resulting call is:
python testthis.py 2>&1
to which Python raises IndexError: list index out of range because it is missing an argument.
To fix use double quotes around the command when calling exec() in PHP:
$file = 'test.txt';
exec("python testthis.py $file 2>&1", $output);
Now $file can be expanded into the string as required.
This does assume that you actually want to expand a PHP variable into the string. Because exec() runs the command in a shell, it is also possible to have the variable defined in the shell's environment, and it will be expanded by the shell into the final command. To do this you would use single quotes around the command passed to exec().
Note that the Python code of "way 1" will return a line count of 1, not 0 as does wc -l.
Related
My function run_deinterleave() is meant to copy code from the file deinterleave.sh then replace the placeholder (sra_data) with a file name which has been input by the user and then run it on the command line.
def run_deinterleave():
codes = open('Project/CODE/deinterleave.sh')
codex = codes.read()
print(inp_address)
codex = codex.replace('sra_data', inp_address)
#is opening this twice creating another pipeline?
stream = os.popen(codex)
codes.close()
self.txtarea.insert(END,codex)
#stuff
However, I keep getting this error:
/bin/sh: 5: Syntax error: "(" unexpected
The code in deinterleave.sh works fine and produces two individual files given an interleaved paired end sra_file (an output file from genetic sequencing machines, I think :P)
#1deinterleave paired end fastq file
paste - - - - - - - - < sra_data \
| tee >(cut -f 1-4 | tr "\t" "\n" > /home/lols/Project/reads-1.fq) \
| cut -f 5-8 | tr "\t" "\n" > /home/lols/Project/reads-2.fq
As the error message shows, the code was interpreted by /bin/sh; if you executed
/bin/sh Project/CODE/deinterleave.sh, you'd get the same error, because the process substitution >(…) is a Bash extension not understood by /bin/sh.
Besides, since you don't communicate with the shell code, we don't need pipes at all. So instead of os.popen I'd use subprocess.run, which allows to specify Bash as the shell.
subprocess.run(codex, shell=True, executable="bash")
The absolutely best fix is probably to replace the shell script with native Python code; but without a specification and/or sample input, I don't think we can tell you exactly how to do that.
An immediate and trivial fix is to change deinterlace so that it accepts an input file parameter.
#!/usr/bin/env bash
paste - - - - - - - - < "${1-sra_data}" |
tee >(cut -f 1-4 | tr "\t" "\n" > "${2-/home/lols/Project/reads-1.fq}") |
cut -f 5-8 | tr "\t" "\n" > "${3-/home/lols/Project/reads-2.fq}"
This refactoring also allows you to specify the names of the output files as the second and third command-line arguments.
Also, a Bash script really should not have a .sh extension, so probably take that out.
Explictly naming Bash in the shebang line should solve the error message you got when running Bash code in sh; perhaps see also Difference between sh and bash
With that, your Python code can be reduced to something like
subprocess.run(
['Project/CODE/deinterleave', inp_address],
# probably a good idea
check=True)
though I don't exactly understand the rest of the surrounding function, so it's not clear how exactly to rewrite it.
I think the shell script could be reimplemented something like
with open(inp_address, 'r') as sra_data, open(
'/home/lols/Project/reads-1.fq', 'w') as first, open(
'/home/lols/Project/reads-2.fq', 'w') as second:
for idx in range(4):
first.write(sra_data.readline())
for idx in range(4):
second.write(sra_data.readline())
I have a bash script that calls a python script with parameters.
In the bash script, I'm reading a file that contains one row of parameters separated by ", and then calls the python script with the line I read.
My problem is that the python gets the parameters separated by the space.
The line looks like this: "param_a" "Param B" "Param C"
Code Example:
Bash Script:
LINE=`cat $tmp_file`
id=`python /full_path/script.py $LINE`
Python Script:
print sys.argv[1]
print sys.argv[2]
print sys.argv[3]
Received output:
"param_a"
"Param
B"
Wanted output:
param_a
Param B
Param C
How can I send the parameters to the Python script the way I need?
Thanks!
What about
id=`python /full_path/script.py $tmp_file`
and
import sys
for line in open(sys.argv[1]):
print(line)
?
The issue is in how bash passes the arguments. Python has nothing do to with it.
So, you have to solve all these stuff before sending it to Python, I decided to use awk and xargs for this. (but xargs is the actual MVP here.)
LINE=$(cat $tmp_file)
awk -v ORS="\0" -v FPAT='"[^"]+"' '{for (i=1;i<=NF;i++){print substr($i,2,length($i)-2)}}' <<<$LINE |
xargs -0 python ./script.py
First $(..) is preferred over backticks, because it is more readable. You are making a variable after all.
awk only reads from stdin or a file, but you can force it to read from a variable with the <<<, also called "here string".
With awk I loop over all fields (as defined by the regex in the FPAT variable), and print them without the "".
The output record separator I choose is the NULL character (-v ORF='\0'), xargs will split on this character.
xargs will now parse the piped input by separating the arguments on NULL characters (set with -0) and execute the command given with the parsed arguments.
Note, while awk is found on most UNIX systems, I make use of FPAT which is a GNU awk extension and you might not be having GNU awk as default (for example Ubuntu), but gnu awk is usually just a install gawk away.
Also, the next command would be a quick and easy solution, but generally considered as unsafe, since eval will execute everything it receives.
eval "python ./script "$LINE
This can be done using bash arrays:
tmp_file='gash.txt'
# Set IFS to " which splits on double quotes and removes them
# Using read is preferable to using the external program cat
# read -a reads into the array called "line"
# UPPERCASE variable names are discouraged because of collisions with bash variables
IFS=\" read -ra line < "$tmp_file"
# That leaves blank and space elements in "line",
# we create a new array called "params" without those elements
declare -a params
for((i=0; i < ${#line[#]}; i++))
do
p="${line[i]}"
if [[ -n "$p" && "$p" != " " ]]
then
params+=("$p")
fi
done
# `backticks` are frowned upon because of poor readability
# I've called the python script "gash.py"
id=$(python ./gash.py "${params[#]}")
echo "$id"
gash.py:
import sys
print "1",sys.argv[1]
print "2",sys.argv[2]
print "3",sys.argv[3]
Gives:
1 param_a
2 Param B
3 Param C
I'm trying to make some functions in python so that I can connect to a linux terminal and do stuff (like in this case, create a file). The code I have, works partially. The only thing that doesn't work is if you want to do something after you have entered the code. Like for instance you create the file and then want to navigate somewhere else (cd /tmp) for instance. Instead of doing the next command, it will just add to the file created.
def create_file(self, name, contents, location):
try:
log.info("Creating a file...")
self.device.execute("mkdir -p {}".format(location))
self.cd_path(location)
self.device.sendline("cat > {}".format(name))
self.device.sendline("{}".format(contents))
self.device.sendline("EOF") # send the CTRL + D command to save and exit I tried here with ^D as well
except:
log.info("Failed to create the file!")
The contents of the file is:
cat test.txt
#!/bin/bash
echo "Fail Method Requested"
exit 1
EOF
ls -d /tmp/asdasd
The order of commands executed is:
execute.create_file(test.txt, the_message, the_location)
execute.check_path("/tmp/adsasd") #this function just checks with ls -d if the directory exists.
I have tried with sendline the following combinations:
^D, EOF, <<EOF
I don't really understand how I could make this happen. I just want to create a file with a specific message. (When researching on how to do this with VI I got the same problem, but there the command I needed was the one for ESC)
If anyone could help with some input that would be great!!
Edit: As Rob mentioned below, sending the character "\x04" actually works. For anyone else having this issue, you can also consult this chart for other combinations if needed:
http://donsnotes.com/tech/charsets/ascii.html
You probably need to send the EOF character, which is typically CONTROL-D, not the three characters E, O, and F.
self.device.sendline("\x04")
http://wiki.bash-hackers.org/syntax/redirection#here_documents
Here docs allow you to use any file input termination string you like to represent end of file ( such as the literal EOF you're attempting to use now). Quoting that string tells the shell not to interpret expansions inside the heredoc content, ensuring that said content is treated as literal.
Using pipes.quote() here ensures that filenames with literal quotes, $s, spaces, or other surprising characters won't break your script. (Of course, you'll need to import pipes; on Python 3, by contrast, this has moved to shlex.quote()).
self.device.sendline("cat > {} <<'EOF'".format(pipes.quote(name)))
Then you can write the EOF as is, having told bash to interpret it as the end of file input.
I have a directory containing files that look like this:
1_reads.fastq
2_reads.fastq
89_reads.fastq
42_reads.fastq
I would like to feed a comma separated list of these file names to a command from a python program, so the input to the python command would like this:
program.py -i 1_reads.fastq,2_reads.fastq,89_reads.fastq,42_reads.fastq
Furthermore, I'd like to use the numbers in the file names for a labeling function within the python command such that the input would look like this:
program.py -i 1_reads.fastq,2_reads.fastq,89_reads.fastq,42_reads.fastq -t s1,s2,s89,s42
Its important that the file names and the label IDs are in the same order.
First: This is a very poorly-thought-out calling convention. Don't use it.
However, if you're using software someone else wrote that already has that convention baked in...
#!/bin/bash
IFS=, # use comma as separator
files=( [[:digit:]]*_* )
[[ -e $files || -L $files ]] || { echo "ERROR: No files matching glob exist" >&2; exit 1; }
prefixes=( )
for file in "${files[#]}"; do
prefixes+=( "s${file%%_*}" )
done
# "exec" only if this is the last command in the script; remove otherwise
exec program.py -i "${files[*]}" -t "${prefixes[*]}"
How this works:
IFS=, causes ${array[*]} to put a comma between each expanded element. Thus, expanding ${files[*]} and ${prefixes[*]} creates comma-separated strings with the contents of each array.
${file%%_*} removes everything after the first _ in a filename, allowing the numbers alone to be extracted.
[[ -e $files || -L $files ]] actually only tests whether the first element in that array exists (as a symlink or otherwise); however, this will always be true if the glob being expanded to form the array matched any files (unless files have been deleted between the two lines' invocation).
Try this:
program.py $(cd DIR && var=$(ls) && echo $var | tr ' ' ',')
That will pass to program.py the string returned by te command line inside the $(..).
That command line will: Enter in your directory, run ls storing the output in a variable, that will remove the newline characters replacing with spaces, and it doesn't add a trailing space. Then echo that variable to 'tr' which will translate spaces to commas.
It can be done easily in pure Bash. Make sure you run from within the directory that contains the files.
#!/bin/bash
shopt -s extglob nullglob
# Create an array of files
f=( +([[:digit:]])_reads.fastq )
# Check that there are some files...
if ((${#f[#]}==0)); then
echo "No files found. Exiting."
exit
fi
# Create an array of labels, directly from the array f:
# Remove trailing _reads.fastq
l=( "${f[#]%_reads.fastq}" )
# And prepend the letter s
l=( "${l[#]/#/s}" )
# Now the arrays f and l are good: check them:
declare -p f l
# To join the arrays, we'll use eval. Safe because the code is single-quoted!
IFS=, eval 'program.py -i "${f[*]}" -t "${l[*]}"'
Note. The use of eval here is perfectly safe as we're passing a constant string (and it's actually an idiomatic way to join an array without using a subshell or a loop). Don't modify the command, in particular the single quotes.
Thanks to Charles Duffy who convinced me to add healthy comments about the use of eval
my python file has these 2 variables:
week_date = "01/03/16-01/09/16"
cust_id = "12345"
how can i read this into a shell script that takes in these 2 variables?
my current shell script requires manual editing of "dt" and "id". I want to read the python variables into the shell script so i can just edit my python parameter file and not so many files.
shell file:
#!/bin/sh
dt="01/03/16-01/09/16"
cust_id="12345"
In a new python file i could just import the parameter python file.
Consider something akin to the following:
#!/bin/bash
# ^^^^ NOT /bin/sh, which doesn't have process substitution available.
python_script='
import sys
d = {} # create a context for variables
exec(open(sys.argv[1], "r").read()) in d # execute the Python code in that context
for k in sys.argv[2:]:
print "%s\0" % str(d[k]).split("\0")[0] # ...and extract your strings NUL-delimited
'
read_python_vars() {
local python_file=$1; shift
local varname
for varname; do
IFS= read -r -d '' "${varname#*:}"
done < <(python -c "$python_script" "$python_file" "${#%%:*}")
}
You might then use this as:
read_python_vars config.py week_date:dt cust_id:id
echo "Customer id is $id; date range is $dt"
...or, if you didn't want to rename the variables as they were read, simply:
read_python_vars config.py week_date cust_id
echo "Customer id is $cust_id; date range is $week_date"
Advantages:
Unlike a naive regex-based solution (which would have trouble with some of the details of Python parsing -- try teaching sed to handle both raw and regular strings, and both single and triple quotes without making it into a hairball!) or a similar approach that used newline-delimited output from the Python subprocess, this will correctly handle any object for which str() gives a representation with no NUL characters that your shell script can use.
Running content through the Python interpreter also means you can determine values programmatically -- for instance, you could have some Python code that asks your version control system for the last-change-date of relevant content.
Think about scenarios such as this one:
start_date = '01/03/16'
end_date = '01/09/16'
week_date = '%s-%s' % (start_date, end_date)
...using a Python interpreter to parse Python means you aren't restricting how people can update/modify your Python config file in the future.
Now, let's talk caveats:
If your Python code has side effects, those side effects will obviously take effect (just as they would if you chose to import the file as a module in Python). Don't use this to extract configuration from a file whose contents you don't trust.
Python strings are Pascal-style: They can contain literal NULs. Strings in shell languages are C-style: They're terminated by the first NUL character. Thus, some variables can exist in Python than cannot be represented in shell without nonliteral escaping. To prevent an object whose str() representation contains NULs from spilling forward into other assignments, this code terminates strings at their first NUL.
Now, let's talk about implementation details.
${#%%:*} is an expansion of $# which trims all content after and including the first : in each argument, thus passing only the Python variable names to the interpreter. Similarly, ${varname#*:} is an expansion which trims everything up to and including the first : from the variable name passed to read. See the bash-hackers page on parameter expansion.
Using <(python ...) is process substitution syntax: The <(...) expression evaluates to a filename which, when read, will provide output of that command. Using < <(...) redirects output from that file, and thus that command (the first < is a redirection, whereas the second is part of the <( token that starts a process substitution). Using this form to get output into a while read loop avoids the bug mentioned in BashFAQ #24 ("I set variables in a loop that's in a pipeline. Why do they disappear after the loop terminates? Or, why can't I pipe data to read?").
The IFS= read -r -d '' construct has a series of components, each of which makes the behavior of read more true to the original content:
Clearing IFS for the duration of the command prevents whitespace from being trimmed from the end of the variable's content.
Using -r prevents literal backslashes from being consumed by read itself rather than represented in the output.
Using -d '' sets the first character of the empty string '' to be the record delimiter. Since C strings are NUL-terminated and the shell uses C strings, that character is a NUL. This ensures that variables' content can contain any non-NUL value, including literal newlines.
See BashFAQ #001 ("How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?") for more on the process of reading record-oriented data from a string in bash.
Other answers give a way to do exactly what you ask for, but I think the idea is a bit crazy. There's a simpler way to satisfy both scripts - move those variables into a config file. You can even preserve the simple assignment format.
Create the config itself: (ini-style)
dt="01/03/16-01/09/16"
cust_id="12345"
In python:
config_vars = {}
with open('the/file/path', 'r') as f:
for line in f:
if '=' in line:
k,v = line.split('=', 1)
config_vars[k] = v
week_date = config_vars['dt']
cust_id = config_vars['cust_id']
In bash:
source "the/file/path"
And you don't need to do crazy source parsing anymore. Alternatively you can just use json for the config file and then use json module in python and jq in shell for parsing.
I would do something like this. You may want to modify it little bit for minor changes to include/exclude quotes as I didn't really tested it for your scenario:
#!/bin/sh
exec <$python_filename
while read line
do
match=`echo $line|grep "week_date ="`
if [ $? -eq 0 ]; then
dt=`echo $line|cut -d '"' -f 2`
fi
match=`echo $line|grep "cust_id ="`
if [ $? -eq 0 ]; then
cust_id=`echo $line|cut -d '"' -f 2`
fi
done