I need to write a python script where I need to call a few awk commands inside of it.
#!/usr/bin/python
import os, sys
input_dir = '/home/abc/data'
os.chdir(input_dir)
#wd=os.getcwd()
#print wd
os.system ("tail -n+2 ./*/*.tsv|cat|awk 'BEGIN{FS="\t"};{split($10,arr,"-")}{print arr[1]}'|sort|uniq -c")
It gives an error in line 8: SyntaxError: unexpected character after line continuation character
Is there a way I can get the awk command get to work within the python script?
Thanks
You have both types of quotes in that string, so use triple quotes around the whole thing
>>> x = '''tail -n+2 ./*/*.tsv|cat|awk 'BEGIN{FS="\t"};{split($10,arr,"-")}{print arr[1]}'|sort|uniq -c'''
>>> x
'tail -n+2 ./*/*.tsv|cat|awk \'BEGIN{FS="\t"};{split($10,arr,"-")}{print arr[1]}\'|sort|uniq -c'
You should use subprocess instead of os.system:
import subprocess
COMMAND = "tail -n+2 ./*/*.tsv|cat|awk 'BEGIN{FS=\"\t\"};{split($10,arr,\"-\")}{print arr[1]}'|sort|uniq -c"
subprocess.call(COMMAND, shell=True)
As TehTris has pointed out, the arrangement of quotes in the question breaks the command string into multiple strings. Pre-formatting the command and escaping the double-quotes fixes this.
Related
I am running the sed command inside python using os.system. Below is the code.
os.system("sed -i /solid/s/Visualization Toolkit generated SLA File/chestwall/g mesh1.stl")
The name to be changed has spaces in it. Also, in the end part i.e. mesh1.stl, the 1 need to be variable. How to do it?
Firstly, for this code, I am getting error as:
sed: -e expression #1, char 22: unterminated s command
I tried putting / at the end.
Second, I need the mesh1 to be a variable from previous line. Say, mesh1 as a and everytime, a changes. How to write like that?
Make sure that the sed statement/command is in either double or single quotes and then use "+" to concatenate strings before passing them to os.system
import os
var=1
os.system("sed -i 's/solid/s/Visualization Toolkit generated SLA File/chestwall/g' mesh" + var + ".stl")
The function os.system() is now considered to be superseded by
subprocess.call().
Would you please try the following:
import subprocess
a = 'mesh1'
cmd = ['sed', '-i', '/solid/s/Visualization Toolkit generated SLA File/chestwall/g', '{0}.stl'.format(a)]
subprocess.call(cmd)
You can pass the command as a list, not a string, and you can explicitly divide the arguments.
My goal is to execute the following bash command in Python and store its output:
echo 'sudo ./run_script.sh -dates \\{\\'2017-11-16\\',\\'2017-11-29\\'\\}'|sed 's;\\\\;\\;'
When I run this command in bash, the output is: sudo ./run_script.sh -dates \{\'2019-10-05\',\'2019-10-04\'\}
My initial idea was to replace the double backslash by a single backslash in Python. As ridiculous as it seems, I couldn't do it in Python (only when using print() the output is as I would like but I can't store the output of print() and str() doesn't convert \ to . So I decided to do it in bash.
import subprocess
t= 'some \\ here'
cmd = "echo \'"+ t+"\'|sed 's;\\\\;\\;'"
ps = subprocess.run(cmd,shell=True,stdout=subprocess.PIPE,stderr=subprocess.STDOUT)
ps.stdout
Out[6]: b"sed: -e expression #1, char 7: unterminated `s' command\n"
Running Python 3.6.8 on Ubuntu 18
Try using subprocess.check_output instead. You're also forgetting an extra backslash for every backslash in your command.
import subprocess
command = "echo 'some \\\\here'|sed 's;\\\\\\\\;\\\\;'"
output = subprocess.check_output(command, shell=True).decode()
print(output) # prints your expect result "some \here"
After re-reading your question I kinda understood what you wanted.
a = r'some \here'
print(a) #some \here
Again, raw string literals...
I'm using subprocess to call a program within python and I'm passing a string to it, which can contain quotation marks.
This is the piece of code that is giving me troubles
import subprocess
text = subprocess.Popen("""awk 'BEGIN { print "%s"}' | my_program """ % sentence, stdout=subprocess.PIPE, shell=True)
When sentence = "I'm doing this" I get the following error message
/bin/sh: -c: line 0: unexpected EOF while looking for matching `"'
/bin/sh: -c: line 1: syntax error: unexpected end of file
I guess this has to do with the way quotes are escaped in python and linux. Is there a way to fix it?
you're confusing awk and underlying shell because there's a quote in your quoted awk expression. First part is equivalent to:
awk 'BEGIN { print "I'm doing this"}'
Which is incorrect, even in pure shell.
Quickfix, escape the quotes in your sentence:
text = subprocess.Popen("""awk 'BEGIN { print "%s"}' | my_program """ % sentence.replace("'","\\'"), stdout=subprocess.PIPE, shell=True)
Proper fix: don't use awk at all just to print something, just feed input to your subprocess:
text = subprocess.Popen(my_program, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
output,error = text.communicate(sentence.encode())
(and you can get rid of the shell=True in the process)
Last point: you seem to have trouble because my_program is some program plus arguments. To pass a command such as aspell -a you can do:
my_program = "aspell -a"
or:
my_program = ['aspell','-a']
but not
my_program = ['aspell -a']
which is probably what you've done here, so Python tries to literally execute the program "aspell -a" instead of splitting into program + argument.
I am trying to execute the following command in python using plumbum:
sort -u -f -t$'\t' -k1,1 file1 > file2
However, I am having issues passing the -t$'\t' argument. Here is my code:
from plumbum.cmd import sort
separator = r"-t$'\t'"
print separator
cmd = (sort["-u", "-f", separator, "-k1,1", "file1"]) > "file2"
print cmd
print cmd()
I can see problems right away after print separator and print cmd() executes:
-t$'\t'
/usr/bin/sort -u -f "-t\$'\\t'" -k1,1 file1 > file2
The argument is wrapped in double quotes.
An extra \ before $ and \t is inserted.
How should I pass this argument to plumbum?
You may have stumbled into limitations of the command line escaping.
I could make it work using subprocess module, passing a real tabulation char litteraly:
import subprocess
p=subprocess.Popen(["sort","-u","-f","-t\t","-k1,1","file1",">","file2"],shell=True)
p.wait()
Also, full python short solution that does what you want:
with open("file1") as fr, open("file2","w") as fw:
fw.writelines(sorted(set(fr),key=lambda x : x.split("\t")[0]))
The full python solution doesn't work exactly the same way sort does when dealing with unicity. If 2 lines have the same first field but not the same second field, sort keeps one of them, whereas the set will keep both.
EDIT: unchecked but you just confirmed that it works: just tweak your plumbum code with:
separator = "-t\t"
could just work, although out of the 3 ones, I'd recommend the full python solution since it doesn't involve an external process and therefore is more pythonic and portable.
I'm trying to write an svn pre-commit hook in python. Part of this involves checking the diff file to see if there are any actual file changes (as opposed to just property changes).
I have a working grep command which I can execute fine on the shell
grep "^\(Added: \|Modified: \|Deleted: \)" diff filename | grep -v 'svn:'
However when I put it through subprocess.POpen it escapes all my backslashes, which knackers the regexp.
Executing command: ['grep', '"^\\Added: \\|Modified: \\|Deleted: \\)", ...]
How do I avoid this?
NB: I'm aware that I can pipe results between subprocesses and I can do the two greps that way. I need help getting the first one working first though :/
NB2: I also tried using filterdiff --clean instead and couldn't get it to work. Searching for Added, Modified or Deleted lines, removing those with 'svn:' in and checking I had some results seemed to work though.
Python code:
command = ['grep', '"^\(Added: \|Modified: \|Deleted: \)"', filename]
sys.stdout.write('Executing command: %s\n' % (command))
p = subprocess.Popen(command,
stdin = subprocess.PIPE
stdout = subprocess.PIPE
stderr = subprocess.STDOUT
shell = True)
data = p.stdout.read()
if len(data) == 0:
sys.stdout.write("Diff does not contain any file modifications./n")
exit(0)
You need to consider what you want grep to see in its command line arguments.
The first argument needs to be the literal string "^\(Added: \|Modified: \|Deleted: \)", so that means that it shouldn't include the double quotes but should include the backslashes.
The way to express this kind of string is to use Python raw strings:
command = ['grep', r'^\(Added: \|Modified: \|Deleted: \)', filename]
A good way to check what you're actually running is to replace grep by echo so you can at least see what you're passing to the command.