subprocess call of sed command giving error

subprocess call of sed command giving error - python

I have a text file which contains the following line
PIXEL_SCALE 1.0 # size of pixel in arc
To replace 1.0 in it with 0.3,
I tried to use sed via subprocess.call from python script.
Following sed regex command works perfectly from shell.
sed -i 's/^\(PIXEL_SCALE\s*\)\([0-9]*\.[0-9]*\)/\10.3/' filename.txt
But the equivalent subprocess.call command gives me the following error.
subprocess.call(['sed','-i',"'s/^\(PIXEL_SCALE\s*\)\([0-9]*\.[0-9]*\)/\10.3/'",'filename.txt'])
sed: -e expression #1, char 1: unknown command: `''
I tried converting the string to raw string by prefixing string with r and also tried .encode("UTF-8"). But they didn't have any effect.
What could be going wrong here?
Thanks

' quotes are delimiters used by the shell. As you do not use a shell, you don't need them around your regular expression:
subprocess.call(['sed','-i',r"s/^\(PIXEL_SCALE\s*\)\([0-9]*\.[0-9]*\)/\10.3/",'filename.txt'])
# ^^ ^
In addition, I used a raw string (r"....") to prevent interpretation of the backslash-escaped sequences by python.

subprocess.call("sed -i 's/^\(PIXEL_SCALE\s*\)\([0-9]*\.[0-9]*\)/\10.3/' filename.txt", shell=True)
that works

's/(PIXEL_SCALE\s*)[0-9]+[0-9]+/\10.3/'

Related

Sed command in python

My input is as
Type combinational function (A B)
Want output to be
Type combinational
function (A B)
I used code and its working
sed 's/\([^ ]* [^ ]*\) \(function.*\)/\1\n\2/' Input_file
When I use this code inside python script using os.system and subprocess its giving me error.
How can I execute this sed inside python script. Or how can I write python code for above sed code.
Python code used
cmd='''
sed 's/\([^ ]* [^ ]*\) \(function.*\)/\1\n\2/' Input_file
'''
subprocess.check_output(cmd, shell=True)
Error is
sed: -e expression #1, char 34: unterminated `s' command

The \n in the string is being substituted by Python into a literal newline. As suggested by #bereal in a comment, you can avoid that by using r'''...''' instead of '''...''' around the script; but a much better solution is to avoid doing in sed what Python already does very well all by itself.
with open('Input_file') as inputfile:
lines = inputfile.read()
lines = lines.replace(' function', '\nfunction')
This is slightly less strict than your current sed script, in that it doesn't require exactly two space-separated tokens before the function marker. If you want to be strict, try re.sub() instead.
import re
# ...
lines = re.sub(r'^(\S+\s+\S+)\s+(function)', r'\1\n\2', lines, re.M)
(Tangentially, you also want to avoid the unnecessary shell=True; perhaps see Actual meaning of 'shell=True' in subprocess)

Although the solutions 1 and 2 are the shortest valid way to get your code running (on Unix), i'd like to add some remarks:
a. os.system() has some issues related to it, and should be replaced by subprocess.call("your command line", shell=False). Regardless of using os.system or subprocess.call, shell=True implies a security risk.
b. Since sed (and awk) are tools that rely heavily on regular expressions it is recommended, when building python for maintainability, to use native python code. In this case use the re, regular expression module, which has a regexp optimized implementation.

Trying to convert Sed command to Python re.sub

I am trying to convert the below Sed command to python re.sub. The Sed command is basically extracting the access_token value from the json string.
finalString=$(echo $initialString | sed -e 's/^.*"access_token":"\([^"]*\)".*$/\1/')
My Python code, I was stuck in replacing the \1 part. I have to replace the whole string with the value
access_token = re.sub('^.*"access_token":"\([^"]*\)".*$',r'\1',initialString)
print access_token
My working echo statement is as follows, When I run this I am getting the access_token value. For Ex: If initialString ='{"access_token":"xyz"}' output will be xyz.
echo initialString | sed -e 's/^.*"access_token":"\([^"]*\)".*$/\1/'

In general, you should make it a rule to always use raw-strings as regular expressions in Python. (In specific, it doesn't matter here. But it's a good rule of thumb.)
Try this:
access_token = re.sub(r'^.*"access_token":"([^"]*)".*$', r'\1', initialString)

I'm working on the assumption that your intialString is something along the lines of: "other":"json","access_token":"(1234)","more":"json"
access_token = re.sub(r'^.*"access_token":"\(([^\)]*)\)".*$',r'\1',initialString)
The problem I noticed was that you were never actually capturing any characters to reference with \1.

python multiline command running from bash

I'm trying to run this:
python -c "for i in range(10):\n print i"
but I get an error:
File "<string>", line 1
for i in range(10):\n print i
^
SyntaxError: unexpected character after line continuation character
According to this I assume that bash should have processed (namely, newline symbol) command line arguments but the returned error shows the opposite case. Where am I wrong, and why does this happen?
P.S. python-2.7
EDIT
Let me explain my motivation a bit.
This code example is definitely pretty silly. Since the doc says that "command can be one or more statements separated by newlines, with significant leading whitespace as in normal module code", I was interested in how should I bring those mentioned newlines to the command properly.
The proposed solutions here are:
Use ; to distinguish several commands inside the loop. Yes, that works but it still is a one-liner, I can not use it If I want to run some commands after the loop. ; is not a replacement for a newline.
Type ^M where newline is needed. This hits the goal more precisely but unfortunately, to my point of view, this basically ruins the whole idea of running a python code from the command line because it requires interactive mode. As I understand it's the same as entering a command ant hitting Enter key. So no difference to typing python and working in its shell. That said, I cannot write this in a bash script. Or may I?
Probably the question really should have been splitted into two ones:
Bash escaping:
Enclosing characters in double quotes (‘"’) preserves the literal value of all characters within the quotes, with the exception of ‘$’, ‘’, ‘\’, and, when history expansion is enabled, ‘!’. The characters ‘$’ and ‘’ retain their special meaning within double quotes (see Shell Expansions). The backslash retains its special meaning only when followed by one of the following characters: ‘$’, ‘`’, ‘"’, ‘\’, or newline.
How does this correspond to the case described? How does bash handles newlines? I found that putting the command into unary quotes makes no change.
How to pass a newline to python in a non-interactive way. (You may say -- why don't you write an ordinary python file with all newlines you want -- you are right but I'm interested in what is exactly meant in the documentation since it quotes newline)

You actually would need to transform the \n part into an actual newline. That can be done with the $'' syntax:
python -c $'for i in range(10):\n print i'
0
1
2
3
4
5
6
7
8
9
You can also reach that result with echo -e or printf
$ python -c "$(echo -e "for i in range(10):\n print i")"
You could also use a here string:
$ python <<< $(echo -e "for i in range(10):\n print i")
See section 3.1.2.4 ANSI-C Quoting of the Bash Manpage for more information.

Remove \n
python -c "for i in range(10): print i"
Or
You can use ; for using multiple line in for loop
python -c "for i in range(10): print '1st newline';print '2nd newline';print i"

You can run a multi-line python -c statement by adding CR characters in your line:
python -c "for i in range(10):^M print (i)^M print ('Hello:' + str(i*i))"
where ^M is not actually ^ followed by M, it is actually the character you get when you type [CTRL-v][CTRL-m]. Notice the space after this character, which means there are two print statements in the for loop, and it should print:
0
Hello:0
1
Hello:1
....
9
Hello:81
You can do this in a bash script too:
#!/bin/bash
A="python -c \"for i in range(10):^M print (i)^M print ('Hello:' + str(i*i))\""
eval $A

call awk from inside python generate error

Ive to run awk from the python. When I run the script from the terminal, gives the desired output but showing error when
executing from inside the python.
runAwk = '''awk '{printf $1}{for(i=2;i<=NF;i++)printf "|"$i}{printf "\n"}' final.txt'''
os.system(runAwk)
gives the error:
awk: line 1: runaway string constant " ...
when I surfed from the web, I found that awk can not be used with os module and there are not much contents. I am confused how to proceed ahead.

The \n in your runAwk string is being interpreted by Python as a literal newline character, rather than being passed through to awk as the two characters \ and n. If you use a raw string instead, by preceding the opening triple-quotes with an r:
runAwk = r'''awk '{printf $1}{for(i=2;i<=NF;i++)printf "|"$i}{printf "\n"}' final.txt'''
... then Python won't treat \n as meaning "newline", and awk will see the string you intended.

Sed with Python/subprocess.call: How can I make python execute this call to sed

The working sed I run from the shell is:
sed -re 's/(::\s+ni\s+=)[^=]*$/\1 512/' test.dat
However, I cannot get it to run with Python's subprocess.call:
I have the following:
infile = 'test.dat'
cmd= [
"sed",
"-re",
"s/(::\s+ni\s+=)[^=]*$/\1 512/",
infile
]
subprocess.call(cmd, stdout=open('out_test.dat','w'))
I tried many different ways but I always get a non-zero exit status.

The problem is that the Python string "s/(::\s+ni\s+=)[^=]*$/\1 512/" contains a control-A where you wanted a backslash and a 1. Whenever you're writing regular expressions as string literals, you want to use raw strings if possible, or escape the backslashes if not. So, just change that line to:
r"s/(::\s+ni\s+=)[^=]*$/\1 512/",

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.