Trying to convert Sed command to Python re.sub - python

I am trying to convert the below Sed command to python re.sub. The Sed command is basically extracting the access_token value from the json string.
finalString=$(echo $initialString | sed -e 's/^.*"access_token":"\([^"]*\)".*$/\1/')
My Python code, I was stuck in replacing the \1 part. I have to replace the whole string with the value
access_token = re.sub('^.*"access_token":"\([^"]*\)".*$',r'\1',initialString)
print access_token
My working echo statement is as follows, When I run this I am getting the access_token value. For Ex: If initialString ='{"access_token":"xyz"}' output will be xyz.
echo initialString | sed -e 's/^.*"access_token":"\([^"]*\)".*$/\1/'

In general, you should make it a rule to always use raw-strings as regular expressions in Python. (In specific, it doesn't matter here. But it's a good rule of thumb.)
Try this:
access_token = re.sub(r'^.*"access_token":"([^"]*)".*$', r'\1', initialString)

I'm working on the assumption that your intialString is something along the lines of: "other":"json","access_token":"(1234)","more":"json"
access_token = re.sub(r'^.*"access_token":"\(([^\)]*)\)".*$',r'\1',initialString)
The problem I noticed was that you were never actually capturing any characters to reference with \1.

Related

Why do I get an empty last line when using os.popen

I am using this Python (Python 3) code, in order to get the list of all topics
as:
more test.py
list = os.popen(" kafka-topics.sh –zookeeper zoo_server:2181 --list | sed '/^[[:space:]]*$/d' ").read().split('\n')
print (list)
When I run the python script, I noticed that last line is like the following
…………………………………………………………
……….'topic32', 'topic33', 'topic34 , ‘ ‘]
So the last word is actually null because between the single quote as ‘ ‘ we do not have topic name
It's strange because in line I am using the sed - sed '/^[[:space:]]*$/d', in order to delete empty lines, and indeed not empty lines when I just run the following
kafka-topics.sh –zookeeperzoo_server:2181 --list | sed '/^[[:space:]]*$/d'
Any hint what is wrong with my Python line?
For example, when I run
kafka-topics.sh –zookeeper zoo_server:2181 --list | sed '/^[[:space:]]*$/d'
topic1
topic2
topic3
.
.
.
This doesn't really have anything to do with popen. The string that read returns ends with a linefeed.
>>> "foo\nbar\n".split()
['foo', 'bar', '']
If you don't want the empty string that follows the last linefeed, you should strip the final linefeed first.
list = os.popen(...).read().rstrip('\n').split('\n')
I am using this Python (Python 3) code, in order to get the list of all topics
Why dont you use a python kafka client and call list topics ?

Sed command in python

My input is as
Type combinational function (A B)
Want output to be
Type combinational
function (A B)
I used code and its working
sed 's/\([^ ]* [^ ]*\) \(function.*\)/\1\n\2/' Input_file
When I use this code inside python script using os.system and subprocess its giving me error.
How can I execute this sed inside python script. Or how can I write python code for above sed code.
Python code used
cmd='''
sed 's/\([^ ]* [^ ]*\) \(function.*\)/\1\n\2/' Input_file
'''
subprocess.check_output(cmd, shell=True)
Error is
sed: -e expression #1, char 34: unterminated `s' command
The \n in the string is being substituted by Python into a literal newline. As suggested by #bereal in a comment, you can avoid that by using r'''...''' instead of '''...''' around the script; but a much better solution is to avoid doing in sed what Python already does very well all by itself.
with open('Input_file') as inputfile:
lines = inputfile.read()
lines = lines.replace(' function', '\nfunction')
This is slightly less strict than your current sed script, in that it doesn't require exactly two space-separated tokens before the function marker. If you want to be strict, try re.sub() instead.
import re
# ...
lines = re.sub(r'^(\S+\s+\S+)\s+(function)', r'\1\n\2', lines, re.M)
(Tangentially, you also want to avoid the unnecessary shell=True; perhaps see Actual meaning of 'shell=True' in subprocess)
Although the solutions 1 and 2 are the shortest valid way to get your code running (on Unix), i'd like to add some remarks:
a. os.system() has some issues related to it, and should be replaced by subprocess.call("your command line", shell=False). Regardless of using os.system or subprocess.call, shell=True implies a security risk.
b. Since sed (and awk) are tools that rely heavily on regular expressions it is recommended, when building python for maintainability, to use native python code. In this case use the re, regular expression module, which has a regexp optimized implementation.

subprocess call of sed command giving error

I have a text file which contains the following line
PIXEL_SCALE 1.0 # size of pixel in arc
To replace 1.0 in it with 0.3,
I tried to use sed via subprocess.call from python script.
Following sed regex command works perfectly from shell.
sed -i 's/^\(PIXEL_SCALE\s*\)\([0-9]*\.[0-9]*\)/\10.3/' filename.txt
But the equivalent subprocess.call command gives me the following error.
subprocess.call(['sed','-i',"'s/^\(PIXEL_SCALE\s*\)\([0-9]*\.[0-9]*\)/\10.3/'",'filename.txt'])
sed: -e expression #1, char 1: unknown command: `''
I tried converting the string to raw string by prefixing string with r and also tried .encode("UTF-8"). But they didn't have any effect.
What could be going wrong here?
Thanks
' quotes are delimiters used by the shell. As you do not use a shell, you don't need them around your regular expression:
subprocess.call(['sed','-i',r"s/^\(PIXEL_SCALE\s*\)\([0-9]*\.[0-9]*\)/\10.3/",'filename.txt'])
# ^^ ^
In addition, I used a raw string (r"....") to prevent interpretation of the backslash-escaped sequences by python.
subprocess.call("sed -i 's/^\(PIXEL_SCALE\s*\)\([0-9]*\.[0-9]*\)/\10.3/' filename.txt", shell=True)
that works
's/(PIXEL_SCALE\s*)[0-9]+[0-9]+/\10.3/'

Sed with Python/subprocess.call: How can I make python execute this call to sed

The working sed I run from the shell is:
sed -re 's/(::\s+ni\s+=)[^=]*$/\1 512/' test.dat
However, I cannot get it to run with Python's subprocess.call:
I have the following:
infile = 'test.dat'
cmd= [
"sed",
"-re",
"s/(::\s+ni\s+=)[^=]*$/\1 512/",
infile
]
subprocess.call(cmd, stdout=open('out_test.dat','w'))
I tried many different ways but I always get a non-zero exit status.
The problem is that the Python string "s/(::\s+ni\s+=)[^=]*$/\1 512/" contains a control-A where you wanted a backslash and a 1. Whenever you're writing regular expressions as string literals, you want to use raw strings if possible, or escape the backslashes if not. So, just change that line to:
r"s/(::\s+ni\s+=)[^=]*$/\1 512/",

symbols in command line argument.. python, bash

I am writing a python script on Linux for twitter post using API, Is it possible to pass symbols like "(" ")" etc in clear text without apostrophes....
% ./twitterupdate this is me #works fine
% ./twitterupdate this is bad :(( #this leaves a error on bash.
Is the only alternative is to enclose the text into --> "" ?? like..
% ./twitterupdate "this is bad :((" #this will reduce the ease of use for the script
Is there any workaround?
Yes, quoting the string is the only way. Bash has its syntax and and some characters have special meaning. Btw, using "" is not enough, use apostrophes instead. Some characters will still get interpretted with normal quotation marks:
$ echo "lots of $$"
lots of 15570
$ echo 'lots of $$'
lots of $$
http://www.gnu.org/software/bash/manual/bashref.html#Quoting

Categories