passing text from python to shell | unicode | applying cut on it - python

I have a python script that essentially parses an xml file, uses the package re and prints text as follows:
string = str(search_compiled.groups(0)[0].encode('utf-8')) + "%" + str(text.encode('utf-8'))
print string
I receive the text in the shell script as follows:
string="$($file.py $arg1 $arg2 $arg3)"
varA="$(echo "$string" | cut -d'%' -f1)"
varB="$(echo "$string" | cut -d'%' -f2)"
echo "$string"
So, in summary, I need the passed string to be cut into two by the delimiter '%' and store the results in varA and varB.
The splitting does not happen.
string shows the entire thingy: part A plus the part B. Here's the catch, the '%' I added in the python script does not get printed though.
Could anyone please help me in understanding what is going wrong?

You can use the pipe and cut commands as you have in the question but without the quotes on the delimiter character use -d% instead of -d'%'
varA=$(echo $string | cut -f1 -d%)
varB=$(echo $string | cut -f2 -d%)

[root#test /tmp]$ eval `echo "aaa%bbb%ccc" | awk -F '%' '{print "a="$1" b="$2}'`
[root#test /tmp]$ echo $a
aaa
[root#test /tmp]$ echo $b
bbb
Explanation
Use awk -F '%' '{print "a="$1" b="$2}' get like this a=aaa b=bbb
eval a=aaa b=bbb Equivalent to the input terminal
$ a=aaa
$ b=bbb

I re-read this for a 3rd time, and I think this is the basic problem (from your description):
string shows the entire thingy: part A plus the part B. Here's the catch, the '%' I added in the python script does not get printed though.
The conversion of data to utf-8 then back to string seems suspect to me. Can you change the string creation line in your python program to this:
string = u'{}%{}'.format(search_compiled.groups(0)[0].encode('utf-8'), text.encode('utf-8'))
You might be double encoding, so this could be what you need:
string = u'{}%{}'.format(search_compiled.groups(0)[0], text)
Add this in the shell script before it calls the python script:
export PYTHONIOENCODING=UTF-8

Related

echo not printing \033 correctly in pipeline started by os.system()

In bash (as started by Python), I want to print this string \033[31m so that I can use a pipe | operator after it, followed by a command to copy that string to the clipboard. This means that in practice, I'm trying to run something like:
os.system('echo \\033[31m | xsel -ib')
...but the xsel -ib part is working fine, so this question is focused specifically on the behavior of echo.
Most of my attempts have been similar to:
echo -e \\033[31m
I have tried it with single quotes, double quotes, no quotes, removing the -e flag, etc. The closest I got was:
echo -n "\\ 033[31m"
which prints this string \ 033[31m
I don't want that space between \ and 0
-n flag is used to not append a new line after the printed string
I use Ubuntu 20.04, and xsel is a selection and clipboard manipulation tool for the X11 Window System (which Ubuntu 20.04 uses).
echo is the wrong tool for the job. It's a shell builtin, and one for which the POSIX sh standard explicitly does not guarantee portable behavior for when escape sequences (such as \033) are present. system() starts /bin/sh instead of bash, so POSIX behavior -- not that of your regular interactive shell -- is expected.
Use subprocess.run() instead of os.system(), and you don't need echo in the first place.
If you want to put an escape sequence into the clipboard (so not \033 but instead the ESC key that this gets converted to by an echo with XSI extensions to POSIX):
# to store \033 as a single escape character, use a regular Python bytestring
subprocess.run(['xsel', '-ib'], input=b'\033[31m')
If you want to put the literal text without being interpreted (so there's an actual backslash and an actual zero), use a raw bytestring instead:
# to store \033 as four separate characters, use a raw string
subprocess.run(['xsel', '-ib'], input=rb'\033[31m')
For a more detailed description of why echo causes problems in this context, see the excellent answer by Stephane to the Unix & Linux Stack Exchange question Why is printf better than echo?.
If you for some reason do want to keep using a shell pipeline, switch to printf instead:
# to store \033 as four separate characters, use %s
subprocess.run(r''' printf '%s\n' '\033[31m' | xsel -ib ''', shell=True)
# to store \033 as a single escape character, use %b
subprocess.run(r''' printf '%b\n' '\033[31m' | xsel -ib ''', shell=True)

Python subprocess using perl for formatting is giving incomplete output

I'm having an issue reading output from a python subprocess command.
The bash command from whose output I want to read:
pacmd list-sink-inputs | tr '\n' '\r' | perl -pe 's/ *index: ([0-9]+).+?application\.process\.id = "([^\r]+)"\r.+?(?=index:|$)/\2:\1\r/g' | tr '\r' '\n'
When I run this via bash I get the intended output:
4 sink input(s) available.
6249:72
20341:84
20344:86
20350:87
When I try to get it's output via python's subprocess running either one :
subprocess.Popen(cmnd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate()[0].decode('UTF-8')
check_output(cmnd,shell=True).decode('UTF-8')
subprocess.run(cmnd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE).stdout.decode('utf-8')
where cmnd = """pacmd list-sink-inputs | tr '\n' '\r' | perl -pe 's/ *index: ([0-9]+).+?application\.process\.id = "([^\r]+)"\r.+?(?=index:|$)/\2:\1\r/g' | tr '\r' '\n'"""
It gives the following output:
'4 sink input(s) available.\n\x02:\x01\n\x02:\x01\n\x02:\x01\n\x02:\x01\n'
Which is unintended as it doesn't have the 6249:72 ,etc. numbers I want. Even stderr is blank and returncode is 0 as intended.
The only workaround, I could find was to redirect the bash output to a text file and then read the text file via python which I don't want to use because that's unnecessary file IO.
I've already gone through Missing output from subprocess command, Python Subprocess Grep, Python subprocess run() is giving abnormal output [duplicate] and many others but can't wrap my head around what's going wrong.
You have a quoting issue. """\1""" means chr(0o1). To produce the string \1, you could use """\\1""". The other instances of \ should be \\ as well.
Since all instances of \ need to be escaped, you could also use r"""\1""".
Other issues:
\1 and \2 outside of a regular expression is wrong anyways. You should be using $1 and $2.
There's no use for a mutliline literal here. "..." or r"..." would suffice.
The whole tr business can be avoided by using -0777 to cause perl to treat the entire file as one line.
This gives us:
cmnd = "pacmd list-sink-inputs | perl -0777pe's/ *index: (\\d+).+?application\\.process\\.id = "([^\\n]+)"\\n.+?(?=index:|$)/$2:$1\\n/sag'"
or
cmnd = r"pacmd list-sink-inputs | perl -0777pe's/ *index: (\d+).+?application\.process\.id = "([^\n]+)"\n.+?(?=index:|$)/$2:$1\n/sag'"
But why is Perl being used at all here? You could easily do the same thing in Python!

Convert python list to bash array

I have a python list as a string with the following structure:
var="["127.0.0.1:14550","127.0.0.1:14551"]"
I would like to turn the string into a bash array to be able to loop through it with bash:
for ip in ${var[#]}; do
something
done
Use Perl to parse the Python output, like so (note single quotes around the string, which contains double quotes inside):
array=( $( echo '["127.0.0.1:14550","127.0.0.1:14551"]' | perl -F'[^\d.:]' -lane 'print for grep /./, #F;' ) )
echo ${array[*]}
Output:
127.0.0.1:14550 127.0.0.1:14551
Alternatively, use jq as in the answer by 0stone0, or pipe its output through xargs, which removes quotes, like so:
array=( $( echo '["127.0.0.1:14550","127.0.0.1:14551"]' | jq -c '.[]' | xargs ) )
The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-n : Loop over the input one line at a time, assigning it to $_ by default.
-l : Strip the input line separator ("\n" on *NIX by default) before executing the code in-line, and append it when printing.
-a : Split $_ into array #F on whitespace or on the regex specified in -F option.
-F'[^\d.:]' : Split into #F on any chars other than digit, period, or colon, rather than on whitespace.
print for grep /./, #F; : take the line split into array of strings #F, select with grep only non-empty strings, print one per line.
SEE ALSO:
perldoc perlrun: how to execute the Perl interpreter: command line switches
One option is to treat the string as json, and use jq to parse it:
jq -rc '.[]' <<< '["127.0.0.1:14550","127.0.0.1:14551"]' | while read i; do
echo $i
done
127.0.0.1:14550
127.0.0.1:14551

Execute command set results as variable

Can anyone tell me how to set the output of a command to a variable?
Basically, I'm looking for the Python equivalent to this bash example:
blah="ajsdlk akajl <ajksd#ajksldf.com>"
blah=$(echo "$blah" | cut -d '<' -f 2 | cut -d '>' -f 1)
echo "$blah"
ajksd#ajksldf.com
You may use string.split
>>> blah="ajsdlk akajl <ajksd#ajksldf.com>"
>>> blah.split('<')[1].split('>')[0]
'ajksd#ajksldf.com'
If a function returns a string, just capture its return value. If you're looking to capture the standard output from a function, wrap it with a StringIO wrapper.

call awk from inside python generate error

Ive to run awk from the python. When I run the script from the terminal, gives the desired output but showing error when
executing from inside the python.
runAwk = '''awk '{printf $1}{for(i=2;i<=NF;i++)printf "|"$i}{printf "\n"}' final.txt'''
os.system(runAwk)
gives the error:
awk: line 1: runaway string constant " ...
when I surfed from the web, I found that awk can not be used with os module and there are not much contents. I am confused how to proceed ahead.
The \n in your runAwk string is being interpreted by Python as a literal newline character, rather than being passed through to awk as the two characters \ and n. If you use a raw string instead, by preceding the opening triple-quotes with an r:
runAwk = r'''awk '{printf $1}{for(i=2;i<=NF;i++)printf "|"$i}{printf "\n"}' final.txt'''
... then Python won't treat \n as meaning "newline", and awk will see the string you intended.

Categories