How to pass collection data from Python to bash? - python

I am writing a dev ops kind of a bash script that is used for running an application in a local development environment under configuration as similar to production as possible. To eliminate duplicating some code/data which is already in a Python script, I would like my bash script to invoke a Python call to retrieve data that is hard coded in that Python script. The data structure in Python is a dict but I really only care about the keys so I can just return an array of keys. The Python script is used in production and I want to use it and not duplicate the data in my shell script to avoid having to follow on any modification in the production script with parallel changes in the local environment shell script.
Is there any way I can invoke a Python function from bash and retrieve this collection of values? If not, should I just have the Python function print to STDOUT and have the shell script parse the result?

Yes, that is best and almost only way to pass data from python to bash.
Also your function can write to file, which would be read by bash script.

To write a Python dictionary from a module out to a NUL-delimited key/value stream (which is the preferred serialization format if you want to represent the full range of values bash is capable of handling):
#!/usr/bin/env python
import sys, yourmodule
saw_errors = 0
for k, v in yourmodule.data.iteritems():
if '\0' in k or '\0' in v:
saw_errors = 1 # setting exit status is nice-to-have but not essential
continue # ...but skipping invalid content is important; otherwise,
# we'd corrupt the output stream.
sys.stdout.write('%s\0%s\0' % (k, v))
sys.exit(saw_errors)
...and to read that stream into an associative array:
# this is bash 4.x's equivalent to a Python dict
declare -A items=()
while IFS= read -r -d '' key && IFS= read -r -d '' value; do
items[$key]=$value
done < <(python_script) # where 'python_script' behaves as given above
...whereafter you can access the items from your Python script:
echo "Value for hello is: ${items[hello]}"
...or iterate over the keys:
printf 'Received key: %q\n' "${!items[#]}"
...or iterate over the values:
printf 'Received value: %q\n' "${items[#]}"
Caveat: Python bytestrings (regular strings, in Python 2.x) are Pascal-style; they have an explicit length stored, so they can contain any raw binary data whatsoever. (Python 3.x character strings are also Pascal-style, and can also contain NULs, but the aforementioned sentence doesn't quite apply as they don't contain raw binary content -- while the next one still does). Bash strings are C strings; they're NUL-terminated, so they can't contain raw NUL characters.
Thus, some data which can be represented in Python cannot be represented in bash.

As an alternative, you could make a python script that prints out a bash array.
bashify.py
#! /usr/bin/python
from sys import argv
from importlib import import_module
def as_bash_array(mapping):
return " ".join("[{!r}]={!r}".format(*item) for item in mapping.items())
def get_mapping(name):
module, var = name.rsplit(".", 1)
return getattr(import_module(module), var)
executable, mapping_name = argv
mapping = get_mapping(mapping_name)
print "(", as_bash_array(mapping), ")"
usage:
declare -A my_arr="`./bashify.py my_module.my_dict`"
Using !r in the format string means non-printing characters such as NUL will be escaped ("\x00" for NUL). It also means that string values will be quoted -- allowing characters that would otherwise break the array declaration syntax.

Related

Calling python script from bash script and getting it's return value

I've bash script doing some tasks but I need to manipulate on string obtained from configuration (for simplification in this test it's hardcoded). This manipulation can be done easily in python but is not simple in bash, so I've written a script in python doing this tasks and returning a string (or ideally an array of strings).
I'm calling this python script in my bash script. Both scripts are in the same directory and this directory is added to environment variables. I'm testing it on Ubuntu 22.04.
My python script below:
#!/usr/bin/python
def Get(input: str) -> list:
#Doing tasks - arr is an output array
return ' '.join(arr) #or ideally return arr
My bash script used to call the above python script
#!/bin/bash
ARR=("$(python -c "from test import Get; Get('val1, val2,val3')")")
echo $ARR
for ELEMENT in "${ARR[#]}"; do
echo "$ELEMENT"
done
When I added print in python script for test purposes I got proper results, so the python script works correctly. But in the bash script I got simply empty line. I've tried also something like that: ARR=("$(python -c "from test import Get; RES=Get('val1, val2,val3')")") and the iterate over res and got the same response.
It seems like the bash script cannot handle the data returned by python.
How can I rewrite this scripts to properly get python script response in bash?
Is it possible to get the whole array or only the string?
How can I rewrite this scripts to properly get python script response in bash?
Serialize the data from python side and deserialize on bash. Decide on proper protocol between the processes that would preserve any characters.
The best looks like it is to use newline or zero separated strings (protocol). Output delimiter separated elements from python (serialize) and read them properly on with readarray on bash side (deserialize).
$ tmp=$(python -c 'arr=[1,2,3]; print(*arr)')
$ readarray -t array <<<"$tmp"
$ declare -p array
declare -a array=([0]="1" [1]="2" [2]="3")
Or with zero separated stream. Note that Bash can't store zero bytes in variables, so we use redirection with process subtitution:
$ readarray -d '' -t array < <(python -c 'arr=[1,2,3]; print(*arr, sep="\0", end="")')
$ declare -p array
declare -a array=([0]="1" [1]="2" [2]="3")
I've solved my problem by exporting a string with elements separated by space.
I've also rewritten python code not to be a function but a script.
import sys
if len(sys.argv) > 1:
input = sys.argv[1]
#Doing tasks - arr is an output array
for element in arr:
print(element)
ARRAY=$(python script.py 'val1, val2,val3')
for ELEMENT in $ARRAY; do
echo "$ELEMENT"
done

Send parameters to python from bash

I have a bash script that calls a python script with parameters.
In the bash script, I'm reading a file that contains one row of parameters separated by ", and then calls the python script with the line I read.
My problem is that the python gets the parameters separated by the space.
The line looks like this: "param_a" "Param B" "Param C"
Code Example:
Bash Script:
LINE=`cat $tmp_file`
id=`python /full_path/script.py $LINE`
Python Script:
print sys.argv[1]
print sys.argv[2]
print sys.argv[3]
Received output:
"param_a"
"Param
B"
Wanted output:
param_a
Param B
Param C
How can I send the parameters to the Python script the way I need?
Thanks!
What about
id=`python /full_path/script.py $tmp_file`
and
import sys
for line in open(sys.argv[1]):
print(line)
?
The issue is in how bash passes the arguments. Python has nothing do to with it.
So, you have to solve all these stuff before sending it to Python, I decided to use awk and xargs for this. (but xargs is the actual MVP here.)
LINE=$(cat $tmp_file)
awk -v ORS="\0" -v FPAT='"[^"]+"' '{for (i=1;i<=NF;i++){print substr($i,2,length($i)-2)}}' <<<$LINE |
xargs -0 python ./script.py
First $(..) is preferred over backticks, because it is more readable. You are making a variable after all.
awk only reads from stdin or a file, but you can force it to read from a variable with the <<<, also called "here string".
With awk I loop over all fields (as defined by the regex in the FPAT variable), and print them without the "".
The output record separator I choose is the NULL character (-v ORF='\0'), xargs will split on this character.
xargs will now parse the piped input by separating the arguments on NULL characters (set with -0) and execute the command given with the parsed arguments.
Note, while awk is found on most UNIX systems, I make use of FPAT which is a GNU awk extension and you might not be having GNU awk as default (for example Ubuntu), but gnu awk is usually just a install gawk away.
Also, the next command would be a quick and easy solution, but generally considered as unsafe, since eval will execute everything it receives.
eval "python ./script "$LINE
This can be done using bash arrays:
tmp_file='gash.txt'
# Set IFS to " which splits on double quotes and removes them
# Using read is preferable to using the external program cat
# read -a reads into the array called "line"
# UPPERCASE variable names are discouraged because of collisions with bash variables
IFS=\" read -ra line < "$tmp_file"
# That leaves blank and space elements in "line",
# we create a new array called "params" without those elements
declare -a params
for((i=0; i < ${#line[#]}; i++))
do
p="${line[i]}"
if [[ -n "$p" && "$p" != " " ]]
then
params+=("$p")
fi
done
# `backticks` are frowned upon because of poor readability
# I've called the python script "gash.py"
id=$(python ./gash.py "${params[#]}")
echo "$id"
gash.py:
import sys
print "1",sys.argv[1]
print "2",sys.argv[2]
print "3",sys.argv[3]
Gives:
1 param_a
2 Param B
3 Param C

Yaml file for python and bash parsing

I have a simple list of hostnames such as:
hostname1,hostname2,hostname3
(Note: the format of the YAML file can be changed for easier parsing if necessary)
I need to be able to loop over this list of hostnames in both python and bash.
A simple indexed array will work for this purpose since the only property is hostname and the list will never be more than 1 level deep.
I know can easily parse this with Python with the code I started below.
How can I also parse this as a Bash array?
Python 2.7
import yaml
with open('emc_hosts.yaml', 'r') as f:
doc = yaml.load(f)
doc = doc.split(",")
for v in doc:
print(v)
UPDATE: The hostnames are all on the same line, but they don't have to be. I can create the hostname file in any format that I want including separated by return or other characters.
UPDATE 2: The file will only contain a list of hostnames.
UPDATE 3: As per a suggestion in the comments, I could easily change this from a YAML file to a simple text file with hostnames separated by new lines.
You can parse yaml/json file directly in your shell/bash with niet.
Easy to install and easy to use:
$ pip install -U niet
Consider the following example:
$ cat dump.yaml
foo:
bar:
key: value
baz:
key: value
tags:
- one
- two
You can parse this example file like this:
$ niet dump.yaml foo.bar.key
value
$ for el in $(niet dump.yaml foo.tags); do echo ${el}; done
one
two
Niet have a good integration with shell and others bash like.
Niet yaml parser documentation, source code, and samples.
Also Niet is developing using python.
If you already have working Python code, might as well use it.
# Embed your Python script into your bash code
_get_hosts_python=$(cat <<'EOF'
import yaml, sys
with open(sys.argv[1], "r") as f:
doc = yaml.load(f)
doc = doc.split(",")
for v in doc:
print(v)
EOF
)
# provide a function to wrap its invocation
get_hosts() {
IFS=$'\n' read -r -d '' -a "${1:-hosts}" \
< <(python -c "$_get_hosts_python" "emc_hosts.yml" && printf '\0')
}
# and demonstrate how to actually use that function
get_hosts hostlist
echo "Got ${#hostlist[#]} hosts:"
printf '- %s\n' "${hostlist[#]}"
Note that bash 4.0 and newer have readarray and mapfile commands, which can be used to read a stream of lines into an array; I didn't use them here because it requires bash 4.4 to check the exit status of a process substitution (the <( ... ) syntax), so we wouldn't be able to detect errors if we went that approach (whereas here, the && printf \0' ensures that the stream has a trailing NUL -- and thus that the read has a successful exit status -- only if the Python code exited with a successful exit status).

Using Makefile bash to save the contents of a python file

For those who are curious as to why I'm doing this: I need specific files in a tar ball - no more, no less. I have to write unit tests for make check, but since I'm constrained to having "no more" files, I have to write the check within the make check. In this way, I have to write bash(but I don't want to).
I dislike using bash for unit testing(sorry to all those who like bash. I just dislike it so much that I would rather go with an extremely hacky approach than to write many lines of bash code), so I wrote a python file. I later learned that I have to use bash because of some unknown strict rule. I figured that there was a way to cache the entire content of the python file into a single string in the bash file, so I could take the string literal in bash and write to a python file and then execute it.
I tried the following attempt (in the following script and result, I used another python file that's not unit_test.py, so don't worry if it doesn't actually look like a unit test):
toStr.py:
import re
with open("unit_test.py", 'r+') as f:
s = f.read()
s = s.replace("\n", "\\n")
print(s)
And then I piped the results out using:
python toStr.py > temp.txt
It looked something like:
#!/usr/bin/env python\n\nimport os\nimport sys\n\n#create number of bytes as specified in the args:\nif len(sys.argv) != 3:\n print("We need a correct number of args : 2 [NUM_BYTES][FILE_NAME].")\n exit(1)\nn = -1\ntry:\n n = int(sys.argv[1])\nexcept:\n print("Error casting number : " + sys.argv[1])\n exit(1)\n\nrand_string = os.urandom(n)\n\nwith open(sys.argv[2], 'wb+') as f:\n f.write(rand_string)\n f.flush()\n f.close()\n\n
I tried taking this as a string literal and echoing it into a new file and see whether I could run it as a python file but it failed.
echo '{insert that giant string above here}' > new_unit_test.py
I wanted to take this statement above and copy it into my "bash unit test" file so I can just execute the python file within the bash script.
The resulting file looked exactly like {insert giant string here}. What am I doing wrong in my attempt? Are there other, much easier ways where I can hold a python file as a string literal in a bash script?
the easiest way is to only use double-quotes in your python code, then, in your bash script, wrap all of your python code in one pair of single-quotes, e.g.,
#!/bin/bash
python -c 'import os
import sys
#create number of bytes as specified in the args:
if len(sys.argv) != 3:
print("We need a correct number of args : 2 [NUM_BYTES][FILE_NAME].")
exit(1)
n = -1
try:
n = int(sys.argv[1])
except:
print("Error casting number : " + sys.argv[1])
exit(1)
rand_string = os.urandom(n)
# i changed ""s to ''s below -webb
with open(sys.argv[2], "wb+") as f:
f.write(rand_string)
f.flush()
f.close()'

Read a python variable in a shell script?

my python file has these 2 variables:
week_date = "01/03/16-01/09/16"
cust_id = "12345"
how can i read this into a shell script that takes in these 2 variables?
my current shell script requires manual editing of "dt" and "id". I want to read the python variables into the shell script so i can just edit my python parameter file and not so many files.
shell file:
#!/bin/sh
dt="01/03/16-01/09/16"
cust_id="12345"
In a new python file i could just import the parameter python file.
Consider something akin to the following:
#!/bin/bash
# ^^^^ NOT /bin/sh, which doesn't have process substitution available.
python_script='
import sys
d = {} # create a context for variables
exec(open(sys.argv[1], "r").read()) in d # execute the Python code in that context
for k in sys.argv[2:]:
print "%s\0" % str(d[k]).split("\0")[0] # ...and extract your strings NUL-delimited
'
read_python_vars() {
local python_file=$1; shift
local varname
for varname; do
IFS= read -r -d '' "${varname#*:}"
done < <(python -c "$python_script" "$python_file" "${#%%:*}")
}
You might then use this as:
read_python_vars config.py week_date:dt cust_id:id
echo "Customer id is $id; date range is $dt"
...or, if you didn't want to rename the variables as they were read, simply:
read_python_vars config.py week_date cust_id
echo "Customer id is $cust_id; date range is $week_date"
Advantages:
Unlike a naive regex-based solution (which would have trouble with some of the details of Python parsing -- try teaching sed to handle both raw and regular strings, and both single and triple quotes without making it into a hairball!) or a similar approach that used newline-delimited output from the Python subprocess, this will correctly handle any object for which str() gives a representation with no NUL characters that your shell script can use.
Running content through the Python interpreter also means you can determine values programmatically -- for instance, you could have some Python code that asks your version control system for the last-change-date of relevant content.
Think about scenarios such as this one:
start_date = '01/03/16'
end_date = '01/09/16'
week_date = '%s-%s' % (start_date, end_date)
...using a Python interpreter to parse Python means you aren't restricting how people can update/modify your Python config file in the future.
Now, let's talk caveats:
If your Python code has side effects, those side effects will obviously take effect (just as they would if you chose to import the file as a module in Python). Don't use this to extract configuration from a file whose contents you don't trust.
Python strings are Pascal-style: They can contain literal NULs. Strings in shell languages are C-style: They're terminated by the first NUL character. Thus, some variables can exist in Python than cannot be represented in shell without nonliteral escaping. To prevent an object whose str() representation contains NULs from spilling forward into other assignments, this code terminates strings at their first NUL.
Now, let's talk about implementation details.
${#%%:*} is an expansion of $# which trims all content after and including the first : in each argument, thus passing only the Python variable names to the interpreter. Similarly, ${varname#*:} is an expansion which trims everything up to and including the first : from the variable name passed to read. See the bash-hackers page on parameter expansion.
Using <(python ...) is process substitution syntax: The <(...) expression evaluates to a filename which, when read, will provide output of that command. Using < <(...) redirects output from that file, and thus that command (the first < is a redirection, whereas the second is part of the <( token that starts a process substitution). Using this form to get output into a while read loop avoids the bug mentioned in BashFAQ #24 ("I set variables in a loop that's in a pipeline. Why do they disappear after the loop terminates? Or, why can't I pipe data to read?").
The IFS= read -r -d '' construct has a series of components, each of which makes the behavior of read more true to the original content:
Clearing IFS for the duration of the command prevents whitespace from being trimmed from the end of the variable's content.
Using -r prevents literal backslashes from being consumed by read itself rather than represented in the output.
Using -d '' sets the first character of the empty string '' to be the record delimiter. Since C strings are NUL-terminated and the shell uses C strings, that character is a NUL. This ensures that variables' content can contain any non-NUL value, including literal newlines.
See BashFAQ #001 ("How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?") for more on the process of reading record-oriented data from a string in bash.
Other answers give a way to do exactly what you ask for, but I think the idea is a bit crazy. There's a simpler way to satisfy both scripts - move those variables into a config file. You can even preserve the simple assignment format.
Create the config itself: (ini-style)
dt="01/03/16-01/09/16"
cust_id="12345"
In python:
config_vars = {}
with open('the/file/path', 'r') as f:
for line in f:
if '=' in line:
k,v = line.split('=', 1)
config_vars[k] = v
week_date = config_vars['dt']
cust_id = config_vars['cust_id']
In bash:
source "the/file/path"
And you don't need to do crazy source parsing anymore. Alternatively you can just use json for the config file and then use json module in python and jq in shell for parsing.
I would do something like this. You may want to modify it little bit for minor changes to include/exclude quotes as I didn't really tested it for your scenario:
#!/bin/sh
exec <$python_filename
while read line
do
match=`echo $line|grep "week_date ="`
if [ $? -eq 0 ]; then
dt=`echo $line|cut -d '"' -f 2`
fi
match=`echo $line|grep "cust_id ="`
if [ $? -eq 0 ]; then
cust_id=`echo $line|cut -d '"' -f 2`
fi
done

Categories