I am trying to get an output from Bash and work with it in Python as bytes. The output of the command is something like this, after is formatted - \x48\x83\xec\x08\x48\x8b\x05\xdd
Then Python receives it through sys.argv but for it to be recognised as bytes I need to use .encode(), when encoding it I get b'\\x48\\x83\\xec\\x08\\x48\\x8b\\x05\\xdd' which is the representation of a single back slash according to what I've read, but I would need it with a single backslash and not two.
I've tried different solutions such as encoding it and decoding it again with 'unicode_escape' as suggested here - Remove double back slashes to no avail.
Surely I am missing some knowledge here, any help would be really appreciated.
#!/bin/sh
echo Enter a executable name.
read varname
echo Enter a PID to search in memory.
read PID
byteString=$(objdump -d -j .text /bin/$varname | head -n100 | tail -n93 |
cut -c11-30 | sed 's/[a-z0-9]\{2\}/\\x&/g' | tr -d '[:space:]')
python3 /home/internship/Desktop/memory_analysis.py $PID $byteString
Above is the bash script with the command to get the bytes. And the following is how the bytes are received by Python.
#!/usr/bin/python3
import sys
if len(sys.argv) < 2:
print ("Please specify a PID")
exit(1);
element = bytes(sys.argv[2].encode())
print(element)
output - b'\\xe8\\x5b\\xfd\\xff\\xff\\xe8\\x56\\xfd\\xff\\xff\\xe8\\x51\\xfd\\xff\\xff\\xe8\\x4c\\xfd\\xff\\xff\\xe8\\x47\\xfd\\xff\\xff'
When I hard code it with a variable it works just fine such as this - element =
b'\xe8\x5b\xfd\xff\xff\xe8\x56\xfd\xff\xff\xe8\x51\xfd\xff\xff\xe8\x4c\xfd\xff\xff'
Although, I need a some automation.
Thank you in advance!
If you can pass this data as binary to a Python script then you can deal with it like this:
import os
import sys
if __name__ == '__main__':
bytes_arg = os.fsencode(sys.argv[1])
print(bytes_arg)
~$ python script.py $'\x48\x83\xec\x08\x48\x8b\x05\xdd'
b'H\x83\xec\x08H\x8b\x05\xdd'
But if you get a string it ends up being x48x83xecx08x48x8bx05xdd.
import os
import sys
if __name__ == '__main__':
cleaned = ''.join(sys.argv[1].split('x'))
bytes_arg = bytes.fromhex(cleaned)
print(bytes_arg)
~$ python script.py \x48\x83\xec\x08\x48\x8b\x05\xdd
b'H\x83\xec\x08H\x8b\x05\xdd'
Hope this is what you expected :
python -c "import sys;print(bytes.fromhex(sys.argv[1].replace(r'\x','')))" '\x48\x83\xec\x08\x48\x8b\x05\xdd'
# Output : b'H\x83\xec\x08H\x8b\x05\xdd'
Based on your update :
test.sh
#!/bin/sh
byteString='\xe8\x5b\xfd\xff\xff\xe8\x56\xfd\xff\xff\xe8\x51\xfd\xff\xff\xe8\x4c\xfd\xff\xff'
PID=999
python3 test.py $PID "$byteString"
test.py
#!/usr/bin/python3
import re
import sys
if len(sys.argv) < 2:
print ("Please specify a PID")
exit(1);
element = bytes.fromhex(sys.argv[2].replace(r'\x',''))
print(element)
# output b'\xe8[\xfd\xff\xff\xe8V\xfd\xff\xff\xe8Q\xfd\xff\xff\xe8L\xfd\xff\xff'
print("b'"+re.sub('(..)', r'\\x\1',element.hex())+"'")
# output b'\xe8\x5b\xfd\xff\xff\xe8\x56\xfd\xff\xff\xe8\x51\xfd\xff\xff\xe8\x4c\xfd\xff\xff'
You actually don't have to use the codecs module, I just used it in that original answer in an attempt to make things more visually accommodating. Your question is practically identical to the one you referenced. The codecs.encode() function and the str.encode() method can both use the raw_unicode_escape text encoding.
In fact, you can just do as follows:
sys.argv[2].encode('raw_unicode_escape')
Just remember that raw_unicode_escape neither escapes or un-escapes backslashes when encoding or decoding.
All the current answers have given you what you wanted, but keep in mind that bytes objects are rendered different when printed. Additionally, when you encode a string you don't need to use the bytes() function, since it is automatically converted to a bytes object when encoded.
>>> b'\x48\x83\xec\x08\x48\x8b\x05\xdd' == b'H\x83\xec\x08\x48\x8b\x05\xdd'
True
Related
For those who are curious as to why I'm doing this: I need specific files in a tar ball - no more, no less. I have to write unit tests for make check, but since I'm constrained to having "no more" files, I have to write the check within the make check. In this way, I have to write bash(but I don't want to).
I dislike using bash for unit testing(sorry to all those who like bash. I just dislike it so much that I would rather go with an extremely hacky approach than to write many lines of bash code), so I wrote a python file. I later learned that I have to use bash because of some unknown strict rule. I figured that there was a way to cache the entire content of the python file into a single string in the bash file, so I could take the string literal in bash and write to a python file and then execute it.
I tried the following attempt (in the following script and result, I used another python file that's not unit_test.py, so don't worry if it doesn't actually look like a unit test):
toStr.py:
import re
with open("unit_test.py", 'r+') as f:
s = f.read()
s = s.replace("\n", "\\n")
print(s)
And then I piped the results out using:
python toStr.py > temp.txt
It looked something like:
#!/usr/bin/env python\n\nimport os\nimport sys\n\n#create number of bytes as specified in the args:\nif len(sys.argv) != 3:\n print("We need a correct number of args : 2 [NUM_BYTES][FILE_NAME].")\n exit(1)\nn = -1\ntry:\n n = int(sys.argv[1])\nexcept:\n print("Error casting number : " + sys.argv[1])\n exit(1)\n\nrand_string = os.urandom(n)\n\nwith open(sys.argv[2], 'wb+') as f:\n f.write(rand_string)\n f.flush()\n f.close()\n\n
I tried taking this as a string literal and echoing it into a new file and see whether I could run it as a python file but it failed.
echo '{insert that giant string above here}' > new_unit_test.py
I wanted to take this statement above and copy it into my "bash unit test" file so I can just execute the python file within the bash script.
The resulting file looked exactly like {insert giant string here}. What am I doing wrong in my attempt? Are there other, much easier ways where I can hold a python file as a string literal in a bash script?
the easiest way is to only use double-quotes in your python code, then, in your bash script, wrap all of your python code in one pair of single-quotes, e.g.,
#!/bin/bash
python -c 'import os
import sys
#create number of bytes as specified in the args:
if len(sys.argv) != 3:
print("We need a correct number of args : 2 [NUM_BYTES][FILE_NAME].")
exit(1)
n = -1
try:
n = int(sys.argv[1])
except:
print("Error casting number : " + sys.argv[1])
exit(1)
rand_string = os.urandom(n)
# i changed ""s to ''s below -webb
with open(sys.argv[2], "wb+") as f:
f.write(rand_string)
f.flush()
f.close()'
I am writing a dev ops kind of a bash script that is used for running an application in a local development environment under configuration as similar to production as possible. To eliminate duplicating some code/data which is already in a Python script, I would like my bash script to invoke a Python call to retrieve data that is hard coded in that Python script. The data structure in Python is a dict but I really only care about the keys so I can just return an array of keys. The Python script is used in production and I want to use it and not duplicate the data in my shell script to avoid having to follow on any modification in the production script with parallel changes in the local environment shell script.
Is there any way I can invoke a Python function from bash and retrieve this collection of values? If not, should I just have the Python function print to STDOUT and have the shell script parse the result?
Yes, that is best and almost only way to pass data from python to bash.
Also your function can write to file, which would be read by bash script.
To write a Python dictionary from a module out to a NUL-delimited key/value stream (which is the preferred serialization format if you want to represent the full range of values bash is capable of handling):
#!/usr/bin/env python
import sys, yourmodule
saw_errors = 0
for k, v in yourmodule.data.iteritems():
if '\0' in k or '\0' in v:
saw_errors = 1 # setting exit status is nice-to-have but not essential
continue # ...but skipping invalid content is important; otherwise,
# we'd corrupt the output stream.
sys.stdout.write('%s\0%s\0' % (k, v))
sys.exit(saw_errors)
...and to read that stream into an associative array:
# this is bash 4.x's equivalent to a Python dict
declare -A items=()
while IFS= read -r -d '' key && IFS= read -r -d '' value; do
items[$key]=$value
done < <(python_script) # where 'python_script' behaves as given above
...whereafter you can access the items from your Python script:
echo "Value for hello is: ${items[hello]}"
...or iterate over the keys:
printf 'Received key: %q\n' "${!items[#]}"
...or iterate over the values:
printf 'Received value: %q\n' "${items[#]}"
Caveat: Python bytestrings (regular strings, in Python 2.x) are Pascal-style; they have an explicit length stored, so they can contain any raw binary data whatsoever. (Python 3.x character strings are also Pascal-style, and can also contain NULs, but the aforementioned sentence doesn't quite apply as they don't contain raw binary content -- while the next one still does). Bash strings are C strings; they're NUL-terminated, so they can't contain raw NUL characters.
Thus, some data which can be represented in Python cannot be represented in bash.
As an alternative, you could make a python script that prints out a bash array.
bashify.py
#! /usr/bin/python
from sys import argv
from importlib import import_module
def as_bash_array(mapping):
return " ".join("[{!r}]={!r}".format(*item) for item in mapping.items())
def get_mapping(name):
module, var = name.rsplit(".", 1)
return getattr(import_module(module), var)
executable, mapping_name = argv
mapping = get_mapping(mapping_name)
print "(", as_bash_array(mapping), ")"
usage:
declare -A my_arr="`./bashify.py my_module.my_dict`"
Using !r in the format string means non-printing characters such as NUL will be escaped ("\x00" for NUL). It also means that string values will be quoted -- allowing characters that would otherwise break the array declaration syntax.
I'm trying to write a zsh script that contains a python 1-liner which takes an argument.
#!/bin/zsh
foo_var="foo"
python -c "import sys; print sys.argv" $foo_var
(This isn't my actual code but this is the gist of what I was doing.)
That code outputs the following:
['-c', 'foo']
The one liner got a little longer than I wanted it to, so I put it in a heredoc, like this:
#!/bin/zsh
bar_var="bar"
python << EOF
import sys
print sys.argv
EOF
$bar_var
(Again, not my actual code but same idea.)
which outputs:
['']
./doctest.zsh:14: command not found: bar
I need $bar_var to be on the line as python so it will get passed as an argument, but I can't have anything on the same line as the second 'EOF'. I also can't have anything before the heredoc because python will interpret it as a filename.
Is there a way to work around the mandatory newline after the second EOF, or better yet, is there generally a better way to do this?
(Also this is my first SO post, so please let me know if I've done something wrong in that sense)
This might do what you want:
python - $bar_var << EOF
import sys
print sys.argv
EOF
After a few days of dwelling over stackoverflow and python 2.7 doc, I have come to no conclusion about this.
Basically I'm running a python script on a windows server that must have as input a block of text. This block of text (unfortunately) has to be passed by a pipe. Something like:
PS > [something_that_outputs_text] | python .\my_script.py
So the problem is:
The server uses cp1252 encoding and I really cannot change it due to administrative regulations and whatnot. And when I pipe the text to my python script, when I read it, it comes already with ? whereas characters like \xe1 should be.
What I have done so far:
Tested with UTF-8. Yep, chcp 65001 and $OutputEncoding = [Console]::OutputEncoding "solve it", as in python gets the text perfectly and then I can decode it to unicode etc. But apparently they don't let me do it on the server /sadface.
A little script to test what the hell is happening:
import codecs
import sys
def main(argv=None):
if argv is None:
argv = sys.argv
if len(argv)>1:
for arg in argv[1:]:
print arg.decode('cp1252')
sys.stdin = codecs.getreader('cp1252')(sys.stdin)
text = sys.stdin.read().strip()
print text
return 0
if __name__=="__main__":
sys.exit(main())
Tried it with both the codecs wrapping and without it.
My input & output:
PS > echo "Blá" | python .\testinput.py blé
blé
Bl?
--> So there's no problem with the argument (blé) but the piped text (Blá) is no good :(
I even converted the text string to hex and, yes, it gets flooded with 3f (AKA mr ?), so it's not a problem with the print.
[Also: it's my first question here... feel free to ask any more info about what I did]
EDIT
I don't know if this is relevant or not, but when I do sys.stdin.encoding it yields None
Update: So... I have no problems with cmd. Checked sys.stdin.encoding while running the program on cmd and everything went fine. I think my head just exploded.
How about saving the data into a file and piping it to Python on a CMD session? Invoke Powershell and Python on CMD. Like so,
c:\>powershell -command "c:\genrateDataForPython.ps1 -output c:\data.txt"
c:\>type c:\data.txt | python .\myscript.py
Edit
Another an idea: convert the data into base64 format in Powershell and decode it in Python. Base64 is simple in Powershell, I guess in Python it isn't hard either. Like so,
# Convert some accent chars to base64
$s = [Text.Encoding]::UTF8.GetBytes("éêèë")
[System.Convert]::ToBase64String($s)
# Output:
w6nDqsOow6s=
# Decode:
$d = [System.Convert]::FromBase64String("w6nDqsOow6s=")
[Text.Encoding]::UTF8.GetString($d)
# Output
éêèë
I'm trying to pass file list to my python script via argument:
python script.py -o aaa -s bbb "filename.txt" "filename2.txt" "file name3.txt"
Unfortunately ArgumentParser is ignoring quotes and instead of giving list of 3 files it gives me list of 4 elements as followed:
1) "filename.txt"
2) "filename2.txt"
3) "file
4) name3.txt"
It completely ignores quotes. How to make it work with quotes?
Hard without seeing what you're using or any code.
Your shell may be interfering, you may need to escape the spaces with \.
Example:
python script.py -o a -f "file1.txt" "file\ 2.csv"
Is hard without code but considering you are using sys.argv[] you can easily pass the file arguments with quotes like when you need a file or argv with blank spaces: python script.py "myFile.txt" "otherFile.jpeg"
Try this simple code to understand:
import sys
for n, p in enumerate(sys.argv):
print("Parameter: %d = %s") % (n, p))`
You can see that first argv is the file name you are running.
It looks like that this is not python's fault. I'm calling python script from inside of bash script and this make mess with quotes as parameters.