So I have a string like "abcd" and I want to convert it into bytes and print it.
I tried print(b'abcd') which prints exactly b'abcd' but I want '\x61\x62\x63\x64'.
Is there a single function for this purpose or do I have to use unhexlify with join?
Note: This is a simplified example of what I'm acutally doing. I need the aforementioned representation for a regex search.
There's no single function to do it, so you would need to do the formatting manually:
s = 'abcd'
print(r'\x' + r'\x'.join(f'{b:02x}' for b in bytes(s, 'utf8')))
Output:
\x61\x62\x63\x64
You can get hex values of a string like this:
string = "abcd"
print(".".join(hex(ord(c))[2:] for c in string))
Related
I have a number of strings from which I am aiming to remove charactars using replace. However, this dosent seem to wake. To give a simplified example, this code:
row = "b'James Bray,/citations?user=8IqSrdIAAAAJ&hl=en&oe=ASCII,1985,6020,188.12,42,1.31,76,2.38'"
row = row.replace("b'", "").replace("'", "").replace('b"', '').replace('"', '')
print(row.encode('ascii', errors='ignore'))
still ouputs this b'James Bray,/citations?user=8IqSrdIAAAAJ&hl=en&oe=ASCII,1985,6020,188.12,42,1.31,76,2.38' wheras I would like it to output James Bray,/citations?user=8IqSrdIAAAAJ&hl=en&oe=ASCII,1985,6020,188.12,42,1.31,76,2.38. How can I do this?
Edit: Updataed the code with a better example.
You seem to be mistaking single quotes for double quotes. Simple replace 'b:
>>> row = "xyz'b"
>>> row.replace("'b", "")
'xyz'
As an alternative to str.replace, you can simple slice the string to remove the unwanted leading and trailing characters:
>>> row[2:-1]
'James Bray,/citations?user=8IqSrdIAAAAJ&hl=en&oe=ASCII,1985,6020,188.12,42,1.31,76,2.38'
In your first .replace, change b' to 'b. Hence your code should be:
>>> row = "xyz'b"
>>> row = row.replace("'b", "").replace("'", "").replace('b"', '').replace('"', '')
# ^ changed here
>>> print(row.encode('ascii', errors='ignore'))
xyz
I am assuming rest of the conditions you have are the part of other task/matches that you didn't mentioned here.
If all you want is to take the string before first ', then you may just do:
row.split("'")[0]
You haven't listed this to remove 'b:
.replace("'b", '')
import ast
row = "b'James Bray,/citations?user=8IqSrdIAAAAJ&hl=en&oe=ASCII,1985,6020,188.12,42,1.31,76,2.38'"
b_string = ast.literal_eval(row)
print(b_string)
u_string = b_string.decode('utf-8')
print(u_string)
out:
b_string:b'James Bray,/citations?user=8IqSrdIAAAAJ&hl=en&oe=ASCII,1985,6020,188.12,42,1.31,76,2.38'
u_string: James Bray,/citations?user=8IqSrdIAAAAJ&hl=en&oe=ASCII,1985,6020,188.12,42,1.31,76,2.38
The real question is how to convert a string to python object.
You get a string which contains an a binary string, to convert it to python's binary string object, you should use eval(). ast.literal_eval() is more safe way to do it.
Now you get a binary string, you can convert it to unicode string which do not start with "b" by using decode()
In this post: Print a string as hex bytes? I learned how to print as string into an "array" of hex bytes now I need something the other way around:
So for example the input would be: 73.69.67.6e.61.74.75.72.65 and the output would be a string.
you can use the built in binascii module. Do note however that this function will only work on ASCII encoded characters.
binascii.unhexlify(hexstr)
Your input string will need to be dotless however, but that is quite easy with a simple
string = string.replace('.','')
another (arguably safer) method would be to use base64 in the following way:
import base64
encoded = base64.b16encode(b'data to be encoded')
print (encoded)
data = base64.b16decode(encoded)
print (data)
or in your example:
data = base64.b16decode(b"7369676e6174757265", True)
print (data.decode("utf-8"))
The string can be sanitised before input into the b16decode method.
Note that I am using python 3.2 and you may not necessarily need the b out the front of the string to denote bytes.
Example was found here
Without binascii:
>>> a="73.69.67.6e.61.74.75.72.65"
>>> "".join(chr(int(e, 16)) for e in a.split('.'))
'signature'
>>>
or better:
>>> a="73.69.67.6e.61.74.75.72.65"
>>> "".join(e.decode('hex') for e in a.split('.'))
PS: works with unicode:
>>> a='.'.join(x.encode('hex') for x in 'Hellö Wörld!')
>>> a
'48.65.6c.6c.94.20.57.94.72.6c.64.21'
>>> print "".join(e.decode('hex') for e in a.split('.'))
Hellö Wörld!
>>>
EDIT:
No need for a generator expression here (thx to thg435):
a.replace('.', '').decode('hex')
Use string split to get a list of strings, then base 16 for decoding the bytes.
>>> inp="73.69.67.6e.61.74.75.72.65"
>>> ''.join((chr(int(i,16)) for i in inp.split('.')))
'signature'
>>>
Does anyone know how to get a chr to hex conversion where the output is always two digits?
for example, if my conversion yields 0x1, I need to convert that to 0x01, since I am concatenating a long hex string.
The code that I am using is:
hexStr += hex(ord(byteStr[i]))[2:]
You can use string formatting for this purpose:
>>> "0x{:02x}".format(13)
'0x0d'
>>> "0x{:02x}".format(131)
'0x83'
Edit: Your code suggests that you are trying to convert a string to a hexstring representation. There is a much easier way to do this (Python2.x):
>>> "abcd".encode("hex")
'61626364'
An alternative (that also works in Python 3.x) is the function binascii.hexlify().
You can use the format function:
>>> format(10, '02x')
'0a'
You won't need to remove the 0x part with that (like you did with the [2:])
If you're using python 3.6 or higher you can also use fstrings:
v = 10
s = f"0x{v:02x}"
print(s)
output:
0x0a
The syntax for the braces part is identical to string.format(), except you use the variable's name. See https://www.python.org/dev/peps/pep-0498/ for more.
htmlColor = "#%02X%02X%02X" % (red, green, blue)
The standard module binascii may also be the answer, namely when you need to convert a longer string:
>>> import binascii
>>> binascii.hexlify('abc\n')
'6162630a'
Use format instead of using the hex function:
>>> mychar = ord('a')
>>> hexstring = '%.2X' % mychar
You can also change the number "2" to the number of digits you want, and the "X" to "x" to choose between upper and lowercase representation of the hex alphanumeric digits.
By many, this is considered the old %-style formatting in Python, but I like it because the format string syntax is the same used by other languages, like C and Java.
The simpliest way (I think) is:
your_str = '0x%02X' % 10
print(your_str)
will print:
0x0A
The number after the % will be converted to hex inside the string, I think it's clear this way and from people that came from a C background (like me) feels more like home
I am new to python and I have a string that looks like this
Temp = "', '/1412311.2121\n"
my desired output is just getting the numbers and decimal itself.. so im looking for
1412311.2121
as the output.. trying to get rid of the ', '/\n in the string.. I have tried Temp.strip("\n") and Temp.rstrip("\n") for trying to remove \n but i still seems to remain in my string. :/... Does anyone have any ideas? Thanks for your help.
Strings are immutable. string.strip() doesn't change string, it's a function that returns a value. You need to do:
Temp = Temp.strip()
Note also that calling strip() without any parameters causes it to remove all whitespace characters, including \n
As stalk said, you can achieve your desired result by calling strip("',/\n") on Temp.
If the data are like you show, numbers that are wrapped from right and left with non-number data, you can use a very simple regular expression:
g = re.search('[0-9.]+', s) # capture the inner number only
print g.group(0)
I would use a regular expression to do this:
In [8]: s = "', '/1412311.2121\n"
In [9]: re.findall(r'([+-]?\d+(?:\.\d+)?(?:[eE][+-]\d+)?)', s)
Out[9]: ['1412311.2121']
This returns a list of all floating-point numbers found in the string.
I am using struct.pack method which takes variable number of arguments. I want to convert a string to bytes. If a string is short (e.g. 'name') I can do it like:
bytes = struct.pack('4c','n','a','m','e')
But what to do when my string is 80 characters long?
I have tried the format string 's', instead of '80c' for struct.pack, but the result is not the same as that of above call.
Use "80s", not just "s". The input is a single string, rather than a series of characters. i.e.
bytes = struct.pack('4s','name')
Note that if you specify a length greater than that of the input, the output will be null-padded.
That doesn't make much sense. Strings are already bytes in python 2.x; So you could just do:
my_string = 'I am some big string'
my_bytes = my_string
On python 3, strings are unicode objects by default. To get bytes you have to encode the string.
my_bytes = my_string.encode('utf-8')
If really you want to use struct.pack, you'd use * syntax as described in the tutorial:
my_bytes = struct.pack('20c', *my_string)
or
my_bytes = struct.pack('20s', my_string)