Remove '\x' from bytes - python

I'm currently reading bytes from a file and I want to put two of these bytes into a list and convert them into an integer. Say the two bytes I want to read are \x02 and \x00 I want to join these bytes together before I convert them into an integer such as 0x0200 but am having difficulty doing so as I cannot remove the \x from the bytes.
I've tried using: .replace('\\x', '') though this doesn't work as python treats the bytes as one object rather than a list. I've also considered using struct although I'm unsure whether this would work in my situation.
It's also not possible to iterate through each byte and remove the first two items as python still treats the entire byte as one object.
Here is the list I have after appending it with both bytes:
While it looks like two strings, they do not behave as strings. I then iterated over the list using:
for x in a:
print a
The two lines below the list are the outputs of 'print a' (a blank space & special character). As you can see they do not print as normal strings.
Below is a code snippet showing how I add the bytes to the array, nothing complicated (test being the array in this case).
for i in openFile.read(512):
....
....
elif 10 < count < 13:
test.insert(0, i[0:])

You could use ord to extract each character's numeric value, then combine them with simple arithmetic.
>>> a = '\x02'
>>> b = '\x00'
>>> c = ord(a)*256 + ord(b)
>>> c == 0x0200
True
>>> print hex(c)
0x200

An alternate way to do this for standard-length types is to use the struct module to convert from strings of bytes to Python types.
For example:
>>> import struct
>>> byte_arr = ['\x02', '\x00']
>>> byte_str = ''.join(byte_arr)
>>> byte_str
'\x02\x00'
>>> num, = struct.unpack('>H', byte_str)
>>> num
512
In this example, the format string '>H' indicates a big-endian unsigned 2-byte integer. Other format strings can be used to specify other sizes, endianness, and signed/unsigned status.

new_str=str(your_byte_like_object).split('\\x')
print("".join(new_str))
You can convert the byte object to str and split it separator use \x
and you get a list and join it.
that's all.
output is like this:
eigioer #text
b'0\x1e\xd7\xe8\xdf\xc1\xd7\x90o3`mD\x92U\xf5\xca\xe7l\xe5"TM' #raw byte
["b'0", '1e', 'd7', 'e8', 'df', 'c1', 'd7', '90o3`mD', '92U', 'f5', 'ca', 'e7l', 'e5"TM\''] #list
b'01ed7e8dfc1d790o3`mD92Uf5cae7le5"TM' #after joining

I had the same problem. I had a "bytes" object and I needed to remove the \xs to be able to upload my file to Cassandra, and all I needed to do was to use this:
my_bytes.hex()

We know that it always starts with \x, and the 'it' is a string. So we can just do...
>>> num = "\\x02"
>>> num = num[2:]
>>> print num
02
Update:
>>> num = [r"\x02", r"\x20"]
>>> num = [ n[2:] for n in num ]
>>> num
['02', '20']

Related

Python: Converting HEX string to bytes

I'm trying to make byte frame which I will send via UDP. I have class Frame which has attributes sync, frameSize, data, checksum etc. I'm using hex strings for value representation. Like this:
testFrame = Frame("AA01","0034","44853600","D43F")
Now, I need to concatenate this hex values together and convert them to byte array like this?!
def convertToBits(self):
stringMessage = self.sync + self.frameSize + self.data + self.chk
return b16decode(self.stringMessage)
But when I print converted value I don't get the same values or I don't know to read python notation correctly:
This is sync: AA01
This is frame size: 0034
This is data:44853600
This is checksum: D43F
b'\xaa\x01\x004D\x856\x00\xd4?'
So, first word is converted ok (AA01 -> \xaa\x01) but (0034 -> \x004D) it's not the same. I tried to use bytearray.fromhex because I can use spaces between bytes but I got same result. Can you help me to send same hex words via UDP?
Python displays any byte that can represent a printable ASCII character as that character. 4 is the same as \x34, but as it opted to print the ASCII character in the representation.
So \x004 is really the same as \x00\x34, D\x856\x00 is the same as \x44\x85\x36\x00, and \xd4? is the same as \xd4\x3f, because:
>>> b'\x34'
'4'
>>> b'\x44'
'D'
>>> b'\x36'
'6'
>>> b'\x3f'
'?'
This is just the representation of the bytes value; the value is entirely correct and you don't need to do anything else.
If it helps, you can visualise the bytes values as hex again using binascii.hexlify():
>>> import binascii
>>> binascii.hexlify(b'\xaa\x01\x004D\x856\x00\xd4?')
b'aa01003444853600d43f'
and you'll see that 4, D, 6 and ? are once again represented by the correct hexadecimal characters.

How do you decode an ascii string in python?

For example, in your python shell(IDLE):
>>> a = "\x3cdiv\x3e"
>>> print a
The result you get is:
<div>
but if a is an ascii encoded string:
>>> a = "\\x3cdiv\\x3e" ## it's the actual \x3cdiv\x3e string if you read it from a file
>>> print a
The result you get is:
\x3cdiv\x3e
Now what i really want from a is <div>, so I did this:
>>> b = a.decode("ascii")
>>> print b
BUT surprisingly I did NOT get the result I want, it's still:
\x3cdiv\x3e
So basically what do I do to convert a, which is \x3cdiv\x3e to b, which should be <div>?
Thanks
>>> a = rb"\x3cdiv\x3e"
>>> a.decode('unicode_escape')
'<div>'
Also check out some interesting codecs.
With python 3.x, you would adapt Kabie answer to
a = b"\x3cdiv\x3e"
a.decode('unicode_escape')
or
a = b"\x3cdiv\x3e"
a.decode('ascii')
both give
>>> a
b'<div>'
What is b prefix for ?
Bytes literals are always prefixed with 'b' or 'B'; they produce an
instance of the bytes type instead of the str type. They may only
contain ASCII characters; bytes with a numeric value of 128 or greater
must be expressed with escapes.

Python: convert a dot separated hex values to string?

In this post: Print a string as hex bytes? I learned how to print as string into an "array" of hex bytes now I need something the other way around:
So for example the input would be: 73.69.67.6e.61.74.75.72.65 and the output would be a string.
you can use the built in binascii module. Do note however that this function will only work on ASCII encoded characters.
binascii.unhexlify(hexstr)
Your input string will need to be dotless however, but that is quite easy with a simple
string = string.replace('.','')
another (arguably safer) method would be to use base64 in the following way:
import base64
encoded = base64.b16encode(b'data to be encoded')
print (encoded)
data = base64.b16decode(encoded)
print (data)
or in your example:
data = base64.b16decode(b"7369676e6174757265", True)
print (data.decode("utf-8"))
The string can be sanitised before input into the b16decode method.
Note that I am using python 3.2 and you may not necessarily need the b out the front of the string to denote bytes.
Example was found here
Without binascii:
>>> a="73.69.67.6e.61.74.75.72.65"
>>> "".join(chr(int(e, 16)) for e in a.split('.'))
'signature'
>>>
or better:
>>> a="73.69.67.6e.61.74.75.72.65"
>>> "".join(e.decode('hex') for e in a.split('.'))
PS: works with unicode:
>>> a='.'.join(x.encode('hex') for x in 'Hellö Wörld!')
>>> a
'48.65.6c.6c.94.20.57.94.72.6c.64.21'
>>> print "".join(e.decode('hex') for e in a.split('.'))
Hellö Wörld!
>>>
EDIT:
No need for a generator expression here (thx to thg435):
a.replace('.', '').decode('hex')
Use string split to get a list of strings, then base 16 for decoding the bytes.
>>> inp="73.69.67.6e.61.74.75.72.65"
>>> ''.join((chr(int(i,16)) for i in inp.split('.')))
'signature'
>>>

passing large number of arguments to struct.pack

I am using struct.pack method which takes variable number of arguments. I want to convert a string to bytes. If a string is short (e.g. 'name') I can do it like:
bytes = struct.pack('4c','n','a','m','e')
But what to do when my string is 80 characters long?
I have tried the format string 's', instead of '80c' for struct.pack, but the result is not the same as that of above call.
Use "80s", not just "s". The input is a single string, rather than a series of characters. i.e.
bytes = struct.pack('4s','name')
Note that if you specify a length greater than that of the input, the output will be null-padded.
That doesn't make much sense. Strings are already bytes in python 2.x; So you could just do:
my_string = 'I am some big string'
my_bytes = my_string
On python 3, strings are unicode objects by default. To get bytes you have to encode the string.
my_bytes = my_string.encode('utf-8')
If really you want to use struct.pack, you'd use * syntax as described in the tutorial:
my_bytes = struct.pack('20c', *my_string)
or
my_bytes = struct.pack('20s', my_string)

How to convert an int to a hex string?

I want to take an integer (that will be <= 255), to a hex string representation
e.g.: I want to pass in 65 and get out '\x41', or 255 and get '\xff'.
I've tried doing this with the struct.pack('c',65), but that chokes on anything above 9 since it wants to take in a single character string.
You are looking for the chr function.
You seem to be mixing decimal representations of integers and hex representations of integers, so it's not entirely clear what you need. Based on the description you gave, I think one of these snippets shows what you want.
>>> chr(0x65) == '\x65'
True
>>> hex(65)
'0x41'
>>> chr(65) == '\x41'
True
Note that this is quite different from a string containing an integer as hex. If that is what you want, use the hex builtin.
This will convert an integer to a 2 digit hex string with the 0x prefix:
strHex = "0x%0.2X" % integerVariable
What about hex()?
hex(255) # 0xff
If you really want to have \ in front you can do:
print '\\' + hex(255)[1:]
Let me add this one, because sometimes you just want the single digit representation
( x can be lower, 'x', or uppercase, 'X', the choice determines if the output letters are upper or lower.):
'{:x}'.format(15)
> f
And now with the new f'' format strings you can do:
f'{15:x}'
> f
To add 0 padding you can use 0>n:
f'{2034:0>4X}'
> 07F2
NOTE: the initial 'f' in f'{15:x}' is to signify a format string
Try:
"0x%x" % 255 # => 0xff
or
"0x%X" % 255 # => 0xFF
Python Documentation says: "keep this under Your pillow: http://docs.python.org/library/index.html"
For Python >= 3.6, use f-string formatting:
>>> x = 114514
>>> f'{x:0x}'
'1bf52'
>>> f'{x:#x}'
'0x1bf52'
If you want to pack a struct with a value <255 (one byte unsigned, uint8_t) and end up with a string of one character, you're probably looking for the format B instead of c. C converts a character to a string (not too useful by itself) while B converts an integer.
struct.pack('B', 65)
(And yes, 65 is \x41, not \x65.)
The struct class will also conveniently handle endianness for communication or other uses.
With format(), as per format-examples, we can do:
>>> # format also supports binary numbers
>>> "int: {0:d}; hex: {0:x}; oct: {0:o}; bin: {0:b}".format(42)
'int: 42; hex: 2a; oct: 52; bin: 101010'
>>> # with 0x, 0o, or 0b as prefix:
>>> "int: {0:d}; hex: {0:#x}; oct: {0:#o}; bin: {0:#b}".format(42)
'int: 42; hex: 0x2a; oct: 0o52; bin: 0b101010'
Note that for large values, hex() still works (some other answers don't):
x = hex(349593196107334030177678842158399357)
print(x)
Python 2: 0x4354467b746f6f5f736d616c6c3f7dL
Python 3: 0x4354467b746f6f5f736d616c6c3f7d
For a decrypted RSA message, one could do the following:
import binascii
hexadecimals = hex(349593196107334030177678842158399357)
print(binascii.unhexlify(hexadecimals[2:-1])) # python 2
print(binascii.unhexlify(hexadecimals[2:])) # python 3
(int_variable).to_bytes(bytes_length, byteorder='big'|'little').hex()
For example:
>>> (434).to_bytes(4, byteorder='big').hex()
'000001b2'
>>> (434).to_bytes(4, byteorder='little').hex()
'b2010000'
This worked best for me
"0x%02X" % 5 # => 0x05
"0x%02X" % 17 # => 0x11
Change the (2) if you want a number with a bigger width (2 is for 2 hex printned chars) so 3 will give you the following
"0x%03X" % 5 # => 0x005
"0x%03X" % 17 # => 0x011
Also you can convert any number in any base to hex. Use this one line code here it's easy and simple to use:
hex(int(n,x)).replace("0x","")
You have a string n that is your number and x the base of that number. First, change it to integer and then to hex but hex has 0x at the first of it so with replace we remove it.
I wanted a random integer converted into a six-digit hex string with a # at the beginning. To get this I used
"#%6x" % random.randint(0xFFFFFF)
As an alternative representation you could use
[in] '%s' % hex(15)
[out]'0xf'

Categories