How to transform a byte value "b'27 '" into int 27 - python

How to get the number in a byte variable ?
I have to transfer data from Arduino to Raspberry Pi with serial and python. I succeed to isolate the variable but its type is bytes, how to get this into an int variable ?
The variable is
b'27'
but i want to get
27
I tried
print(int.from_bytes(b'\x27', "big", signed=True))
But i don't succeed to get the correct number 27

You can use decode to get it to a regular str and then use int:
x = b'27'
y = int(x.decode()) # decode is a method on the bytes class that returns a string
type(y)
# <class 'int'>
Alternatively:
y = int(b'27')
type(y)
# <class 'int'>
Per #chepner's comment, you'll want to note cases where weird encoding can break the latter approach, and for non-utf-8 encoding it could break both

To supplement the helpful and practical answer given by C.Nivs, I would like to add that if you had wanted to use int.from_bytes() to retrieve the value 27, you would have needed to do:
int.from_bytes(b'\x1B', "big", signed=True)
because '\x27' is actually the hex value for the value 39. There are loads of conversion tables online that can be helpful for cross-referencing decimal against hex values. These two forms are only 1:1 for values less than 10.

Related

Pack into c types and obtain the binary value back

I'm using the following code to pack an integer into an unsigned short as follows,
raw_data = 40
# Pack into little endian
data_packed = struct.pack('<H', raw_data)
Now I'm trying to unpack the result as follows. I use utf-16-le since the data is encoded as little-endian.
def get_bin_str(data):
bin_asc = binascii.hexlify(data)
result = bin(int(bin_asc.decode("utf-16-le"), 16))
trimmed_res = result[2:]
return trimmed_res
print(get_bin_str(data_packed))
Unfortunately, it throws the following error,
result = bin(int(bin_asc.decode("utf-16-le"), 16)) ValueError: invalid
literal for int() with base 16: '㠲〰'
How do I properly decode the bytes in little-endian to binary data properly?
Use unpack to reverse what you packed. The data isn't UTF-encoded so there is no reason to use UTF encodings.
>>> import struct
>>> data_packed = struct.pack('<H', 40)
>>> data_packed.hex() # the two little-endian bytes are 0x28 (40) and 0x00 (0)
2800
>>> data = struct.unpack('<H',data_packed)
>>> data
(40,)
unpack returns a tuple, so index it to get the single value
>>> data = struct.unpack('<H',data_packed)[0]
>>> data
40
To print in binary use string formatting. Either of these work work best. bin() doesn't let you specify the number of binary digits to display and the 0b needs to be removed if not desired.
>>> format(data,'016b')
'0000000000101000'
>>> f'{data:016b}'
'0000000000101000'
You have not said what you are trying to do, so let's assume your goal is to educate yourself. (If you are trying to pack data that will be passed to another program, the only reliable test is to check if the program reads your output correctly.)
Python does not have an "unsigned short" type, so the output of struct.pack() is a byte array. To see what's in it, just print it:
>>> data_packed = struct.pack('<H', 40)
>>> print(data_packed)
b'(\x00'
What's that? Well, the character (, which is decimal 40 in the ascii table, followed by a null byte. If you had used a number that does not map to a printable ascii character, you'd see something less surprising:
>>> struct.pack("<H", 11)
b'\x0b\x00'
Where 0b is 11 in hex, of course. Wait, I specified "little-endian", so why is my number on the left? The answer is, it's not. Python prints the byte string left to right because that's how English is written, but that's irrelevant. If it helps, think of strings as growing upwards: From low memory locations to high memory. The least significant byte comes first, which makes this little-endian.
Anyway, you can also look at the bytes directly:
>>> print(data_packed[0])
40
Yup, it's still there. But what about the bits, you say? For this, use bin() on each of the bytes separately:
>>> bin(data_packed[0])
'0b101000'
>>> bin(data_packed[1])
'0b0'
The two high bits you see are worth 32 and 8. Your number was less than 256, so it fits entirely in the low byte of the short you constructed.
What's wrong with your unpacking code?
Just for fun let's see what your sequence of transformations in get_bin_str was doing.
>>> binascii.hexlify(data_packed)
b'2800'
Um, all right. Not sure why you converted to hex digits, but now you have 4 bytes, not two. (28 is the number 40 written in hex, the 00 is for the null byte.) In the next step, you call decode and tell it that these 4 bytes are actually UTF-16; there's just enough for two unicode characters, let's take a look:
>>> b'2800'.decode("utf-16-le")
'㠲〰'
In the next step Python finally notices that something is wrong, but by then it does not make much difference because you are pretty far away from the number 40 you started with.
To correctly read your data as a UTF-16 character, call decode directly on the byte string you packed.
>>> data_packed.decode("utf-16-le")
'('
>>> ord('(')
40

Convert binary to signed, little endian 16bit integer in Python

Trying to a convert a binary list into a signed 16bit little endian integer
input_data = [['1100110111111011','1101111011111111','0010101000000011'],['1100111111111011','1101100111111111','0010110100000011']]
Desired Output =[[-1074, -34, 810],[-1703, -39, 813]]
This is what I've got so far. It's been adapted from: Hex string to signed int in Python 3.2?,
Conversion from HEX to SIGNED DEC in python
results = []
for i in input_data:
hex_convert = [hex(int(x,2)) for x in i]
convert = [int(y[4:6] + y[2:4], 16) for y in hex_convert]
results.append(convert)
print (results)
output: [[64461, 65502, 810], [64463, 65497, 813]]
This is works fine, but the above are unsigned integers. I need signed integers capable of handling negative values. I then tried a different approach:
results_2 = []
for i in input_data:
hex_convert = [hex(int(x,2)) for x in i]
to_bytes = [bytes(j, 'utf-8') for j in hex_convert]
split_bits = [int(k, 16) for k in to_bytes]
convert_2 = [int.from_bytes(b, byteorder = 'little', signed = True) for b in to_bytes]
results_2.append(convert_2)
print (results_2)
Output: [[108191910426672, 112589973780528, 56282882144304], [108191943981104, 112589235583024, 56282932475952]]
This result is even more wild than the first. I know my approach is wrong, and it doesn't help that i've never been able to get my head around binary conversion etc, but I feel i'm on the right path with:
(b, byteorder = 'little', signed = True)
but can't work out where i'm wrong. Any help explaining this concept would be greatly appreciated.
This result is even more wild than the first. I know my approach is wrong... but can't work out where i'm wrong.
The problem is in the conversion to bytes. Let's look at it a step at a time:
int(x, 2)
Fine; we treat the string as a base-2 representation of the integer value, and get that integer. Only problem is it's a) unsigned and b) big-endian.
hex(int(x,2))
What this does is create a string representation of the integer, in base 16, with a 0x prefix. Notably, there are two text characters per byte that we want. This is already heading is down the wrong path.
You might have thought of using hexadecimal because you've seen \xAB style escapes inside string representations. This is a completely different thing. The string '\xAB' contains one character. The string '0xAB' contains four.
From there, everything else is still nonsense. Converting to bytes with a text encoding just means that the text character 0 for example is replaced with the byte value 48 (since in UTF-8 it's encoded with a single byte with that value). For this data we get the same results with UTF-8 that we would by assuming plain ASCII (since UTF-8 is "ASCII transparent" and there are no non-ASCII characters in the text).
So how do we do it?
We want to convert the integer from the first step into the bytes used to represent it. Just as there is a .from_bytes class method allowing us to create an integer from underlying bytes, there is an instance method allowing us to get the bytes that would represent the integer.
So, we use .to_bytes, specifying the length, signedness and endianness that was assumed when we created the int from the binary string - that gives us bytes that correspond to that string. Then, we re-create the integer from those bytes, except now specifying the proper signedness and endianness. The reason that .to_bytes makes us specify a length is because the integer doesn't have a particular length - there are a minimum number of bytes required to represent it, but you could use as many more as you like. (This is especially important if you want to handle signed values, since it will do sign-extension automatically.)
Thus:
for i in input_data:
values = [int(x,2) for x in i]
as_bytes = [x.to_bytes(2, byteorder='big', signed=False) for x in values]
reinterpreted = [int.from_bytes(x, byteorder='little', signed=True) for x in as_bytes]
results_2.append(reinterpreted)
But let's improve the organization of the code a bit. I will first make a function to handle a single integer value, and then we can use comprehensions to process the list. In fact, we can use nested comprehensions for the nested list.
def as_signed_little(binary_str):
# This time, taking advantage of positional args and default values.
as_bytes = int(binary_str, 2).to_bytes(2, 'big')
return int.from_bytes(as_bytes, 'little', signed=True)
# And now we can do:
results_2 = [[as_signed_little(x) for x in i] for i in input_data]

How would you unpack a 32bit int in Python?

I'm fairly weak with structs but I have a feeling they're the best way to do this. I have a large string of binary data and need to pull 32 of those chars, starting at a specific index, and store them as an int. What is the best way to do this?
Since I need to start at an initial position I have been playing with struct.unpack_from(). Based on the format table here, I thought the 'i' formatting being 4 bytes is exactly what I needed but the code below executes and prints "(825307441,)" where I was expecting either the binary, decimal or hex form. Can anyone explain to me what 825307441 represents?
Also is there a method of extracting the data in a similar fashion but returning it in a list instead of a tuple? Thank you
st = "1111111111111111111111111111111"
test = struct.unpack_from('i',st,0)
print test
Just use int
>>> st = "1111111111111111111111111111111"
>>> int(st,2)
2147483647
>>> int(st[1:4],2)
7
You can slice the string any way you want to get the indices you desire. Passing 2 to int tells int that you are passing it a string in binary

Convert hexadecimal notation literally to string or vice versa

The function I'm writing gets a checksum (format: '*76') as a string (isolated from an NMEA string). This checksum in string format is called 'Obs' (Observed from string). It then computes the checksum from the rest of the string and gets that answer as hex (Terminal: 0x76), this will be called 'Com' (Computed from string). Now I need to convert one to the other to compare them agains each other.
I've tried stuff like:
HexObs = hex(Obs) #with Obs as '0x76' and '0*76'
Which gives me an error.
and
StrCom = str(Com)
Which gives: '118'
There were no previous questions in which I recognised my question.
Does anyone know how to convert one to the other? Tnx in advance.
I think you're problem is getting the original into an actual hex form
tobs = '76'
obs = hex(int('0x' + tobs, 16))
that will give you an actual hex value to compare
alternately you could use:
tobs = '76'
com = '0x76'
tcom = com[2:]
then compare tobs & tcom
To go from a string hex representation, use:
>>> int('0x76', 16)
118
The second argument is the base.
To go from an integer to a string hex representation, use:
>>> hex(118)
'0x76'

Changing string to byte type in Python 2.7

In python 3.2, i can change the type of an object easily. For example :
x=0
print(type (x))
x=bytes(0)
print(type (x))
it will give me this :
<class 'int'>
<class 'bytes'>
But, in python 2.7, it seems that i can't use the same way to do it. If i do the same code, it give me this :
<type 'int'>
<type 'str'>
What can i do to change the type into a bytes type?
You are not changing types, you are assigning a different value to a variable.
You are also hitting on one of the fundamental differences between python 2.x and 3.x; grossly simplified the 2.x type unicode has replaced the str type, which itself has been renamed to bytes. It happens to work in your code as more recent versions of Python 2 have added bytes as an alias for str to ease writing code that works under both versions.
In other words, your code is working as expected.
What can i do to change the type into a bytes type?
You can't, there is no such type as 'bytes' in Python 2.7.
From the Python 2.7 documentation (5.6 Sequence Types):
"There are seven sequence types: strings, Unicode strings, lists, tuples, bytearrays, buffers, and xrange objects."
From the Python 3.2 documentation (5.6 Sequence Types):
"There are six sequence types: strings, byte sequences (bytes objects), byte arrays (bytearray objects), lists, tuples, and range objects."
In Python 2.x, bytes is just an alias for str, so everything works as expected. Moreover, you are not changing the type of any objects here – you are merely rebinding the name x to a different object.
May be not exactly what you need, but when I needed to get the decimal value of the byte d8 (it was a byte giving an offset in a file) i did:
a = (data[-1:]) # the variable 'data' holds 60 bytes from a PE file, I needed the last byte
#so now a == '\xd8' , a string
b = str(a.encode('hex')) # which makes b == 'd8' , again a string
c = '0x' + b # c == '0xd8' , again a string
int_value = int(c,16) # giving me my desired offset in decimal: 216
#I hope this can help someone stuck in my situation
Just example to emphasize a procedure of turning regular string into binary string and back:
sb = "a0" # just string with 2 characters representing a byte
ib = int(sb, 16) # integer value (160 decimal)
xsb = chr(ib) # a binary string (equals '\xa0')
Now backwards
back_sb = xsb.encode('hex')
back_sb == sb # returns True

Categories