I actually get a value (b'\xc8\x00') from a temperature sensor. I want to convert it to a float value. Is it right, that I need to decode it?
Here is my function:
def ToFloat(data):
s = data.decode('utf-8')
print(s)
return s
But when I try to compile it, I get the error:
'utf-8' codec can't decode byte 0xc8 in position 0: invalid continuation byte
You seem to be having packed bytes not unicode objects. Use struct.unpack:
In [3]: import struct
In [4]: struct.unpack('h', b'\xc8\x00')[0]
Out[4]: 200
Format h specifies a short value (2 bytes). If your temperature values will always be positive, you can use H for unsigned short:
import struct
def to_float(data):
return float(struct.unpack('H', data)[0])
Notice that ToFloat() is a bit irritating as it returns a float but interpretes the data as integer values. If the bytes are representing a float, it would be necessary to know in which format the float is packed into these two bytes (usually float takes more than two bytes).
data = b'\xc8\x00'
def ToFloat(data):
byte0 = int(data[0])
print(byte0)
byte1 = int(data[1])
print(byte1)
number = byte0 + 256*byte1
print(number)
return float(number)
returns: 200.0 what seems to be reasonable. If not, just see what the data mean and process accordingly.
Related
I am looking for a python equivalent of https://www.npmjs.com/package/bit-sequence.
That is to say, I need a function which takes some 'bytes', some 'start' value corresponding to an integer bit index to start extraction (not byte index), and some 'length' value to correspond to the amount of bits to extract from the bytes array
Here's an answer, with the example from the page you linked:
def bitSequence(bytes, start, length):
# Converting the bytes array into one long bit string
binstring = "".join([bin(byte)[2:].zfill(8) for byte in bytes])
# Converting the required part of the bitstring to a base 10 int
ret_byte = int("0b"+binstring[start:start+length], 2)
return ret_byte
example=[ 0b00010101, 0b10101000, 0b00000000, 0b00000000 ]
print(bitSequence(example , 7, 11))
1696
I think that solves your problem, for more information about bit manipulation in python take a look at this link ->https://realpython.com/python-bitwise-operators/#bit-strings-in-python
def bytes_extraction(bytes, start, length):
bytes = bin(bytes)
result = bytes[start:start+length]
result = int(result, 2)
return result
bits = 0b101010
response = bytes_extraction(bytes=bits , start=2, length=4)
print(response)
I am trying to read bytes from an image, and get all the int (16 bit) values from that image.
After I parsed the image header, I got to the pixel values. The values that I get when the pair of bytes are like b"\xd4\x00" is incorrect. In this case it should be 54272, not 3392.
This are parts of the code:
I use a generator to get the bytes:
import itertools
def osddef_generator(in_file):
with open(in_file, mode='rb') as f:
dat = f.read()
for byte in dat:
yield byte
def take_slice(in_generator, size):
return ''.join(str(chr(i)) for i in itertools.islice(in_generator, size))
def take_single_pixel(in_generator):
pix = itertools.islice(in_generator, 2)
hex_list = [hex(i) for i in pix]
hex_str = "".join(hex_list)[2:].replace("0x", '')
intval = int(hex_str, 16)
print("hex_list: ", hex_list)
print("hex_str: ", hex_str)
print("intval: ", intval)
After I get the header correctly using the take_slice method, I get to the part with the pixel values, where I use the take_single_pixel method.
Here, I get the bad results.
This is what I get:
hex_list: ['0xd4', '0x0']
hex_str: d40
intval: 3392
But the actual sequence of bytes that should be interpreted is: \xd4\x00, which equals to 54272, so that my hex_list = ['0xd4', '0x00'] and hex_str = d400.
Something happens when I have a sequence of bytes when the second one is \x00.
Got any ideas? Thanks!
There are much better ways of converting bytes to integters:
int.from_bytes() takes bytes input, and a byte order argument:
>>> int.from_bytes(b"\xd4\x00", 'big')
54272
>>> int.from_bytes(b"\xd4\x00", 'little')
212
The struct.unpack() function lets you convert a whole series of bytes to integers following a pattern:
>>> import struct
>>> struct.unpack('!4H', b'\xd4\x00\xd4\x00\xd4\x00\xd4\x00')
(54272, 54272, 54272, 54272)
The array module lets you read binary data representing homogenous integer data into a memory structure efficiently:
>>> array.array('H', fileobject)
However, array can't be told what byte order to use. You'd have to determine the current architecture byte order and call arr.byteswap() to reverse order if the machine order doesn't match the file order.
When reading image data, it is almost always preferable to use the struct module to do the parsing. You generally then use file.read() calls with specific sizes; if the header consists of 10 bytes, use:
headerinfo = struct.unpack('<expected header pattern for 10 bytes>', f.read(10))
and go from there. For examples, look at the Pillow / PIL image plugins source code; here is how the Blizzard Mipmap image format header is read:
def _read_blp_header(self):
self._blp_compression, = struct.unpack("<i", self.fp.read(4))
self._blp_encoding, = struct.unpack("<b", self.fp.read(1))
self._blp_alpha_depth, = struct.unpack("<b", self.fp.read(1))
self._blp_alpha_encoding, = struct.unpack("<b", self.fp.read(1))
self._blp_mips, = struct.unpack("<b", self.fp.read(1))
self._size = struct.unpack("<II", self.fp.read(8))
if self.magic == b"BLP1":
# Only present for BLP1
self._blp_encoding, = struct.unpack("<i", self.fp.read(4))
self._blp_subtype, = struct.unpack("<i", self.fp.read(4))
self._blp_offsets = struct.unpack("<16I", self.fp.read(16 * 4))
self._blp_lengths = struct.unpack("<16I", self.fp.read(16 * 4))
Because struct.unpack() always returns tuples, you can assign individual elements in a tuple to name1, name2, ... names on the left-hand size, including single_name, = assignments to extract a single result.
The separate set of read calls above could also be compressed into fewer calls:
comp, enc, adepth, aenc, mips, *size = struct.unpack("<i4b2I", self.fp.read(16))
if self.magic == b"BLP1":
# Only present for BLP1
enc, subtype = struct.unpack("<2i", self.fp.read(8))
followed by specific attribute assignments.
I'm trying to make a filesharing program, so I open the files in readbinary, and read it, establish a connection, and I try to send byte for byte.
How can I send b"\x" + (encoded bytes of the int from dataread[i])?
I always gives me an error, also, if it won't work, how can I read exactly a byte? So that I don't get an int? (like dataread[0], if the value is "\x01", I get 1).
My code:
for g in range(len(datar)):
esc = str(datar[g])
if len(esc) == 1:
esc = "0"+esc
esc = "\x"+bytes(esc,"utf8")
c.send(esc)
c.recv(500)
print(g,"Bytes von",len(datar),"gesendet")
The '\xhh' notation only works in string or byte literals. If you have an integer, just pass this to a bytes() object in a list:
bytes(dataread) # if dataread is a list of integers
or
bytes([dataread]) # if dataread is a single integer
bytes objects are sequences of integer values, each limited to the range 0-255.
To send individual bytes from datar, that translates to:
for byte in datar:
c.send(bytes([esc]))
c.recv(500)
print(g,"Bytes von",len(datar),"gesendet")
In Java, I can encode a BigInteger as:
java.math.BigInteger bi = new java.math.BigInteger("65537L");
String encoded = Base64.encodeBytes(bi.toByteArray(), Base64.ENCODE|Base64.DONT_GUNZIP);
// result: 65537L encodes as "AQAB" in Base64
byte[] decoded = Base64.decode(encoded, Base64.DECODE|Base64.DONT_GUNZIP);
java.math.BigInteger back = new java.math.BigInteger(decoded);
In C#:
System.Numerics.BigInteger bi = new System.Numerics.BigInteger("65537L");
string encoded = Convert.ToBase64(bi);
byte[] decoded = Convert.FromBase64String(encoded);
System.Numerics.BigInteger back = new System.Numerics.BigInteger(decoded);
How can I encode long integers in Python as Base64-encoded strings? What I've tried so far produces results different from implementations in other languages (so far I've tried in Java and C#), particularly it produces longer-length Base64-encoded strings.
import struct
encoded = struct.pack('I', (1<<16)+1).encode('base64')[:-1]
# produces a longer string, 'AQABAA==' instead of the expected 'AQAB'
When using this Python code to produce a Base64-encoded string, the resulting decoded integer in Java (for example) produces instead 16777472 in place of the expected 65537. Firstly, what am I missing?
Secondly, I have to figure out by hand what is the length format to use in struct.pack; and if I'm trying to encode a long number (greater than (1<<64)-1) the 'Q' format specification is too short to hold the representation. Does that mean that I have to do the representation by hand, or is there an undocumented format specifier for the struct.pack function? (I'm not compelled to use struct, but at first glance it seemed to do what I needed.)
Check out this page on converting integer to base64.
import base64
import struct
def encode(n):
data = struct.pack('<Q', n).rstrip('\x00')
if len(data)==0:
data = '\x00'
s = base64.urlsafe_b64encode(data).rstrip('=')
return s
def decode(s):
data = base64.urlsafe_b64decode(s + '==')
n = struct.unpack('<Q', data + '\x00'* (8-len(data)) )
return n[0]
The struct module:
… performs conversions between Python values and C structs represented as Python strings.
Because C doesn't have infinite-length integers, there's no functionality for packing them.
But it's very easy to write yourself. For example:
def pack_bigint(i):
b = bytearray()
while i:
b.append(i & 0xFF)
i >>= 8
return b
Or:
def pack_bigint(i):
bl = (i.bit_length() + 7) // 8
fmt = '<{}B'.format(bl)
# ...
And so on.
And of course you'll want an unpack function, like jbatista's from the comments:
def unpack_bigint(b):
b = bytearray(b) # in case you're passing in a bytes/str
return sum((1 << (bi*8)) * bb for (bi, bb) in enumerate(b))
This is a bit late, but I figured I'd throw my hat in the ring:
def inttob64(n):
"""
Given an integer returns the base64 encoded version of it (no trailing ==)
"""
parts = []
while n:
parts.insert(0,n & limit)
n >>= 32
data = struct.pack('>' + 'L'*len(parts),*parts)
s = base64.urlsafe_b64encode(data).rstrip('=')
return s
def b64toint(s):
"""
Given a string with a base64 encoded value, return the integer representation
of it
"""
data = base64.urlsafe_b64decode(s + '==')
n = 0
while data:
n <<= 32
(toor,) = struct.unpack('>L',data[:4])
n |= toor & 0xffffffff
data = data[4:]
return n
These functions turn an arbitrary-sized long number to/from a big-endian base64 representation.
Here is something that may help. Instead of using struct.pack() I am building a string of bytes to encode and then calling the BASE64 encode on that. I didn't write the decode, but clearly the decode can recover an identical string of bytes and a loop could recover the original value. I don't know if you need fixed-size integers (like always 128-bit) and I don't know if you need Big Endian so I left the decoder for you.
Also, encode64() and decode64() are from #msc's answer, but modified to work.
import base64
import struct
def encode64(n):
data = struct.pack('<Q', n).rstrip('\x00')
if len(data)==0:
data = '\x00'
s = base64.urlsafe_b64encode(data).rstrip('=')
return s
def decode64(s):
data = base64.urlsafe_b64decode(s + '==')
n = struct.unpack('<Q', data + '\x00'* (8-len(data)) )
return n[0]
def encode(n, big_endian=False):
lst = []
while True:
n, lsb = divmod(n, 0x100)
lst.append(chr(lsb))
if not n:
break
if big_endian:
# I have not tested Big Endian mode, and it may need to have
# some initial zero bytes prepended; like, if the integer is
# supposed to be a 128-bit integer, and you encode a 1, you
# would need this to have 15 leading zero bytes.
initial_zero_bytes = '\x00' * 2
data = initial_zero_bytes + ''.join(reversed(lst))
else:
data = ''.join(lst)
s = base64.urlsafe_b64encode(data).rstrip('=')
return s
print encode(1234567890098765432112345678900987654321)
I'm writing a client for a P2P application at the minute and the spec for the protocol says that the header for each packet should have each field with a particular byte length like so:
Version: 1 Byte
Type: 1 Byte
Length: 2 Bytes
And then the data
I've got the way of packing and unpacking the header fields (I think) like this:
packed = struct.pack('cch' , '1' , '1' , 26)
This constructs a header for a packet with a data length of 26, but when it comes to unpacking the data I'm unsure how to go about getting the rest of the data afterwards. To unpack we need to know the size of all the fields, unless I'm missing something? I guess to pack the data I'd use a format indicator 'cch26s' meaning:
1 Byte char
1 Byte char
2 Byte short
26 Byte char array
But how do I unpack the data when I don't know how much data will be included in the packet first?
The way you're describing the protocol, you should unpack the first four bytes first, and extract Length (a 16-bit int). This tells you how many bytes to unpack in a second step.
version, type, length = struct.unpack("cch", packed[:4])
content, = struct.unpack("%ds" % length, packed[4:])
This is if everything checks out. unpack() requires that the packed buffer contain exactly as much data as you unpack. Also, check whether the 4 header bytes are included in the length count.
You can surmise the number of characters to unpack by inspecting len(data).
Here is a helper function which does this for you:
def unpack(fmt, astr):
"""
Return struct.unpack(fmt, astr) with the optional single * in fmt replaced with
the appropriate number, given the length of astr.
"""
# http://stackoverflow.com/a/7867892/190597
try:
return struct.unpack(fmt, astr)
except struct.error:
flen = struct.calcsize(fmt.replace('*', ''))
alen = len(astr)
idx = fmt.find('*')
before_char = fmt[idx-1]
n = (alen-flen)/struct.calcsize(before_char)+1
fmt = ''.join((fmt[:idx-1], str(n), before_char, fmt[idx+1:]))
return struct.unpack(fmt, astr)
You can use it like this:
unpack('cchs*', data)