the context:
I decode a amf response from an flex app with python.
With pyamf I can decode all the response, but one value got my attention.
this value \xa2C is transformed to 4419
#\xa2C -> 4419
#\xddI -> 11977
I know \x is related with a hex value, but I cant get the function to transform 4419 to \xa2C.
the 4419 is an integer.
--- Update 1
This original value, are not hex.
because I transform this value \xa2I to 4425.
So what kind of value is \xa2I ???
Thanks!
-- Update 2.
DJ = 5834
0F = 15
0G = error
1F = 31
a1f = 4294
adI = 5833
adg = 5863
adh = 5864
Is strange some time accept values after F and in other situation show an error. But are not hex value that is for sure.
What you're seeing is the string representation of the bytes of an AmfInteger. The first example, \xa2C consists of two bytes: 0xa2 aka 162, and C, which is the ASCII representation of 67:
>>> ord("\xa2C"[0])
162
>>> ord("\xa2C"[1])
67
To convert this into an AmfInteger, we have to follow the AMF3 specifications, section 1.3.1 (the format of an AmfInteger is the same in AMF0 and AMF3, so it doesn't matter what specification we look at).
In that section, a U29 (variable length unsigned 29-bit integer, which is what AmfIntegers use internally to represent the value) is defined as either a 1-, 2-, 3- or 4-byte sequence. Each byte encodes information about the value itself, as well as whether another byte follows. To figure out whether another byte follows the current one, one just needs to check whether the most significant bit is set:
>>> (162 & 0x80) == 0x80
True
>>> (67 & 0x80) == 0x80
False
So we now confirmed that the byte sequence you see is indeed a full U29: the first byte has its high bit set, to indicate that it's followed by another byte. The second byte has the bit unset, to indicate the end of the sequence. To get the actual value from those bytes, we now only need to combine their values, while masking out the high bit of the first byte:
>>> 162 & 0x7f
34
>>> 34 << 7
4352
>>> 4352 | 67
4419
From this, it should be easy to figure out why the other values give the results you observe.
For completeness sake, here's also a Python snippet with an example implementation that parses a U29, including all corner cases:
def parse_u29(byte_sequence):
value = 0
# Handle the initial bytes
for byte in byte_sequence[:-1]:
# Ensure it has its high bit set.
assert ord(byte) & 0x80
# Extract the value and add it to the accumulator.
value <<= 7
value |= ord(byte) & 0x7F
# Handle the last byte.
value <<= 8 if len(byte_sequence) > 3 else 7
value |= ord(byte_sequence[-1])
# Handle sign.
value = (value + 2**28) % 2**29 - 2**28
return value
print parse_u29("\xa2C"), 4419
print parse_u29(map(chr, [0x88, 0x00])), 1024
print parse_u29(map(chr, [0xFF, 0xFF, 0x7E])), 0x1ffffe
print parse_u29(map(chr, [0x80, 0xC0, 0x80, 0x00])), 0x200000
print parse_u29(map(chr, [0xBF, 0xFF, 0xFF, 0xFE])), 0xffffffe
print parse_u29(map(chr, [0xC0, 0x80, 0x80, 0x01])), -268435455
print parse_u29(map(chr, [0xFF, 0xFF, 0xFF, 0x81])), -127
Related
I am sending from a sensor node over TCP to my TCP server. The raw received data looks like:
b'A\x10Vu\x87%\x00x\x0c\xc7\x03\x01\x00\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x00\x00'
When trying to decode it using utf-8, I receive the following error.
Code:
my_variable = b'A\x10Vu\x87%\x00x\x0c\xc7\x03\x01\x00\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x00\x00'
print(my_variable.decode('utf-8'))
Error:
print(my_variable.decode('utf-8')) UnicodeDecodeError: 'utf-8' codec
can't decode byte 0x87 in position 4: invalid start byte
So the problem is that the Payload contains non-ascii format characters, apparently.
How can I decode this payload to sth. human readable?
The payload description can be found here on p32. p20 shows a tcp connection example but without decoding the payload.
Base on documentation is is NOT human readable and you shouldn't decode it but you should write special code to convert every value from hex to integer and eventually convert it to string with extra values - ie. dots in version number.
Here code for beginning values from payload
data = b'A\x10Vu\x87%\x00x\x0c\xc7\x03\x01\x00\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x00\x00'
ID = data[:6].hex()
print('ID:', ID)
hardware = data[6]
if hardware == 1:
hardware = 'NBSN95'
print('hardware version:', hardware)
software = data[7]
print('software version (raw):', software)
software = '.'.join(list(str(software)))
print('software version:', software)
battery = data[8:10].hex()
print('battery (raw):', battery)
battery = int(battery, 16)
print('battery:', battery, 'mV =', battery/1000, 'V')
signal = data[10]
print('signal (raw):', signal)
if signal == 0:
signal = '-113dBm or less'
elif signal == 1:
signal = '-111dBm'
elif 2 <= signal <= 30:
signal = '-109dBm ... -53dBm'
elif signal == 31:
signal = '-51dBm or greater'
elif signal == 99:
signla = 'Not known or not detectable'
print('signal:', signal)
temp = data[11:13].hex()
print('temperature (raw):', temp)
temp = int(temp, 16)
if temp & 0xFC00 == 0:
temp = temp/10
elif temp & 0xFC00 == 1:
temp = (temp-65536)/10
print('temperature:', temp, 'degree')
Result:
ID: 411056758725
hardware version: 0
software version (raw): 120
software version: 1.2.0
battery (raw): 0cc7
battery: 3271 mV = 3.271 V
signal (raw): 3
signal: -109dBm ... -53dBm
temperature (raw): 0100
temperature: 25.6 degree
You could find the answer in a Python prompt. In fact, I started my exploration using dir(my_variable):
my_variable = b'A\x10Vu\x87%\x00x\x0c\xc7\x03\x01\x00\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x00\x00'
# dir(my_variable)
my_variable.hex
<built-in method hex of bytes object at 0x00000232BDF129F0>
help(my_variable.hex) # truncated
Help on built-in function hex:
hex(...) method of builtins.bytes instance
Create a str of hexadecimal numbers from a bytes object.
my_variable.hex()
'41105675872500780cc7030100000000260000000000000000'
The data is raw byte data and should not be decoded. Instead, use the struct module to unpack the raw bytes into bytes and words of data. The spec (page 22) indicates how many bytes for each field:
The struct module also has the advantage that you don't have to manually calculate the offsets to each fields, and if the unpack pattern doesn't match the length of data it will catch the error.
Note that the 2-byte version is a hardware version byte and a software version byte, so I used BB (2 bytes) to extract them separately. The temperatures were documented as two's complement so I used h (signed 16-bit value) for them. Also note the data is big-endian, so > is used.
See also the struct - Format String documentation.
import struct
from datetime import datetime
from pytz import UTC
data = b'A\x10Vu\x87%\x00x\x0c\xc7\x03\x01\x00\x00\x00\x00&\x00\x00\x00\x00\x00\x00\x00\x00'
devid,hw,sw,bat,ss,mod,t1,dii,adc,t2,h,ts = struct.unpack('>6sBBHBBhBHhHL',data)
# fields that needed processing are done in the f-strings below
print(f"DeviceID={devid.hex()} HW={hw} SW={'.'.join(str(sw))}\n"
f"BAT={bat:.3f}mV SignalStrength={-113+2*ss}dBm Mode={mod} Temp={t1/10}\N{DEGREE CELSIUS}\n"
f"Door={dii==0x80} ADC={adc}mv Temp2={t1/10:.1f}\N{DEGREE CELSIUS} Humidity={h/10:.1f}%\n"
f"Timestamp={datetime.fromtimestamp(ts,UTC)}")
Output:
DeviceID=411056758725 HW=0 SW=1.2.0
BAT=3.271V SignalStrength=-107dBm Mode=1 Temp=0.0℃
Door=False ADC=38mv Temp2=0.0℃ Humidity=0.0%
Timestamp=1970-01-01 00:00:00+00:00
I'm working on a communication command protocol between a PLC and a 3rd party device.
The manufacturer has provided me with the following information for calculating the CRC values that will change depending on the address of the device I wish to read information from.
A CRC is performed on a block of data, for example the first seven bytes of all transmissions are followed by a two byte CRC for that data. This CRC will be virtually unique for that particular combination of bytes. The process of calculating the CRC follows:
Inputs:
N.BYTES = Number or data bytes to CRC ( maximum 64 bytes )
DATA() = An array of data bytes of quantity N.BYTES
CRC.MASK = 0xC9DA a hexadecimal constant used in the process
Outputs:
CRC = two byte code redundancy check made up of CRC1 (High byte) and CRC2 (Low byte)
Process:
START
CRC = 0xFFFF
FOR N = 1 TO N.BYTES
CRC = CRC XOR ( DATA(N) AND 0xFF )
FOR I = 1 TO 8
IF ( CRC AND 0x0001 ) = 0 THEN LSR CRC
ELSE LSR CRC ; CRC = CRC XOR CRC.MASK
NEXT I
NEXT N
X = CRC1 ; Change the two bytes in CRC around
CRC1 = CRC2
CRC2 = X
END
They also provided me with a couple of complete command strings for the first few device addresses.
RTU #1
05-64-00-02-10-01-00-6C-4B-53-45-EB-F7
RTU #2
05-64-00-02-10-02-00-1C-AE-53-45-EB-F7
RTU #3
05-64-00-02-10-03-00-CC-F2-53-45-EB-F7
The header CRC bytes in the previous three commands are 6C-4B, 1C-AE, and CC-F2 respectively.
I calculated out a the first few lines by hand to have something to compare against when I wrote out the following code in Python.
byte1 = 05
byte2 = 100
byte3 = 00
byte4 = 02
byte5 = 16
byte6 = 01
byte7 = 00
byte8 = 00
mask = 51674
hexarray = [byte1, byte2, byte3, byte4, byte5, byte6, byte7, byte8]
#print hexarray
CRCdata = 65535
for n in hexarray:
CRCdata = CRCdata ^ (n & 255)
print(n, CRCdata)
for i in range(1,8):
if (CRCdata & 1) == 0:
CRCdata = CRCdata >> 1
# print 'if'
else:
CRCdata = CRCdata >> 1
CRCdata = CRCdata ^ mask
# print 'else'
print(i, CRCdata)
print CRCdata
I added byte8 due to some research I did mentioning that an extra byte of 0s needs to be added to the end of the CRC array for calculations. I converted the final result and did the byte swap manually. The problem I've been running into, is that my CRC calculations, whether I keep byte8 or not, are not matching up with any of the three examples that have been provided.
I'm not quite sure where I am going wrong on this and any help would be greatly appreciated.
I was able to solve the issue by updating the code to range(0,8) and dropping byte8.
I have a file which contains binary data. The content of this file is just one long line.
Example: 010101000011101010101
Originaly the content was an array of c++ objects with the following data types:
// Care pseudo code, just for visualisation
int64 var1;
int32 var2[50];
int08 var3;
I want to skip var1 and var3 and only extract the values of var2 into some readable decimal values. My idea was to read the file byte by byte and convert them into hex values. In the next step I though I could "combine" (append) 4 of those hex values to get one int32 value.
Example: 0x10 0xAA 0x00 0x50 -> 0x10AA0050
My code so far:
def append_hex(a, b):
return (a << 4) | b
with open("file.dat", "rb") as f:
counter = 0
tickdifcounter = 0
current_byte=" "
while True:
if (counter >= 8) and (counter < 208):
tickdifcounter+=1
if (tickdifcounter <= 4):
current_byte = append_hex(current_byte, f.read(1))
if (not current_byte):
break
val = ord(current_byte)
if (tickdifcounter > 4):
print hex(val)
tickdifcounter = 0
current_byte=""
counter+=1
if(counter == 209): #209 bytes = int64 + (int32*50) + int08
counter = 0
print
Now I have the problem that my append_hex is not working because the variables are strings so the bitshift is not working.
I am new to python so please give me hints when I can do something in a better way.
You can use struct module for reading binary files.
This can help you Reading a binary file into a struct in Python
A character can be converted to a int using the ord(x) method. In order to get the integer value of a multi-byte number, bitshift left. For example, from a earlier project:
def parseNumber(string, index):
return ord(string[index])<<24 + ord(string[index+1])<<16 + \
ord(string[index+2])<<8+ord(string[index+3])
Note this code assumes big-endian system, you will need to reverse the index for parsing little-endian code.
If you know exaclty what the size of the struct is going to be, (or can easily calculate it based on the size of the file) you are probably better of using the "struct" module.
I actually get a value (b'\xc8\x00') from a temperature sensor. I want to convert it to a float value. Is it right, that I need to decode it?
Here is my function:
def ToFloat(data):
s = data.decode('utf-8')
print(s)
return s
But when I try to compile it, I get the error:
'utf-8' codec can't decode byte 0xc8 in position 0: invalid continuation byte
You seem to be having packed bytes not unicode objects. Use struct.unpack:
In [3]: import struct
In [4]: struct.unpack('h', b'\xc8\x00')[0]
Out[4]: 200
Format h specifies a short value (2 bytes). If your temperature values will always be positive, you can use H for unsigned short:
import struct
def to_float(data):
return float(struct.unpack('H', data)[0])
Notice that ToFloat() is a bit irritating as it returns a float but interpretes the data as integer values. If the bytes are representing a float, it would be necessary to know in which format the float is packed into these two bytes (usually float takes more than two bytes).
data = b'\xc8\x00'
def ToFloat(data):
byte0 = int(data[0])
print(byte0)
byte1 = int(data[1])
print(byte1)
number = byte0 + 256*byte1
print(number)
return float(number)
returns: 200.0 what seems to be reasonable. If not, just see what the data mean and process accordingly.
I'm trying to modify the code shown far below, which works in Python 2.7.x, so it will also work unchanged in Python 3.x. However I'm encountering the following problem I can't solve in the first function, bin_to_float() as shown by the output below:
float_to_bin(0.000000): '0'
Traceback (most recent call last):
File "binary-to-a-float-number.py", line 36, in <module>
float = bin_to_float(binary)
File "binary-to-a-float-number.py", line 9, in bin_to_float
return struct.unpack('>d', bf)[0]
TypeError: a bytes-like object is required, not 'str'
I tried to fix that by adding a bf = bytes(bf) right before the call to struct.unpack(), but doing so produced its own TypeError:
TypeError: string argument without an encoding
So my questions are is it possible to fix this issue and achieve my goal? And if so, how? Preferably in a way that would work in both versions of Python.
Here's the code that works in Python 2:
import struct
def bin_to_float(b):
""" Convert binary string to a float. """
bf = int_to_bytes(int(b, 2), 8) # 8 bytes needed for IEEE 754 binary64
return struct.unpack('>d', bf)[0]
def int_to_bytes(n, minlen=0): # helper function
""" Int/long to byte string. """
nbits = n.bit_length() + (1 if n < 0 else 0) # plus one for any sign bit
nbytes = (nbits+7) // 8 # number of whole bytes
bytes = []
for _ in range(nbytes):
bytes.append(chr(n & 0xff))
n >>= 8
if minlen > 0 and len(bytes) < minlen: # zero pad?
bytes.extend((minlen-len(bytes)) * '0')
return ''.join(reversed(bytes)) # high bytes at beginning
# tests
def float_to_bin(f):
""" Convert a float into a binary string. """
ba = struct.pack('>d', f)
ba = bytearray(ba)
s = ''.join('{:08b}'.format(b) for b in ba)
s = s.lstrip('0') # strip leading zeros
return s if s else '0' # but leave at least one
for f in 0.0, 1.0, -14.0, 12.546, 3.141593:
binary = float_to_bin(f)
print('float_to_bin(%f): %r' % (f, binary))
float = bin_to_float(binary)
print('bin_to_float(%r): %f' % (binary, float))
print('')
To make portable code that works with bytes in both Python 2 and 3 using libraries that literally use the different data types between the two, you need to explicitly declare them using the appropriate literal mark for every string (or add from __future__ import unicode_literals to top of every module doing this). This step is to ensure your data types are correct internally in your code.
Secondly, make the decision to support Python 3 going forward, with fallbacks specific for Python 2. This means overriding str with unicode, and figure out methods/functions that do not return the same types in both Python versions should be modified and replaced to return the correct type (being the Python 3 version). Do note that bytes is a reserved word, too, so don't use that.
Putting this together, your code will look something like this:
import struct
import sys
if sys.version_info < (3, 0):
str = unicode
chr = unichr
def bin_to_float(b):
""" Convert binary string to a float. """
bf = int_to_bytes(int(b, 2), 8) # 8 bytes needed for IEEE 754 binary64
return struct.unpack(b'>d', bf)[0]
def int_to_bytes(n, minlen=0): # helper function
""" Int/long to byte string. """
nbits = n.bit_length() + (1 if n < 0 else 0) # plus one for any sign bit
nbytes = (nbits+7) // 8 # number of whole bytes
ba = bytearray(b'')
for _ in range(nbytes):
ba.append(n & 0xff)
n >>= 8
if minlen > 0 and len(ba) < minlen: # zero pad?
ba.extend((minlen-len(ba)) * b'0')
return u''.join(str(chr(b)) for b in reversed(ba)).encode('latin1') # high bytes at beginning
# tests
def float_to_bin(f):
""" Convert a float into a binary string. """
ba = struct.pack(b'>d', f)
ba = bytearray(ba)
s = u''.join(u'{:08b}'.format(b) for b in ba)
s = s.lstrip(u'0') # strip leading zeros
return (s if s else u'0').encode('latin1') # but leave at least one
for f in 0.0, 1.0, -14.0, 12.546, 3.141593:
binary = float_to_bin(f)
print(u'float_to_bin(%f): %r' % (f, binary))
float = bin_to_float(binary)
print(u'bin_to_float(%r): %f' % (binary, float))
print(u'')
I used the latin1 codec simply because that's what the byte mappings are originally defined, and it seems to work
$ python2 foo.py
float_to_bin(0.000000): '0'
bin_to_float('0'): 0.000000
float_to_bin(1.000000): '11111111110000000000000000000000000000000000000000000000000000'
bin_to_float('11111111110000000000000000000000000000000000000000000000000000'): 1.000000
float_to_bin(-14.000000): '1100000000101100000000000000000000000000000000000000000000000000'
bin_to_float('1100000000101100000000000000000000000000000000000000000000000000'): -14.000000
float_to_bin(12.546000): '100000000101001000101111000110101001111110111110011101101100100'
bin_to_float('100000000101001000101111000110101001111110111110011101101100100'): 12.546000
float_to_bin(3.141593): '100000000001001001000011111101110000010110000101011110101111111'
bin_to_float('100000000001001001000011111101110000010110000101011110101111111'): 3.141593
Again, but this time under Python 3.5)
$ python3 foo.py
float_to_bin(0.000000): b'0'
bin_to_float(b'0'): 0.000000
float_to_bin(1.000000): b'11111111110000000000000000000000000000000000000000000000000000'
bin_to_float(b'11111111110000000000000000000000000000000000000000000000000000'): 1.000000
float_to_bin(-14.000000): b'1100000000101100000000000000000000000000000000000000000000000000'
bin_to_float(b'1100000000101100000000000000000000000000000000000000000000000000'): -14.000000
float_to_bin(12.546000): b'100000000101001000101111000110101001111110111110011101101100100'
bin_to_float(b'100000000101001000101111000110101001111110111110011101101100100'): 12.546000
float_to_bin(3.141593): b'100000000001001001000011111101110000010110000101011110101111111'
bin_to_float(b'100000000001001001000011111101110000010110000101011110101111111'): 3.141593
It's a lot more work, but in Python3 you can more clearly see that the types are done as proper bytes. I also changed your bytes = [] to a bytearray to more clearly express what you were trying to do.
I had a different approach from #metatoaster's answer. I just modified int_to_bytes to use and return a bytearray:
def int_to_bytes(n, minlen=0): # helper function
""" Int/long to byte string. """
nbits = n.bit_length() + (1 if n < 0 else 0) # plus one for any sign bit
nbytes = (nbits+7) // 8 # number of whole bytes
b = bytearray()
for _ in range(nbytes):
b.append(n & 0xff)
n >>= 8
if minlen > 0 and len(b) < minlen: # zero pad?
b.extend([0] * (minlen-len(b)))
return bytearray(reversed(b)) # high bytes at beginning
This seems to work without any other modifications under both Python 2.7.11 and Python 3.5.1.
Note that I zero padded with 0 instead of '0'. I didn't do much testing, but surely that's what you meant?
In Python 3, integers have a to_bytes() method that can perform the conversion in a single call. However, since you asked for a solution that works on Python 2 and 3 unmodified, here's an alternative approach.
If you take a detour via hexadecimal representation, the function int_to_bytes() becomes very simple:
import codecs
def int_to_bytes(n, minlen=0):
hex_str = format(n, "0{}x".format(2 * minlen))
return codecs.decode(hex_str, "hex")
You might need some special case handling to deal with the case when the hex string gets an odd number of characters.
Note that I'm not sure this works with all versions of Python 3. I remember that pseudo-encodings weren't supported in some 3.x version, but I don't remember the details. I tested the code with Python 3.5.