python- reading certain number of bytes from a variable

python- reading certain number of bytes from a variable - python

I'm trying to implement the AES algorithm, for which the message is to be divided into b-blocks each of 1byte(AES-128 would require 1 byte per state cell). So, if the message is: "This is saturday, and it is time to tell tale.", I'd have to read 1 byte out of this and store it in a state cell.
So, my first problem is, Is it possible to read(or extract) a certain number of bytes from a variable?
And the problem that follows immediately is, "if it is possible to get a certain number of bytes from a variable, then, how do we obtain the bits in that byte?"

had to do that just recently. this is an option:
from itertools import islice
byteorder = 'big'
plain = b"This is saturday, and it is time to tell tale."
def chunked(iterable, n):
it = iter(iterable)
values = bytes(islice(it, n))
while values:
yield values
values = bytes(islice(it, n))
for block_bytes in chunked(plain, n=8):
block_int = int.from_bytes(block_bytes, byteorder)
print(block_bytes, bin(block_int))
which outputs
b'This is ' 0b101010001101000011010010111001100100000011010010111001100100000
b'saturday' 0b111001101100001011101000111010101110010011001000110000101111001
b', and it' 0b10110000100000011000010110111001100100001000000110100101110100
b' is time' 0b10000001101001011100110010000001110100011010010110110101100101
b' to tell' 0b10000001110100011011110010000001110100011001010110110001101100
b' tale.' 0b1000000111010001100001011011000110010100101110
note that the byteorder can be 'little' as well.
from block_int it is easy to obtain the individual bits: e.g. the least-significant bit is block_int & 1; for bits in other positions you can shift: (block_int >> 5) & 1 etc. or you get the desired byte from block_bytes (which is an array of ints) and select the bit you want; e.g. (block_bytes[4] >> 7) & 1.
maybe this answer ia also helpful.

Related

Convert binary to signed, little endian 16bit integer in Python

Trying to a convert a binary list into a signed 16bit little endian integer
input_data = [['1100110111111011','1101111011111111','0010101000000011'],['1100111111111011','1101100111111111','0010110100000011']]
Desired Output =[[-1074, -34, 810],[-1703, -39, 813]]
This is what I've got so far. It's been adapted from: Hex string to signed int in Python 3.2?,
Conversion from HEX to SIGNED DEC in python
results = []
for i in input_data:
hex_convert = [hex(int(x,2)) for x in i]
convert = [int(y[4:6] + y[2:4], 16) for y in hex_convert]
results.append(convert)
print (results)
output: [[64461, 65502, 810], [64463, 65497, 813]]
This is works fine, but the above are unsigned integers. I need signed integers capable of handling negative values. I then tried a different approach:
results_2 = []
for i in input_data:
hex_convert = [hex(int(x,2)) for x in i]
to_bytes = [bytes(j, 'utf-8') for j in hex_convert]
split_bits = [int(k, 16) for k in to_bytes]
convert_2 = [int.from_bytes(b, byteorder = 'little', signed = True) for b in to_bytes]
results_2.append(convert_2)
print (results_2)
Output: [[108191910426672, 112589973780528, 56282882144304], [108191943981104, 112589235583024, 56282932475952]]
This result is even more wild than the first. I know my approach is wrong, and it doesn't help that i've never been able to get my head around binary conversion etc, but I feel i'm on the right path with:
(b, byteorder = 'little', signed = True)
but can't work out where i'm wrong. Any help explaining this concept would be greatly appreciated.

This result is even more wild than the first. I know my approach is wrong... but can't work out where i'm wrong.
The problem is in the conversion to bytes. Let's look at it a step at a time:
int(x, 2)
Fine; we treat the string as a base-2 representation of the integer value, and get that integer. Only problem is it's a) unsigned and b) big-endian.
hex(int(x,2))
What this does is create a string representation of the integer, in base 16, with a 0x prefix. Notably, there are two text characters per byte that we want. This is already heading is down the wrong path.
You might have thought of using hexadecimal because you've seen \xAB style escapes inside string representations. This is a completely different thing. The string '\xAB' contains one character. The string '0xAB' contains four.
From there, everything else is still nonsense. Converting to bytes with a text encoding just means that the text character 0 for example is replaced with the byte value 48 (since in UTF-8 it's encoded with a single byte with that value). For this data we get the same results with UTF-8 that we would by assuming plain ASCII (since UTF-8 is "ASCII transparent" and there are no non-ASCII characters in the text).
So how do we do it?
We want to convert the integer from the first step into the bytes used to represent it. Just as there is a .from_bytes class method allowing us to create an integer from underlying bytes, there is an instance method allowing us to get the bytes that would represent the integer.
So, we use .to_bytes, specifying the length, signedness and endianness that was assumed when we created the int from the binary string - that gives us bytes that correspond to that string. Then, we re-create the integer from those bytes, except now specifying the proper signedness and endianness. The reason that .to_bytes makes us specify a length is because the integer doesn't have a particular length - there are a minimum number of bytes required to represent it, but you could use as many more as you like. (This is especially important if you want to handle signed values, since it will do sign-extension automatically.)
Thus:
for i in input_data:
values = [int(x,2) for x in i]
as_bytes = [x.to_bytes(2, byteorder='big', signed=False) for x in values]
reinterpreted = [int.from_bytes(x, byteorder='little', signed=True) for x in as_bytes]
results_2.append(reinterpreted)
But let's improve the organization of the code a bit. I will first make a function to handle a single integer value, and then we can use comprehensions to process the list. In fact, we can use nested comprehensions for the nested list.
def as_signed_little(binary_str):
# This time, taking advantage of positional args and default values.
as_bytes = int(binary_str, 2).to_bytes(2, 'big')
return int.from_bytes(as_bytes, 'little', signed=True)
# And now we can do:
results_2 = [[as_signed_little(x) for x in i] for i in input_data]

Convert I2C Sensor (DS1624) reading into number

First off, sorry for the confusing title. It's pretty late here and I wasn't able to come up with a better one.
So, I have a I2C temperature sensor that outputs the current temperature as a 16 bit word. Reading from LEFT to RIGHT, the 1st bit is the MSB and the 13th bit is the LSB, so 13 bits are payload and the last 3 bits are zeros. I want to read out that sensor with a Raspberry Pi and convert the data.
The first byte (8 bits) are the integer part of the current temperature. If and only if the temperature is negative, the two's complement of the entire word has to be built.
the second byte is the decimal part which has to be multiplied by 0.03125.
So, just a couple of examples (TEMP DIGITAL OUTPUT (Binary) / DIGITAL OUTPUT (Hex), taken from the data sheet here http://datasheets.maximintegrated.com/en/ds/DS1624.pdf)
+125˚C | 01111101 00000000 | 7D00h
+25.0625˚C | 00011001 00010000 | 1910h
+½˚C | 00000000 10000000 | 0080h
0˚C | 00000000 00000000 | 0000h
-½˚C | 11111111 10000000 | FF80h
-25.0625˚C | 11100110 11110000 | E6F0h
-55˚C | 11001001 00000000 | C900h
Because of a difference in endianness the byte order is reversed when reading the sensor, which is not a problem. For example, the first line would become 0x007D instead of 0x7D00, 0xE6F0 becomes F0E6, and so on...
However, once I build the two's complement for negative values I'm not able to come up with a correct conversion.
What I came up with (not working for negative values) is:
import smbus
import time
import logging
class TempSensor:
"""
Class to read out an DS1624 temperature sensor with a given address.
DS1624 data sheet: http://datasheets.maximintegrated.com/en/ds/DS1624.pdf
Usage:
>>> from TempSensor import TempSensor
>>> sensor = TempSensor(0x48)
>>> print "%02.02f" % sensor.get_temperature()
23.66
"""
# Some constants
DS1624_READ_TEMP = 0xAA
DS1624_START = 0xEE
DS1624_STOP = 0x22
def __init__(self, address):
self.address = address
self.bus = smbus.SMBus(0)
def __send_start(self):
self.bus.write_byte(self.address, self.DS1624_START);
def __send_stop(self):
self.bus.write_byte(self.address, self.DS1624_STOP);
def __read_sensor(self):
"""
Gets the temperature data. As the DS1624 is Big-endian and the Pi Little-endian,
the byte order is reversed.
"""
"""
Get the two-byte temperature value. The second byte (endianness!) represents
the integer part of the temperature and the first byte the fractional part in terms
of a 0.03125 multiplier.
The first byte contains the value of the 5 least significant bits. The remaining 3
bits are set to zero.
"""
return self.bus.read_word_data(self.address, self.DS1624_READ_TEMP)
def __convert_raw_to_decimal(self, raw):
# Check if temperature is negative
negative = ((raw & 0x00FF) & 0x80) == 0x80
if negative:
# perform two's complement
raw = (~raw) + 1
# Remove the fractional part (first byte) by doing a bitwise AND with 0x00FF
temp_integer = raw & 0x00FF
# Remove the integer part (second byte) by doing a bitwise AND with 0XFF00 and
# shift the result bits to the right by 8 places and another 3 bits to the right
# because LSB is the 5th bit
temp_fractional = ((raw & 0xFF00) >> 8) >> 3
return temp_integer + ( 0.03125 * temp_fractional)
def run_test(self):
logging.basicConfig(filename='debug.log', level=logging.DEBUG)
# Examples taken from the data sheet (byte order swapped)
values = [0x7D, 0x1019, 0x8000, 0, 0x80FF, 0xF0E6, 0xC9]
for value in values:
logging.debug('value: ' + hex(value) + ' result: ' + str(self.__convert_raw_to_decimal(value)))
def get_temperature(self):
self.__send_start();
time.sleep(0.1);
return self.__convert_raw_to_decimal(self.__read_sensor())
If you run the run_test() method you'll see what i mean. All negatives values are wrong.
The results I get are:
DEBUG:root:value: 0x7d result: 125.0
DEBUG:root:value: 0x1019 result: 25.0625
DEBUG:root:value: 0x8000 result: 0.5
DEBUG:root:value: 0x0 result: 0.0
DEBUG:root:value: 0x80ff result: 1.46875
DEBUG:root:value: 0xf0e6 result: 26.03125
DEBUG:root:value: 0xc9 result: 55.96875
So, I've been banging my head for hours on this one, but it seems I'm lacking the fundamentals of bit-wise operations. I believe that the problem is the masking with the logical AND when values are negative.
EDIT: There are a couple of implementations on the web. None of them works for negative temperatures. I tried it by actually putting the sensor in ice water. I haven't tried the Arduino C++ version yet, but from looking at the source code it seems it doesn't build the two's complement at all, so no negative temperatures either (https://github.com/federico-galli/Arduino-i2c-temperature-sensor-DS1624/blob/master/DS1624.cpp).

Two things, you've got your masks turned around, raw & 0x00ff is the fractional part, not the integer part, and second, this is my solution, given the inputs in your table, this seems to work for me:
import struct
def convert_temp (bytes):
raw_temp = (bytes & 0xff00) >> 8
raw_frac = (bytes & 0x00ff) >> 3
a, b = struct.unpack('bb', '{}{}'.format(chr(raw_temp), chr(raw_frac)))
return a + (0.03125 * b)
The struct module is really nifty when working with more basic data types (such as signed bytes). Hope this helps!
Edit: ignore the comment on your masks, I see my own error now. You can switch around the bytes, should be no problem.
Struct explanation:
Struct.(un)pack both take 2 arguments, the first is a string that specified the attributes of your struct (think in terms of C). In C a struct is just a bunch of bytes, with some information about their types. The second argument is the data that you need decoded (which needs to be a string, explaining the nasty format()).
I can't seem to really explain it any further, I think if you read up on the struct module, and structs in C, and realize that a struct is nothing more then a bunch of bytes, then you should be ok :).
As for the two's complement, that is the regular representation for a signed byte, so no need to convert. The problem you where having is that Python doesn't understand 8-bit integers, and signedness. For instance, you might have a signed byte 0x10101010, but if you long() that in Python, it doesn't interpret that as a signed 8-bit int. My guess is, it just puts it inside and 32-bit int, in which case the sign bit gets interpretted as just the eighth bit.
What struct.unpack('b', ...) does, is actually interpret the bits as a 8-bit signed integer. Not sure if this makes it any clearer, but I hope it helps.

How can I convert two bytes of an integer back into an integer in Python?

I am currently using an Arduino that's outputting some integers (int) through Serial (using pySerial) to a Python script that I'm writing for the Arduino to communicate with X-Plane, a flight simulation program.
I managed to separate the original into two bytes so that I could send it over to the script, but I'm having a little trouble reconstructing the original integer.
I tried using basic bitwise operators (<<, >> etc.) as I would have done in a C++like program, but it does not seem to be working.
I suspect it has to do with data types. I may be using integers with bytes in the same operations, but I can't really tell which type each variable holds, since you don't really declare variables in Python, as far as I know (I'm very new to Python).
self.pot=self.myline[2]<<8
self.pot|=self.myline[3]

You can use the struct module to convert between integers and representation as bytes. In your case, to convert from a Python integer to two bytes and back, you'd use:
>>> import struct
>>> struct.pack('>H', 12345)
'09'
>>> struct.unpack('>H', '09')
(12345,)
The first argument to struct.pack and struct.unpack represent how you want you data to be formatted. Here, I ask for it to be in big-ending mode by using the > prefix (you can use < for little-endian, or = for native) and then I say there is a single unsigned short (16-bits integer) represented by the H.
Other possibilities are b for a signed byte, B for an unsigned byte, h for a signed short (16-bits), i for a signed 32-bits integer, I for an unsigned 32-bits integer. You can get the complete list by looking at the documentation of the struct module.

For example, using Big Endian encoding:
int.from_bytes(my_bytes, byteorder='big')

What you have seems basically like it should work, assuming the data stored in myline has the high byte first:
myline = [0, 1, 2, 3]
pot = myline[2]<<8 | myline[3]
print 'pot: {:d}, 0x{:04x}'.format(pot, pot) # outputs "pot: 515, 0x0203"
Otherwise, if it's low-byte first you'd need to do the opposite way:
myline = [0, 1, 2, 3]
pot = myline[3]<<8 | myline[2]
print 'pot: {:d}, 0x{:04x}'.format(pot, pot) # outputs "pot: 770, 0x0302"

This totally works:
long = 500
first = long & 0xff #244
second = long >> 8 #1
result = (second << 8) + first #500
If you are not sure of types in 'myline' please check Stack Overflow question How to determine the variable type in Python?.

To convert a byte or char to the number it represents, use ord(). Here's a simple round trip from an int to bytes and back:
>>> number = 3**9
>>> hibyte = chr(number / 256)
>>> lobyte = chr(number % 256)
>>> hibyte, lobyte
('L', '\xe3')
>>> print number == (ord(hibyte) << 8) + ord(lobyte)
True
If your myline variable is string or bytestring, you can use the formula in the last line above. If it somehow is a list of integers, then of course you don't need ord.

PNG Chunk type-code Bit #5

I'm trying to write my own little PNG reader in Python. There is something in the documentation I don't quite understand. In chapter 3.3 (where chunks are handled) it says:
Four bits of the type code, namely bit 5 (value 32) of each byte, are used to convey chunk properties. This
choice means that a human can read off the assigned properties according to whether each letter of the type
code is uppercase (bit 5 is 0) or lowercase (bit 5 is 1). However, decoders should test the properties of an unknown
chunk by numerically testing the specified bits; testing whether a character is uppercase or lowercase
is inefficient, and even incorrect if a locale-specific case definition is used.
Ok, so it explicitly denotes one should not test whether a byte is uppercase or lowercase. Then, how do I check that bit 5?
Furthermore, the documentation states
Ancillary bit: bit 5 of first byte
0 (uppercase) = critical, 1 (lowercase) = ancillary.
I have the following function to convert an integer to a bit-stream:
def bits(x, n):
""" Convert an integer value *x* to a sequence of *n* bits as a string. """
return ''.join(str([0, 1][x >> i & 1]) for i in xrange(n - 1, -1, -1))
Just for example, take the sRGB chunk. The lowercase s denotes the chunk is ancillary. But comparing the bit-streams of an uppercase S and lowercase s
01110011
01010011
we can see that bit #5 is zero in both cases.
I think I do have a wrong understanding of counting the bits. As the only bit that changes is the third one (i.e. indexed with 2), i assume this is the bit I'm searching for? It is also the 6th bit from the right and indexed with 5 (from the right of course). Is this what I'm searching for?

Python does have bitwise manipulation. You are doing it the hard way, when they already gave you the bitmask (32 or 0x20).
is_critical = (type_code & 0x20) == 0
or, equivalently:
is_critical = (type_code & (0x1 << 5)) == 0
(with extra parentheses for clarity)

Integer to Unique String

There's probably someone else who asked a similar question, but I didn't take much time to search for this, so just point me to it if someone's already answered this.
I'm trying to take an integer (or long) and turn it into a string, in a very specific way.
The goal is essentially to split the integer into 8-bit segments, then take each of those segments and get the corresponding ASCII character for that chunk, then glue the chunks together.
This is easy to implement, but I'm not sure I'm going about it in the most efficient way.
>>> def stringify(integer):
output = ""
part = integer & 255
while integer != 0:
output += chr(part)
integer = integer >> 8
return output
>>> stringify(10)
'\n'
>>> stringify(10 << 8 | 10)
'\n\n'
>>> stringify(32)
' '
Is there a more efficient way to do this?
Is this built into Python?
EDIT:
Also, as this will be run sequentially in a tight loop, is there some way to streamline it for such use?
>>> for n in xrange(1000): ## EXAMPLE!
print stringify(n)
...

struct can easily do this for integers up to 64 bits in size. Any larger will require you to carve the number up first.
>>> struct.pack('>Q', 12345678901234567890)
'\xabT\xa9\x8c\xeb\x1f\n\xd2'

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

python- reading certain number of bytes from a variable - python

Related

Convert binary to signed, little endian 16bit integer in Python

Convert I2C Sensor (DS1624) reading into number

How can I convert two bytes of an integer back into an integer in Python?

PNG Chunk type-code Bit #5

Integer to Unique String

Categories

Resources