Convert int16 to 16 big-endian bit binary - python

I've got an array of short/int16s that i need to convert to a padded 16 bit-string(?). I've tried using struct:
struct.pack('>H', 545)
To which i get:
'\x02!'
Whereas I need something formatted as 16 bits.
Does anyone know how to do this? I'm rather confused and know next to nothing about the binary system.
Cheers

That is 16 bits. '\x02' is 8 bits, and ! is the other 8.
Were you looking for '0000001000100001'? If so, you can do that with the format function:
>>> format(545, '016b')
'0000001000100001'
The 0 means "pad with zeros", the 16 means "show at least 16 digits", and the b means binary.
If you don't need the zero padding, you can just use bin:
>>> bin(545)
'0b1000100001'
>>> bin(545)[2:]
'1000100001'

Related

base64 string length calculation when encoding unsigned integers only

I am trying to figure out estimates for how many unsigned integer numbers I can encode with 5 characters of base64, 6 characters, and so on.
Through programmatic approach I found out that I can encode
2^28 - 1 = 268,435,455
with 6 characters and
2^35 - 1 = 34,359,738,368
with 7 characters.
(-1 because I start at uint 1)
I am struggling to generalize this though, since I would assume it starts at 2^8 = 256 but I don't get how I end up at 28 and 35.
This is my implementation in Go
func Shorten(num uint64) string {
buf := make([]byte, binary.MaxVarintLen64)
n := binary.PutUvarint(buf, num)
b := buf[:n]
encoded := base64.URLEncoding.EncodeToString(b)
return strings.Replace(encoded, "=", "", -1)
}
Also
0 -> AA
128 -> gAE
16384 -> gIAB
2097152 -> gICAAQ
268435456 -> gICAgAE
So it looks like it's going up in 7 increments: 2^7, 2^14, 2^21, etc. but why 7?
A byte is 8 bits and therefore 256 possible values. Base 64 uses 64 different characters to encode and therefore is using 6 bits. so how many 8 bit objects can you fit in 6 bits? 0 if you're rounding or 3/4 if you aren't. When you start talking about encoding integers however your numbers do not appear to make sense. Are you talking about integers written in ascii? with 6 base64 characters you have 36 bits to play with so if you're talking about binary 32-bit unsigned integers you can encode one at a time but you can encode any of them that you want for 2**32 different possibilities and then 4 wasted bits. With ascii you'd have 4 characters so it would be 10000 different possibilities (0 to 9999).
You are getting unexpected results because you're using go varints which are not encoded as regular binary integers. some ipython output for you:
In [22]: base64.b64encode((128).to_bytes(1,'little'))
Out[22]: b'gA=='
because 128 can be encoded in a single 8 bit byte it is only 2 characters with some padding. and look at this:
In [3]: base64.b64decode('gAE=')
Out[3]: b'\x80\x01'
In [4]: int.from_bytes(_,'little')
Out[4]: 384
So as you can see PutUVarint isn't just encoding an integer of variable length it's encoding a variable integer, ie it has been encoded in a way that it can be decoded without knowing in advance what size it is. If you look at the source code for the varint go module it describes this process. Go is using 7 bits of each byte to hold actual integer binary data and the most significant bit is a flag as to whether or not there is more data yet to come. 128 is just the most significant bit of one byte set. So basically you're encoding twice based on the way you're accomplishing this task. If you have a given integer to encode it as a var int you need the number of bytes that the integer uses *8/7 to store the value then you base64 encode that result so you need that value *8/6 to store that. Depending on what you're doing with the base64 you can likely determine how many bytes you're playing with without needing to resort to the go varints and then the calculation would just be the 8/6 conversion (which is 4/3 I just left it in bits to match the varint process more closely.)

Get binary mask in python

What is the easiest/fastest way to get a int in python which can be represented by all ones in binary. This is for generating N bit masks.
E.g:
If total number of bits is 4, then binary '1111' or int 15
If total number of bits is 8 then, binary '1111 1111' or 255
I was under the impression ~0 is for that purpose, looks like that is not the case or I am missing something.
it's very easy to achieve with bit shifting:
>>> (1<<4)-1
15
shifting 4 times 1 to the left gives you 0b10000, substract 1 you get 0b1111 aka 15.
(the int("1"*4,2) method is overkill because it involves building a string and parsing it back)

Convert an Integer into 32bit Binary Python

I am trying to make a program that converts a given integer(limited by the value 32 bit int can hold) into 32 bit binary number. For example 1 should return (000..31times)1. I have been searching the documents and everything but haven't been able to find some concrete way. I got it working where number of bits are according to the number size but in String. Can anybody tell a more efficient way to go about this?
'{:032b}'.format(n) where n is an integer. If the binary representation is greater than 32 digits it will expand as necessary:
>>> '{:032b}'.format(100)
'00000000000000000000000001100100'
>>> '{:032b}'.format(8589934591)
'111111111111111111111111111111111'
>>> '{:032b}'.format(8589934591 + 1)
'1000000000000000000000000000000000' # N.B. this is 33 digits long
You can just left or right shift integer and convert it to string for display if you need.
>>> 1<<1
2
>>> "{:032b}".format(2)
'00000000000000000000000000000010'
>>>
or if you just need a binary you can consider bin
>>> bin(4)
'0b100'
Say for example the number you want to convert into 32 bit binary is 4. So, num=4.
Here is the code that does this: "s" is the empty string initially.
for i in range(31,-1,-1):
cur=(num>>i) & 1 #(right shift operation on num and i and bitwise AND with 1)
s+=str(cur)
print(s)#s contains 32 bit binary representation of 4(00000000000000000000000000000100)
00000000000000000000000000000100
Lets say
a = 4
print(bin(a)) # 0b101
For the output you may append 0s from LSB to till 101 to get the 32bit address for the integer - 4.
If you don't want 0b you may slice it
print(bin(a)[-3:]) # 101

How to shift bits in a 2-5 byte long bytes object in python?

I am trying to extract data out of a byte object. For example:
From b'\x93\x4c\x00' my integer hides from bit 8 to 21.
I tried to do bytes >> 3 but that isn't possible with more than one byte.
I also tried to solve this with struct but the byte object must have a specific length.
How can I shift the bits to the right?
Don't use bytes to represent integer values; if you need bits, convert to an int:
value = int.from_bytes(your_bytes_value, byteorder='big')
bits_21_to_8 = (value & 0x1fffff) >> 8
where the 0x1fffff mask could also be calculated with:
mask = 2 ** 21 - 1
Demo:
>>> your_bytes_value = b'\x93\x4c\x00'
>>> value = int.from_bytes(your_bytes_value, byteorder='big')
>>> (value & 0x1fffff) >> 8
4940
You can then move back to bytes with the int.to_bytes() method:
>>> ((value & 0x1fffff) >> 8).to_bytes(2, byteorder='big')
b'\x13L'
As you have a bytes string and you want to strip the right-most eight bits (i.e. one byte), you can simply it from the bytes string:
>>> b'\x93\x4c\x00'[:-1]
b'\x93L'
If you want to convert that then to an integer, you can use Python’s struct to unpack it. As you correctly said, you need a fixed size to use structs, so you can just pad the bytes string to add as many zeros as you need:
>>> data = b'\x93\x4c\x00'
>>> data[:-1]
b'\x93L'
>>> data[:-1].rjust(4, b'\x00')
b'\x00\x00\x93L'
>>> struct.unpack('>L', data[:-1].rjust(4, b'\x00'))[0]
37708
Of course, you can also convert it first, and then shift off the 8 bits from the resulting integer:
>>> struct.unpack('>Q', data.rjust(8, b'\x00'))[0] >> 8
37708
If you want to make sure that you don’t actually interpret more than those 13 bits (bits 8 to 21), you have to apply the bit mask 0x1FFF of course:
>>> 37708 & 0x1FFF
4940
(If you need big-endianness instead, just use <L or <Q respectively.)
If you are really counting the bits from left to right (which would be unusual but okay), then you can use that padding technique too:
>>> struct.unpack('>Q', data.ljust(8, b'\x00'))[0] >> 43
1206656
Note that we’re adding the padding to the other side, and are shifting it by 43 bits (your 3 bits plus 5 bytes for the padded data we won’t need to look at)
Another approach that works for arbitrarily long byte sequences is to use the bitstring library which allows for bitwise operations on bitstrings e.g.
>>> import bitstring
>>> bitstring.BitArray(bytes=b'\x93\x4c\x00') >> 3
BitArray('0x126980')
You could convert your bytes to an integer then multiply or divide by powers of two to accomplish the shifting

Interpreting 5bit subsets within Packed Binary Data Python

I have been having some real trouble with this for a while. I am receiving a string of binary data in python and I am having trouble unpacking and interpreting only a 5bit subset (not an entire byte) of the data. It seems like whatever method comes to mind just simply fails miserably.
Let's say I have two bytes packed binary data, and I would like to interpret the first 10bits within the 16. How could I convert this to an 2 integers representing 5bits each?
Use bitmasks and bitshifting:
>>> example = 0x1234 # Hexadecimal example; 2 bytes, 4660 decimal.
>>> bin(example) # Show as binary digits
'0b1001000110100'
>>> example & 31 # Grab 5 most significant bits
20
>>> bin(example & 31) # Same, now represented as binary digits
'0b10100'
>>> (example >> 5) & 31 # Grab the next 5 bits (shift right 5 times first)
17
>>> bin(example >> 5 & 31)
'0b10001'
The trick here is to know that 31 is a 5-bit bitmask:
>>> bin(31)
'0b11111'
>>> 0b11111
31
>>> example & 0b11111
20
As you can see you could also just use the 0b binary number literal notation if you find that easier to work with.
See the Python Wiki on bit manipulation for more background info.

Categories