little endian bits in python - python

I'm trying to make a GIF analyzer; I'm having problems with reading an arbitrary number of bits as an integer in little endian. Struct is nice for byte-sized arguments, but some of the GIF structures are 3 bit little endian unsigned integers (specifically in the GIF header, http://www.onicos.com/staff/iz/formats/gif.html). what's the best way to invert these numbers?
I have tried reversing the endianness of the entire byte/s with Struct but it doesn't want to invert:
struct.unpack('<'+str(len(string))+'s',string)[0] //does not actually invert

I don't know if you can use struct to do the work on things that are less than a byte in size. But if you're not too worried about speed you could try this function. It takes a number to reverse and a size in bits and returns the reversed result.
def reverse(a,size):
b = 0
for i in range(size):
b <<= 1
b |= a >> i & 1
return b
Use it like so:
>>> reverse(3,3) # 011 => 110
6
>>> invert(6,3) # 110 => 011
3
>>> invert(4,3) # 100 => 001
1
>>> invert(5,3) # 101 => 111
5
>>>
Obviously you still need to extract the relevant bits into a number using struct but this should take care of the endianness issue

Related

What is the behavior of the & operator when comparing a list and an integer in Python [duplicate]

Consider this code:
x = 1 # 0001
x << 2 # Shift left 2 bits: 0100
# Result: 4
x | 2 # Bitwise OR: 0011
# Result: 3
x & 1 # Bitwise AND: 0001
# Result: 1
I can understand the arithmetic operators in Python (and other languages), but I never understood 'bitwise' operators quite well. In the above example (from a Python book), I understand the left-shift but not the other two.
Also, what are bitwise operators actually used for? I'd appreciate some examples.
Bitwise operators are operators that work on multi-bit values, but conceptually one bit at a time.
AND is 1 only if both of its inputs are 1, otherwise it's 0.
OR is 1 if one or both of its inputs are 1, otherwise it's 0.
XOR is 1 only if exactly one of its inputs are 1, otherwise it's 0.
NOT is 1 only if its input is 0, otherwise it's 0.
These can often be best shown as truth tables. Input possibilities are on the top and left, the resultant bit is one of the four (two in the case of NOT since it only has one input) values shown at the intersection of the inputs.
AND | 0 1 OR | 0 1 XOR | 0 1 NOT | 0 1
----+----- ---+---- ----+---- ----+----
0 | 0 0 0 | 0 1 0 | 0 1 | 1 0
1 | 0 1 1 | 1 1 1 | 1 0
One example is if you only want the lower 4 bits of an integer, you AND it with 15 (binary 1111) so:
201: 1100 1001
AND 15: 0000 1111
------------------
IS 9 0000 1001
The zero bits in 15 in that case effectively act as a filter, forcing the bits in the result to be zero as well.
In addition, >> and << are often included as bitwise operators, and they "shift" a value respectively right and left by a certain number of bits, throwing away bits that roll of the end you're shifting towards, and feeding in zero bits at the other end.
So, for example:
1001 0101 >> 2 gives 0010 0101
1111 1111 << 4 gives 1111 0000
Note that the left shift in Python is unusual in that it's not using a fixed width where bits are discarded - while many languages use a fixed width based on the data type, Python simply expands the width to cater for extra bits. In order to get the discarding behaviour in Python, you can follow a left shift with a bitwise and such as in an 8-bit value shifting left four bits:
bits8 = (bits8 << 4) & 255
With that in mind, another example of bitwise operators is if you have two 4-bit values that you want to pack into an 8-bit one, you can use all three of your operators (left-shift, and and or):
packed_val = ((val1 & 15) << 4) | (val2 & 15)
The & 15 operation will make sure that both values only have the lower 4 bits.
The << 4 is a 4-bit shift left to move val1 into the top 4 bits of an 8-bit value.
The | simply combines these two together.
If val1 is 7 and val2 is 4:
val1 val2
==== ====
& 15 (and) xxxx-0111 xxxx-0100 & 15
<< 4 (left) 0111-0000 |
| |
+-------+-------+
|
| (or) 0111-0100
One typical usage:
| is used to set a certain bit to 1
& is used to test or clear a certain bit
Set a bit (where n is the bit number, and 0 is the least significant bit):
unsigned char a |= (1 << n);
Clear a bit:
unsigned char b &= ~(1 << n);
Toggle a bit:
unsigned char c ^= (1 << n);
Test a bit:
unsigned char e = d & (1 << n);
Take the case of your list for example:
x | 2 is used to set bit 1 of x to 1
x & 1 is used to test if bit 0 of x is 1 or 0
what are bitwise operators actually used for? I'd appreciate some examples.
One of the most common uses of bitwise operations is for parsing hexadecimal colours.
For example, here's a Python function that accepts a String like #FF09BE and returns a tuple of its Red, Green and Blue values.
def hexToRgb(value):
# Convert string to hexadecimal number (base 16)
num = (int(value.lstrip("#"), 16))
# Shift 16 bits to the right, and then binary AND to obtain 8 bits representing red
r = ((num >> 16) & 0xFF)
# Shift 8 bits to the right, and then binary AND to obtain 8 bits representing green
g = ((num >> 8) & 0xFF)
# Simply binary AND to obtain 8 bits representing blue
b = (num & 0xFF)
return (r, g, b)
I know that there are more efficient ways to acheive this, but I believe that this is a really concise example illustrating both shifts and bitwise boolean operations.
I think that the second part of the question:
Also, what are bitwise operators actually used for? I'd appreciate some examples.
Has been only partially addressed. These are my two cents on that matter.
Bitwise operations in programming languages play a fundamental role when dealing with a lot of applications. Almost all low-level computing must be done using this kind of operations.
In all applications that need to send data between two nodes, such as:
computer networks;
telecommunication applications (cellular phones, satellite communications, etc).
In the lower level layer of communication, the data is usually sent in what is called frames. Frames are just strings of bytes that are sent through a physical channel. This frames usually contain the actual data plus some other fields (coded in bytes) that are part of what is called the header. The header usually contains bytes that encode some information related to the status of the communication (e.g, with flags (bits)), frame counters, correction and error detection codes, etc. To get the transmitted data in a frame, and to build the frames to send data, you will need for sure bitwise operations.
In general, when dealing with that kind of applications, an API is available so you don't have to deal with all those details. For example, all modern programming languages provide libraries for socket connections, so you don't actually need to build the TCP/IP communication frames. But think about the good people that programmed those APIs for you, they had to deal with frame construction for sure; using all kinds of bitwise operations to go back and forth from the low-level to the higher-level communication.
As a concrete example, imagine some one gives you a file that contains raw data that was captured directly by telecommunication hardware. In this case, in order to find the frames, you will need to read the raw bytes in the file and try to find some kind of synchronization words, by scanning the data bit by bit. After identifying the synchronization words, you will need to get the actual frames, and SHIFT them if necessary (and that is just the start of the story) to get the actual data that is being transmitted.
Another very different low level family of application is when you need to control hardware using some (kind of ancient) ports, such as parallel and serial ports. This ports are controlled by setting some bytes, and each bit of that bytes has a specific meaning, in terms of instructions, for that port (see for instance http://en.wikipedia.org/wiki/Parallel_port). If you want to build software that does something with that hardware you will need bitwise operations to translate the instructions you want to execute to the bytes that the port understand.
For example, if you have some physical buttons connected to the parallel port to control some other device, this is a line of code that you can find in the soft application:
read = ((read ^ 0x80) >> 4) & 0x0f;
Hope this contributes.
I didn't see it mentioned above but you will also see some people use left and right shift for arithmetic operations. A left shift by x is equivalent to multiplying by 2^x (as long as it doesn't overflow) and a right shift is equivalent to dividing by 2^x.
Recently I've seen people using x << 1 and x >> 1 for doubling and halving, although I'm not sure if they are just trying to be clever or if there really is a distinct advantage over the normal operators.
I hope this clarifies those two:
x | 2
0001 //x
0010 //2
0011 //result = 3
x & 1
0001 //x
0001 //1
0001 //result = 1
Think of 0 as false and 1 as true. Then bitwise and(&) and or(|) work just like regular and and or except they do all of the bits in the value at once. Typically you will see them used for flags if you have 30 options that can be set (say as draw styles on a window) you don't want to have to pass in 30 separate boolean values to set or unset each one so you use | to combine options into a single value and then you use & to check if each option is set. This style of flag passing is heavily used by OpenGL. Since each bit is a separate flag you get flag values on powers of two(aka numbers that have only one bit set) 1(2^0) 2(2^1) 4(2^2) 8(2^3) the power of two tells you which bit is set if the flag is on.
Also note 2 = 10 so x|2 is 110(6) not 111(7) If none of the bits overlap(which is true in this case) | acts like addition.
Sets
Sets can be combined using mathematical operations.
The union operator | combines two sets to form a new one containing items in either.
The intersection operator & gets items only in both.
The difference operator - gets items in the first set but not in the second.
The symmetric difference operator ^ gets items in either set, but not both.
Try It Yourself:
first = {1, 2, 3, 4, 5, 6}
second = {4, 5, 6, 7, 8, 9}
print(first | second)
print(first & second)
print(first - second)
print(second - first)
print(first ^ second)
Result:
{1, 2, 3, 4, 5, 6, 7, 8, 9}
{4, 5, 6}
{1, 2, 3}
{8, 9, 7}
{1, 2, 3, 7, 8, 9}
This example will show you the operations for all four 2 bit values:
10 | 12
1010 #decimal 10
1100 #decimal 12
1110 #result = 14
10 & 12
1010 #decimal 10
1100 #decimal 12
1000 #result = 8
Here is one example of usage:
x = raw_input('Enter a number:')
print 'x is %s.' % ('even', 'odd')[x&1]
Another common use-case is manipulating/testing file permissions. See the Python stat module: http://docs.python.org/library/stat.html.
For example, to compare a file's permissions to a desired permission set, you could do something like:
import os
import stat
#Get the actual mode of a file
mode = os.stat('file.txt').st_mode
#File should be a regular file, readable and writable by its owner
#Each permission value has a single 'on' bit. Use bitwise or to combine
#them.
desired_mode = stat.S_IFREG|stat.S_IRUSR|stat.S_IWUSR
#check for exact match:
mode == desired_mode
#check for at least one bit matching:
bool(mode & desired_mode)
#check for at least one bit 'on' in one, and not in the other:
bool(mode ^ desired_mode)
#check that all bits from desired_mode are set in mode, but I don't care about
# other bits.
not bool((mode^desired_mode)&desired_mode)
I cast the results as booleans, because I only care about the truth or falsehood, but it would be a worthwhile exercise to print out the bin() values for each one.
Bit representations of integers are often used in scientific computing to represent arrays of true-false information because a bitwise operation is much faster than iterating through an array of booleans. (Higher level languages may use the idea of a bit array.)
A nice and fairly simple example of this is the general solution to the game of Nim. Take a look at the Python code on the Wikipedia page. It makes heavy use of bitwise exclusive or, ^.
There may be a better way to find where an array element is between two values, but as this example shows, the & works here, whereas and does not.
import numpy as np
a=np.array([1.2, 2.3, 3.4])
np.where((a>2) and (a<3))
#Result: Value Error
np.where((a>2) & (a<3))
#Result: (array([1]),)
i didnt see it mentioned, This example will show you the (-) decimal operation for 2 bit values: A-B (only if A contains B)
this operation is needed when we hold an verb in our program that represent bits. sometimes we need to add bits (like above) and sometimes we need to remove bits (if the verb contains then)
111 #decimal 7
-
100 #decimal 4
--------------
011 #decimal 3
with python:
7 & ~4 = 3 (remove from 7 the bits that represent 4)
001 #decimal 1
-
100 #decimal 4
--------------
001 #decimal 1
with python:
1 & ~4 = 1 (remove from 1 the bits that represent 4 - in this case 1 is not 'contains' 4)..
Whilst manipulating bits of an integer is useful, often for network protocols, which may be specified down to the bit, one can require manipulation of longer byte sequences (which aren't easily converted into one integer). In this case it is useful to employ the bitstring library which allows for bitwise operations on data - e.g. one can import the string 'ABCDEFGHIJKLMNOPQ' as a string or as hex and bit shift it (or perform other bitwise operations):
>>> import bitstring
>>> bitstring.BitArray(bytes='ABCDEFGHIJKLMNOPQ') << 4
BitArray('0x142434445464748494a4b4c4d4e4f50510')
>>> bitstring.BitArray(hex='0x4142434445464748494a4b4c4d4e4f5051') << 4
BitArray('0x142434445464748494a4b4c4d4e4f50510')
the following bitwise operators: &, |, ^, and ~ return values (based on their input) in the same way logic gates affect signals. You could use them to emulate circuits.
To flip bits (i.e. 1's complement/invert) you can do the following:
Since value ExORed with all 1s results into inversion,
for a given bit width you can use ExOR to invert them.
In Binary
a=1010 --> this is 0xA or decimal 10
then
c = 1111 ^ a = 0101 --> this is 0xF or decimal 15
-----------------
In Python
a=10
b=15
c = a ^ b --> 0101
print(bin(c)) # gives '0b101'
You can use bit masking to convert binary to decimal;
int a = 1 << 7;
int c = 55;
for(int i = 0; i < 8; i++){
System.out.print((a & c) >> 7);
c = c << 1;
}
this is for 8 digits you can also do for further more.

How do I directly write bits to an integer in python? [duplicate]

Consider this code:
x = 1 # 0001
x << 2 # Shift left 2 bits: 0100
# Result: 4
x | 2 # Bitwise OR: 0011
# Result: 3
x & 1 # Bitwise AND: 0001
# Result: 1
I can understand the arithmetic operators in Python (and other languages), but I never understood 'bitwise' operators quite well. In the above example (from a Python book), I understand the left-shift but not the other two.
Also, what are bitwise operators actually used for? I'd appreciate some examples.
Bitwise operators are operators that work on multi-bit values, but conceptually one bit at a time.
AND is 1 only if both of its inputs are 1, otherwise it's 0.
OR is 1 if one or both of its inputs are 1, otherwise it's 0.
XOR is 1 only if exactly one of its inputs are 1, otherwise it's 0.
NOT is 1 only if its input is 0, otherwise it's 0.
These can often be best shown as truth tables. Input possibilities are on the top and left, the resultant bit is one of the four (two in the case of NOT since it only has one input) values shown at the intersection of the inputs.
AND | 0 1 OR | 0 1 XOR | 0 1 NOT | 0 1
----+----- ---+---- ----+---- ----+----
0 | 0 0 0 | 0 1 0 | 0 1 | 1 0
1 | 0 1 1 | 1 1 1 | 1 0
One example is if you only want the lower 4 bits of an integer, you AND it with 15 (binary 1111) so:
201: 1100 1001
AND 15: 0000 1111
------------------
IS 9 0000 1001
The zero bits in 15 in that case effectively act as a filter, forcing the bits in the result to be zero as well.
In addition, >> and << are often included as bitwise operators, and they "shift" a value respectively right and left by a certain number of bits, throwing away bits that roll of the end you're shifting towards, and feeding in zero bits at the other end.
So, for example:
1001 0101 >> 2 gives 0010 0101
1111 1111 << 4 gives 1111 0000
Note that the left shift in Python is unusual in that it's not using a fixed width where bits are discarded - while many languages use a fixed width based on the data type, Python simply expands the width to cater for extra bits. In order to get the discarding behaviour in Python, you can follow a left shift with a bitwise and such as in an 8-bit value shifting left four bits:
bits8 = (bits8 << 4) & 255
With that in mind, another example of bitwise operators is if you have two 4-bit values that you want to pack into an 8-bit one, you can use all three of your operators (left-shift, and and or):
packed_val = ((val1 & 15) << 4) | (val2 & 15)
The & 15 operation will make sure that both values only have the lower 4 bits.
The << 4 is a 4-bit shift left to move val1 into the top 4 bits of an 8-bit value.
The | simply combines these two together.
If val1 is 7 and val2 is 4:
val1 val2
==== ====
& 15 (and) xxxx-0111 xxxx-0100 & 15
<< 4 (left) 0111-0000 |
| |
+-------+-------+
|
| (or) 0111-0100
One typical usage:
| is used to set a certain bit to 1
& is used to test or clear a certain bit
Set a bit (where n is the bit number, and 0 is the least significant bit):
unsigned char a |= (1 << n);
Clear a bit:
unsigned char b &= ~(1 << n);
Toggle a bit:
unsigned char c ^= (1 << n);
Test a bit:
unsigned char e = d & (1 << n);
Take the case of your list for example:
x | 2 is used to set bit 1 of x to 1
x & 1 is used to test if bit 0 of x is 1 or 0
what are bitwise operators actually used for? I'd appreciate some examples.
One of the most common uses of bitwise operations is for parsing hexadecimal colours.
For example, here's a Python function that accepts a String like #FF09BE and returns a tuple of its Red, Green and Blue values.
def hexToRgb(value):
# Convert string to hexadecimal number (base 16)
num = (int(value.lstrip("#"), 16))
# Shift 16 bits to the right, and then binary AND to obtain 8 bits representing red
r = ((num >> 16) & 0xFF)
# Shift 8 bits to the right, and then binary AND to obtain 8 bits representing green
g = ((num >> 8) & 0xFF)
# Simply binary AND to obtain 8 bits representing blue
b = (num & 0xFF)
return (r, g, b)
I know that there are more efficient ways to acheive this, but I believe that this is a really concise example illustrating both shifts and bitwise boolean operations.
I think that the second part of the question:
Also, what are bitwise operators actually used for? I'd appreciate some examples.
Has been only partially addressed. These are my two cents on that matter.
Bitwise operations in programming languages play a fundamental role when dealing with a lot of applications. Almost all low-level computing must be done using this kind of operations.
In all applications that need to send data between two nodes, such as:
computer networks;
telecommunication applications (cellular phones, satellite communications, etc).
In the lower level layer of communication, the data is usually sent in what is called frames. Frames are just strings of bytes that are sent through a physical channel. This frames usually contain the actual data plus some other fields (coded in bytes) that are part of what is called the header. The header usually contains bytes that encode some information related to the status of the communication (e.g, with flags (bits)), frame counters, correction and error detection codes, etc. To get the transmitted data in a frame, and to build the frames to send data, you will need for sure bitwise operations.
In general, when dealing with that kind of applications, an API is available so you don't have to deal with all those details. For example, all modern programming languages provide libraries for socket connections, so you don't actually need to build the TCP/IP communication frames. But think about the good people that programmed those APIs for you, they had to deal with frame construction for sure; using all kinds of bitwise operations to go back and forth from the low-level to the higher-level communication.
As a concrete example, imagine some one gives you a file that contains raw data that was captured directly by telecommunication hardware. In this case, in order to find the frames, you will need to read the raw bytes in the file and try to find some kind of synchronization words, by scanning the data bit by bit. After identifying the synchronization words, you will need to get the actual frames, and SHIFT them if necessary (and that is just the start of the story) to get the actual data that is being transmitted.
Another very different low level family of application is when you need to control hardware using some (kind of ancient) ports, such as parallel and serial ports. This ports are controlled by setting some bytes, and each bit of that bytes has a specific meaning, in terms of instructions, for that port (see for instance http://en.wikipedia.org/wiki/Parallel_port). If you want to build software that does something with that hardware you will need bitwise operations to translate the instructions you want to execute to the bytes that the port understand.
For example, if you have some physical buttons connected to the parallel port to control some other device, this is a line of code that you can find in the soft application:
read = ((read ^ 0x80) >> 4) & 0x0f;
Hope this contributes.
I didn't see it mentioned above but you will also see some people use left and right shift for arithmetic operations. A left shift by x is equivalent to multiplying by 2^x (as long as it doesn't overflow) and a right shift is equivalent to dividing by 2^x.
Recently I've seen people using x << 1 and x >> 1 for doubling and halving, although I'm not sure if they are just trying to be clever or if there really is a distinct advantage over the normal operators.
I hope this clarifies those two:
x | 2
0001 //x
0010 //2
0011 //result = 3
x & 1
0001 //x
0001 //1
0001 //result = 1
Think of 0 as false and 1 as true. Then bitwise and(&) and or(|) work just like regular and and or except they do all of the bits in the value at once. Typically you will see them used for flags if you have 30 options that can be set (say as draw styles on a window) you don't want to have to pass in 30 separate boolean values to set or unset each one so you use | to combine options into a single value and then you use & to check if each option is set. This style of flag passing is heavily used by OpenGL. Since each bit is a separate flag you get flag values on powers of two(aka numbers that have only one bit set) 1(2^0) 2(2^1) 4(2^2) 8(2^3) the power of two tells you which bit is set if the flag is on.
Also note 2 = 10 so x|2 is 110(6) not 111(7) If none of the bits overlap(which is true in this case) | acts like addition.
Sets
Sets can be combined using mathematical operations.
The union operator | combines two sets to form a new one containing items in either.
The intersection operator & gets items only in both.
The difference operator - gets items in the first set but not in the second.
The symmetric difference operator ^ gets items in either set, but not both.
Try It Yourself:
first = {1, 2, 3, 4, 5, 6}
second = {4, 5, 6, 7, 8, 9}
print(first | second)
print(first & second)
print(first - second)
print(second - first)
print(first ^ second)
Result:
{1, 2, 3, 4, 5, 6, 7, 8, 9}
{4, 5, 6}
{1, 2, 3}
{8, 9, 7}
{1, 2, 3, 7, 8, 9}
This example will show you the operations for all four 2 bit values:
10 | 12
1010 #decimal 10
1100 #decimal 12
1110 #result = 14
10 & 12
1010 #decimal 10
1100 #decimal 12
1000 #result = 8
Here is one example of usage:
x = raw_input('Enter a number:')
print 'x is %s.' % ('even', 'odd')[x&1]
Another common use-case is manipulating/testing file permissions. See the Python stat module: http://docs.python.org/library/stat.html.
For example, to compare a file's permissions to a desired permission set, you could do something like:
import os
import stat
#Get the actual mode of a file
mode = os.stat('file.txt').st_mode
#File should be a regular file, readable and writable by its owner
#Each permission value has a single 'on' bit. Use bitwise or to combine
#them.
desired_mode = stat.S_IFREG|stat.S_IRUSR|stat.S_IWUSR
#check for exact match:
mode == desired_mode
#check for at least one bit matching:
bool(mode & desired_mode)
#check for at least one bit 'on' in one, and not in the other:
bool(mode ^ desired_mode)
#check that all bits from desired_mode are set in mode, but I don't care about
# other bits.
not bool((mode^desired_mode)&desired_mode)
I cast the results as booleans, because I only care about the truth or falsehood, but it would be a worthwhile exercise to print out the bin() values for each one.
Bit representations of integers are often used in scientific computing to represent arrays of true-false information because a bitwise operation is much faster than iterating through an array of booleans. (Higher level languages may use the idea of a bit array.)
A nice and fairly simple example of this is the general solution to the game of Nim. Take a look at the Python code on the Wikipedia page. It makes heavy use of bitwise exclusive or, ^.
There may be a better way to find where an array element is between two values, but as this example shows, the & works here, whereas and does not.
import numpy as np
a=np.array([1.2, 2.3, 3.4])
np.where((a>2) and (a<3))
#Result: Value Error
np.where((a>2) & (a<3))
#Result: (array([1]),)
i didnt see it mentioned, This example will show you the (-) decimal operation for 2 bit values: A-B (only if A contains B)
this operation is needed when we hold an verb in our program that represent bits. sometimes we need to add bits (like above) and sometimes we need to remove bits (if the verb contains then)
111 #decimal 7
-
100 #decimal 4
--------------
011 #decimal 3
with python:
7 & ~4 = 3 (remove from 7 the bits that represent 4)
001 #decimal 1
-
100 #decimal 4
--------------
001 #decimal 1
with python:
1 & ~4 = 1 (remove from 1 the bits that represent 4 - in this case 1 is not 'contains' 4)..
Whilst manipulating bits of an integer is useful, often for network protocols, which may be specified down to the bit, one can require manipulation of longer byte sequences (which aren't easily converted into one integer). In this case it is useful to employ the bitstring library which allows for bitwise operations on data - e.g. one can import the string 'ABCDEFGHIJKLMNOPQ' as a string or as hex and bit shift it (or perform other bitwise operations):
>>> import bitstring
>>> bitstring.BitArray(bytes='ABCDEFGHIJKLMNOPQ') << 4
BitArray('0x142434445464748494a4b4c4d4e4f50510')
>>> bitstring.BitArray(hex='0x4142434445464748494a4b4c4d4e4f5051') << 4
BitArray('0x142434445464748494a4b4c4d4e4f50510')
the following bitwise operators: &, |, ^, and ~ return values (based on their input) in the same way logic gates affect signals. You could use them to emulate circuits.
To flip bits (i.e. 1's complement/invert) you can do the following:
Since value ExORed with all 1s results into inversion,
for a given bit width you can use ExOR to invert them.
In Binary
a=1010 --> this is 0xA or decimal 10
then
c = 1111 ^ a = 0101 --> this is 0xF or decimal 15
-----------------
In Python
a=10
b=15
c = a ^ b --> 0101
print(bin(c)) # gives '0b101'
You can use bit masking to convert binary to decimal;
int a = 1 << 7;
int c = 55;
for(int i = 0; i < 8; i++){
System.out.print((a & c) >> 7);
c = c << 1;
}
this is for 8 digits you can also do for further more.

Unpacking ripemd160 result in python

I am working on a program which does a lot of hashing, and in one of the steps I take a result of hashlib's ripemd160 hash and convert it into an integer. The lines are:
ripe_fruit = new('ripemd160', sha256(key.to_der()).digest())
key_hash160 = struct.unpack("<Q", ripe_fruit.digest())[0]
It gives me the error:
struct.error: unpack requires a buffer of 8 bytes
I tried changing the value to L and other things, but they didn't work. How do I fix this?
RIPEMD-160 returns 160 bits, or 20 bytes. struct doesn't know how to unpack integers larger than 8 bytes. You have two options and the right one depends on what exactly you're trying to do.
If your algorithm is looking for just some of the bytes of the hash, you can take the first or last 8 bytes and unpack those.
key_hash160 = struct.unpack("<Q", ripe_fruit.digest()[:8])[0]
If you need a 160 bytes integer, you first have to decide how that's represented. Is it little endian or big endian or something in between? Then you can break the array into 20 bytes and then calculate one number from them. Assuming little endian based on the < in your question, you can then do something like:
key_parts = struct.unpack("B" * 20, ripe_fruit.digest())
key_hash160 = 0
for b in key_parts[::-1]:
key_hash160 <<= 8
key_hash160 |= b

What does this statement do (bin and int expression)? [duplicate]

Consider this code:
x = 1 # 0001
x << 2 # Shift left 2 bits: 0100
# Result: 4
x | 2 # Bitwise OR: 0011
# Result: 3
x & 1 # Bitwise AND: 0001
# Result: 1
I can understand the arithmetic operators in Python (and other languages), but I never understood 'bitwise' operators quite well. In the above example (from a Python book), I understand the left-shift but not the other two.
Also, what are bitwise operators actually used for? I'd appreciate some examples.
Bitwise operators are operators that work on multi-bit values, but conceptually one bit at a time.
AND is 1 only if both of its inputs are 1, otherwise it's 0.
OR is 1 if one or both of its inputs are 1, otherwise it's 0.
XOR is 1 only if exactly one of its inputs are 1, otherwise it's 0.
NOT is 1 only if its input is 0, otherwise it's 0.
These can often be best shown as truth tables. Input possibilities are on the top and left, the resultant bit is one of the four (two in the case of NOT since it only has one input) values shown at the intersection of the inputs.
AND | 0 1 OR | 0 1 XOR | 0 1 NOT | 0 1
----+----- ---+---- ----+---- ----+----
0 | 0 0 0 | 0 1 0 | 0 1 | 1 0
1 | 0 1 1 | 1 1 1 | 1 0
One example is if you only want the lower 4 bits of an integer, you AND it with 15 (binary 1111) so:
201: 1100 1001
AND 15: 0000 1111
------------------
IS 9 0000 1001
The zero bits in 15 in that case effectively act as a filter, forcing the bits in the result to be zero as well.
In addition, >> and << are often included as bitwise operators, and they "shift" a value respectively right and left by a certain number of bits, throwing away bits that roll of the end you're shifting towards, and feeding in zero bits at the other end.
So, for example:
1001 0101 >> 2 gives 0010 0101
1111 1111 << 4 gives 1111 0000
Note that the left shift in Python is unusual in that it's not using a fixed width where bits are discarded - while many languages use a fixed width based on the data type, Python simply expands the width to cater for extra bits. In order to get the discarding behaviour in Python, you can follow a left shift with a bitwise and such as in an 8-bit value shifting left four bits:
bits8 = (bits8 << 4) & 255
With that in mind, another example of bitwise operators is if you have two 4-bit values that you want to pack into an 8-bit one, you can use all three of your operators (left-shift, and and or):
packed_val = ((val1 & 15) << 4) | (val2 & 15)
The & 15 operation will make sure that both values only have the lower 4 bits.
The << 4 is a 4-bit shift left to move val1 into the top 4 bits of an 8-bit value.
The | simply combines these two together.
If val1 is 7 and val2 is 4:
val1 val2
==== ====
& 15 (and) xxxx-0111 xxxx-0100 & 15
<< 4 (left) 0111-0000 |
| |
+-------+-------+
|
| (or) 0111-0100
One typical usage:
| is used to set a certain bit to 1
& is used to test or clear a certain bit
Set a bit (where n is the bit number, and 0 is the least significant bit):
unsigned char a |= (1 << n);
Clear a bit:
unsigned char b &= ~(1 << n);
Toggle a bit:
unsigned char c ^= (1 << n);
Test a bit:
unsigned char e = d & (1 << n);
Take the case of your list for example:
x | 2 is used to set bit 1 of x to 1
x & 1 is used to test if bit 0 of x is 1 or 0
what are bitwise operators actually used for? I'd appreciate some examples.
One of the most common uses of bitwise operations is for parsing hexadecimal colours.
For example, here's a Python function that accepts a String like #FF09BE and returns a tuple of its Red, Green and Blue values.
def hexToRgb(value):
# Convert string to hexadecimal number (base 16)
num = (int(value.lstrip("#"), 16))
# Shift 16 bits to the right, and then binary AND to obtain 8 bits representing red
r = ((num >> 16) & 0xFF)
# Shift 8 bits to the right, and then binary AND to obtain 8 bits representing green
g = ((num >> 8) & 0xFF)
# Simply binary AND to obtain 8 bits representing blue
b = (num & 0xFF)
return (r, g, b)
I know that there are more efficient ways to acheive this, but I believe that this is a really concise example illustrating both shifts and bitwise boolean operations.
I think that the second part of the question:
Also, what are bitwise operators actually used for? I'd appreciate some examples.
Has been only partially addressed. These are my two cents on that matter.
Bitwise operations in programming languages play a fundamental role when dealing with a lot of applications. Almost all low-level computing must be done using this kind of operations.
In all applications that need to send data between two nodes, such as:
computer networks;
telecommunication applications (cellular phones, satellite communications, etc).
In the lower level layer of communication, the data is usually sent in what is called frames. Frames are just strings of bytes that are sent through a physical channel. This frames usually contain the actual data plus some other fields (coded in bytes) that are part of what is called the header. The header usually contains bytes that encode some information related to the status of the communication (e.g, with flags (bits)), frame counters, correction and error detection codes, etc. To get the transmitted data in a frame, and to build the frames to send data, you will need for sure bitwise operations.
In general, when dealing with that kind of applications, an API is available so you don't have to deal with all those details. For example, all modern programming languages provide libraries for socket connections, so you don't actually need to build the TCP/IP communication frames. But think about the good people that programmed those APIs for you, they had to deal with frame construction for sure; using all kinds of bitwise operations to go back and forth from the low-level to the higher-level communication.
As a concrete example, imagine some one gives you a file that contains raw data that was captured directly by telecommunication hardware. In this case, in order to find the frames, you will need to read the raw bytes in the file and try to find some kind of synchronization words, by scanning the data bit by bit. After identifying the synchronization words, you will need to get the actual frames, and SHIFT them if necessary (and that is just the start of the story) to get the actual data that is being transmitted.
Another very different low level family of application is when you need to control hardware using some (kind of ancient) ports, such as parallel and serial ports. This ports are controlled by setting some bytes, and each bit of that bytes has a specific meaning, in terms of instructions, for that port (see for instance http://en.wikipedia.org/wiki/Parallel_port). If you want to build software that does something with that hardware you will need bitwise operations to translate the instructions you want to execute to the bytes that the port understand.
For example, if you have some physical buttons connected to the parallel port to control some other device, this is a line of code that you can find in the soft application:
read = ((read ^ 0x80) >> 4) & 0x0f;
Hope this contributes.
I didn't see it mentioned above but you will also see some people use left and right shift for arithmetic operations. A left shift by x is equivalent to multiplying by 2^x (as long as it doesn't overflow) and a right shift is equivalent to dividing by 2^x.
Recently I've seen people using x << 1 and x >> 1 for doubling and halving, although I'm not sure if they are just trying to be clever or if there really is a distinct advantage over the normal operators.
I hope this clarifies those two:
x | 2
0001 //x
0010 //2
0011 //result = 3
x & 1
0001 //x
0001 //1
0001 //result = 1
Think of 0 as false and 1 as true. Then bitwise and(&) and or(|) work just like regular and and or except they do all of the bits in the value at once. Typically you will see them used for flags if you have 30 options that can be set (say as draw styles on a window) you don't want to have to pass in 30 separate boolean values to set or unset each one so you use | to combine options into a single value and then you use & to check if each option is set. This style of flag passing is heavily used by OpenGL. Since each bit is a separate flag you get flag values on powers of two(aka numbers that have only one bit set) 1(2^0) 2(2^1) 4(2^2) 8(2^3) the power of two tells you which bit is set if the flag is on.
Also note 2 = 10 so x|2 is 110(6) not 111(7) If none of the bits overlap(which is true in this case) | acts like addition.
Sets
Sets can be combined using mathematical operations.
The union operator | combines two sets to form a new one containing items in either.
The intersection operator & gets items only in both.
The difference operator - gets items in the first set but not in the second.
The symmetric difference operator ^ gets items in either set, but not both.
Try It Yourself:
first = {1, 2, 3, 4, 5, 6}
second = {4, 5, 6, 7, 8, 9}
print(first | second)
print(first & second)
print(first - second)
print(second - first)
print(first ^ second)
Result:
{1, 2, 3, 4, 5, 6, 7, 8, 9}
{4, 5, 6}
{1, 2, 3}
{8, 9, 7}
{1, 2, 3, 7, 8, 9}
This example will show you the operations for all four 2 bit values:
10 | 12
1010 #decimal 10
1100 #decimal 12
1110 #result = 14
10 & 12
1010 #decimal 10
1100 #decimal 12
1000 #result = 8
Here is one example of usage:
x = raw_input('Enter a number:')
print 'x is %s.' % ('even', 'odd')[x&1]
Another common use-case is manipulating/testing file permissions. See the Python stat module: http://docs.python.org/library/stat.html.
For example, to compare a file's permissions to a desired permission set, you could do something like:
import os
import stat
#Get the actual mode of a file
mode = os.stat('file.txt').st_mode
#File should be a regular file, readable and writable by its owner
#Each permission value has a single 'on' bit. Use bitwise or to combine
#them.
desired_mode = stat.S_IFREG|stat.S_IRUSR|stat.S_IWUSR
#check for exact match:
mode == desired_mode
#check for at least one bit matching:
bool(mode & desired_mode)
#check for at least one bit 'on' in one, and not in the other:
bool(mode ^ desired_mode)
#check that all bits from desired_mode are set in mode, but I don't care about
# other bits.
not bool((mode^desired_mode)&desired_mode)
I cast the results as booleans, because I only care about the truth or falsehood, but it would be a worthwhile exercise to print out the bin() values for each one.
Bit representations of integers are often used in scientific computing to represent arrays of true-false information because a bitwise operation is much faster than iterating through an array of booleans. (Higher level languages may use the idea of a bit array.)
A nice and fairly simple example of this is the general solution to the game of Nim. Take a look at the Python code on the Wikipedia page. It makes heavy use of bitwise exclusive or, ^.
There may be a better way to find where an array element is between two values, but as this example shows, the & works here, whereas and does not.
import numpy as np
a=np.array([1.2, 2.3, 3.4])
np.where((a>2) and (a<3))
#Result: Value Error
np.where((a>2) & (a<3))
#Result: (array([1]),)
i didnt see it mentioned, This example will show you the (-) decimal operation for 2 bit values: A-B (only if A contains B)
this operation is needed when we hold an verb in our program that represent bits. sometimes we need to add bits (like above) and sometimes we need to remove bits (if the verb contains then)
111 #decimal 7
-
100 #decimal 4
--------------
011 #decimal 3
with python:
7 & ~4 = 3 (remove from 7 the bits that represent 4)
001 #decimal 1
-
100 #decimal 4
--------------
001 #decimal 1
with python:
1 & ~4 = 1 (remove from 1 the bits that represent 4 - in this case 1 is not 'contains' 4)..
Whilst manipulating bits of an integer is useful, often for network protocols, which may be specified down to the bit, one can require manipulation of longer byte sequences (which aren't easily converted into one integer). In this case it is useful to employ the bitstring library which allows for bitwise operations on data - e.g. one can import the string 'ABCDEFGHIJKLMNOPQ' as a string or as hex and bit shift it (or perform other bitwise operations):
>>> import bitstring
>>> bitstring.BitArray(bytes='ABCDEFGHIJKLMNOPQ') << 4
BitArray('0x142434445464748494a4b4c4d4e4f50510')
>>> bitstring.BitArray(hex='0x4142434445464748494a4b4c4d4e4f5051') << 4
BitArray('0x142434445464748494a4b4c4d4e4f50510')
the following bitwise operators: &, |, ^, and ~ return values (based on their input) in the same way logic gates affect signals. You could use them to emulate circuits.
To flip bits (i.e. 1's complement/invert) you can do the following:
Since value ExORed with all 1s results into inversion,
for a given bit width you can use ExOR to invert them.
In Binary
a=1010 --> this is 0xA or decimal 10
then
c = 1111 ^ a = 0101 --> this is 0xF or decimal 15
-----------------
In Python
a=10
b=15
c = a ^ b --> 0101
print(bin(c)) # gives '0b101'
You can use bit masking to convert binary to decimal;
int a = 1 << 7;
int c = 55;
for(int i = 0; i < 8; i++){
System.out.print((a & c) >> 7);
c = c << 1;
}
this is for 8 digits you can also do for further more.

How to shift bits in a 2-5 byte long bytes object in python?

I am trying to extract data out of a byte object. For example:
From b'\x93\x4c\x00' my integer hides from bit 8 to 21.
I tried to do bytes >> 3 but that isn't possible with more than one byte.
I also tried to solve this with struct but the byte object must have a specific length.
How can I shift the bits to the right?
Don't use bytes to represent integer values; if you need bits, convert to an int:
value = int.from_bytes(your_bytes_value, byteorder='big')
bits_21_to_8 = (value & 0x1fffff) >> 8
where the 0x1fffff mask could also be calculated with:
mask = 2 ** 21 - 1
Demo:
>>> your_bytes_value = b'\x93\x4c\x00'
>>> value = int.from_bytes(your_bytes_value, byteorder='big')
>>> (value & 0x1fffff) >> 8
4940
You can then move back to bytes with the int.to_bytes() method:
>>> ((value & 0x1fffff) >> 8).to_bytes(2, byteorder='big')
b'\x13L'
As you have a bytes string and you want to strip the right-most eight bits (i.e. one byte), you can simply it from the bytes string:
>>> b'\x93\x4c\x00'[:-1]
b'\x93L'
If you want to convert that then to an integer, you can use Python’s struct to unpack it. As you correctly said, you need a fixed size to use structs, so you can just pad the bytes string to add as many zeros as you need:
>>> data = b'\x93\x4c\x00'
>>> data[:-1]
b'\x93L'
>>> data[:-1].rjust(4, b'\x00')
b'\x00\x00\x93L'
>>> struct.unpack('>L', data[:-1].rjust(4, b'\x00'))[0]
37708
Of course, you can also convert it first, and then shift off the 8 bits from the resulting integer:
>>> struct.unpack('>Q', data.rjust(8, b'\x00'))[0] >> 8
37708
If you want to make sure that you don’t actually interpret more than those 13 bits (bits 8 to 21), you have to apply the bit mask 0x1FFF of course:
>>> 37708 & 0x1FFF
4940
(If you need big-endianness instead, just use <L or <Q respectively.)
If you are really counting the bits from left to right (which would be unusual but okay), then you can use that padding technique too:
>>> struct.unpack('>Q', data.ljust(8, b'\x00'))[0] >> 43
1206656
Note that we’re adding the padding to the other side, and are shifting it by 43 bits (your 3 bits plus 5 bytes for the padded data we won’t need to look at)
Another approach that works for arbitrarily long byte sequences is to use the bitstring library which allows for bitwise operations on bitstrings e.g.
>>> import bitstring
>>> bitstring.BitArray(bytes=b'\x93\x4c\x00') >> 3
BitArray('0x126980')
You could convert your bytes to an integer then multiply or divide by powers of two to accomplish the shifting

Categories