How Can I limit bit number in the integer variable in Python? - python

I want to realize IDEA algorithm in Python. In Python we have no limits for variable size, but I need limit bit number in the integer number, for example, to do cyclic left shift. What do you advise?

One way is to use the BitVector library.
Example of use:
>>> from BitVector import BitVector
>>> bv = BitVector(intVal = 0x13A5, size = 32)
>>> print bv
00000000000000000001001110100101
>>> bv << 6 #does a cyclic left shift
>>> print bv
00000000000001001110100101000000
>>> bv[0] = 1
>>> print bv
10000000000001001110100101000000
>>> bv << 3 #cyclic shift again, should be more apparent
>>> print bv
00000000001001110100101000000100

An 8-bit mask with a cyclic left shift:
shifted = number << 1
overflowed = (number & 0x100) >> 8
shifted &= 0xFF
result = overflowed | shifted
You should be able to make a class that does this for you. With a bit more of the same, it can shift an arbitrary amount out of an arbitrary sized value.

The bitstring module might be of help (documentation here). This example creates a 22 bit bitstring and rotates the bits 3 to the right:
>>> from bitstring import BitArray
>>> a = BitArray(22) # creates 22-bit zeroed bitstring
>>> a.uint = 12345 # set the bits with an unsigned integer
>>> a.bin # view the binary representation
'0b0000000011000000111001'
>>> a.ror(3) # rotate to the right
>>> a.bin
'0b0010000000011000000111'
>>> a.uint # and back to the integer representation
525831

If you want a the low 32 bits of a number, you can use binary-and like so:
>>> low32 = (1 << 32) - 1
>>> n = 0x12345678
>>> m = ((n << 20) | (n >> 12)) & low32
>>> "0x%x" % m
'0x67812345'

Related

how to output the difference of two floats as an interger in python

is there any way to output the difference between two float numbers as an integer
below is three examples of the float values provided for script, my goal is to output the difference between these values as an integer , in the first example i should get 2 where num_two - num_one equals 0.000002 but i don't want the zeros as they don't matter i can do it with string format but i have no way of telling how big the number is or how many zeros it has
## example 1
num_one = 0.000012
num_two = 0.000014
## example 2
num_0ne = 0.0123
num_tw0 = 0.013
## example 3
num_1 = 23.32
num_2 = 23.234
print (float(num_2) - float(num_1))
## this should output 86 as an integer
Beware of floats (see https://en.wikipedia.org/wiki/IEEE_754):
>>> 23.32 - 23.234
0.08599999999999852
You need exact precision. Use the decimal module:
>>> from decimal import Decimal
>>> n1 = Decimal("23.32")
>>> n2 = Decimal("23.234")
>>> n1, n2
(Decimal('23.32'), Decimal('23.234'))
>>> d = abs(n1-n2)
>>> d
Decimal('0.086')
Now, just shift the decimal point right (that is * 10) until there is no fractional part left (d % 1 == 0):
>>> while d % 1:
... d *= 10
(Don't be afraid, the loop will stop because you can't have more decimal levels than decimal.getcontext().prec at the beginning and the decimal level decrease on each iteration).
You get the expected result:
>>> d
Decimal('86.000')
>>> int(d)
86

Two's complement in Python (shift left on many bits with rounding)

How we could code the reverse complete of a DNA sequence from its code?
A DNA sequence can contain 4 different characters A, C, G, T; where A is the complement of T and C is the complement of G.
A reverse complement of A DNA sequence is the complement of a sequence but in an inverse way (we compute the complement of each character from right to left).
Example: the complement of (AA) is: TT, the complement of (AC) is GT and so on...
In general, using python we code a sequence by mapping each character to a number going from 0 to 3,
{A:0, C:1, G:2, T:3}
then the coding of AA is: 0, the coding of AC is:
AC = 0*4^0+1*4^1 = 4
the coding of GT is:
GT = 2*4^0+3*4^1 = 14
How could I transform the code of each sequence to its reverse complement in python without creating a dictionary? For the above example: convert 4 to 14? and 0 to 15 ...
Your symbol set is too small for a hash map to actually be efficient. And mixing two's complement into your problem has just caused confusion.
symbols = 'ACGT'
complements = symbols[::-1] # reverse order
import string
table = string.maketrans(symbols, complements)
sample = 'ACCGTT'
print(sample[::-1].translate(table))
# output: AACGGT
Converting to some bitpacked format would take less space but require a lot more special handling, as you'd need to track sizes separately, perform arbitrarily wide shifts and so on. Python can certainly do it, in particular with int() accepting many bases and creating arbitrary width results, but it's likely a counterproductive detour.
digits = string.digits[:len(symbols)]
length = len(sample)
digitmap = string.maketrans(symbols, digits)
number = int(sample.translate(digitmap), len(digits))
def reversemapnumber(function=id, number=0, radix=0b100, length=0):
result = 0
for i in range(length):
number,digit = divmod(number, radix)
result = result*radix + function(digit)
return result
revcomplemented = reversemapnumber(function=lambda x: 3-x,
number=number, length=length)
# binary form
print('{:0{}b}'.format(revcomplemented, length*2))
# back to text form
print(''.join(symbols[(revcomplemented>>i)&0b11]
for i in range(2*length-2, -2, -2)))
In that jumble of code I've used division rather than shifts to be somewhat more generic (supporting radix not a power of two), but the printing examples rely on the width exactly. In the end it's just tricky and unclear.
the reverse of a list in python
>>> xs = [1,2,3]
>>> reversed(xs)
<listreverseiterator object at 0x10089c9d0>
>>> list(reversed(xs))
[3, 2, 1]
>>>
def complement(x):
return ~x & 15 # as 15 == int('1111', 2)
the 15 is a bitmask. It represents the binary 1111. We then use the binary and operator.
>>> "{0:b}".format(complement(int('1111',2)))
'0'
>>> "{0:b}".format(complement(int('0001',2)))
'1110'
>>> "{0:b}".format(complement(int('1001',2)))
'110'
>>> xs = [int('1111',2), int('1001',2), int('0110',2), int('1011',2)]
>>> map(complement, xs)
[0, 6, 9, 4]
>>> list(reversed(map(complement, xs)))
[4, 9, 6, 0]
Basing your example where
given a sequence of 6 characters: ACCGTT, the complement of A is: T,
and the complement of C is G; so the reverse complement of ACCGTT is: AACGGT.
assume that you have c complemnt function complement and a reverse function reverse.
we have reverse(ACCGTT) = TTGCCA and complement(ACCGTT) = TGGCAA
. Reversing a list after calling a function on each element is the same as calling a function on each element on a list.
complement(reverse(ACCGTT)) = reverse(complement(ACCGTT))
So the other part of the question is that you want to map
{A:0, C:1, G:2, T:3}
A -> T | 0 -> 3
T -> A | 3 -> 0
C -> G | 1 -> 2
G -> C | 2 -> 1
which in binary would be
a = int('00', 2) # 0
c = int('01', 2) # 1
g = int('10', 2) # 2
t = int('11', 2) # 3
def complement(x):
return ~x & 3 # this 3 is the same as int('11', 2)
def reverse_complement(list_of_ints):
return list(reversed(map(complement, list_of_ints)))

Extract bitfields from an int in Python

I have a number like 0x5423 where I want to extract 4 values:
a = 0x5 # 15 downto 12
b = 0x42 # 11 downto 3
c = 0x3 # 3 downto 2
d = 0x00 # 1 downto 0
I discovered the module bitstrings that looks great. Unfortunately, for an unknown reason, the bits are numbered from the right.
This is bad because if a add some upper bits like 0xA5423 my extraction won't work anymore:
field = bitstrings.BitArray('0x5423')
a = field[0:4].uint
b = field[4:12].uint
c = field[12:14].uint
d = field[14:16].uint
How can I properly extract my bitfields without complex arithmetic manipulations such as:
b = (a >> 4) & 0xFF
Ideally I would have:
b = field.range(11, 4)
Convert the string to 0x#### format before pass to bitstring.BitArray:
>>> n = '0xA5423'
>>> n = '0x{:04x}'.format(int(n, 16) & 0xffff) # => '0x5423'
>>> field = bitstring.BitArray(n)
>>> field[0:4].uint
5
>>> field[4:12].uint # 0x42 == 66
66
>>> field[12:14].uint
0
>>> field[14:16].uint
3
UPDATE another solution that does not depend on bitstring, and count from left(according to OP):
Convert the number into binary format:
>>> n = '0xA5423'
>>> n = format(int(n, 16), '016b')[::-1] # reversed
>>> n
'11000100001010100101'
>>> int(n[0:2][::-1], 2) # need to reverse again to get proper value
3
>>> int(n[2:4][::-1], 2)
0
>>> int(n[4:12][::-1], 2)
66
>>> int(n[12:16][::-1], 2)
5

drop trailing zeros from decimal

I have a long list of Decimals and that I have to adjust by factors of 10, 100, 1000,..... 1000000 depending on certain conditions. When I multiply them there is sometimes a useless trailing zero (though not always) that I want to get rid of. For example...
from decimal import Decimal
# outputs 25.0, PROBLEM! I would like it to output 25
print Decimal('2.5') * 10
# outputs 2567.8000, PROBLEM! I would like it to output 2567.8
print Decimal('2.5678') * 1000
Is there a function that tells the decimal object to drop these insignificant zeros? The only way I can think of doing this is to convert to a string and replace them using regular expressions.
Should probably mention that I am using python 2.6.5
EDIT
senderle's fine answer made me realize that I occasionally get a number like 250.0 which when normalized produces 2.5E+2. I guess in these cases I could try to sort them out and convert to a int
You can use the normalize method to remove extra precision.
>>> print decimal.Decimal('5.500')
5.500
>>> print decimal.Decimal('5.500').normalize()
5.5
To avoid stripping zeros to the left of the decimal point, you could do this:
def normalize_fraction(d):
normalized = d.normalize()
sign, digits, exponent = normalized.as_tuple()
if exponent > 0:
return decimal.Decimal((sign, digits + (0,) * exponent, 0))
else:
return normalized
Or more compactly, using quantize as suggested by user7116:
def normalize_fraction(d):
normalized = d.normalize()
sign, digit, exponent = normalized.as_tuple()
return normalized if exponent <= 0 else normalized.quantize(1)
You could also use to_integral() as shown here but I think using as_tuple this way is more self-documenting.
I tested these both against a few cases; please leave a comment if you find something that doesn't work.
>>> normalize_fraction(decimal.Decimal('55.5'))
Decimal('55.5')
>>> normalize_fraction(decimal.Decimal('55.500'))
Decimal('55.5')
>>> normalize_fraction(decimal.Decimal('55500'))
Decimal('55500')
>>> normalize_fraction(decimal.Decimal('555E2'))
Decimal('55500')
There's probably a better way of doing this, but you could use .rstrip('0').rstrip('.') to achieve the result that you want.
Using your numbers as an example:
>>> s = str(Decimal('2.5') * 10)
>>> print s.rstrip('0').rstrip('.') if '.' in s else s
25
>>> s = str(Decimal('2.5678') * 1000)
>>> print s.rstrip('0').rstrip('.') if '.' in s else s
2567.8
And here's the fix for the problem that #gerrit pointed out in the comments:
>>> s = str(Decimal('1500'))
>>> print s.rstrip('0').rstrip('.') if '.' in s else s
1500
Answer from the Decimal FAQ in the documentation:
>>> def remove_exponent(d):
... return d.quantize(Decimal(1)) if d == d.to_integral() else d.normalize()
>>> remove_exponent(Decimal('5.00'))
Decimal('5')
>>> remove_exponent(Decimal('5.500'))
Decimal('5.5')
>>> remove_exponent(Decimal('5E+3'))
Decimal('5000')
Answer is mentioned in FAQ (https://docs.python.org/2/library/decimal.html#decimal-faq) but does not explain things.
To drop trailing zeros for fraction part you should use normalize:
>>> Decimal('100.2000').normalize()
Decimal('100.2')
>> Decimal('0.2000').normalize()
Decimal('0.2')
But this works different for numbers with leading zeros in sharp part:
>>> Decimal('100.0000').normalize()
Decimal('1E+2')
In this case we should use `to_integral':
>>> Decimal('100.000').to_integral()
Decimal('100')
So we could check if there's a fraction part:
>>> Decimal('100.2000') == Decimal('100.2000').to_integral()
False
>>> Decimal('100.0000') == Decimal('100.0000').to_integral()
True
And use appropriate method then:
def remove_exponent(num):
return num.to_integral() if num == num.to_integral() else num.normalize()
Try it:
>>> remove_exponent(Decimal('100.2000'))
Decimal('100.2')
>>> remove_exponent(Decimal('100.0000'))
Decimal('100')
>>> remove_exponent(Decimal('0.2000'))
Decimal('0.2')
Now we're done.
Use the format specifier %g. It seems remove to trailing zeros.
>>> "%g" % (Decimal('2.5') * 10)
'25'
>>> "%g" % (Decimal('2.5678') * 1000)
'2567.8'
It also works without the Decimal function
>>> "%g" % (2.5 * 10)
'25'
>>> "%g" % (2.5678 * 1000)
'2567.8'
I ended up doing this:
import decimal
def dropzeros(number):
mynum = decimal.Decimal(number).normalize()
# e.g 22000 --> Decimal('2.2E+4')
return mynum.__trunc__() if not mynum % 1 else float(mynum)
print dropzeros(22000.000)
22000
print dropzeros(2567.8000)
2567.8
note: casting the return value as a string will limit you to 12 significant digits
Slightly modified version of A-IV's answer
NOTE that Decimal('0.99999999999999999999999999995').normalize() will round to Decimal('1')
def trailing(s: str, char="0"):
return len(s) - len(s.rstrip(char))
def decimal_to_str(value: decimal.Decimal):
"""Convert decimal to str
* Uses exponential notation when there are more than 4 trailing zeros
* Handles decimal.InvalidOperation
"""
# to_integral_value() removes decimals
if value == value.to_integral_value():
try:
value = value.quantize(decimal.Decimal(1))
except decimal.InvalidOperation:
pass
uncast = str(value)
# use exponential notation if there are more that 4 zeros
return str(value.normalize()) if trailing(uncast) > 4 else uncast
else:
# normalize values with decimal places
return str(value.normalize())
# or str(value).rstrip('0') if rounding edgecases are a concern
You could use :g to achieve this:
'{:g}'.format(3.140)
gives
'3.14'
This should work:
'{:f}'.format(decimal.Decimal('2.5') * 10).rstrip('0').rstrip('.')
Just to show a different possibility, I used to_tuple() to achieve the same result.
def my_normalize(dec):
"""
>>> my_normalize(Decimal("12.500"))
Decimal('12.5')
>>> my_normalize(Decimal("-0.12500"))
Decimal('-0.125')
>>> my_normalize(Decimal("0.125"))
Decimal('0.125')
>>> my_normalize(Decimal("0.00125"))
Decimal('0.00125')
>>> my_normalize(Decimal("125.00"))
Decimal('125')
>>> my_normalize(Decimal("12500"))
Decimal('12500')
>>> my_normalize(Decimal("0.000"))
Decimal('0')
"""
if dec is None:
return None
sign, digs, exp = dec.as_tuple()
for i in list(reversed(digs)):
if exp >= 0 or i != 0:
break
exp += 1
digs = digs[:-1]
if not digs and exp < 0:
exp = 0
return Decimal((sign, digs, exp))
Why not use modules 10 from a multiple of 10 to check if there is remainder? No remainder means you can force int()
if (x * 10) % 10 == 0:
x = int(x)
x = 2/1
Output: 2
x = 3/2
Output: 1.5

Hex string to signed int in Python

How do I convert a hex string to a signed int in Python 3?
The best I can come up with is
h = '9DA92DAB'
b = bytes(h, 'utf-8')
ba = binascii.a2b_hex(b)
print(int.from_bytes(ba, byteorder='big', signed=True))
Is there a simpler way? Unsigned is so much easier: int(h, 16)
BTW, the origin of the question is itunes persistent id - music library xml version and iTunes hex version
In n-bit two's complement, bits have value:
bit 0 = 20
bit 1 = 21
bit n-2 = 2n-2
bit n-1 = -2n-1
But bit n-1 has value 2n-1 when unsigned, so the number is 2n too high. Subtract 2n if bit n-1 is set:
def twos_complement(hexstr, bits):
value = int(hexstr, 16)
if value & (1 << (bits - 1)):
value -= 1 << bits
return value
print(twos_complement('FFFE', 16))
print(twos_complement('7FFF', 16))
print(twos_complement('7F', 8))
print(twos_complement('FF', 8))
Output:
-2
32767
127
-1
import struct
For Python 3 (with comments' help):
h = '9DA92DAB'
struct.unpack('>i', bytes.fromhex(h))
For Python 2:
h = '9DA92DAB'
struct.unpack('>i', h.decode('hex'))
or if it is little endian:
h = '9DA92DAB'
struct.unpack('<i', h.decode('hex'))
Here's a general function you can use for hex of any size:
import math
# hex string to signed integer
def htosi(val):
uintval = int(val,16)
bits = 4 * (len(val) - 2)
if uintval >= math.pow(2,bits-1):
uintval = int(0 - (math.pow(2,bits) - uintval))
return uintval
And to use it:
h = str(hex(-5))
h2 = str(hex(-13589))
x = htosi(h)
x2 = htosi(h2)
This works for 16 bit signed ints, you can extend for 32 bit ints. It uses the basic definition of 2's complement signed numbers. Also note xor with 1 is the same as a binary negate.
# convert to unsigned
x = int('ffbf', 16) # example (-65)
# check sign bit
if (x & 0x8000) == 0x8000:
# if set, invert and add one to get the negative value, then add the negative sign
x = -( (x ^ 0xffff) + 1)
It's a very late answer, but here's a function to do the above. This will extend for whatever length you provide. Credit for portions of this to another SO answer (I lost the link, so please provide it if you find it).
def hex_to_signed(source):
"""Convert a string hex value to a signed hexidecimal value.
This assumes that source is the proper length, and the sign bit
is the first bit in the first byte of the correct length.
hex_to_signed("F") should return -1.
hex_to_signed("0F") should return 15.
"""
if not isinstance(source, str):
raise ValueError("string type required")
if 0 == len(source):
raise valueError("string is empty")
sign_bit_mask = 1 << (len(source)*4-1)
other_bits_mask = sign_bit_mask - 1
value = int(source, 16)
return -(value & sign_bit_mask) | (value & other_bits_mask)

Categories