16 bit hex into 14 bit signed int python? - python

I get a 16 bit Hex number (so 4 digits) from a sensor and want to convert it into a signed integer so I can actually use it.
There are plenty of codes on the internet that get the job done, but with this sensor it is a bit more arkward.
In fact, the number has only 14 bit, the first two (from the left) are irrelevant.
I tried to do it (in Python 3) but failed pretty hard.
Any suggestions how to "cut" the first two digits of the number and then make the rest a signed integer?
The Datasheet says, that E002 should be -8190 ane 1FFE should be +8190.
Thanks a lot!

Let's define a conversion function:
>>> def f(x):
... r = int(x, 16)
... return r if r < 2**15 else r - 2**16
...
Now, let's test the function against the values that the datahsheet provided:
>>> f('1FFE')
8190
>>> f('E002')
-8190
The usual convention for signed numbers is that a number is negative if the high bit is set and positive if it isn't. Following this convention, '0000' is zero and 'FFFF' is -1. The issue is that int assumes that a number is positive and we have to correct for that:
For any number equal to or less than 0x7FFF, then high bit is unset and the number is positive. Thus we return r=int(x,16) if r<2**15.
For any number r-int(x,16) that is equal to or greater than 0x8000, we return r - 2**16.
While your sensor may only produce 14-bin data, the manufacturer is following the standard convention for 16-bit integers.
Alternative
Instead of converting x to r and testing the value of r, we can directly test whether the high bit in x is set:
>>> def g(x):
... return int(x, 16) if x[0] in '01234567' else int(x, 16) - 2**16
...
>>> g('1FFE')
8190
>>> g('E002')
-8190
Ignoring the upper bits
Let's suppose that the manufacturer is not following standard conventions and that the upper 2-bits are unreliable. In this case, we can use modulo, %, to remove them and, after adjusting the other constants as appropriate for 14-bit integers, we have:
>>> def h(x):
... r = int(x, 16) % 2**14
... return r if r < 2**13 else r - 2**14
...
>>> h('1FFE')
8190
>>> h('E002')
-8190

There is a general algorithm for sign-extending a two's-complement integer value val whose number of bits is nbits (so that the top-most of those bits is the sign bit).
That algorithm is:
treat the value as a non-negative number, and if needed, mask off additional bits
invert the sign bit, still treating the result as a non-negative number
subtract the numeric value of the sign bit considered as a non-negative number, producing as a result, a signed number.
Expressing this algorithm in Python produces:
from __future__ import print_function
def sext(val, nbits):
assert nbits > 0
signbit = 1 << (nbits - 1)
mask = (1 << nbits) - 1
return ((val & mask) ^ signbit) - signbit
if __name__ == '__main__':
print('sext(0xe002, 14) =', sext(0xe002, 14))
print('sext(0x1ffe, 14) =', sext(0x1ffe, 14))
which when run shows the desired results:
sext(0xe002, 14) = -8190
sext(0x1ffe, 14) = 8190

Related

How are integer truncated for Python hash() function? [duplicate]

I've been playing with Python's hash function. For small integers, it appears hash(n) == n always. However this does not extend to large numbers:
>>> hash(2**100) == 2**100
False
I'm not surprised, I understand hash takes a finite range of values. What is that range?
I tried using binary search to find the smallest number hash(n) != n
>>> import codejamhelpers # pip install codejamhelpers
>>> help(codejamhelpers.binary_search)
Help on function binary_search in module codejamhelpers.binary_search:
binary_search(f, t)
Given an increasing function :math:`f`, find the greatest non-negative integer :math:`n` such that :math:`f(n) \le t`. If :math:`f(n) > t` for all :math:`n \ge 0`, return None.
>>> f = lambda n: int(hash(n) != n)
>>> n = codejamhelpers.binary_search(f, 0)
>>> hash(n)
2305843009213693950
>>> hash(n+1)
0
What's special about 2305843009213693951? I note it's less than sys.maxsize == 9223372036854775807
Edit: I'm using Python 3. I ran the same binary search on Python 2 and got a different result 2147483648, which I note is sys.maxint+1
I also played with [hash(random.random()) for i in range(10**6)] to estimate the range of hash function. The max is consistently below n above. Comparing the min, it seems Python 3's hash is always positively valued, whereas Python 2's hash can take negative values.
2305843009213693951 is 2^61 - 1. It's the largest Mersenne prime that fits into 64 bits.
If you have to make a hash just by taking the value mod some number, then a large Mersenne prime is a good choice -- it's easy to compute and ensures an even distribution of possibilities. (Although I personally would never make a hash this way)
It's especially convenient to compute the modulus for floating point numbers. They have an exponential component that multiplies the whole number by 2^x. Since 2^61 = 1 mod 2^61-1, you only need to consider the (exponent) mod 61.
See: https://en.wikipedia.org/wiki/Mersenne_prime
Based on python documentation in pyhash.c file:
For numeric types, the hash of a number x is based on the reduction
of x modulo the prime P = 2**_PyHASH_BITS - 1. It's designed so that
hash(x) == hash(y) whenever x and y are numerically equal, even if
x and y have different types.
So for a 64/32 bit machine, the reduction would be 2 _PyHASH_BITS - 1, but what is _PyHASH_BITS?
You can find it in pyhash.h header file which for a 64 bit machine has been defined as 61 (you can read more explanation in pyconfig.h file).
#if SIZEOF_VOID_P >= 8
# define _PyHASH_BITS 61
#else
# define _PyHASH_BITS 31
#endif
So first off all it's based on your platform for example in my 64bit Linux platform the reduction is 261-1, which is 2305843009213693951:
>>> 2**61 - 1
2305843009213693951
Also You can use math.frexp in order to get the mantissa and exponent of sys.maxint which for a 64 bit machine shows that max int is 263:
>>> import math
>>> math.frexp(sys.maxint)
(0.5, 64)
And you can see the difference by a simple test:
>>> hash(2**62) == 2**62
True
>>> hash(2**63) == 2**63
False
Read the complete documentation about python hashing algorithm https://github.com/python/cpython/blob/master/Python/pyhash.c#L34
As mentioned in comment you can use sys.hash_info (in python 3.X) which will give you a struct sequence of parameters used for computing
hashes.
>>> sys.hash_info
sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash24', hash_bits=64, seed_bits=128, cutoff=0)
>>>
Alongside the modulus that I've described in preceding lines, you can also get the inf value as following:
>>> hash(float('inf'))
314159
>>> sys.hash_info.inf
314159
Hash function returns plain int that means that returned value is greater than -sys.maxint and lower than sys.maxint, which means if you pass sys.maxint + x to it result would be -sys.maxint + (x - 2).
hash(sys.maxint + 1) == sys.maxint + 1 # False
hash(sys.maxint + 1) == - sys.maxint -1 # True
hash(sys.maxint + sys.maxint) == -sys.maxint + sys.maxint - 2 # True
Meanwhile 2**200 is a n times greater than sys.maxint - my guess is that hash would go over range -sys.maxint..+sys.maxint n times until it stops on plain integer in that range, like in code snippets above..
So generally, for any n <= sys.maxint:
hash(sys.maxint*n) == -sys.maxint*(n%2) + 2*(n%2)*sys.maxint - n/2 - (n + 1)%2 ## True
Note: this is true for python 2.
The implementation for the int type in cpython can be found here.
It just returns the value, except for -1, than it returns -2:
static long
int_hash(PyIntObject *v)
{
/* XXX If this is changed, you also need to change the way
Python's long, float and complex types are hashed. */
long x = v -> ob_ival;
if (x == -1)
x = -2;
return x;
}

Python binary value of integer of certain byte size

I know python probably isn't the best tool for this, but let's say I have a value that I would like to display as an unsigned char with values between -128 and 127. For example:
# ok for positive number
>>> f'0b{1:>08b}'
'0b00000001'
# how to do it for negative number?
>>> f'0b{-1:>08b}' # should be 0b11111111
'0b000000-1'
# how to do it for 2's complement?
>>> f'0b{~1:>08b}' # should be 0b11111110
'0b00000-10'
How could I do this display in python?
Use modulo 256.
# positive number is the same
>>> f'0b{1 % 0x100:>08b}'
'0b00000001'
# correct bit pattern, were you to notate -1 in a signed int8
# same as notating 255 in unsigned int8, which is what -1 % 255 is
>>> f'0b{-1 % 0x100:>08b}'
'0b11111111'
# flipped bits from 1, truncated to only the least significant 8 digits
>>> f'0b{~1 % 0x100:>08b}'
'0b11111110'
Essentially this is just 'convert your signed char into an unsigned char, and print the bit pattern' - the benefit of using the modulo operator is you always get a positive number, and if your modulo is a power of two, the bit pattern for every bit less than that modulus is left exactly the same.
You could try manually setting the bits like this knowing that 2^n - 1 sets the first n bits:
>>> negative =lambda value, bits: bin(2**bits-1-value+1)
>>> complement =lambda value, bits: bin(2**bits-1-value)
# to verify (note, python doesn't 'know' its 1 byte so will
# equal 256 unless we do the 1-byte mask with &0xFF
>>> (1+int(complement(1,8),2))&0xFF
0

Converting an integer to signed 2's complement binary string

Right now, as far as I know, all means of conversion from int to binary bit string is for unsigned conversions (bin, format, etc.). Is there a way to quickly convert a given integer into its corresponding 2's complement bitstring (using minimal bits)?
For example, I'd want this function f to output:
f(-4) = '100'
f(5) = '0101'
f(-13) = '10011'
Right now, my implementation is this code here:
def f(x):
"""Convert decimal to two's complement binary string"""
if x < 0:
bs = bin(x)[3:]
bs_pad = zero_pad(bs, roundup(tc_bits(x)))
return bin((int(invert(bs_pad),2) + 1))#negate and add 1
else: #Positive- sign bit 0.
bs = bin(x)[2:]
return "0b" + zero_pad(bs, roundup(tc_bits(x)))
which basically traces each step of the conversion process- zero-padding, negation, adding 1, then converting back to binary (it actually also ensures the bit width is a multiple of four). This was super tedious to write and I'm wondering if Python supports a faster/more code-concise way.
Nothing built in, but this is more concise:
def f(n):
nbits = n.bit_length() + 1
return f"{n & ((1 << nbits) - 1):0{nbits}b}"
Then, e.g.,
>>> f(0)
'0'
>>> f(1)
'01'
>>> f(2)
'010'
>>> f(3)
'011'
>>> f(-1)
'11'
>>> f(-2)
'110'
>>> f(-3)
'101'

When is hash(n) == n in Python?

I've been playing with Python's hash function. For small integers, it appears hash(n) == n always. However this does not extend to large numbers:
>>> hash(2**100) == 2**100
False
I'm not surprised, I understand hash takes a finite range of values. What is that range?
I tried using binary search to find the smallest number hash(n) != n
>>> import codejamhelpers # pip install codejamhelpers
>>> help(codejamhelpers.binary_search)
Help on function binary_search in module codejamhelpers.binary_search:
binary_search(f, t)
Given an increasing function :math:`f`, find the greatest non-negative integer :math:`n` such that :math:`f(n) \le t`. If :math:`f(n) > t` for all :math:`n \ge 0`, return None.
>>> f = lambda n: int(hash(n) != n)
>>> n = codejamhelpers.binary_search(f, 0)
>>> hash(n)
2305843009213693950
>>> hash(n+1)
0
What's special about 2305843009213693951? I note it's less than sys.maxsize == 9223372036854775807
Edit: I'm using Python 3. I ran the same binary search on Python 2 and got a different result 2147483648, which I note is sys.maxint+1
I also played with [hash(random.random()) for i in range(10**6)] to estimate the range of hash function. The max is consistently below n above. Comparing the min, it seems Python 3's hash is always positively valued, whereas Python 2's hash can take negative values.
2305843009213693951 is 2^61 - 1. It's the largest Mersenne prime that fits into 64 bits.
If you have to make a hash just by taking the value mod some number, then a large Mersenne prime is a good choice -- it's easy to compute and ensures an even distribution of possibilities. (Although I personally would never make a hash this way)
It's especially convenient to compute the modulus for floating point numbers. They have an exponential component that multiplies the whole number by 2^x. Since 2^61 = 1 mod 2^61-1, you only need to consider the (exponent) mod 61.
See: https://en.wikipedia.org/wiki/Mersenne_prime
Based on python documentation in pyhash.c file:
For numeric types, the hash of a number x is based on the reduction
of x modulo the prime P = 2**_PyHASH_BITS - 1. It's designed so that
hash(x) == hash(y) whenever x and y are numerically equal, even if
x and y have different types.
So for a 64/32 bit machine, the reduction would be 2 _PyHASH_BITS - 1, but what is _PyHASH_BITS?
You can find it in pyhash.h header file which for a 64 bit machine has been defined as 61 (you can read more explanation in pyconfig.h file).
#if SIZEOF_VOID_P >= 8
# define _PyHASH_BITS 61
#else
# define _PyHASH_BITS 31
#endif
So first off all it's based on your platform for example in my 64bit Linux platform the reduction is 261-1, which is 2305843009213693951:
>>> 2**61 - 1
2305843009213693951
Also You can use math.frexp in order to get the mantissa and exponent of sys.maxint which for a 64 bit machine shows that max int is 263:
>>> import math
>>> math.frexp(sys.maxint)
(0.5, 64)
And you can see the difference by a simple test:
>>> hash(2**62) == 2**62
True
>>> hash(2**63) == 2**63
False
Read the complete documentation about python hashing algorithm https://github.com/python/cpython/blob/master/Python/pyhash.c#L34
As mentioned in comment you can use sys.hash_info (in python 3.X) which will give you a struct sequence of parameters used for computing
hashes.
>>> sys.hash_info
sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash24', hash_bits=64, seed_bits=128, cutoff=0)
>>>
Alongside the modulus that I've described in preceding lines, you can also get the inf value as following:
>>> hash(float('inf'))
314159
>>> sys.hash_info.inf
314159
Hash function returns plain int that means that returned value is greater than -sys.maxint and lower than sys.maxint, which means if you pass sys.maxint + x to it result would be -sys.maxint + (x - 2).
hash(sys.maxint + 1) == sys.maxint + 1 # False
hash(sys.maxint + 1) == - sys.maxint -1 # True
hash(sys.maxint + sys.maxint) == -sys.maxint + sys.maxint - 2 # True
Meanwhile 2**200 is a n times greater than sys.maxint - my guess is that hash would go over range -sys.maxint..+sys.maxint n times until it stops on plain integer in that range, like in code snippets above..
So generally, for any n <= sys.maxint:
hash(sys.maxint*n) == -sys.maxint*(n%2) + 2*(n%2)*sys.maxint - n/2 - (n + 1)%2 ## True
Note: this is true for python 2.
The implementation for the int type in cpython can be found here.
It just returns the value, except for -1, than it returns -2:
static long
int_hash(PyIntObject *v)
{
/* XXX If this is changed, you also need to change the way
Python's long, float and complex types are hashed. */
long x = v -> ob_ival;
if (x == -1)
x = -2;
return x;
}

How to get the signed integer value of a long in python?

If lv stores a long value, and the machine is 32 bits, the following code:
iv = int(lv & 0xffffffff)
results an iv of type long, instead of the machine's int.
How can I get the (signed) int value in this case?
import ctypes
number = lv & 0xFFFFFFFF
signed_number = ctypes.c_long(number).value
You're working in a high-level scripting language; by nature, the native data types of the system you're running on aren't visible. You can't cast to a native signed int with code like this.
If you know that you want the value converted to a 32-bit signed integer--regardless of the platform--you can just do the conversion with the simple math:
iv = 0xDEADBEEF
if(iv & 0x80000000):
iv = -0x100000000 + iv
Essentially, the problem is to sign extend from 32 bits to... an infinite number of bits, because Python has arbitrarily large integers. Normally, sign extension is done automatically by CPU instructions when casting, so it's interesting that this is harder in Python than it would be in, say, C.
By playing around, I found something similar to BreizhGatch's function, but that doesn't require a conditional statement. n & 0x80000000 extracts the 32-bit sign bit; then, the - keeps the same 32-bit representation but sign-extends it; finally, the extended sign bits are set on n.
def toSigned32(n):
n = n & 0xffffffff
return n | (-(n & 0x80000000))
Bit Twiddling Hacks suggests another solution that perhaps works more generally. n ^ 0x80000000 flips the 32-bit sign bit; then - 0x80000000 will sign-extend the opposite bit. Another way to think about it is that initially, negative numbers are above positive numbers (separated by 0x80000000); the ^ swaps their positions; then the - shifts negative numbers to below 0.
def toSigned32(n):
n = n & 0xffffffff
return (n ^ 0x80000000) - 0x80000000
Can I suggest this:
def getSignedNumber(number, bitLength):
mask = (2 ** bitLength) - 1
if number & (1 << (bitLength - 1)):
return number | ~mask
else:
return number & mask
print iv, '->', getSignedNumber(iv, 32)
You may use struct library to convert values like that. It's ugly, but works:
from struct import pack, unpack
signed = unpack('l', pack('L', lv & 0xffffffff))[0]
A quick and dirty solution (x is never greater than 32-bit in my case).
if x > 0x7fffffff:
x = x - 4294967296
If you know how many bits are in the original value, e.g. byte or multibyte values from an I2C sensor, then you can do the standard Two's Complement conversion:
def TwosComp8(n):
return n - 0x100 if n & 0x80 else n
def TwosComp16(n):
return n - 0x10000 if n & 0x8000 else n
def TwosComp32(n):
return n - 0x100000000 if n & 0x80000000 else n
In case the hexadecimal representation of the number is of 4 bytes, this would solve the problem.
def B2T_32(x):
num=int(x,16)
if(num & 0x80000000): # If it has the negative sign bit. (MSB=1)
num -= 0x80000000*2
return num
print(B2T_32(input("enter a input as a hex value\n")))
Simplest solution with any bit-length of number
Why is the syntax of a signed integer so difficult for the human mind to understand. Because this is the idea of machines. :-)
Let's explain.
If we have a bi-directional 7-bit counter with the initial state
000 0000
and we get a pulse for the back count input. Then the next number to count will be
111 1111
And the people said:
Hey, the counter we need to know that this is a negative reload. You
should add a sign letting you know about this.
And the counter added:
1111 1111
And people asked,
How are we going to calculate that this is -1.
The counter replied: Find a number one greater than the reading and subtract it and you get the result.
1111 1111
-10000 0000
____________
(dec) -1
def sigIntFromHex(a): # a = 0x0xffe1
if a & (1 << (a.bit_length()-1)): # check if highest bit is 1 thru & with 0x1000
return a - (1 << (a.bit_length())) # 0xffe1 - 0x10000
else:
return a
###and more elegant:###
def sigIntFromHex(a):
return a - (1 << (a.bit_length())) if a & (1 << (a.bit_length()-1)) else a
b = 0xFFE1
print(sigIntFromHex(b))
I hope I helped

Categories