How are integer truncated for Python hash() function? [duplicate] - python

I've been playing with Python's hash function. For small integers, it appears hash(n) == n always. However this does not extend to large numbers:
>>> hash(2**100) == 2**100
False
I'm not surprised, I understand hash takes a finite range of values. What is that range?
I tried using binary search to find the smallest number hash(n) != n
>>> import codejamhelpers # pip install codejamhelpers
>>> help(codejamhelpers.binary_search)
Help on function binary_search in module codejamhelpers.binary_search:
binary_search(f, t)
Given an increasing function :math:`f`, find the greatest non-negative integer :math:`n` such that :math:`f(n) \le t`. If :math:`f(n) > t` for all :math:`n \ge 0`, return None.
>>> f = lambda n: int(hash(n) != n)
>>> n = codejamhelpers.binary_search(f, 0)
>>> hash(n)
2305843009213693950
>>> hash(n+1)
0
What's special about 2305843009213693951? I note it's less than sys.maxsize == 9223372036854775807
Edit: I'm using Python 3. I ran the same binary search on Python 2 and got a different result 2147483648, which I note is sys.maxint+1
I also played with [hash(random.random()) for i in range(10**6)] to estimate the range of hash function. The max is consistently below n above. Comparing the min, it seems Python 3's hash is always positively valued, whereas Python 2's hash can take negative values.

2305843009213693951 is 2^61 - 1. It's the largest Mersenne prime that fits into 64 bits.
If you have to make a hash just by taking the value mod some number, then a large Mersenne prime is a good choice -- it's easy to compute and ensures an even distribution of possibilities. (Although I personally would never make a hash this way)
It's especially convenient to compute the modulus for floating point numbers. They have an exponential component that multiplies the whole number by 2^x. Since 2^61 = 1 mod 2^61-1, you only need to consider the (exponent) mod 61.
See: https://en.wikipedia.org/wiki/Mersenne_prime

Based on python documentation in pyhash.c file:
For numeric types, the hash of a number x is based on the reduction
of x modulo the prime P = 2**_PyHASH_BITS - 1. It's designed so that
hash(x) == hash(y) whenever x and y are numerically equal, even if
x and y have different types.
So for a 64/32 bit machine, the reduction would be 2 _PyHASH_BITS - 1, but what is _PyHASH_BITS?
You can find it in pyhash.h header file which for a 64 bit machine has been defined as 61 (you can read more explanation in pyconfig.h file).
#if SIZEOF_VOID_P >= 8
# define _PyHASH_BITS 61
#else
# define _PyHASH_BITS 31
#endif
So first off all it's based on your platform for example in my 64bit Linux platform the reduction is 261-1, which is 2305843009213693951:
>>> 2**61 - 1
2305843009213693951
Also You can use math.frexp in order to get the mantissa and exponent of sys.maxint which for a 64 bit machine shows that max int is 263:
>>> import math
>>> math.frexp(sys.maxint)
(0.5, 64)
And you can see the difference by a simple test:
>>> hash(2**62) == 2**62
True
>>> hash(2**63) == 2**63
False
Read the complete documentation about python hashing algorithm https://github.com/python/cpython/blob/master/Python/pyhash.c#L34
As mentioned in comment you can use sys.hash_info (in python 3.X) which will give you a struct sequence of parameters used for computing
hashes.
>>> sys.hash_info
sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash24', hash_bits=64, seed_bits=128, cutoff=0)
>>>
Alongside the modulus that I've described in preceding lines, you can also get the inf value as following:
>>> hash(float('inf'))
314159
>>> sys.hash_info.inf
314159

Hash function returns plain int that means that returned value is greater than -sys.maxint and lower than sys.maxint, which means if you pass sys.maxint + x to it result would be -sys.maxint + (x - 2).
hash(sys.maxint + 1) == sys.maxint + 1 # False
hash(sys.maxint + 1) == - sys.maxint -1 # True
hash(sys.maxint + sys.maxint) == -sys.maxint + sys.maxint - 2 # True
Meanwhile 2**200 is a n times greater than sys.maxint - my guess is that hash would go over range -sys.maxint..+sys.maxint n times until it stops on plain integer in that range, like in code snippets above..
So generally, for any n <= sys.maxint:
hash(sys.maxint*n) == -sys.maxint*(n%2) + 2*(n%2)*sys.maxint - n/2 - (n + 1)%2 ## True
Note: this is true for python 2.

The implementation for the int type in cpython can be found here.
It just returns the value, except for -1, than it returns -2:
static long
int_hash(PyIntObject *v)
{
/* XXX If this is changed, you also need to change the way
Python's long, float and complex types are hashed. */
long x = v -> ob_ival;
if (x == -1)
x = -2;
return x;
}

Related

How to prevent Python from identifying a very large float as integer [duplicate]

How could I check if a number is a perfect square?
Speed is of no concern, for now, just working.
See also: Integer square root in python.
The problem with relying on any floating point computation (math.sqrt(x), or x**0.5) is that you can't really be sure it's exact (for sufficiently large integers x, it won't be, and might even overflow). Fortunately (if one's in no hurry;-) there are many pure integer approaches, such as the following...:
def is_square(apositiveint):
x = apositiveint // 2
seen = set([x])
while x * x != apositiveint:
x = (x + (apositiveint // x)) // 2
if x in seen: return False
seen.add(x)
return True
for i in range(110, 130):
print i, is_square(i)
Hint: it's based on the "Babylonian algorithm" for square root, see wikipedia. It does work for any positive number for which you have enough memory for the computation to proceed to completion;-).
Edit: let's see an example...
x = 12345678987654321234567 ** 2
for i in range(x, x+2):
print i, is_square(i)
this prints, as desired (and in a reasonable amount of time, too;-):
152415789666209426002111556165263283035677489 True
152415789666209426002111556165263283035677490 False
Please, before you propose solutions based on floating point intermediate results, make sure they work correctly on this simple example -- it's not that hard (you just need a few extra checks in case the sqrt computed is a little off), just takes a bit of care.
And then try with x**7 and find clever way to work around the problem you'll get,
OverflowError: long int too large to convert to float
you'll have to get more and more clever as the numbers keep growing, of course.
If I was in a hurry, of course, I'd use gmpy -- but then, I'm clearly biased;-).
>>> import gmpy
>>> gmpy.is_square(x**7)
1
>>> gmpy.is_square(x**7 + 1)
0
Yeah, I know, that's just so easy it feels like cheating (a bit the way I feel towards Python in general;-) -- no cleverness at all, just perfect directness and simplicity (and, in the case of gmpy, sheer speed;-)...
Use Newton's method to quickly zero in on the nearest integer square root, then square it and see if it's your number. See isqrt.
Python ≥ 3.8 has math.isqrt. If using an older version of Python, look for the "def isqrt(n)" implementation here.
import math
def is_square(i: int) -> bool:
return i == math.isqrt(i) ** 2
Since you can never depend on exact comparisons when dealing with floating point computations (such as these ways of calculating the square root), a less error-prone implementation would be
import math
def is_square(integer):
root = math.sqrt(integer)
return integer == int(root + 0.5) ** 2
Imagine integer is 9. math.sqrt(9) could be 3.0, but it could also be something like 2.99999 or 3.00001, so squaring the result right off isn't reliable. Knowing that int takes the floor value, increasing the float value by 0.5 first means we'll get the value we're looking for if we're in a range where float still has a fine enough resolution to represent numbers near the one for which we are looking.
If youre interested, I have a pure-math response to a similar question at math stackexchange, "Detecting perfect squares faster than by extracting square root".
My own implementation of isSquare(n) may not be the best, but I like it. Took me several months of study in math theory, digital computation and python programming, comparing myself to other contributors, etc., to really click with this method. I like its simplicity and efficiency though. I havent seen better. Tell me what you think.
def isSquare(n):
## Trivial checks
if type(n) != int: ## integer
return False
if n < 0: ## positivity
return False
if n == 0: ## 0 pass
return True
## Reduction by powers of 4 with bit-logic
while n&3 == 0:
n=n>>2
## Simple bit-logic test. All perfect squares, in binary,
## end in 001, when powers of 4 are factored out.
if n&7 != 1:
return False
if n==1:
return True ## is power of 4, or even power of 2
## Simple modulo equivalency test
c = n%10
if c in {3, 7}:
return False ## Not 1,4,5,6,9 in mod 10
if n % 7 in {3, 5, 6}:
return False ## Not 1,2,4 mod 7
if n % 9 in {2,3,5,6,8}:
return False
if n % 13 in {2,5,6,7,8,11}:
return False
## Other patterns
if c == 5: ## if it ends in a 5
if (n//10)%10 != 2:
return False ## then it must end in 25
if (n//100)%10 not in {0,2,6}:
return False ## and in 025, 225, or 625
if (n//100)%10 == 6:
if (n//1000)%10 not in {0,5}:
return False ## that is, 0625 or 5625
else:
if (n//10)%4 != 0:
return False ## (4k)*10 + (1,9)
## Babylonian Algorithm. Finding the integer square root.
## Root extraction.
s = (len(str(n))-1) // 2
x = (10**s) * 4
A = {x, n}
while x * x != n:
x = (x + (n // x)) >> 1
if x in A:
return False
A.add(x)
return True
Pretty straight forward. First it checks that we have an integer, and a positive one at that. Otherwise there is no point. It lets 0 slip through as True (necessary or else next block is infinite loop).
The next block of code systematically removes powers of 4 in a very fast sub-algorithm using bit shift and bit logic operations. We ultimately are not finding the isSquare of our original n but of a k<n that has been scaled down by powers of 4, if possible. This reduces the size of the number we are working with and really speeds up the Babylonian method, but also makes other checks faster too.
The third block of code performs a simple Boolean bit-logic test. The least significant three digits, in binary, of any perfect square are 001. Always. Save for leading zeros resulting from powers of 4, anyway, which has already been accounted for. If it fails the test, you immediately know it isnt a square. If it passes, you cant be sure.
Also, if we end up with a 1 for a test value then the test number was originally a power of 4, including perhaps 1 itself.
Like the third block, the fourth tests the ones-place value in decimal using simple modulus operator, and tends to catch values that slip through the previous test. Also a mod 7, mod 8, mod 9, and mod 13 test.
The fifth block of code checks for some of the well-known perfect square patterns. Numbers ending in 1 or 9 are preceded by a multiple of four. And numbers ending in 5 must end in 5625, 0625, 225, or 025. I had included others but realized they were redundant or never actually used.
Lastly, the sixth block of code resembles very much what the top answerer - Alex Martelli - answer is. Basically finds the square root using the ancient Babylonian algorithm, but restricting it to integer values while ignoring floating point. Done both for speed and extending the magnitudes of values that are testable. I used sets instead of lists because it takes far less time, I used bit shifts instead of division by two, and I smartly chose an initial start value much more efficiently.
By the way, I did test Alex Martelli's recommended test number, as well as a few numbers many orders magnitude larger, such as:
x=1000199838770766116385386300483414671297203029840113913153824086810909168246772838680374612768821282446322068401699727842499994541063844393713189701844134801239504543830737724442006577672181059194558045164589783791764790043104263404683317158624270845302200548606715007310112016456397357027095564872551184907513312382763025454118825703090010401842892088063527451562032322039937924274426211671442740679624285180817682659081248396873230975882215128049713559849427311798959652681930663843994067353808298002406164092996533923220683447265882968239141724624870704231013642255563984374257471112743917655991279898690480703935007493906644744151022265929975993911186879561257100479593516979735117799410600147341193819147290056586421994333004992422258618475766549646258761885662783430625 ** 2
for i in range(x, x+2):
print(i, isSquare(i))
printed the following results:
1000399717477066534083185452789672211951514938424998708930175541558932213310056978758103599452364409903384901149641614494249195605016959576235097480592396214296565598519295693079257885246632306201885850365687426564365813280963724310434494316592041592681626416195491751015907716210235352495422858432792668507052756279908951163972960239286719854867504108121432187033786444937064356645218196398775923710931242852937602515835035177768967470757847368349565128635934683294155947532322786360581473152034468071184081729335560769488880138928479829695277968766082973795720937033019047838250608170693879209655321034310764422462828792636246742456408134706264621790736361118589122797268261542115823201538743148116654378511916000714911467547209475246784887830649309238110794938892491396597873160778553131774466638923135932135417900066903068192088883207721545109720968467560224268563643820599665232314256575428214983451466488658896488012211237139254674708538347237589290497713613898546363590044902791724541048198769085430459186735166233549186115282574626012296888817453914112423361525305960060329430234696000121420787598967383958525670258016851764034555105019265380321048686563527396844220047826436035333266263375049097675787975100014823583097518824871586828195368306649956481108708929669583308777347960115138098217676704862934389659753628861667169905594181756523762369645897154232744410732552956489694024357481100742138381514396851789639339362228442689184910464071202445106084939268067445115601375050153663645294106475257440167535462278022649865332161044187890625 True
1000399717477066534083185452789672211951514938424998708930175541558932213310056978758103599452364409903384901149641614494249195605016959576235097480592396214296565598519295693079257885246632306201885850365687426564365813280963724310434494316592041592681626416195491751015907716210235352495422858432792668507052756279908951163972960239286719854867504108121432187033786444937064356645218196398775923710931242852937602515835035177768967470757847368349565128635934683294155947532322786360581473152034468071184081729335560769488880138928479829695277968766082973795720937033019047838250608170693879209655321034310764422462828792636246742456408134706264621790736361118589122797268261542115823201538743148116654378511916000714911467547209475246784887830649309238110794938892491396597873160778553131774466638923135932135417900066903068192088883207721545109720968467560224268563643820599665232314256575428214983451466488658896488012211237139254674708538347237589290497713613898546363590044902791724541048198769085430459186735166233549186115282574626012296888817453914112423361525305960060329430234696000121420787598967383958525670258016851764034555105019265380321048686563527396844220047826436035333266263375049097675787975100014823583097518824871586828195368306649956481108708929669583308777347960115138098217676704862934389659753628861667169905594181756523762369645897154232744410732552956489694024357481100742138381514396851789639339362228442689184910464071202445106084939268067445115601375050153663645294106475257440167535462278022649865332161044187890626 False
And it did this in 0.33 seconds.
In my opinion, my algorithm works the same as Alex Martelli's, with all the benefits thereof, but has the added benefit highly efficient simple-test rejections that save a lot of time, not to mention the reduction in size of test numbers by powers of 4, which improves speed, efficiency, accuracy and the size of numbers that are testable. Probably especially true in non-Python implementations.
Roughly 99% of all integers are rejected as non-Square before Babylonian root extraction is even implemented, and in 2/3 the time it would take the Babylonian to reject the integer. And though these tests dont speed up the process that significantly, the reduction in all test numbers to an odd by dividing out all powers of 4 really accelerates the Babylonian test.
I did a time comparison test. I tested all integers from 1 to 10 Million in succession. Using just the Babylonian method by itself (with my specially tailored initial guess) it took my Surface 3 an average of 165 seconds (with 100% accuracy). Using just the logical tests in my algorithm (excluding the Babylonian), it took 127 seconds, it rejected 99% of all integers as non-Square without mistakenly rejecting any perfect squares. Of those integers that passed, only 3% were perfect Squares (a much higher density). Using the full algorithm above that employs both the logical tests and the Babylonian root extraction, we have 100% accuracy, and test completion in only 14 seconds. The first 100 Million integers takes roughly 2 minutes 45 seconds to test.
EDIT: I have been able to bring down the time further. I can now test the integers 0 to 100 Million in 1 minute 40 seconds. A lot of time is wasted checking the data type and the positivity. Eliminate the very first two checks and I cut the experiment down by a minute. One must assume the user is smart enough to know that negatives and floats are not perfect squares.
import math
def is_square(n):
sqrt = math.sqrt(n)
return (sqrt - int(sqrt)) == 0
A perfect square is a number that can be expressed as the product of two equal integers. math.sqrt(number) return a float. int(math.sqrt(number)) casts the outcome to int.
If the square root is an integer, like 3, for example, then math.sqrt(number) - int(math.sqrt(number)) will be 0, and the if statement will be False. If the square root was a real number like 3.2, then it will be True and print "it's not a perfect square".
It fails for a large non-square such as 152415789666209426002111556165263283035677490.
My answer is:
def is_square(x):
return x**.5 % 1 == 0
It basically does a square root, then modulo by 1 to strip the integer part and if the result is 0 return True otherwise return False. In this case x can be any large number, just not as large as the max float number that python can handle: 1.7976931348623157e+308
It is incorrect for a large non-square such as 152415789666209426002111556165263283035677490.
This can be solved using the decimal module to get arbitrary precision square roots and easy checks for "exactness":
import math
from decimal import localcontext, Context, Inexact
def is_perfect_square(x):
# If you want to allow negative squares, then set x = abs(x) instead
if x < 0:
return False
# Create localized, default context so flags and traps unset
with localcontext(Context()) as ctx:
# Set a precision sufficient to represent x exactly; `x or 1` avoids
# math domain error for log10 when x is 0
ctx.prec = math.ceil(math.log10(x or 1)) + 1 # Wrap ceil call in int() on Py2
# Compute integer square root; don't even store result, just setting flags
ctx.sqrt(x).to_integral_exact()
# If previous line couldn't represent square root as exact int, sets Inexact flag
return not ctx.flags[Inexact]
For demonstration with truly huge values:
# I just kept mashing the numpad for awhile :-)
>>> base = 100009991439393999999393939398348438492389402490289028439083249803434098349083490340934903498034098390834980349083490384903843908309390282930823940230932490340983098349032098324908324098339779438974879480379380439748093874970843479280329708324970832497804329783429874329873429870234987234978034297804329782349783249873249870234987034298703249780349783497832497823497823497803429780324
>>> sqr = base ** 2
>>> sqr ** 0.5 # Too large to use floating point math
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: int too large to convert to float
>>> is_perfect_power(sqr)
True
>>> is_perfect_power(sqr-1)
False
>>> is_perfect_power(sqr+1)
False
If you increase the size of the value being tested, this eventually gets rather slow (takes close to a second for a 200,000 bit square), but for more moderate numbers (say, 20,000 bits), it's still faster than a human would notice for individual values (~33 ms on my machine). But since speed wasn't your primary concern, this is a good way to do it with Python's standard libraries.
Of course, it would be much faster to use gmpy2 and just test gmpy2.mpz(x).is_square(), but if third party packages aren't your thing, the above works quite well.
I just posted a slight variation on some of the examples above on another thread (Finding perfect squares) and thought I'd include a slight variation of what I posted there here (using nsqrt as a temporary variable), in case it's of interest / use:
import math
def is_square(n):
if not (isinstance(n, int) and (n >= 0)):
return False
else:
nsqrt = math.sqrt(n)
return nsqrt == math.trunc(nsqrt)
It is incorrect for a large non-square such as 152415789666209426002111556165263283035677490.
A variant of #Alex Martelli's solution without set
When x in seen is True:
In most cases, it is the last one added, e.g. 1022 produces the x's sequence 511, 256, 129, 68, 41, 32, 31, 31;
In some cases (i.e., for the predecessors of perfect squares), it is the second-to-last one added, e.g. 1023 produces 511, 256, 129, 68, 41, 32, 31, 32.
Hence, it suffices to stop as soon as the current x is greater than or equal to the previous one:
def is_square(n):
assert n > 1
previous = n
x = n // 2
while x * x != n:
x = (x + (n // x)) // 2
if x >= previous:
return False
previous = x
return True
x = 12345678987654321234567 ** 2
assert not is_square(x-1)
assert is_square(x)
assert not is_square(x+1)
Equivalence with the original algorithm tested for 1 < n < 10**7. On the same interval, this slightly simpler variant is about 1.4 times faster.
This is my method:
def is_square(n) -> bool:
return int(n**0.5)**2 == int(n)
Take square root of number. Convert to integer. Take the square. If the numbers are equal, then it is a perfect square otherwise not.
It is incorrect for a large square such as 152415789666209426002111556165263283035677489.
If the modulus (remainder) leftover from dividing by the square root is 0, then it is a perfect square.
def is_square(num: int) -> bool:
return num % math.sqrt(num) == 0
I checked this against a list of perfect squares going up to 1000.
It is possible to improve the Babylonian method by observing that the successive terms form a decreasing sequence if one starts above the square root of n.
def is_square(n):
assert n > 1
a = n
b = (a + n // a) // 2
while b < a:
a = b
b = (a + n // a) // 2
return a * a == n
If it's a perfect square, its square root will be an integer, the fractional part will be 0, we can use modulus operator to check fractional part, and check if it's 0, it does fail for some numbers, so, for safety, we will also check if it's square of the square root even if the fractional part is 0.
import math
def isSquare(n):
root = math.sqrt(n)
if root % 1 == 0:
if int(root) * int(root) == n:
return True
return False
isSquare(4761)
You could binary-search for the rounded square root. Square the result to see if it matches the original value.
You're probably better off with FogleBirds answer - though beware, as floating point arithmetic is approximate, which can throw this approach off. You could in principle get a false positive from a large integer which is one more than a perfect square, for instance, due to lost precision.
A simple way to do it (faster than the second one) :
def is_square(n):
return str(n**(1/2)).split(".")[1] == '0'
Another way:
def is_square(n):
if n == 0:
return True
else:
if n % 2 == 0 :
for i in range(2,n,2):
if i*i == n:
return True
else :
for i in range(1,n,2):
if i*i == n:
return True
return False
This response doesn't pertain to your stated question, but to an implicit question I see in the code you posted, ie, "how to check if something is an integer?"
The first answer you'll generally get to that question is "Don't!" And it's true that in Python, typechecking is usually not the right thing to do.
For those rare exceptions, though, instead of looking for a decimal point in the string representation of the number, the thing to do is use the isinstance function:
>>> isinstance(5,int)
True
>>> isinstance(5.0,int)
False
Of course this applies to the variable rather than a value. If I wanted to determine whether the value was an integer, I'd do this:
>>> x=5.0
>>> round(x) == x
True
But as everyone else has covered in detail, there are floating-point issues to be considered in most non-toy examples of this kind of thing.
If you want to loop over a range and do something for every number that is NOT a perfect square, you could do something like this:
def non_squares(upper):
next_square = 0
diff = 1
for i in range(0, upper):
if i == next_square:
next_square += diff
diff += 2
continue
yield i
If you want to do something for every number that IS a perfect square, the generator is even easier:
(n * n for n in range(upper))
I think that this works and is very simple:
import math
def is_square(num):
sqrt = math.sqrt(num)
return sqrt == int(sqrt)
It is incorrect for a large non-square such as 152415789666209426002111556165263283035677490.
a=int(input('enter any number'))
flag=0
for i in range(1,a):
if a==i*i:
print(a,'is perfect square number')
flag=1
break
if flag==1:
pass
else:
print(a,'is not perfect square number')
In kotlin :
It's quite easy and it passed all test cases as well.
really thanks to >> https://www.quora.com/What-is-the-quickest-way-to-determine-if-a-number-is-a-perfect-square
fun isPerfectSquare(num: Int): Boolean {
var result = false
var sum=0L
var oddNumber=1L
while(sum<num){
sum = sum + oddNumber
oddNumber = oddNumber+2
}
result = sum == num.toLong()
return result
}
def isPerfectSquare(self, num: int) -> bool:
left, right = 0, num
while left <= right:
mid = (left + right) // 2
if mid**2 < num:
left = mid + 1
elif mid**2 > num:
right = mid - 1
else:
return True
return False
Decide how long the number will be.
take a delta 0.000000000000.......000001
see if the (sqrt(x))^2 - x is greater / equal /smaller than delta and decide based on the delta error.
import math
def is_square(n):
sqrt = math.sqrt(n)
return sqrt == int(sqrt)
It fails for a large non-square such as 152415789666209426002111556165263283035677490.
The idea is to run a loop from i = 1 to floor(sqrt(n)) then check if squaring it makes n.
bool isPerfectSquare(int n)
{
for (int i = 1; i * i <= n; i++) {
// If (i * i = n)
if ((n % i == 0) && (n / i == i)) {
return true;
}
}
return false;
}

16 bit hex into 14 bit signed int python?

I get a 16 bit Hex number (so 4 digits) from a sensor and want to convert it into a signed integer so I can actually use it.
There are plenty of codes on the internet that get the job done, but with this sensor it is a bit more arkward.
In fact, the number has only 14 bit, the first two (from the left) are irrelevant.
I tried to do it (in Python 3) but failed pretty hard.
Any suggestions how to "cut" the first two digits of the number and then make the rest a signed integer?
The Datasheet says, that E002 should be -8190 ane 1FFE should be +8190.
Thanks a lot!
Let's define a conversion function:
>>> def f(x):
... r = int(x, 16)
... return r if r < 2**15 else r - 2**16
...
Now, let's test the function against the values that the datahsheet provided:
>>> f('1FFE')
8190
>>> f('E002')
-8190
The usual convention for signed numbers is that a number is negative if the high bit is set and positive if it isn't. Following this convention, '0000' is zero and 'FFFF' is -1. The issue is that int assumes that a number is positive and we have to correct for that:
For any number equal to or less than 0x7FFF, then high bit is unset and the number is positive. Thus we return r=int(x,16) if r<2**15.
For any number r-int(x,16) that is equal to or greater than 0x8000, we return r - 2**16.
While your sensor may only produce 14-bin data, the manufacturer is following the standard convention for 16-bit integers.
Alternative
Instead of converting x to r and testing the value of r, we can directly test whether the high bit in x is set:
>>> def g(x):
... return int(x, 16) if x[0] in '01234567' else int(x, 16) - 2**16
...
>>> g('1FFE')
8190
>>> g('E002')
-8190
Ignoring the upper bits
Let's suppose that the manufacturer is not following standard conventions and that the upper 2-bits are unreliable. In this case, we can use modulo, %, to remove them and, after adjusting the other constants as appropriate for 14-bit integers, we have:
>>> def h(x):
... r = int(x, 16) % 2**14
... return r if r < 2**13 else r - 2**14
...
>>> h('1FFE')
8190
>>> h('E002')
-8190
There is a general algorithm for sign-extending a two's-complement integer value val whose number of bits is nbits (so that the top-most of those bits is the sign bit).
That algorithm is:
treat the value as a non-negative number, and if needed, mask off additional bits
invert the sign bit, still treating the result as a non-negative number
subtract the numeric value of the sign bit considered as a non-negative number, producing as a result, a signed number.
Expressing this algorithm in Python produces:
from __future__ import print_function
def sext(val, nbits):
assert nbits > 0
signbit = 1 << (nbits - 1)
mask = (1 << nbits) - 1
return ((val & mask) ^ signbit) - signbit
if __name__ == '__main__':
print('sext(0xe002, 14) =', sext(0xe002, 14))
print('sext(0x1ffe, 14) =', sext(0x1ffe, 14))
which when run shows the desired results:
sext(0xe002, 14) = -8190
sext(0x1ffe, 14) = 8190

When is hash(n) == n in Python?

I've been playing with Python's hash function. For small integers, it appears hash(n) == n always. However this does not extend to large numbers:
>>> hash(2**100) == 2**100
False
I'm not surprised, I understand hash takes a finite range of values. What is that range?
I tried using binary search to find the smallest number hash(n) != n
>>> import codejamhelpers # pip install codejamhelpers
>>> help(codejamhelpers.binary_search)
Help on function binary_search in module codejamhelpers.binary_search:
binary_search(f, t)
Given an increasing function :math:`f`, find the greatest non-negative integer :math:`n` such that :math:`f(n) \le t`. If :math:`f(n) > t` for all :math:`n \ge 0`, return None.
>>> f = lambda n: int(hash(n) != n)
>>> n = codejamhelpers.binary_search(f, 0)
>>> hash(n)
2305843009213693950
>>> hash(n+1)
0
What's special about 2305843009213693951? I note it's less than sys.maxsize == 9223372036854775807
Edit: I'm using Python 3. I ran the same binary search on Python 2 and got a different result 2147483648, which I note is sys.maxint+1
I also played with [hash(random.random()) for i in range(10**6)] to estimate the range of hash function. The max is consistently below n above. Comparing the min, it seems Python 3's hash is always positively valued, whereas Python 2's hash can take negative values.
2305843009213693951 is 2^61 - 1. It's the largest Mersenne prime that fits into 64 bits.
If you have to make a hash just by taking the value mod some number, then a large Mersenne prime is a good choice -- it's easy to compute and ensures an even distribution of possibilities. (Although I personally would never make a hash this way)
It's especially convenient to compute the modulus for floating point numbers. They have an exponential component that multiplies the whole number by 2^x. Since 2^61 = 1 mod 2^61-1, you only need to consider the (exponent) mod 61.
See: https://en.wikipedia.org/wiki/Mersenne_prime
Based on python documentation in pyhash.c file:
For numeric types, the hash of a number x is based on the reduction
of x modulo the prime P = 2**_PyHASH_BITS - 1. It's designed so that
hash(x) == hash(y) whenever x and y are numerically equal, even if
x and y have different types.
So for a 64/32 bit machine, the reduction would be 2 _PyHASH_BITS - 1, but what is _PyHASH_BITS?
You can find it in pyhash.h header file which for a 64 bit machine has been defined as 61 (you can read more explanation in pyconfig.h file).
#if SIZEOF_VOID_P >= 8
# define _PyHASH_BITS 61
#else
# define _PyHASH_BITS 31
#endif
So first off all it's based on your platform for example in my 64bit Linux platform the reduction is 261-1, which is 2305843009213693951:
>>> 2**61 - 1
2305843009213693951
Also You can use math.frexp in order to get the mantissa and exponent of sys.maxint which for a 64 bit machine shows that max int is 263:
>>> import math
>>> math.frexp(sys.maxint)
(0.5, 64)
And you can see the difference by a simple test:
>>> hash(2**62) == 2**62
True
>>> hash(2**63) == 2**63
False
Read the complete documentation about python hashing algorithm https://github.com/python/cpython/blob/master/Python/pyhash.c#L34
As mentioned in comment you can use sys.hash_info (in python 3.X) which will give you a struct sequence of parameters used for computing
hashes.
>>> sys.hash_info
sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash24', hash_bits=64, seed_bits=128, cutoff=0)
>>>
Alongside the modulus that I've described in preceding lines, you can also get the inf value as following:
>>> hash(float('inf'))
314159
>>> sys.hash_info.inf
314159
Hash function returns plain int that means that returned value is greater than -sys.maxint and lower than sys.maxint, which means if you pass sys.maxint + x to it result would be -sys.maxint + (x - 2).
hash(sys.maxint + 1) == sys.maxint + 1 # False
hash(sys.maxint + 1) == - sys.maxint -1 # True
hash(sys.maxint + sys.maxint) == -sys.maxint + sys.maxint - 2 # True
Meanwhile 2**200 is a n times greater than sys.maxint - my guess is that hash would go over range -sys.maxint..+sys.maxint n times until it stops on plain integer in that range, like in code snippets above..
So generally, for any n <= sys.maxint:
hash(sys.maxint*n) == -sys.maxint*(n%2) + 2*(n%2)*sys.maxint - n/2 - (n + 1)%2 ## True
Note: this is true for python 2.
The implementation for the int type in cpython can be found here.
It just returns the value, except for -1, than it returns -2:
static long
int_hash(PyIntObject *v)
{
/* XXX If this is changed, you also need to change the way
Python's long, float and complex types are hashed. */
long x = v -> ob_ival;
if (x == -1)
x = -2;
return x;
}

How to check if a float value is a whole number

I am trying to find the largest cube root that is a whole number, that is less than 12,000.
processing = True
n = 12000
while processing:
n -= 1
if n ** (1/3) == #checks to see if this has decimals or not
I am not sure how to check if it is a whole number or not though! I could convert it to a string then use indexing to check the end values and see whether they are zero or not, that seems rather cumbersome though. Is there a simpler way?
To check if a float value is a whole number, use the float.is_integer() method:
>>> (1.0).is_integer()
True
>>> (1.555).is_integer()
False
The method was added to the float type in Python 2.6.
Take into account that in Python 2, 1/3 is 0 (floor division for integer operands!), and that floating point arithmetic can be imprecise (a float is an approximation using binary fractions, not a precise real number). But adjusting your loop a little this gives:
>>> for n in range(12000, -1, -1):
... if (n ** (1.0/3)).is_integer():
... print n
...
27
8
1
0
which means that anything over 3 cubed, (including 10648) was missed out due to the aforementioned imprecision:
>>> (4**3) ** (1.0/3)
3.9999999999999996
>>> 10648 ** (1.0/3)
21.999999999999996
You'd have to check for numbers close to the whole number instead, or not use float() to find your number. Like rounding down the cube root of 12000:
>>> int(12000 ** (1.0/3))
22
>>> 22 ** 3
10648
If you are using Python 3.5 or newer, you can use the math.isclose() function to see if a floating point value is within a configurable margin:
>>> from math import isclose
>>> isclose((4**3) ** (1.0/3), 4)
True
>>> isclose(10648 ** (1.0/3), 22)
True
For older versions, the naive implementation of that function (skipping error checking and ignoring infinity and NaN) as mentioned in PEP485:
def isclose(a, b, rel_tol=1e-9, abs_tol=0.0):
return abs(a - b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol)
We can use the modulo (%) operator. This tells us how many remainders we have when we divide x by y - expresses as x % y. Every whole number must divide by 1, so if there is a remainder, it must not be a whole number.
This function will return a boolean, True or False, depending on whether n is a whole number.
def is_whole(n):
return n % 1 == 0
You could use this:
if k == int(k):
print(str(k) + " is a whole number!")
You don't need to loop or to check anything. Just take a cube root of 12,000 and round it down:
r = int(12000**(1/3.0))
print r*r*r # 10648
You can use a modulo operation for that.
if (n ** (1.0/3)) % 1 != 0:
print("We have a decimal number here!")
How about
if x%1==0:
print "is integer"
Wouldn't it be easier to test the cube roots? Start with 20 (20**3 = 8000) and go up to 30 (30**3 = 27000). Then you have to test fewer than 10 integers.
for i in range(20, 30):
print("Trying {0}".format(i))
if i ** 3 > 12000:
print("Maximum integral cube root less than 12000: {0}".format(i - 1))
break
The above answers work for many cases but they miss some. Consider the following:
fl = sum([0.1]*10) # this is 0.9999999999999999, but we want to say it IS an int
Using this as a benchmark, some of the other suggestions don't get the behavior we might want:
fl.is_integer() # False
fl % 1 == 0 # False
Instead try:
def isclose(a, b, rel_tol=1e-09, abs_tol=0.0):
return abs(a-b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol)
def is_integer(fl):
return isclose(fl, round(fl))
now we get:
is_integer(fl) # True
isclose comes with Python 3.5+, and for other Python's you can use this mostly equivalent definition (as mentioned in the corresponding PEP)
Just a side info, is_integer is doing internally:
import math
isInteger = (math.floor(x) == x)
Not exactly in python, but the cpython implementation is implemented as mentioned above.
All the answers are good but a sure fire method would be
def whole (n):
return (n*10)%10==0
The function returns True if it's a whole number else False....I know I'm a bit late but here's one of the interesting methods which I made...
Edit: as stated by the comment below, a cheaper equivalent test would be:
def whole(n):
return n%1==0
You can use something like:
num = 1.9899
bool(int(num)-num)
#returns True
if it is True, It means it holds some value, hence not a whole number. Else
num = 1.0
bool(int(num)-num)
# returns False
>>> def is_near_integer(n, precision=8, get_integer=False):
... if get_integer:
... return int(round(n, precision))
... else:
... return round(n) == round(n, precision)
...
>>> print(is_near_integer(10648 ** (1.0/3)))
True
>>> print(is_near_integer(10648 ** (1.0/3), get_integer=True))
22
>>> for i in [4.9, 5.1, 4.99, 5.01, 4.999, 5.001, 4.9999, 5.0001, 4.99999, 5.000
01, 4.999999, 5.000001]:
... print(i, is_near_integer(i, 4))
...
4.9 False
5.1 False
4.99 False
5.01 False
4.999 False
5.001 False
4.9999 False
5.0001 False
4.99999 True
5.00001 True
4.999999 True
5.000001 True
>>>
This problem has been solved, but I would like to propose an additional mathematical-based solution for funcies.
The benefit of this approach is that it calculates the whole number part of your number, which may be beneficial depending on your general task.
Algorithm:
Decompose whole number part of your number its a sum of its decimals (e.g., 327=3*100+2*10+7*1)
take difference between calculated whole number and number itself
decide whether difference is close enough to be considered an integer.
from math import ceil, log, isclose
def is_whole(x: float) -> bool:
n_digits = ceil(log(x,10)) # number of digits of decimals at or above ones
digits = [(n//(10**i))%10 for i in range(n_digits)] # parse digits of `x` at or above ones decimal
whole = 0 # will equal the whole number part of `x`
for i in range(n_digits):
decimal = 10**i
digit = digits[i]
whole += digit*decimal
diff = whole - x
return isclose(diff, 0.0)
NOTE: the idea of parsing digits of a number was realized from here
Try using:
int(val) == val
It will give lot more precision than any other methods.
You can use the round function to compute the value.
Yes in python as many have pointed when we compute the value of a cube root, it will give you an output with a little bit of error. To check if the value is a whole number you can use the following function:
def cube_integer(n):
if round(n**(1.0/3.0))**3 == n:
return True
return False
But remember that int(n) is equivalent to math.floor and because of this if you find the int(41063625**(1.0/3.0)) you will get 344 instead of 345.
So please be careful when using int withe cube roots.

Check if a number is a perfect square

How could I check if a number is a perfect square?
Speed is of no concern, for now, just working.
See also: Integer square root in python.
The problem with relying on any floating point computation (math.sqrt(x), or x**0.5) is that you can't really be sure it's exact (for sufficiently large integers x, it won't be, and might even overflow). Fortunately (if one's in no hurry;-) there are many pure integer approaches, such as the following...:
def is_square(apositiveint):
x = apositiveint // 2
seen = set([x])
while x * x != apositiveint:
x = (x + (apositiveint // x)) // 2
if x in seen: return False
seen.add(x)
return True
for i in range(110, 130):
print i, is_square(i)
Hint: it's based on the "Babylonian algorithm" for square root, see wikipedia. It does work for any positive number for which you have enough memory for the computation to proceed to completion;-).
Edit: let's see an example...
x = 12345678987654321234567 ** 2
for i in range(x, x+2):
print i, is_square(i)
this prints, as desired (and in a reasonable amount of time, too;-):
152415789666209426002111556165263283035677489 True
152415789666209426002111556165263283035677490 False
Please, before you propose solutions based on floating point intermediate results, make sure they work correctly on this simple example -- it's not that hard (you just need a few extra checks in case the sqrt computed is a little off), just takes a bit of care.
And then try with x**7 and find clever way to work around the problem you'll get,
OverflowError: long int too large to convert to float
you'll have to get more and more clever as the numbers keep growing, of course.
If I was in a hurry, of course, I'd use gmpy -- but then, I'm clearly biased;-).
>>> import gmpy
>>> gmpy.is_square(x**7)
1
>>> gmpy.is_square(x**7 + 1)
0
Yeah, I know, that's just so easy it feels like cheating (a bit the way I feel towards Python in general;-) -- no cleverness at all, just perfect directness and simplicity (and, in the case of gmpy, sheer speed;-)...
Use Newton's method to quickly zero in on the nearest integer square root, then square it and see if it's your number. See isqrt.
Python ≥ 3.8 has math.isqrt. If using an older version of Python, look for the "def isqrt(n)" implementation here.
import math
def is_square(i: int) -> bool:
return i == math.isqrt(i) ** 2
Since you can never depend on exact comparisons when dealing with floating point computations (such as these ways of calculating the square root), a less error-prone implementation would be
import math
def is_square(integer):
root = math.sqrt(integer)
return integer == int(root + 0.5) ** 2
Imagine integer is 9. math.sqrt(9) could be 3.0, but it could also be something like 2.99999 or 3.00001, so squaring the result right off isn't reliable. Knowing that int takes the floor value, increasing the float value by 0.5 first means we'll get the value we're looking for if we're in a range where float still has a fine enough resolution to represent numbers near the one for which we are looking.
If youre interested, I have a pure-math response to a similar question at math stackexchange, "Detecting perfect squares faster than by extracting square root".
My own implementation of isSquare(n) may not be the best, but I like it. Took me several months of study in math theory, digital computation and python programming, comparing myself to other contributors, etc., to really click with this method. I like its simplicity and efficiency though. I havent seen better. Tell me what you think.
def isSquare(n):
## Trivial checks
if type(n) != int: ## integer
return False
if n < 0: ## positivity
return False
if n == 0: ## 0 pass
return True
## Reduction by powers of 4 with bit-logic
while n&3 == 0:
n=n>>2
## Simple bit-logic test. All perfect squares, in binary,
## end in 001, when powers of 4 are factored out.
if n&7 != 1:
return False
if n==1:
return True ## is power of 4, or even power of 2
## Simple modulo equivalency test
c = n%10
if c in {3, 7}:
return False ## Not 1,4,5,6,9 in mod 10
if n % 7 in {3, 5, 6}:
return False ## Not 1,2,4 mod 7
if n % 9 in {2,3,5,6,8}:
return False
if n % 13 in {2,5,6,7,8,11}:
return False
## Other patterns
if c == 5: ## if it ends in a 5
if (n//10)%10 != 2:
return False ## then it must end in 25
if (n//100)%10 not in {0,2,6}:
return False ## and in 025, 225, or 625
if (n//100)%10 == 6:
if (n//1000)%10 not in {0,5}:
return False ## that is, 0625 or 5625
else:
if (n//10)%4 != 0:
return False ## (4k)*10 + (1,9)
## Babylonian Algorithm. Finding the integer square root.
## Root extraction.
s = (len(str(n))-1) // 2
x = (10**s) * 4
A = {x, n}
while x * x != n:
x = (x + (n // x)) >> 1
if x in A:
return False
A.add(x)
return True
Pretty straight forward. First it checks that we have an integer, and a positive one at that. Otherwise there is no point. It lets 0 slip through as True (necessary or else next block is infinite loop).
The next block of code systematically removes powers of 4 in a very fast sub-algorithm using bit shift and bit logic operations. We ultimately are not finding the isSquare of our original n but of a k<n that has been scaled down by powers of 4, if possible. This reduces the size of the number we are working with and really speeds up the Babylonian method, but also makes other checks faster too.
The third block of code performs a simple Boolean bit-logic test. The least significant three digits, in binary, of any perfect square are 001. Always. Save for leading zeros resulting from powers of 4, anyway, which has already been accounted for. If it fails the test, you immediately know it isnt a square. If it passes, you cant be sure.
Also, if we end up with a 1 for a test value then the test number was originally a power of 4, including perhaps 1 itself.
Like the third block, the fourth tests the ones-place value in decimal using simple modulus operator, and tends to catch values that slip through the previous test. Also a mod 7, mod 8, mod 9, and mod 13 test.
The fifth block of code checks for some of the well-known perfect square patterns. Numbers ending in 1 or 9 are preceded by a multiple of four. And numbers ending in 5 must end in 5625, 0625, 225, or 025. I had included others but realized they were redundant or never actually used.
Lastly, the sixth block of code resembles very much what the top answerer - Alex Martelli - answer is. Basically finds the square root using the ancient Babylonian algorithm, but restricting it to integer values while ignoring floating point. Done both for speed and extending the magnitudes of values that are testable. I used sets instead of lists because it takes far less time, I used bit shifts instead of division by two, and I smartly chose an initial start value much more efficiently.
By the way, I did test Alex Martelli's recommended test number, as well as a few numbers many orders magnitude larger, such as:
x=1000199838770766116385386300483414671297203029840113913153824086810909168246772838680374612768821282446322068401699727842499994541063844393713189701844134801239504543830737724442006577672181059194558045164589783791764790043104263404683317158624270845302200548606715007310112016456397357027095564872551184907513312382763025454118825703090010401842892088063527451562032322039937924274426211671442740679624285180817682659081248396873230975882215128049713559849427311798959652681930663843994067353808298002406164092996533923220683447265882968239141724624870704231013642255563984374257471112743917655991279898690480703935007493906644744151022265929975993911186879561257100479593516979735117799410600147341193819147290056586421994333004992422258618475766549646258761885662783430625 ** 2
for i in range(x, x+2):
print(i, isSquare(i))
printed the following results:
1000399717477066534083185452789672211951514938424998708930175541558932213310056978758103599452364409903384901149641614494249195605016959576235097480592396214296565598519295693079257885246632306201885850365687426564365813280963724310434494316592041592681626416195491751015907716210235352495422858432792668507052756279908951163972960239286719854867504108121432187033786444937064356645218196398775923710931242852937602515835035177768967470757847368349565128635934683294155947532322786360581473152034468071184081729335560769488880138928479829695277968766082973795720937033019047838250608170693879209655321034310764422462828792636246742456408134706264621790736361118589122797268261542115823201538743148116654378511916000714911467547209475246784887830649309238110794938892491396597873160778553131774466638923135932135417900066903068192088883207721545109720968467560224268563643820599665232314256575428214983451466488658896488012211237139254674708538347237589290497713613898546363590044902791724541048198769085430459186735166233549186115282574626012296888817453914112423361525305960060329430234696000121420787598967383958525670258016851764034555105019265380321048686563527396844220047826436035333266263375049097675787975100014823583097518824871586828195368306649956481108708929669583308777347960115138098217676704862934389659753628861667169905594181756523762369645897154232744410732552956489694024357481100742138381514396851789639339362228442689184910464071202445106084939268067445115601375050153663645294106475257440167535462278022649865332161044187890625 True
1000399717477066534083185452789672211951514938424998708930175541558932213310056978758103599452364409903384901149641614494249195605016959576235097480592396214296565598519295693079257885246632306201885850365687426564365813280963724310434494316592041592681626416195491751015907716210235352495422858432792668507052756279908951163972960239286719854867504108121432187033786444937064356645218196398775923710931242852937602515835035177768967470757847368349565128635934683294155947532322786360581473152034468071184081729335560769488880138928479829695277968766082973795720937033019047838250608170693879209655321034310764422462828792636246742456408134706264621790736361118589122797268261542115823201538743148116654378511916000714911467547209475246784887830649309238110794938892491396597873160778553131774466638923135932135417900066903068192088883207721545109720968467560224268563643820599665232314256575428214983451466488658896488012211237139254674708538347237589290497713613898546363590044902791724541048198769085430459186735166233549186115282574626012296888817453914112423361525305960060329430234696000121420787598967383958525670258016851764034555105019265380321048686563527396844220047826436035333266263375049097675787975100014823583097518824871586828195368306649956481108708929669583308777347960115138098217676704862934389659753628861667169905594181756523762369645897154232744410732552956489694024357481100742138381514396851789639339362228442689184910464071202445106084939268067445115601375050153663645294106475257440167535462278022649865332161044187890626 False
And it did this in 0.33 seconds.
In my opinion, my algorithm works the same as Alex Martelli's, with all the benefits thereof, but has the added benefit highly efficient simple-test rejections that save a lot of time, not to mention the reduction in size of test numbers by powers of 4, which improves speed, efficiency, accuracy and the size of numbers that are testable. Probably especially true in non-Python implementations.
Roughly 99% of all integers are rejected as non-Square before Babylonian root extraction is even implemented, and in 2/3 the time it would take the Babylonian to reject the integer. And though these tests dont speed up the process that significantly, the reduction in all test numbers to an odd by dividing out all powers of 4 really accelerates the Babylonian test.
I did a time comparison test. I tested all integers from 1 to 10 Million in succession. Using just the Babylonian method by itself (with my specially tailored initial guess) it took my Surface 3 an average of 165 seconds (with 100% accuracy). Using just the logical tests in my algorithm (excluding the Babylonian), it took 127 seconds, it rejected 99% of all integers as non-Square without mistakenly rejecting any perfect squares. Of those integers that passed, only 3% were perfect Squares (a much higher density). Using the full algorithm above that employs both the logical tests and the Babylonian root extraction, we have 100% accuracy, and test completion in only 14 seconds. The first 100 Million integers takes roughly 2 minutes 45 seconds to test.
EDIT: I have been able to bring down the time further. I can now test the integers 0 to 100 Million in 1 minute 40 seconds. A lot of time is wasted checking the data type and the positivity. Eliminate the very first two checks and I cut the experiment down by a minute. One must assume the user is smart enough to know that negatives and floats are not perfect squares.
import math
def is_square(n):
sqrt = math.sqrt(n)
return (sqrt - int(sqrt)) == 0
A perfect square is a number that can be expressed as the product of two equal integers. math.sqrt(number) return a float. int(math.sqrt(number)) casts the outcome to int.
If the square root is an integer, like 3, for example, then math.sqrt(number) - int(math.sqrt(number)) will be 0, and the if statement will be False. If the square root was a real number like 3.2, then it will be True and print "it's not a perfect square".
It fails for a large non-square such as 152415789666209426002111556165263283035677490.
My answer is:
def is_square(x):
return x**.5 % 1 == 0
It basically does a square root, then modulo by 1 to strip the integer part and if the result is 0 return True otherwise return False. In this case x can be any large number, just not as large as the max float number that python can handle: 1.7976931348623157e+308
It is incorrect for a large non-square such as 152415789666209426002111556165263283035677490.
This can be solved using the decimal module to get arbitrary precision square roots and easy checks for "exactness":
import math
from decimal import localcontext, Context, Inexact
def is_perfect_square(x):
# If you want to allow negative squares, then set x = abs(x) instead
if x < 0:
return False
# Create localized, default context so flags and traps unset
with localcontext(Context()) as ctx:
# Set a precision sufficient to represent x exactly; `x or 1` avoids
# math domain error for log10 when x is 0
ctx.prec = math.ceil(math.log10(x or 1)) + 1 # Wrap ceil call in int() on Py2
# Compute integer square root; don't even store result, just setting flags
ctx.sqrt(x).to_integral_exact()
# If previous line couldn't represent square root as exact int, sets Inexact flag
return not ctx.flags[Inexact]
For demonstration with truly huge values:
# I just kept mashing the numpad for awhile :-)
>>> base = 100009991439393999999393939398348438492389402490289028439083249803434098349083490340934903498034098390834980349083490384903843908309390282930823940230932490340983098349032098324908324098339779438974879480379380439748093874970843479280329708324970832497804329783429874329873429870234987234978034297804329782349783249873249870234987034298703249780349783497832497823497823497803429780324
>>> sqr = base ** 2
>>> sqr ** 0.5 # Too large to use floating point math
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: int too large to convert to float
>>> is_perfect_power(sqr)
True
>>> is_perfect_power(sqr-1)
False
>>> is_perfect_power(sqr+1)
False
If you increase the size of the value being tested, this eventually gets rather slow (takes close to a second for a 200,000 bit square), but for more moderate numbers (say, 20,000 bits), it's still faster than a human would notice for individual values (~33 ms on my machine). But since speed wasn't your primary concern, this is a good way to do it with Python's standard libraries.
Of course, it would be much faster to use gmpy2 and just test gmpy2.mpz(x).is_square(), but if third party packages aren't your thing, the above works quite well.
I just posted a slight variation on some of the examples above on another thread (Finding perfect squares) and thought I'd include a slight variation of what I posted there here (using nsqrt as a temporary variable), in case it's of interest / use:
import math
def is_square(n):
if not (isinstance(n, int) and (n >= 0)):
return False
else:
nsqrt = math.sqrt(n)
return nsqrt == math.trunc(nsqrt)
It is incorrect for a large non-square such as 152415789666209426002111556165263283035677490.
A variant of #Alex Martelli's solution without set
When x in seen is True:
In most cases, it is the last one added, e.g. 1022 produces the x's sequence 511, 256, 129, 68, 41, 32, 31, 31;
In some cases (i.e., for the predecessors of perfect squares), it is the second-to-last one added, e.g. 1023 produces 511, 256, 129, 68, 41, 32, 31, 32.
Hence, it suffices to stop as soon as the current x is greater than or equal to the previous one:
def is_square(n):
assert n > 1
previous = n
x = n // 2
while x * x != n:
x = (x + (n // x)) // 2
if x >= previous:
return False
previous = x
return True
x = 12345678987654321234567 ** 2
assert not is_square(x-1)
assert is_square(x)
assert not is_square(x+1)
Equivalence with the original algorithm tested for 1 < n < 10**7. On the same interval, this slightly simpler variant is about 1.4 times faster.
This is my method:
def is_square(n) -> bool:
return int(n**0.5)**2 == int(n)
Take square root of number. Convert to integer. Take the square. If the numbers are equal, then it is a perfect square otherwise not.
It is incorrect for a large square such as 152415789666209426002111556165263283035677489.
If the modulus (remainder) leftover from dividing by the square root is 0, then it is a perfect square.
def is_square(num: int) -> bool:
return num % math.sqrt(num) == 0
I checked this against a list of perfect squares going up to 1000.
It is possible to improve the Babylonian method by observing that the successive terms form a decreasing sequence if one starts above the square root of n.
def is_square(n):
assert n > 1
a = n
b = (a + n // a) // 2
while b < a:
a = b
b = (a + n // a) // 2
return a * a == n
If it's a perfect square, its square root will be an integer, the fractional part will be 0, we can use modulus operator to check fractional part, and check if it's 0, it does fail for some numbers, so, for safety, we will also check if it's square of the square root even if the fractional part is 0.
import math
def isSquare(n):
root = math.sqrt(n)
if root % 1 == 0:
if int(root) * int(root) == n:
return True
return False
isSquare(4761)
You could binary-search for the rounded square root. Square the result to see if it matches the original value.
You're probably better off with FogleBirds answer - though beware, as floating point arithmetic is approximate, which can throw this approach off. You could in principle get a false positive from a large integer which is one more than a perfect square, for instance, due to lost precision.
A simple way to do it (faster than the second one) :
def is_square(n):
return str(n**(1/2)).split(".")[1] == '0'
Another way:
def is_square(n):
if n == 0:
return True
else:
if n % 2 == 0 :
for i in range(2,n,2):
if i*i == n:
return True
else :
for i in range(1,n,2):
if i*i == n:
return True
return False
This response doesn't pertain to your stated question, but to an implicit question I see in the code you posted, ie, "how to check if something is an integer?"
The first answer you'll generally get to that question is "Don't!" And it's true that in Python, typechecking is usually not the right thing to do.
For those rare exceptions, though, instead of looking for a decimal point in the string representation of the number, the thing to do is use the isinstance function:
>>> isinstance(5,int)
True
>>> isinstance(5.0,int)
False
Of course this applies to the variable rather than a value. If I wanted to determine whether the value was an integer, I'd do this:
>>> x=5.0
>>> round(x) == x
True
But as everyone else has covered in detail, there are floating-point issues to be considered in most non-toy examples of this kind of thing.
If you want to loop over a range and do something for every number that is NOT a perfect square, you could do something like this:
def non_squares(upper):
next_square = 0
diff = 1
for i in range(0, upper):
if i == next_square:
next_square += diff
diff += 2
continue
yield i
If you want to do something for every number that IS a perfect square, the generator is even easier:
(n * n for n in range(upper))
I think that this works and is very simple:
import math
def is_square(num):
sqrt = math.sqrt(num)
return sqrt == int(sqrt)
It is incorrect for a large non-square such as 152415789666209426002111556165263283035677490.
a=int(input('enter any number'))
flag=0
for i in range(1,a):
if a==i*i:
print(a,'is perfect square number')
flag=1
break
if flag==1:
pass
else:
print(a,'is not perfect square number')
In kotlin :
It's quite easy and it passed all test cases as well.
really thanks to >> https://www.quora.com/What-is-the-quickest-way-to-determine-if-a-number-is-a-perfect-square
fun isPerfectSquare(num: Int): Boolean {
var result = false
var sum=0L
var oddNumber=1L
while(sum<num){
sum = sum + oddNumber
oddNumber = oddNumber+2
}
result = sum == num.toLong()
return result
}
def isPerfectSquare(self, num: int) -> bool:
left, right = 0, num
while left <= right:
mid = (left + right) // 2
if mid**2 < num:
left = mid + 1
elif mid**2 > num:
right = mid - 1
else:
return True
return False
This is an elegant, simple, fast and arbitrary solution that works for Python version >= 3.8:
from math import isqrt
def is_square(number):
if number >= 0:
return isqrt(number) ** 2 == number
return False
Decide how long the number will be.
take a delta 0.000000000000.......000001
see if the (sqrt(x))^2 - x is greater / equal /smaller than delta and decide based on the delta error.
import math
def is_square(n):
sqrt = math.sqrt(n)
return sqrt == int(sqrt)
It fails for a large non-square such as 152415789666209426002111556165263283035677490.
The idea is to run a loop from i = 1 to floor(sqrt(n)) then check if squaring it makes n.
bool isPerfectSquare(int n)
{
for (int i = 1; i * i <= n; i++) {
// If (i * i = n)
if ((n % i == 0) && (n / i == i)) {
return true;
}
}
return false;
}

Categories