How to fix a int overflow? - python

I'm having a python code that uses the number 2637268776 (bigger than sys.maxint in 32-bit systems). Therefore it is saved as a long type.
I'm using a C++ framework bindings in my code, so I have a case where it's being converted into int32, resulting in an int32 overflow:
2637268776 --> -1657698520
In my case, it can happen only one time, so it's safe to assume that if the integer is negative, we had a single int overflow. How can I mathematically reverse the numbers?

In short, you can't. There are many long integers that would map to the same negative number. In your example, these are 2637268776L, 6932236072L, 11227203368L, 15522170664L, 19817137960L etc.
Also, it is possible to get a positive number as a result of such an overflow. For example, 4294967297L would map to 1.

You could add 2 * (sys.maxint + 1) to it:
>>> -1657698520 + (2 * (sys.maxint + 1))
2637268776L
but that only works for original values < 2 * (sys.maxint + 1), as beyond that the overflow will run into positive numbers, or worse, overflow again.

Related

Answer becomes infinite [duplicate]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
I am working on a program that outputs the condition number of a big matrix, so I used the Power Method to get the Largest EigenValue, but the values are large numbers (float) larger than 1*10^310, and in the end the values become "Infinity", I tried the decimal module but it's the same. How can I store those large float values? Or maybe another method that uses shorter values?
(I'm not allowed to use any module that helps explicitly the proccess as Numpy)
You want to use the decimal module:
from decimal import Decimal
x = Decimal('1.345e1310')
y = Decimal('1.0e1310')
print(x + y)
Result:
2.345E+1310
Don't work with floating point values if you can help it; they are very difficult to reason about and will bite you!
Whenever you are trying to work with floats, especially ones with lots of digits, you should consider how you can shift it into an integer range and if you have invalid or needless accuracy beyond the floating part of your value
perhaps into a bigger int such as 10**400 or 10**100000, which should provide plenty of room for your floating point digits, while allowing you to work in the integer space
directly convert or scale down, discarding digits beyond the decimal point (consider how accurate the measurement really is)
>>> int(1.0 * 10) * 10**999 # divide off 10**690 later or note in units
10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
>>> int(1.0 * 10**10) # multiply by 10**300 later or note in units
10000000000
Practically, this is why you would want scientific notation - don't store the data with all its digits if you don't need them, keep the smallest amount you need and a second multiplier for the size factor (scientific notation does use a floating-point, but the idea is the same for integers)
Then, rather than working with floating points, you can recall the multiplier(s) at the end when you're done with your math (even multiplying them out separately)
It may even be sufficient to remove a significant portion of the digits entirely in some regular manner, and display the factor in the post-calculation units for whom or whatever is consuming the data
While this question is about large numbers, even decimal.Decimal unfortunately does not handle the small bits of floating points the way one might expect, as they're subject to some aliasing from how they're stored
https://en.wikipedia.org/wiki/Floating-point_arithmetic#IEEE_754:_floating_point_in_modern_computers
This is problematic with normal python floats, and so extends to Decimals, even of a size you may expect to see in normal use!
>>> 9007199254740993.0
9007199254740992.0
>>> Decimal(9007199254740993.0) # NOTE converted to float before Decimal
Decimal('9007199254740992')
Adapted from Which is the first integer that an IEEE 754 float is incapable of representing exactly?
Example to the original question
>>> a = Decimal(10**310) * Decimal(1.0)
>>> b = Decimal(1)
>>> a + b - a
Decimal('0E+283')
Further examples
>>> a = Decimal(10**310)
>>> b = Decimal(0.1)
>>> a + b - a
Decimal('0')
>>> a
Decimal('10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000')
>>> b
Decimal('0.1000000000000000055511151231257827021181583404541015625')
>>> 10**-100
1e-100
>>> Decimal(10**-100)
Decimal('1.00000000000000001999189980260288361964776078853415942018260300593659569925554346761767628861329298958274607481091185079852827053974965402226843604196126360835628314127871794272492894246908066589163059300043457860230145025079449986855914338755579873208034769049845635890960693359375E-100')
>>> 10**-1000
0.0
>>> Decimal(10**-1000)
Decimal('0')

Problem in handling large number in Python

I was solving a problem on codeforces:- Here is the Question
I wrote python code to solve the same:-
n=int(input())
print(0 if ((n*(n+1))/2)%2==0 else 1)
But it failed for the test-case: 1999999997 See Submission-[TestCase-6]
Why it failed despite Python can handle large numbers effectively ? [See this Thread]
Also the similar logic worked flawlessly when I coded it in CPP [See Submission Here]:-
#include<bits/stdc++.h>
using namespace std;
int main(){
int n;
cin>>n;
long long int sum=1ll*(n*(n+1))/2;
if(sum%2==0) cout<<0;
else cout<<1;
return 0;
}
Ran a test based on the insight from #juanpa.arrivillaga and this has been a great rabbit hole:
number = 1999999997
temp = n * (n+1)
# type(temp) is int, n is 3999999990000000006. We can clearly see that after dividing by 2 we should get an odd number, and therefore output 1
divided = temp / 2
# type(divided) is float. Printing divided for me gives 1.999999995e+18
# divided % 2 is 0
divided_int = temp // 2
# type(divided_int) is int. Printing divided for me gives 1999999995000000003
// Forces integer division, and will always return an integer: 7 // 2 will be equal to 3, not 3.5
As per the other answer you have linked, the int type in python can handle very large numbers.
Float can also handle large numbers, but there are issues with our ability to represent floats across languages. The crux of it is that not all floats can be captured accurately: In many scenarios the difference between 1.999999995e+18 and 1.999999995000000003e+18 is so minute it won't matter, but this is a scenario where it does, as you care a lot about the final digit of the number.
You can learn more about this by watching this video
As mentioned by #juanpa.arrivillaga and #DarrylG in comments, I should have used floor operator// for integer division, the anomaly was cause due to float division by / division operator.
So, the correct code should be:-
n=int(input())
print(0 if (n*(n+1)//2)%2==0 else 1)

How to adjust the numerical precision for integers

I'm trying to work with big numbers in R, in my opinion they aren't even that big. I asked R to return me the module of the division of 6001532020609003100 by 97, I got answer 1; when doing the same calculation in Python I got answer 66.
Can someone tell me what's going on?
R doesn't have the same kind of "magic", arbitrary-length integers that Python does: its base integer type is 32 bit, which maxes out at .Machine$integer.max == 2147483647. When confronted with a number greater than this value R automatically converts it to double-precision floating point; then the %% operator gets messed up by floating-point imprecision. (If you try to insist that the input is an integer by entering 6001532020609003100L (L indicates integer) R still converts it to float, but warns you ...)
#JonSpring is right that you can do completely arbitrary-length integer computation (up to your computer's memory capability) with Rmpfr, but you can
also use the bit64 package for 64-bit integers, which your example just fits into:
library(bit64)
x <- as.integer64("6001532020609003100")
x %% 97
## [1] 66
But doubling this value puts you out of the integer-64 range: 2*x gives an overflow error.
Honestly, if you want to do a lot of big-integer calculation I'd say that Python is more convenient ...
library(Rmpfr)
as.integer(mpfr("6001532020609003100") %% 97)
[1] 66

Python: Solution for high int-precision needed (generate primes)

at the moment I try to implement a generate_random_prime()-function from the algorithm shown in FIPS186-4 from NIST (Appendix B3.2.1), see here.
But there seems a big problem with step 4.4 (if p < sqrt(2)*(2**((nlen/2)-1)), because of the precision in Python.
to show the relevant part and problem of my code, see this example:
import os
from decimal import Decimal
import math
for i in range(100):
nlen = 2048 #my key-size should be 2048bit
p = int.from_bytes(os.urandom(int(2048/2/8)), byteorder = "little") #see Ann1 and Ann2
print(p < Decimal(math.sqrt(2))*(Decimal(2**(int(2048/2))) - 1)
Ann1: 2048/2/8 because of bytes
Ann2: I know that os.urandom is not the best generator - I will later use an approved one... for the testing phase it should be acceptable I think...
The result is always "True" - so the algorithm will never leave step 4.4.
I think the problem is Decimal(math.sqrt(2))*(Decimal(2**(int(2048/2))) - 1), because the result of this is Decimal('2.542322012307292741109308792E+308'). Convert to int via int(Decimal(math.sqrt(2))*(Decimal(2**(int(2048/2))) - 1)), the result will be
254232201230729274110930879200000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
It is rounded up! - Is this the reason for the always True result? I think in this case it will never be possible to get a p less than Decimal(math.sqrt(2))*(Decimal(2**(int(2048/2))) - 1)
How can I solve this problem?
__
edit: found a mistake:
Decimal(math.sqrt(2))*(Decimal(2**(int(2048/2))) - 1) should be Decimal(math.sqrt(2))*(Decimal(2**(int(2048/2-1)))), so the result of this should be Decimal('1.271161006153646370554654396E+308') instead of Decimal('2.542322012307292741109308791E+308')
You are constantly converting between floats, integers and Decimal. Drop all use of float; this includes not using functions that produce float values, such as math.sqrt().
Stick to Decimal objects instead, and only convert the final value to an integer:
int(Decimal(2).sqrt() * 2 ** ((nlen // 2) - 1))
Note the use of //, to use integer division, not true division (producing floats again).

Python, len, and size of ints

So, cPython (2.4) has some interesting behaviour when the length of something gets near to 1<<32 (the size of an int).
r = xrange(1<<30)
assert len(r) == 1<<30
is fine, but:
r = xrange(1<<32)
assert len(r) == 1<<32
ValueError: xrange object size cannot be reported`__len__() should return 0 <= outcome
Alex's wowrange has this behaviour as well. wowrange(1<<32).l is fine, but len(wowrange(1<<32)) is bad. I'm guessing there is some floating point behaviour (being read as negative) action going on here.
What exactly is happening here? (this is pretty well-solved below!)
How can I get around it? Longs?
(My specific application is random.sample(xrange(1<<32),ABUNCH)) if people want to tackle that question directly!)
cPython assumes that lists fit in memory. This extends to objects that behave like lists, such as xrange. essentially, the len function expects the __len__ method to return something that is convertable to size_t, which won't happen if the number of logical elements is too large, even if those elements don't actually exist in memory.
You'll find that
xrange(1 << 31 - 1)
is the last one that behaves as you want. This is because the maximum signed (32-bit) integer is 2^31 - 1.
1 << 32 is not a positive signed 32-bit integer (Python's int datatype), so that's why you're getting that error.
In Python 2.6, I can't even do xrange(1 << 32) or xrange(1 << 31) without getting an error, much less len on the result.
Edit If you want a little more detail...
1 << 31 represents the number 0x80000000 which in 2's complement representation is the lowest representable negative number (-1 * 2^31) for a 32-bit int. So yes, due to the bit-wise representation of the numbers you're working with, it's actually becoming negative.
For a 32-bit 2's complement number, 0x7FFFFFFF is the highest representable integer (2^31 - 1) before you "overflow" into negative numbers.
Further reading, if you're interested.
Note that when you see something like 2147483648L in the prompt, the "L" at the end signifies that it's now being represented as a "long integer" (64 bits, usually, I can't make any promises on how Python handles it because I haven't read up on it).
1<<32, when treated as a signed integer, is negative.

Categories