Converting An "Infinite" Float To An Int [duplicate] - python

This question already has answers here:
Integer square root in python
(14 answers)
Closed 8 years ago.
I'm trying to check if a number is a perfect square. However, i am dealing with extraordinarily large numbers so python thinks its infinity for some reason. it gets up to 1.1 X 10^154 before the code returns "Inf". Is there anyway to get around this? Here is the code, the lst variable just holds a bunch of really really really really really big numbers
import math
from decimal import Decimal
def main():
for i in lst:
root = math.sqrt(Decimal(i))
print(root)
if int(root + 0.5) ** 2 == i:
print(str(i) + " True")

Replace math.sqrt(Decimal(i)) with Decimal(i).sqrt() to prevent your Decimals decaying into floats

I think that you need to take a look at the BigFloat module, e.g.:
import bigfloat as bf
b = bf.BigFloat('1e1000', bf.precision(21))
print bf.sqrt(b)
Prints BigFloat.exact('9.9999993810013282e+499', precision=53)

#casevh has the right answer -- use a library that can do math on arbitrarily large integers. Since you're looking for squares, you presumably are working with integers, and one could argue that using floating point types (including decimal.Decimal) is, in some sense, inelegant.
You definitely shouldn't use Python's float type; it has limited precision (about 16 decimal places). If you do use decimal.Decimal, be careful to specify the precision (which will depend on how big your numbers are).
Since Python has a big integer type, one can write a reasonably simple algorithm to check for squareness; see my implementation of such an algorithm, along with illustrations of problems with float, and how you could use decimal.Decimal, below.
import math
import decimal
def makendigit(n):
"""Return an arbitraryish n-digit number"""
return sum((j%9+1)*10**i for i,j in enumerate(range(n)))
x=makendigit(30)
# it looks like float will work...
print 'math.sqrt(x*x) - x: %.17g' % (math.sqrt(x*x) - x)
# ...but actually they won't
print 'math.sqrt(x*x+1) - x: %.17g' % (math.sqrt(x*x+1) - x)
# by default Decimal won't be sufficient...
print 'decimal.Decimal(x*x).sqrt() - x:',decimal.Decimal(x*x).sqrt() - x
# ...you need to specify the precision
print 'decimal.Decimal(x*x).sqrt(decimal.Context(prec=30)) - x:',decimal.Decimal(x*x).sqrt(decimal.Context(prec=100)) - x
def issquare_decimal(y,prec=1000):
x=decimal.Decimal(y).sqrt(decimal.Context(prec=prec))
return x==x.to_integral_value()
print 'issquare_decimal(x*x):',issquare_decimal(x*x)
print 'issquare_decimal(x*x+1):',issquare_decimal(x*x+1)
# you can check for "squareness" without going to floating point.
# one option is a bisection search; this Newton's method approach
# should be faster.
# For "industrial use" you should use gmpy2 or some similar "big
# integer" library.
def isqrt(y):
"""Find largest integer <= sqrt(y)"""
if not isinstance(y,(int,long)):
raise ValueError('arg must be an integer')
if y<0:
raise ValueError('arg must be positive')
if y in (0,1):
return y
x0=y//2
while True:
# newton's rule
x1= (x0**2+y)//2//x0
# we don't always get converge to x0=x1, e.g., for y=3
if abs(x1-x0)<=1:
# nearly converged; find biggest
# integer satisfying our condition
x=max(x0,x1)
if x**2>y:
while x**2>y:
x-=1
else:
while (x+1)**2<=y:
x+=1
return x
x0=x1
def issquare(y):
"""Return true if non-negative integer y is a perfect square"""
return y==isqrt(y)**2
print 'isqrt(x*x)-x:',isqrt(x*x)-x
print 'issquare(x*x):',issquare(x*x)
print 'issquare(x*x+1):',issquare(x*x+1)

math.sqrt() converts the argument to a Python float which has a maximum value around 10^308.
You should probably look at using the gmpy2 library. gmpy2 provide very fast multiple precision arithmetic.
If you want to check for arbitrary powers, the function gmpy2.is_power() will return True if a number is a perfect power. It may be a cube or fifth power so you will need to check for power you are interested in.
>>> gmpy2.is_power(456789**372)
True
You can use gmpy2.isqrt_rem() to check if it is an exact square.
>>> gmpy2.isqrt_rem(9)
(mpz(3), mpz(0))
>>> gmpy2.isqrt_rem(10)
(mpz(3), mpz(1))
You can use gmpy2.iroot_rem() to check for arbitrary powers.
>>> gmpy2.iroot_rem(13**7 + 1, 7)
(mpz(13), mpz(1))

Related

How do I check a large floating point number to see if it is an integer? [duplicate]

This question already has answers here:
Is floating point math broken?
(31 answers)
Is floating point arbitrary precision available?
(5 answers)
Closed 5 months ago.
It is not a problem to check a small number for whether it is a float or an integer
>>> 4.0.is_integer()
True
>>> 4.123.is_integer()
False
if a - int(a) == 0:
print('Integer')
else:
print('Not Integer')
But when I have a large number, it does not work anymore:
>>> 31231242354234534534534534534534534534534534535434645755453543543453534534534534534535345346756423423.111.is_integer()
True
I would like to check very many and very large numbers, and the results of my calculations are floating-point numbers. I want to check if the result is an integer. For large numbers, the conventional methods do not work.
Using floats, it's not possible with numbers of the magnitude shown in your example, because eventually the precision of floating point becomes too coarse to distinguish the integer from the float. For example:
>>> 100000000000000000000000.5 == \
... 100000000000000000000000.0
False
>>> 1000000000000000000000000.5 == \
... 1000000000000000000000000.0
True
Both inputs were parsed to identical numbers. If you need the granularity to distinguish such values, parse them from strings into a different type such as Decimal.
For high precision calculations, you may also be interested in using a multiple-precision arithmetic library such as gmpy2.
Simply checking type might be misleading. If you're trying to solve a mathematical problem where your function spills continous range of numbers and you want to check which are integers, then type(a) likely won't work for you.
Solution: use is_integer instead.
a = 31231242354234534534534534534534534534534534535434645755453543543453534534534534534535345346756423423.0
print('Type: ',type(a))
print('Is integer: ',a.is_integer())
The first check gives you float, while the other tells the number is integer.
You can use isinstance. Here is example:
a = 100
print(isinstance(a, int))
It will print True or False values depends of value type for a variable (True in this case above).
print(isinstance(a, float))
In example above it will print False
The length = (int) (math.log10(a) + 1); will return the length of integer only.
Try it here:
https://onlinegdb.com/Sk1jmeTxV
import math
a= 31231242354234534534534534534534534534534534535434645755453543543453534534534534534535345346756423423
b = 31231242354234534534534534534534534534534534535434645755453543543453534534534534534535345346756423423.111
length = (int) (math.log10(a) + 1);
if length ==len(str(abs(a))):
print ('is a perfect int');
Likely you get the number as string from the input or file, and you can use the decimal module:
import decimal
ctx=decimal.Context(prec=200)
b=decimal.Decimal("31231242354234534534534534534534534534534534535434645755453543543453534534534534534535345346756423423.111")
print( ctx.divmod(b,1) )
Out: Decimal('31231242354234534534534534534534534534534534535434645755453543543453534534534534534535345346756423423'), Decimal('0.111'))
print(ctx.divmod(b,1)[1].is_zero()) # integer?
Out: False
I want to check if the cube root of a number "x" is an integer.
The result of a root calculation is a floating-point number.
For example, the following numbers "y" should be checked:
x = 12341254534XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
y = x**(1/3) # y = cube root of x
y = 6456535XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.2146544753325
or
y = 6456535XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.0
or
y = 6456535XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.99999999999646546547
This does not work for these large numbers.

Why is math.sqrt() incorrect for large numbers?

Why does the math module return the wrong result?
First test
A = 12345678917
print 'A =',A
B = sqrt(A**2)
print 'B =',int(B)
Result
A = 12345678917
B = 12345678917
Here, the result is correct.
Second test
A = 123456758365483459347856
print 'A =',A
B = sqrt(A**2)
print 'B =',int(B)
Result
A = 123456758365483459347856
B = 123456758365483467538432
Here the result is incorrect.
Why is that the case?
Because math.sqrt(..) first casts the number to a floating point and floating points have a limited mantissa: it can only represent part of the number correctly. So float(A**2) is not equal to A**2. Next it calculates the math.sqrt which is also approximately correct.
Most functions working with floating points will never be fully correct to their integer counterparts. Floating point calculations are almost inherently approximative.
If one calculates A**2 one gets:
>>> 12345678917**2
152415787921658292889L
Now if one converts it to a float(..), one gets:
>>> float(12345678917**2)
1.5241578792165828e+20
But if you now ask whether the two are equal:
>>> float(12345678917**2) == 12345678917**2
False
So information has been lost while converting it to a float.
You can read more about how floats work and why these are approximative in the Wikipedia article about IEEE-754, the formal definition on how floating points work.
The documentation for the math module states "It provides access to the mathematical functions defined by the C standard." It also states "Except when explicitly noted otherwise, all return values are floats."
Those together mean that the parameter to the square root function is a float value. In most systems that means a floating point value that fits into 8 bytes, which is called "double" in the C language. Your code converts your integer value into such a value before calculating the square root, then returns such a value.
However, the 8-byte floating point value can store at most 15 to 17 significant decimal digits. That is what you are getting in your results.
If you want better precision in your square roots, use a function that is guaranteed to give full precision for an integer argument. Just do a web search and you will find several. Those usually do a variation of the Newton-Raphson method to iterate and eventually end at the correct answer. Be aware that this is significantly slower that the math module's sqrt function.
Here is a routine that I modified from the internet. I can't cite the source right now. This version also works for non-integer arguments but just returns the integer part of the square root.
def isqrt(x):
"""Return the integer part of the square root of x, even for very
large values."""
if x < 0:
raise ValueError('square root not defined for negative numbers')
n = int(x)
if n == 0:
return 0
a, b = divmod(n.bit_length(), 2)
x = (1 << (a+b)) - 1
while True:
y = (x + n//x) // 2
if y >= x:
return x
x = y
If you want to calculate sqrt of really large numbers and you need exact results, you can use sympy:
import sympy
num = sympy.Integer(123456758365483459347856)
print(int(num) == int(sympy.sqrt(num**2)))
The way floating-point numbers are stored in memory makes calculations with them prone to slight errors that can nevertheless be significant when exact results are needed. As mentioned in one of the comments, the decimal library can help you here:
>>> A = Decimal(12345678917)
>>> A
Decimal('123456758365483459347856')
>>> B = A.sqrt()**2
>>> B
Decimal('123456758365483459347856.0000')
>>> A == B
True
>>> int(B)
123456758365483459347856
I use version 3.6, which has no hardcoded limit on the size of integers. I don't know if, in 2.7, casting B as an int would cause overflow, but decimal is incredibly useful regardless.

I'm making mistakes dividing large numbers

I am trying to write a program in python 2.7 that will first see if a number divides the other evenly, and if it does get the result of the division.
However, I am getting some interesting results when I use large numbers.
Currently I am using:
from __future__ import division
import math
a=82348972389472433334783
b=2
if a/b==math.trunc(a/b):
answer=a/b
print 'True' #to quickly see if the if loop was invoked
When I run this I get:
True
But 82348972389472433334783 is clearly not even.
Any help would be appreciated.
That's a crazy way to do it. Just use the remainder operator.
if a % b == 0:
# then b divides a evenly
quotient = a // b
The true division implicitly converts the input to floats which don't provide the precision to store the value of a accurately. E.g. on my machine
>>> int(1E15+1)
1000000000000001
>>> int(1E16+1)
10000000000000000
hence you loose precision. A similar thing happens with your big number (compare int(float(a))-a).
Now, if you check your division, you see the result "is" actually found to be an integer
>>> (a/b).is_integer()
True
which is again not really expected beforehand.
The math.trunc function does something similar (from the docs):
Return the Real value x truncated to an Integral (usually a long integer).
The duck typing nature of python allows a comparison of the long integer and float, see
Checking if float is equivalent to an integer value in python and
Comparing a float and an int in Python.
Why don't you use the modulus operator instead to check if a number can be divided evenly?
n % x == 0

How to print floating point numbers as it is without any truncation in python?

I have some number 0.0000002345E^-60. I want to print the floating point value as it is.
What is the way to do it?
print %f truncates it to 6 digits. Also %n.nf gives fixed numbers. What is the way to print without truncation.
Like this?
>>> print('{:.100f}'.format(0.0000002345E-60))
0.0000000000000000000000000000000000000000000000000000000000000000002344999999999999860343602938602754
As you might notice from the output, it’s not really that clear how you want to do it. Due to the float representation you lose precision and can’t really represent the number precisely. As such it’s not really clear where you want the number to stop displaying.
Also note that the exponential representation is often used to more explicitly show the number of significant digits the number has.
You could also use decimal to not lose the precision due to binary float truncation:
>>> from decimal import Decimal
>>> d = Decimal('0.0000002345E-60')
>>> p = abs(d.as_tuple().exponent)
>>> print(('{:.%df}' % p).format(d))
0.0000000000000000000000000000000000000000000000000000000000000000002345
You can use decimal.Decimal:
>>> from decimal import Decimal
>>> str(Decimal(0.0000002345e-60))
'2.344999999999999860343602938602754401109865640550232148836753621775217856801120686600683401464097113374472942165409862789978024748827516129306833728589548440037314681709534891496105046826414763927459716796875E-67'
This is the actual value of float created by literal 0.0000002345e-60. Its value is a number representable as python float which is closest to actual 0.0000002345 * 10**-60.
float should be generally used for approximate calculations. If you want accurate results you should use something else, like mentioned Decimal.
If I understand, you want to print a float?
The problem is, you cannot print a float.
You can only print a string representation of a float. So, in short, you cannot print a float, that is your answer.
If you accept that you need to print a string representation of a float, and your question is how specify your preferred format for the string representations of your floats, then judging by the comments you have been very unclear in your question.
If you would like to print the string representations of your floats in exponent notation, then the format specification language allows this:
{:g} or {:G}, depending whether or not you want the E in the output to be capitalized). This gets around the default precision for e and E types, which leads to unwanted trailing 0s in the part before the exponent symbol.
Assuming your value is my_float, "{:G}".format(my_float) would print the output the way that the Python interpreter prints it. You could probably just print the number without any formatting and get the same exact result.
If your goal is to print the string representation of the float with its current precision, in non-exponentiated form, User poke describes a good way to do this by casting the float to a Decimal object.
If, for some reason, you do not want to do this, you can do something like is mentioned in this answer. However, you should set 'max_digits' to sys.float_info.max_10_exp, instead of 14 used in the answer. This requires you to import sys at some point prior in the code.
A full example of this would be:
import math
import sys
def precision_and_scale(x):
max_digits = sys.float_info.max_10_exp
int_part = int(abs(x))
magnitude = 1 if int_part == 0 else int(math.log10(int_part)) + 1
if magnitude >= max_digits:
return (magnitude, 0)
frac_part = abs(x) - int_part
multiplier = 10 ** (max_digits - magnitude)
frac_digits = multiplier + int(multiplier * frac_part + 0.5)
while frac_digits % 10 == 0:
frac_digits /= 10
scale = int(math.log10(frac_digits))
return (magnitude + scale, scale)
f = 0.0000002345E^-60
p, s = precision_and_scale(f)
print "{:.{p}f}".format(f, p=p)
But I think the method involving casting to Decimal is probably better, overall.

Increment a Python floating point value by the smallest possible amount

How can I increment a floating point value in python by the smallest possible amount?
Background: I'm using floating point values as dictionary keys.
Occasionally, very occasionally (and perhaps never, but not certainly never), there will be collisions. I would like to resolve these by incrementing the floating point value by as small an amount as possible. How can I do this?
In C, I would twiddle the bits of the mantissa to achieve this, but I assume that isn't possible in Python.
Since Python 3.9 there is math.nextafter in the stdlib. Read on for alternatives in older Python versions.
Increment a python floating point value by the smallest possible amount
The nextafter(x,y) functions return the next discretely different representable floating-point value following x in the direction of y. The nextafter() functions are guaranteed to work on the platform or to return a sensible value to indicate that the next value is not possible.
The nextafter() functions are part of POSIX and ISO C99 standards and is _nextafter() in Visual C. C99 compliant standard math libraries, Visual C, C++, Boost and Java all implement the IEEE recommended nextafter() functions or methods. (I do not honestly know if .NET has nextafter(). Microsoft does not care much about C99 or POSIX.)
None of the bit twiddling functions here fully or correctly deal with the edge cases, such as values going though 0.0, negative 0.0, subnormals, infinities, negative values, over or underflows, etc. Here is a reference implementation of nextafter() in C to give an idea of how to do the correct bit twiddling if that is your direction.
There are two solid work arounds to get nextafter() or other excluded POSIX math functions in Python < 3.9:
Use Numpy:
>>> import numpy
>>> numpy.nextafter(0,1)
4.9406564584124654e-324
>>> numpy.nextafter(.1, 1)
0.10000000000000002
>>> numpy.nextafter(1e6, -1)
999999.99999999988
>>> numpy.nextafter(-.1, 1)
-0.099999999999999992
Link directly to the system math DLL:
import ctypes
import sys
from sys import platform as _platform
if _platform == "linux" or _platform == "linux2":
_libm = ctypes.cdll.LoadLibrary('libm.so.6')
_funcname = 'nextafter'
elif _platform == "darwin":
_libm = ctypes.cdll.LoadLibrary('libSystem.dylib')
_funcname = 'nextafter'
elif _platform == "win32":
_libm = ctypes.cdll.LoadLibrary('msvcrt.dll')
_funcname = '_nextafter'
else:
# these are the ones I have access to...
# fill in library and function name for your system math dll
print("Platform", repr(_platform), "is not supported")
sys.exit(0)
_nextafter = getattr(_libm, _funcname)
_nextafter.restype = ctypes.c_double
_nextafter.argtypes = [ctypes.c_double, ctypes.c_double]
def nextafter(x, y):
"Returns the next floating-point number after x in the direction of y."
return _nextafter(x, y)
assert nextafter(0, 1) - nextafter(0, 1) == 0
assert 0.0 + nextafter(0, 1) > 0.0
And if you really really want a pure Python solution:
# handles edge cases correctly on MY computer
# not extensively QA'd...
import math
# 'double' means IEEE 754 double precision -- c 'double'
epsilon = math.ldexp(1.0, -53) # smallest double that 0.5+epsilon != 0.5
maxDouble = float(2**1024 - 2**971) # From the IEEE 754 standard
minDouble = math.ldexp(1.0, -1022) # min positive normalized double
smallEpsilon = math.ldexp(1.0, -1074) # smallest increment for doubles < minFloat
infinity = math.ldexp(1.0, 1023) * 2
def nextafter(x,y):
"""returns the next IEEE double after x in the direction of y if possible"""
if y==x:
return y #if x==y, no increment
# handle NaN
if x!=x or y!=y:
return x + y
if x >= infinity:
return infinity
if x <= -infinity:
return -infinity
if -minDouble < x < minDouble:
if y > x:
return x + smallEpsilon
else:
return x - smallEpsilon
m, e = math.frexp(x)
if y > x:
m += epsilon
else:
m -= epsilon
return math.ldexp(m,e)
Or, use Mark Dickinson's excellent solution
Obviously the Numpy solution is the easiest.
Python 3.9 and above
Starting with Python 3.9, released 2020-10-05, you can use the math.nextafter function:
math.nextafter(x, y)
Return the next floating-point value after x towards y.
If x is equal to y, return y.
Examples:
math.nextafter(x, math.inf) goes up: towards positive infinity.
math.nextafter(x, -math.inf) goes down: towards minus infinity.
math.nextafter(x, 0.0) goes towards zero.
math.nextafter(x, math.copysign(math.inf, x)) goes away from zero.
See also math.ulp().
First, this "respond to a collision" is a pretty bad idea.
If they collide, the values in the dictionary should have been lists of items with a common key, not individual items.
Your "hash probing" algorithm will have to loop through more than one "tiny increments" to resolve collisions.
And sequential hash probes are known to be inefficient.
Read this: http://en.wikipedia.org/wiki/Quadratic_probing
Second, use math.frexp and sys.float_info.epsilon to fiddle with mantissa and exponent separately.
>>> m, e = math.frexp(4.0)
>>> (m+sys.float_info.epsilon)*2**e
4.0000000000000018
Forgetting about why we would want to increment a floating point value for a moment, I would have to say I think Autopulated's own answer is probably correct.
But for the problem domain, I share the misgivings of most of the responders to the idea of using floats as dictionary keys. If the objection to using Decimal (as proposed in the main comments) is that it is a "heavyweight" solution, I suggest a do-it-yourself compromise: Figure out what the practical resolution is on the timestamps, pick a number of digits to adequately cover it, then multiply all the timestamps by the necessary amount so that you can use integers as the keys. If you can afford an extra digit or two beyond the timer precision, then you can be even more confident that there will be no or fewer collisions, and that if there are collisions, you can just add 1 (instead of some rigamarole to find the next floating point value).
I recommend against assuming that floats (or timestamps) will be unique if at all possible. Use a counting iterator, database sequence or other service to issue unique identifiers.
Instead of incrementing the value, just use a tuple for the colliding key. If you need to keep them in order, every key should be a tuple, not just the duplicates.
A better answer (now I'm just doing this for fun...), motivated by twiddling the bits. Handling the carry and overflows between parts of the number of negative values is somewhat tricky.
import struct
def floatToieee754Bits(f):
return struct.unpack('<Q', struct.pack('<d', f))[0]
def ieee754BitsToFloat(i):
return struct.unpack('<d', struct.pack('<Q', i))[0]
def incrementFloat(f):
i = floatToieee754Bits(f)
if f >= 0:
return ieee754BitsToFloat(i+1)
else:
raise Exception('f not >= 0: unsolved problem!')
Instead of resolving the collisions by changing the key, how about collecting the collisions? IE:
bag = {}
bag[1234.] = 'something'
becomes
bag = collections.defaultdict(list)
bag[1234.].append('something')
would that work?
For colliding key k, add: k / 250
Interesting problem. The amount you need to add obviously depends on the magnitude of the colliding value, so that a normalized add will affect only the least significant bits.
It's not necessary to determine the smallest value that can be added. All you need to do is approximate it. The FPU format provides 52 mantissa bits plus a hidden bit for 53 bits of precision. No physical constant is known to anywhere near this level of precision. No sensor is able measure anything near it. So you don't have a hard problem.
In most cases, for key k, you would be able to add k/253, because of that 52-bit fraction plus the hidden bit.
But it's not necessary to risk triggering library bugs or exploring rounding issues by shooting for the very last bit or anything near it.
So I would say, for colliding key k, just add k / 250 and call it a day.1
1. Possibly more than once until it doesn't collide any more, at least to foil any diabolical unit test authors.
import sys
>>> sys.float_info.epsilon
2.220446049250313e-16
Instead of modifying your float timestamp, use a tuple for every key as Mark Ransom suggests where the tuple (x,y) is composed of x=your_unmodified_time_stamp and y=(extremely unlikely to be a same value twice).
So:
x just is the unmodified timestamp and can be the same value many times;
y you can use:
a random integer number from a large range,
serial integer (0,1,2,etc),
UUID.
While 2.1 (random int from a large range) there works great for ethernet, I would use 2.2 (serializer) or 2.3 (UUID). Easy, fast, bulletproof. For 2.2 and 2.3 you don't even need collision detection (you might want to still have it for 2.1 as ethernet does.)
The advantage of 2.2 is that you can also tell, and sort, data elements that have the same float time stamp.
Then just extract x from the tuple for any sorting type operations and the tuple itself is a collision free key for the hash / dictionary.
Edit
I guess example code will help:
#!/usr/bin/env python
import time
import sys
import random
#generator for ints from 0 to maxinteger on system:
serializer=(sn for sn in xrange(0,sys.maxint))
#a list with guranteed collisions:
times=[]
for c in range(0,35):
t=time.clock()
for i in range(0,random.choice(range(0,4))):
times.append(t)
print len(set(times)), "unique items in a list of",len(times)
#dictionary of tuples; no possibilities of collisions:
di={}
for time in times:
sn=serializer.next()
di[(time,sn)]='Element {}'.format(sn)
#for tuples of multiple numbers, Python sorts
# as you expect: first by t[0] then t[1], until t[n]
for key in sorted(di.keys()):
print "{:>15}:{}".format(key, di[key])
Output:
26 unique items in a list of 55
(0.042289, 0):Element 0
(0.042289, 1):Element 1
(0.042289, 2):Element 2
(0.042305, 3):Element 3
(0.042305, 4):Element 4
(0.042317, 5):Element 5
# and so on until Element n...
Here it part of it. This is dirty and slow, but maybe that is how you like it. It is missing several corner cases, but maybe this gets someone else close.
The idea is to get the hex string of a floating point number. That gives you a string with the mantissa and exponent bits to twiddle. The twiddling is a pain since you have to do all it manually and keep converting to/from strings. Anyway, you add(subtract) 1 to(from) the last digit for positive(negative) numbers. Make sure you carry through to the exponent if you overflow. Negative numbers are a little more tricky to make you don't waste any bits.
def increment(f):
h = f.hex()
# decide if we need to increment up or down
if f > 0:
sign = '+'
inc = 1
else:
sign = '-'
inc = -1
# pull the string apart
h = h.split('0x')[-1]
h,e = h.split('p')
h = ''.join(h.split('.'))
h2 = shift(h, inc)
# increase the exponent if we added a digit
h2 = '%s0x%s.%sp%s' % (sign, h2[0], h2[1:], e)
return float.fromhex(h2)
def shift(s, num):
if not s:
return ''
right = s[-1]
right = int(right, 16) + num
if right > 15:
num = right // 16
right = right%16
elif right < 0:
right = 0
num = -1
else:
num = 0
# drop the leading 0x
right = hex(right)[2:]
return shift(s[:-1], num) + right
a = 1.4e4
print increment(a) - a
a = -1.4e4
print increment(a) - a
a = 1.4
print increment(a) - a
I think you mean "by as small an amount possible to avoid a hash collision", since for example the next-highest-float may already be a key! =)
while toInsert.key in myDict: # assumed to be positive
toInsert.key *= 1.000000000001
myDict[toInsert.key] = toInsert
That said you probably don't want to be using timestamps as keys.
After Looking at Autopopulated's answer I came up with a slightly different answer:
import math, sys
def incrementFloatValue(value):
if value == 0:
return sys.float_info.min
mant, exponent = math.frexp(value)
epsilonAtValue = math.ldexp(1, exponent - sys.float_info.mant_dig)
return math.fsum([value, epsilonAtValue])
Disclaimer: I'm really not as great at maths as I think I am ;) Please verify this is correct before using it. Also I'm not sure about performance
some notes:
epsilonAtValue calculates how many bits are used for the mantissa (the maximum minus what is used for the exponent).
I'm not sure if the math.fsum() is needed but hey it doesn't seem to hurt.
It turns out that this is actually quite complicated (maybe why seven people have answered without actually providing an answer yet...).
I think this is the right solution, it certainly seems to handle 0 and positive values correctly:
import math
import sys
def incrementFloat(f):
if f == 0.0:
return sys.float_info.min
m, e = math.frexp(f)
return math.ldexp(m + sys.float_info.epsilon / 2, e)

Categories