number approximation in python - python

I have a list of floating points numbers which represent x and y coordinates of points.
(-379.99418604651157, 47.517234218543351, 0.0) #representing point x
an edge contains two such numbers.
I'd like to use a graph traversal algorithm, such as dijkstra, but using floating point numbers such as the ones above don't help.
What I'm actually looking for is a way of approximating those numbers:
(-37*.*, 4*.*, 0.0)
is there a python function that does that?

"...using floating point numbers such as the ones above don't help..." - why not? I don't recall integers as a requirement for Dijkstra. Aren't you concerned with the length of the edge? That's more likely to be a floating point number, even if the endpoints are expressed in integer values.
I'm quoting from Steve Skiena's "Algorithm Design Manual":
Dijkstra's algorithm proceeds in a
series of rounds, where each round
establishes the shortest path from s
to some new vertex. Specifically, x
is the vertex that minimizes dist(s,
vi) + w(vi, x) over all unfinished 1
<= i <= n...
Distance - no mention of integer.

Like so?
>>> x, y, z = (-379.99418604651157, 47.517234218543351, 0.0)
>>> abs(x - -370) < 10
True
>>> abs(y - 40) < 10
True

Given your vector
(-379.99418604651157, 47.517234218543351, 0.0) #representing point x
The easiest way to perform rounding that works like you would expect would probably be to use the decimal module: http://docs.python.org/library/decimal.html .
from decimal import Decimal:
point = (-379.99418604651157, 47.517234218543351, 0.0) #representing point x
converted = [Decimal(str(x)) for x in point]
Then, to get an approximation, you can use the quantize method:
>>> converted[0].quantize(Decimal('.0001'), rounding="ROUND_DOWN")
Decimal("-379.9941")
This approach has the advantage of the built in ability to avoid rounding errors. Hopefully this is helpful.
Edit:
After seeing your comment, it looks like you're trying to see if two points are close to each other. These functions might do what you want:
def roundable(a,b):
"""Returns true if a can be rounded to b at any precision"""
a = Decimal(str(a))
b = Decimal(str(b))
return a.quantize(b) == b
def close(point_1, point_2):
for a,b in zip(point_1, point_2):
if not (roundable(a,b) or roundable(b,a)):
return False
return True
I don't know if this is better than an epsilon approach, but it's fairly simple to implement.

I'm not sure what the problem is with the floating point numbers, but there are several ways you can approximate your values. If you just want to round them you can use math.ceil(), math.floor() and math.trunc().
If you actually want to keep track of the precision, there are a bunch of multi-precision math libraries listed on the wiki which might be useful.

I suppose that you want to approximate the number so that you can visually easily understand you're algorithm while stepping into it (as Djikstra pose no limitation on the coordinate of the node, in fact it is only interested with the cost of edges).
A simple function to approximate numbers:
>>> import math
>>> def approximate(value, places = 0):
... factor = 10. ** places
... return factor * math.trunc(value / factor)
>>> p = (-379.99418604651157, 47.517234218543351, 0.0)
>>> print [ approximate(x, 1) for x in p ]
[-370.0, 40.0, 0.0]

Related

How do I calculate square root in Python?

I need to calculate the square root of some numbers, for example √9 = 3 and √2 = 1.4142. How can I do it in Python?
The inputs will probably be all positive integers, and relatively small (say less than a billion), but just in case they're not, is there anything that might break?
Related
Integer square root in python
How to find integer nth roots?
Is there a short-hand for nth root of x in Python?
Difference between **(1/2), math.sqrt and cmath.sqrt?
Why is math.sqrt() incorrect for large numbers?
Python sqrt limit for very large numbers?
Which is faster in Python: x**.5 or math.sqrt(x)?
Why does Python give the "wrong" answer for square root? (specific to Python 2)
calculating n-th roots using Python 3's decimal module
How can I take the square root of -1 using python? (focused on NumPy)
Arbitrary precision of square roots
Note: This is an attempt at a canonical question after a discussion on Meta about an existing question with the same title.
Option 1: math.sqrt()
The math module from the standard library has a sqrt function to calculate the square root of a number. It takes any type that can be converted to float (which includes int) as an argument and returns a float.
>>> import math
>>> math.sqrt(9)
3.0
Option 2: Fractional exponent
The power operator (**) or the built-in pow() function can also be used to calculate a square root. Mathematically speaking, the square root of a equals a to the power of 1/2.
The power operator requires numeric types and matches the conversion rules for binary arithmetic operators, so in this case it will return either a float or a complex number.
>>> 9 ** (1/2)
3.0
>>> 9 ** .5 # Same thing
3.0
>>> 2 ** .5
1.4142135623730951
(Note: in Python 2, 1/2 is truncated to 0, so you have to force floating point arithmetic with 1.0/2 or similar. See Why does Python give the "wrong" answer for square root?)
This method can be generalized to nth root, though fractions that can't be exactly represented as a float (like 1/3 or any denominator that's not a power of 2) may cause some inaccuracy:
>>> 8 ** (1/3)
2.0
>>> 125 ** (1/3)
4.999999999999999
Edge cases
Negative and complex
Exponentiation works with negative numbers and complex numbers, though the results have some slight inaccuracy:
>>> (-25) ** .5 # Should be 5j
(3.061616997868383e-16+5j)
>>> 8j ** .5 # Should be 2+2j
(2.0000000000000004+2j)
Note the parentheses on -25! Otherwise it's parsed as -(25**.5) because exponentiation is more tightly binding than unary negation.
Meanwhile, math is only built for floats, so for x<0, math.sqrt(x) will raise ValueError: math domain error and for complex x, it'll raise TypeError: can't convert complex to float. Instead, you can use cmath.sqrt(x), which is more more accurate than exponentiation (and will likely be faster too):
>>> import cmath
>>> cmath.sqrt(-25)
5j
>>> cmath.sqrt(8j)
(2+2j)
Precision
Both options involve an implicit conversion to float, so floating point precision is a factor. For example:
>>> n = 10**30
>>> x = n**2
>>> root = x**.5
>>> n == root
False
>>> n - root # how far off are they?
0.0
>>> int(root) - n # how far off is the float from the int?
19884624838656
Very large numbers might not even fit in a float and you'll get OverflowError: int too large to convert to float. See Python sqrt limit for very large numbers?
Other types
Let's look at Decimal for example:
Exponentiation fails unless the exponent is also Decimal:
>>> decimal.Decimal('9') ** .5
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for ** or pow(): 'decimal.Decimal' and 'float'
>>> decimal.Decimal('9') ** decimal.Decimal('.5')
Decimal('3.000000000000000000000000000')
Meanwhile, math and cmath will silently convert their arguments to float and complex respectively, which could mean loss of precision.
decimal also has its own .sqrt(). See also calculating n-th roots using Python 3's decimal module
SymPy
Depending on your goal, it might be a good idea to delay the calculation of square roots for as long as possible. SymPy might help.
SymPy is a Python library for symbolic mathematics.
import sympy
sympy.sqrt(2)
# => sqrt(2)
This doesn't seem very useful at first.
But sympy can give more information than floats or Decimals:
sympy.sqrt(8) / sympy.sqrt(27)
# => 2*sqrt(6)/9
Also, no precision is lost. (√2)² is still an integer:
s = sympy.sqrt(2)
s**2
# => 2
type(s**2)
#=> <class 'sympy.core.numbers.Integer'>
In comparison, floats and Decimals would return a number which is very close to 2 but not equal to 2:
(2**0.5)**2
# => 2.0000000000000004
from decimal import Decimal
(Decimal('2')**Decimal('0.5'))**Decimal('2')
# => Decimal('1.999999999999999999999999999')
Sympy also understands more complex examples like the Gaussian integral:
from sympy import Symbol, integrate, pi, sqrt, exp, oo
x = Symbol('x')
integrate(exp(-x**2), (x, -oo, oo))
# => sqrt(pi)
integrate(exp(-x**2), (x, -oo, oo)) == sqrt(pi)
# => True
Finally, if a decimal representation is desired, it's possible to ask for more digits than will ever be needed:
sympy.N(sympy.sqrt(2), 1_000_000)
# => 1.4142135623730950488016...........2044193016904841204
NumPy
>>> import numpy as np
>>> np.sqrt(25)
5.0
>>> np.sqrt([2, 3, 4])
array([1.41421356, 1.73205081, 2. ])
docs
Negative
For negative reals, it'll return nan, so np.emath.sqrt() is available for that case.
>>> a = np.array([4, -1, np.inf])
>>> np.sqrt(a)
<stdin>:1: RuntimeWarning: invalid value encountered in sqrt
array([ 2., nan, inf])
>>> np.emath.sqrt(a)
array([ 2.+0.j, 0.+1.j, inf+0.j])
Another option, of course, is to convert to complex first:
>>> a = a.astype(complex)
>>> np.sqrt(a)
array([ 2.+0.j, 0.+1.j, inf+0.j])
Newton's method
Most simple and accurate way to compute square root is Newton's method.
You have a number which you want to compute its square root (num) and you have a guess of its square root (estimate). Estimate can be any number bigger than 0, but a number that makes sense shortens the recursive call depth significantly.
new_estimate = (estimate + num/estimate) / 2
This line computes a more accurate estimate with those 2 parameters. You can pass new_estimate value to the function and compute another new_estimate which is more accurate than the previous one or you can make a recursive function definition like this.
def newtons_method(num, estimate):
# Computing a new_estimate
new_estimate = (estimate + num/estimate) / 2
print(new_estimate)
# Base Case: Comparing our estimate with built-in functions value
if new_estimate == math.sqrt(num):
return True
else:
return newtons_method(num, new_estimate)
For example we need to find 30's square root. We know that the result is between 5 and 6.
newtons_method(30,5)
number is 30 and estimate is 5. The result from each recursive calls are:
5.5
5.477272727272727
5.4772255752546215
5.477225575051661
The last result is the most accurate computation of the square root of number. It is the same value as the built-in function math.sqrt().
This answer was originally posted by gunesevitan, but is now deleted.
Python's fractions module and its class, Fraction, implement arithmetic with rational numbers. The Fraction class doesn't implement a square root operation, because most square roots are irrational numbers. However, it can be used to approximate a square root with arbitrary accuracy, because a Fraction's numerator and denominator are arbitrary-precision integers.
The following method takes a positive number x and a number of iterations, and returns upper and lower bounds for the square root of x.
from fractions import Fraction
def sqrt(x, n):
x = x if isinstance(x, Fraction) else Fraction(x)
upper = x + 1
for i in range(0, n):
upper = (upper + x/upper) / 2
lower = x / upper
if lower > upper:
raise ValueError("Sanity check failed")
return (lower, upper)
See the reference below for details on this operation's implementation. It also shows how to implement other operations with upper and lower bounds (although there is apparently at least one error with the log operation there).
Daumas, M., Lester, D., Muñoz, C., "Verified Real Number Calculations: A Library for Interval Arithmetic", arXiv:0708.3721 [cs.MS], 2007.
Alternatively, using Python's math.isqrt, we can calculate a square root to arbitrary precision:
Square root of i within 1/2n of the correct value, where i is an integer:Fraction(math.isqrt(i * 2**(n*2)), 2**n).
Square root of i within 1/10n of the correct value, where i is an integer:Fraction(math.isqrt(i * 10**(n*2)), 10**n).
Square root of x within 1/2n of the correct value, where x is a multiple of 1/2n:Fraction(math.isqrt(x * 2**(n)), 2**n).
Square root of x within 1/10n of the correct value, where x is a multiple of 1/10n:Fraction(math.isqrt(x * 10**(n)), 10**n).
In the foregoing, i or x must be 0 or greater.
Binary search
Disclaimer: this is for a more specialised use-case. This method might not be practical in all circumstances.
Benefits:
can find integer values (i.e. which integer is the root?)
no need to convert to float, so better precision (can be done that well too)
I personally implemented this one for a crypto CTF challenge (RSA cube root attack),where I needed a precise integer value.
The general idea can be extended to any other root.
def int_squareroot(d: int) -> tuple[int, bool]:
"""Try calculating integer squareroot and return if it's exact"""
left, right = 1, (d+1)//2
while left<right-1:
x = (left+right)//2
if x**2 > d:
left, right = left, x
else:
left, right = x, right
return left, left**2==d
EDIT:
As #wjandrea have also pointed out, **this example code can NOT compute **. This is a side-effect of the fact that it does not convert anything into floats, so no precision is lost. If the root is an integer, you get that back. If it's not, you get the biggest number whose square is smaller than your number. I updated the code so that it also returns a bool indicating if the value is correct or not, and also fixed an issue causing it to loop infinitely (also pointed out by #wjandrea). This implementation of the general method still works kindof weird for smaller numbers, but above 10 I had no problems with.
Overcoming the issues and limits of this method/implementation:
For smaller numbers, you can just use all the other methods from other answers. They generally use floats, which might be a loss of precision, but for small integers that should mean no problem at all. All of those methods that use floats have the same (or nearly the same) limit from this.
If you still want to use this method and get float results, it should be trivial to convert this to use floats too. Note that that will reintroduce precision loss, this method's unique benefit over the others, and in that case you can also just use any of the other answers. I think the newton's method version converges a bit faster, but I'm not sure.
For larger numbers, where loss of precision with floats come into play, this method can give results closer to the actual answer (depending on how big is the input). If you want to work with non-integers in this range, you can use other types, for example fixed precision numbers in this method too.
Edit 2, on other answers:
Currently, and afaik, the only other answer that has similar or better precision for large numbers than this implementation is the one that suggest SymPy, by Eric Duminil. That version is also easier to use, and work for any kind of number, the only downside is that it requires SymPy. My implementation is free from any huge dependencies if that is what you are looking for.
Arbitrary precision square root
This variation uses string manipulations to convert a string which represents a decimal floating-point number to an int, calls math.isqrt to do the actual square root extraction, and then formats the result as a decimal string. math.isqrt rounds down, so all produced digits are correct.
The input string, num, must use plain float format: 'e' notation is not supported. The num string can be a plain integer, and leading zeroes are ignored.
The digits argument specifies the number of decimal places in the result string, i.e., the number of digits after the decimal point.
from math import isqrt
def str_sqrt(num, digits):
""" Arbitrary precision square root
num arg must be a string
Return a string with `digits` after
the decimal point
Written by PM 2Ring 2022.01.26
"""
int_part , _, frac_part = num.partition('.')
num = int_part + frac_part
# Determine the required precision
width = 2 * digits - len(frac_part)
# Truncate or pad with zeroes
num = num[:width] if width < 0 else num + '0' * width
s = str(isqrt(int(num)))
if digits:
# Pad, if necessary
s = '0' * (1 + digits - len(s)) + s
s = f"{s[:-digits]}.{s[-digits:]}"
return s
Test
print(str_sqrt("2.0", 30))
Output
1.414213562373095048801688724209
For small numbers of digits, it's faster to use decimal.Decimal.sqrt. Around 32 digits or so, str_sqrt is roughly the same speed as Decimal.sqrt. But at 128 digits, str_sqrt is 2.2× faster than Decimal.sqrt, at 512 digits, it's 4.3× faster, at 8192 digits, it's 7.4× faster.
Here's a live version running on the SageMathCell server.
find square-root of a number
while True:
num = int(input("Enter a number:\n>>"))
for i in range(2, num):
if num % i == 0:
if i*i == num:
print("Square root of", num, "==>", i)
break
else:
kd = (num**0.5) # (num**(1/2))
print("Square root of", num, "==>", kd)
OUTPUT:-
Enter a number: 24
Square root of 24 ==> 4.898979485566356
Enter a number: 36
Square root of 36 ==> 6
Enter a number: 49
Square root of 49 ==> 7
✔ Output 💡 CLICK BELOW & SEE ✔

How are data types interpreted, calculated, and/or stored?

In python, suppose the code is:
import.math
a = math.sqrt(2.0)
if a * a == 2.0:
x = 2
else:
x = 1
This is a variant of "Floating Point Numbers are Approximations -- Not Exact".
Mathematically speaking, you are correct that sqrt(2) * sqrt(2) == 2. But sqrt(2) can not be exactly represented as a native datatype (read: floating point number). (Heck, the sqrt(2) is actually guaranteed to be an infinite decimal!). It can get really close, but not exact:
>>> import math
>>> math.sqrt(2)
1.4142135623730951
>>> math.sqrt(2) * math.sqrt(2)
2.0000000000000004
Note the result is, in fact, not exactly 2.
If you want the x = 2 branch to execute, you will need to use an epsilon value of "is the result close enough?":
epsilon = 1e-6 # 0.000001
if abs(2.0 - a*a) < epsilon:
x = 2
else:
x = 1
Numbers with decimals are stored as floating point numbers and they can only be an approximation to the real number in some cases.
So your comparison needs to be not "are these two numbers exactly equal (==)" but "are they sufficiently close as to be considered equal".
Fortunately, in the math library, there's a function to do that conveniently. Using isClose(), you can compare with a defined tolerance. The function isn't too complicated, you could do it yourself.
math.isclose(a*a, 2, abs_tol=0.0001)
>> True

numpy arange: how to make "precise" array of floats?

In short, the problem I encounter is this:
aa = np.arange(-1., 0.001, 0.01)
aa[-1]
Out[16]: 8.8817841970012523e-16
In reality, this cause a series problem since my simulations doesn't allow positive value inputs.
I can sort of get around by doing:
aa = np.arange(-100, 1, 1)/100.
aa[-1]
Out[21]: 0.0
But this is a pain. Practically you can't do this every time.
This seems like such a basic problem. There's gotta be something I am missing here.
By the way, I am using Python 2.7.13.
This happens because Python (like most modern programming languages) uses floating point arithmetic, which cannot exactly represent some numbers (see Is floating point math broken?).
This means that, regardless of whether you're using Python 2, Python 3, R, C, Java, etc. you have to think about the effects of adding two floating point numbers together.
np.arange works by repeatedly adding the step value to the start value, and this leads to imprecision in the end:
>>> start = -1
>>> for i in range(1000):
... start += 0.001
>>> start
8.81239525796218e-16
Similarly:
>>> x = np.arange(-1., 0.001, 0.01)
>>> x[-1]
8.8817841970012523e-16
The typical pattern used to circumvent this is to work with integers whenever possible if repeated operations are needed. So, for example, I would do something like this:
>>> x = 0.01 * np.arange(-100, 0.1)
>>> x[-1]
0.0
Alternatively, you could create a one-line convenience function that will do this for you:
>>> def safe_arange(start, stop, step):
... return step * np.arange(start / step, stop / step)
>>> x = safe_arange(-1, 0.001, 0.01)
>>> x[-1]
0
But note that even this can't get around the limits of floating point precision; for example, the number -0.99 cannot be represented exactly in floating point:
>>> val = -0.99
>>> print('{0:.20f}'.format(val))
-0.98999999999999999112
So you must always keep that in mind when working with floating point numbers, in any language.
Using np.linespace solved it for me:
For example np.linspace(0.5, 0.9, 5) produce [0.5 0.6 0.7 0.8 0.9].
We don't get to forget about the limitations of floating-point arithmetics. Repeatedly adding 0.01, or rather the double-precision float that is close to 0.01, will result in the kind of effects you observe.
To ensure that an array does not contain positive numbers, use numpy.clip:
aa = np.clip(np.arange(-1., 0.001, 0.01), None, 0)
I have the same problem as you do.
Here is a simple solution:
for b in np.arange(maxb,minb+stepb,stepb):
for a in np.arange(mina,maxa+stepa,stepa):
a=round(a,2);b=round(b,2); # 2 is the size of the floating part.

Measuring the accuracy of floating point results to N decimal places

I'm testing some implementations of Pi in python (64-bit OS) and am interested in measuring how accurate the answer is (how many decimal places were correct?) for increasing iterations. I don't wish to compare more than 15 decimal places because beyond that the floating point representation itself is inaccurate.
E.g. for a low iteration count, the answer I got is
>>> x
3.140638056205993
I wish to compare to math.pi
>>> math.pi
3.141592653589793
For the above I wish my answer to be 3 (3rd decimal is wrong)
The way I've done it is:
>>> p = str('%.51f' % math.pi)
>>> q = str('%.51f' % x)
>>> for i,(a,b) in enumerate(zip(p,q)):
... if a != b:
... break
The above looks clumsy to me, i.e. converting floats to strings and then comparing character by character, is there a better way of doing this, say more Pythonic or that uses the raw float values themselves?
Btw I found math.frexp, can this be used to do this?
>>> math.frexp(x)
(0.7851595140514982, 2)
You can compute the logarithm of the difference between the two
>>> val = 3.140638056205993
>>> epsilon = abs(val - math.pi)
>>> abs(int(math.log(epsilon, 10))) + 1
3
Essentially, you're finding out which power of 10 does it take to equal the difference between the two numbers. This only works if the difference between the two numbers is less than 1.

Normalize Small Probabilities in Python

I have a list of probabilities, which I need to normalize to equal 1.0.
e.g. probs = [0.01,0.03,0.005]
I realize that this is done by dividing each probability by the sum of probs. However, if the probabilities become really small, Python will tell me that sum(probs)=0.0. I understand that this is an underflow issue. I suppose I should use the log of each probability. How would I do this?
The sum of even very small floating point values will never truly be 0; they may be close to zero, but can never be exactly zero.
Just divide 1 by their sum, and multiply the probabilities by that factor:
def normalize(probs):
prob_factor = 1 / sum(probs)
return [prob_factor * p for p in probs]
Some probabilities may make up but a very small percentage in the total sum, of course, and that percentage may approach zero. But this just means that when normalising you may end up with normalized probabilities that are either very close to zero, or if smaller than the smallest representable floating point value, equal to zero. The latter only happens if there are probabilities in the list that are so much smaller than the others that they no longer represent anything close to something that'll ever occur.
Demo:
>>> def normalize(probs):
... prob_factor = 1 / sum(probs)
... return [prob_factor * p for p in probs]
...
>>> normalize([0.0000000001,0.000000000003,0.000000000000005])
[0.9708266589000533, 0.029124799767001597, 4.854133294500266e-05]
And the extreme case:
>>> import sys
>>> normalize([sys.float_info.max, sys.float_info.min])
[0.9999999999999999, 0.0]
>>> normalize([sys.float_info.max, sys.float_info.min])[-1] == 0
True
You can always use a scale factor to avoid the underflow problem, either manually entered or automatically calculated, e.g.:
import math
no_z = ([x for x in probs if x > 0.0])
if len(no_z) == 0:
print "Unable to calculate with 0.0 as all the probabilities"
order = int(-math.log10(min(no_z)))
if order > 0:
order = 0
sf = 10**order
scaled = [x * sf for x in probs]
tot = sum(scaled)
norm = [x/tot for x in scaled]
Of course you would probably be better off just using bigfloat or numpy and doing high precision maths.

Categories