I have a lot of measurements, with measurement uncertainties. To speed up the process of reporting all the measurements and their uncertainties, I'm writing a script to print them for me.
For example, when I have measurement x = 0.012345 with uncertainty dx = 0.000321, I want my python script to print '0.0123(3)', with (3) being the uncertainty on the last digit of the rounded x
I feel like this should be very easy, but so far, I'm coming up with incredibly ugly solutions, iterating over numbers as a string. What would be a good pythonic way to do this?
from math import log10, floor
x = 0.012345
dx = 0.000321
print('{:.{prec}f}({:.0f})'.format(x, floor(dx / (10**floor(log10(dx)))), prec=-floor(log10(dx))))
This code prints
0.0123(3)
You can do it like this:
print("{:.4f}({:.0f})".format(x, dx*10**4))
However that will only work if your error is in the order of 10**-4 otherwise it will output more numbers and if it is smaller it will output (0). S you might have to deal with the powers of your values. So this works for your specific example. Not sure how flexible it is supposed to be. BUt this could be a start for an idea on how to do it.
Based on the very clever solution from Yaroslav Kornachevskyi, I wrote a little function to do this for all numbers, rounding correctly. Probably nobody will ever use this but me:
from math import log10, floor, ceil
x = 0.012345
dx = 0.0000968
def get_number(x, dx):
""" Returns a string of the measurement value"""
""" together with the measurement error"""
""" x: measurement value"""
""" dx: measurment error"""
# Power of dx
power_err = log10(dx)
# Digits of dx in format a.bcd
n_err = dx / (10**floor(power_err))
# If the second digit in dx is >=5
# round the 1st digit in dx up
if n_err % 1 >= 0.5:
# If the first digit of dx is 9
# the precision is one digit less
if int(n_err) == 9:
err = 1
# The precision of x is determined by the precision of dx
prec=int(-floor(log10(dx))) - 1
else:
err = ceil(n_err)
# The precision of x is determined by the precision of dx
prec=int(-floor(log10(dx)))
# Otherwise round down
else:
err = floor(n_err)
# The precision of x is determined by the precision of dx
prec=int(-floor(log10(dx)))
return '{:.{prec}f}({:.0f})'.format(x, err, prec = prec)
print(get_number(x, dx))
Related
Recently we encountered an issue with math.log() . Since 243 is a perfect power of 3 , assumption that taking floor should be fine was wrong as it seems to have precision error on lower side.
So as a hack we started adding a small value before taking logarithm. Is there a way to configue math.log upfront or something similar that that we dont have to add EPS every time.
To clarify some of the comments Note we are not looking to round to nearest integer. Our goal is to keep the value exact or at times take floor. But if the precision error is on lower side floor screws up big time, that's what we are trying to avoid.
code:
import math
math.log(243, 3)
int(math.log(243, 3))
output:
4.999999999999999
4
code:
import math
EPS = 1e-09
math.log(243 + EPS, 3)
int(math.log(243 + EPS, 3))
output:
5.0000000000037454
5
Instead of trying to solve it might be easier to look at and just solve this iteratively, taking advantage of Python's integer type. This way you can avoid the float domain, and its associated precision loss, entirely.
Here's a rough attempt:
def ilog(a: int, p: int) -> tuple[int, bool]:
"""
find the largest b such that p ** b <= a
return tuple of (b, exact)
"""
if p == 1:
return a, True
b = 0
x = 1
while x < a:
x *= p
b += 1
if x == a:
return b, True
else:
return b - 1, False
There are plenty of opportunities for optimization if this is too slow (consider Newton's method, binary search...)
How about this? Is this what you are looking for?
import math
def ilog(a: int, p:int) -> int:
"""
find the largest b such that p ** b <= a
"""
float_log = math.log(a, p)
if (candidate := math.ceil(float_log))**p <= a:
return candidate
return int(float_log)
print(ilog(243, 3))
print(ilog(3**31, 3))
print(ilog(8,2))
Output:
5
31
3
You can use decimals and play with precision and rounding instead of floats in this case
Like this:
from decimal import Decimal, Context, ROUND_HALF_UP, ROUND_HALF_DOWN
ctx1 = Context(prec=20, rounding=ROUND_HALF_UP)
ctx2 = Context(prec=20, rounding=ROUND_HALF_DOWN)
ctx1.divide(Decimal(243).ln( ctx1) , Decimal(3).ln( ctx2))
Output:
Decimal('5')
First, the rounding works like the epsilon - the numerator is rounded up and denominator down. You always get a slightly higher answer
Second, you can adjust precision you need
However, fundamentally the problem is unsolvable.
Let's consider this situation:
from math import sqrt
x = sqrt(19) # x : 4.358898943540674
print("{:.4f}".format(x))
# I don't want to get 4.3589
# I want to get 4.3588
The print() function rounds the number automatically, but I don't want this. What should I do?
If you want to round the number down to the 4th decimal place rather than round it to the nearest possibility, you could do the rounding yourself.
x = int(x * 10**4) / 10**4
print("{:.4f}".format(x))
This gives you
4.3588
Multiplying and later dividing by 10**4 shifts the number 4 decimal places, and the int function rounds down to an integer. Combining them all accomplishes what you want. There are some edge cases that will give an unexpected result due to floating point issues, but those will be rare.
Here is one way. truncate function courtesy of #user648852.
from math import sqrt, floor
def truncate(f, n):
return floor(f * 10 ** n) / 10 ** n
x = sqrt(19) # x : 4.358898943540674
print("{0}".format(truncate(x, 4)))
# 4.3588
Do more work initially and cut away a fixed number of excess digits:
from math import sqrt
x = sqrt(19) # x : 4.358898943540674
print(("{:.9f}".format(x))[:-5])
gives the desired result. This could still fail if x has the form ?.????999996 or similar, but the density of these numbers is rather small.
I'm trying to write a program to look for a number, n, between 0 and 100 such that n! + 1 is a perfect square. I'm trying to do this because I know there are only three so it was meant as a test of my Python ability.
Refer to Brocard's problem.
math.sqrt always returns a float, even if that float happens to be, say, 4.0. As the docs say, "Except when explicitly noted otherwise, all return values are floats."
So, your test for type(math.sqrt(x)) == int will never be true.
You could try to work around that by checking whether the float represents an integer, like this:
sx = math.sqrt(x)
if round(sx) == sx:
There's even a built-in method that does this as well as possible:
if sx.is_integer():
But keep in mind that float values are not a perfect representation of real numbers, and there are always rounding issues. For example, for a too-large number, the sqrt might round to an integer, even though it really wasn't a perfect square. For example, if math.sqrt(10000000000**2 + 1).is_integer() is True, even though obviously the number is not a perfect square.
I could tell you whether this is safe within your range of values, but can you convince yourself? If not, you shouldn't just assume that it is.
So, is there a way we can check that isn't affected by float roading issues? Sure, we can use integer arithmetic to check:
sx = int(round(math.sqrt(x)))
if sx*sx == x:
But, as Stefan Pochmann points out, even if this check is safe, does that mean the whole algorithm is? No; sqrt itself could have already been rounded to the point where you've lost integer precision.
So, you need an exact sqrt. You could do this by using decimal.Decimal with a huge configured precision. This will take a bit of work, and a lot of memory, but it's doable. Like this:
decimal.getcontext().prec = ENOUGH_DIGITS
sx = decimal.Decimal(x).sqrt()
But how many digits is ENOUGH_DIGITS? Well, how many digits do you need to represent 100!+1 exactly?
So:
decimal.getcontext().prec = 156
while n <= 100:
x = math.factorial(n) + 1
sx = decimal.Decimal(x).sqrt()
if int(sx) ** 2 == x:
print(sx)
n = n + 1
If you think about it, there's a way to reduce the needed precision to 79 digits, but I'll leave that as an exercise for the reader.
The way you're presumably supposed to solve this is by using purely integer math. For example, you can find out whether an integer is a square in logarithmic time just by using Newton's method until your approximation error is small enough to just check the two bordering integers.
For very large numbers it's better to avoid using floating point square roots altogether because you will run into too many precision issues and you can't even guarantee that you will be within 1 integer value of the correct answer. Fortunately Python natively supports integers of arbitrary size, so you can write an integer square root checking function, like this:
def isSquare(x):
if x == 1:
return True
low = 0
high = x // 2
root = high
while root * root != x:
root = (low + high) // 2
if low + 1 >= high:
return False
if root * root > x:
high = root
else:
low = root
return True
Then you can run through the integers from 0 to 100 like this:
n = 0
while n <= 100:
x = math.factorial(n) + 1
if isSquare(x):
print n
n = n + 1
Here's another version working only with integers, computing the square root by adding decreasing powers of 2, for example intsqrt(24680) will be computed as 128+16+8+4+1.
def intsqrt(n):
pow2 = 1
while pow2 < n:
pow2 *= 2
sqrt = 0
while pow2:
if (sqrt + pow2) ** 2 <= n:
sqrt += pow2
pow2 //= 2
return sqrt
factorial = 1
for n in range(1, 101):
factorial *= n
if intsqrt(factorial + 1) ** 2 == factorial + 1:
print(n)
The number math.sqrt returns is never an int, even if it's an integer.How to check if a float value is a whole number
I'm solving a one dimensional non-linear equation with Newton's method. I'm trying to figure out why one of the implementations of Newton's method is converging exactly within floating point precision, wheres another is not.
The following algorithm does not converge:
whereas the following does converge:
You may assume that the functions f and f' are smooth and well behaved. The best explanation I was able to come up with is that this is somehow related to what's called iterative improvement (Golub and Van Loan, 1989). Any further insight would be greatly appreciated!
Here is a simple python example illustrating the issue
# Python
def f(x):
return x*x-2.
def fp(x):
return 2.*x
xprev = 0.
# converges
x = 1. # guess
while x != xprev:
xprev = x
x = (x*fp(x)-f(x))/fp(x)
print(x)
# does not converge
x = 1. # guess
while x != xprev:
xprev = x
dx = -f(x)/fp(x)
x = x + dx
print(x)
Note: I'm aware of how floating point numbers work (please don't post your favourite link to a website telling me to never compare two floating point numbers). Also, I'm not looking for a solution to a problem but for an explanation as to why one of the algorithms converges but not the other.
Update:
As #uhoh pointed out, there are many cases where the second method does not converge. However, I still don't know why the second method converges so much more easily in my real world scenario than the first. All the test cases have very simple functions f whereas the real world f has several hundred lines of code (which is why I don't want to post it). So maybe the complexity of f is important. If you have any additional insight into this, let me know!
None of the methods is perfect:
One situation in which both methods will tend to fail is if the root is about exactly midway between two consecutive floating-point numbers f1 and f2. Then both methods, having arrived to f1, will try to compute that intermediate value and have a good chance of turning up f2, and vice versa.
/f(x)
/
/
/
/
f1 /
--+----------------------+------> x
/ f2
/
/
/
"I'm aware of how floating point numbers work...". Perhaps the workings of floating-point arithmetic are more complicated than imagined.
This is a classic example of cycling of iterates using Newton's method. The comparison of a difference to an epsilon is "mathematical thinking" and can burn you when using floating-point. In your example, you visit several floating-point values for x, and then you are trapped in a cycle between two numbers. The "floating-point thinking" is better formulated as the following (sorry, my preferred language is C++)
std::set<double> visited;
xprev = 0.0;
x = 1.0;
while (x != prev)
{
xprev = x;
dx = -F(x)/DF(x);
x = x + dx;
if (visited.find(x) != visited.end())
{
break; // found a cycle
}
visited.insert(x);
}
I'm trying to figure out why one of the implementations of Newton's method is converging exactly within floating point precision, wheres another is not.
Technically, it doesn't converge to the correct value. Try printing more digits, or using float.hex.
The first one gives
>>> print "%.16f" % x
1.4142135623730949
>>> float.hex(x)
'0x1.6a09e667f3bccp+0'
whereas the correctly rounded value is the next floating point value:
>>> print "%.16f" % math.sqrt(2)
1.4142135623730951
>>> float.hex(math.sqrt(2))
'0x1.6a09e667f3bcdp+0'
The second algorithm is actually alternating between the two values, so doesn't converge.
The problem is due to catastrophic cancellation in f(x): as x*x will be very close to 2, when you subtract 2, the result will be dominated by the rounding error incurred in computing x*x.
I think trying to force an exact equal (instead of err < small) is always going to fail frequently. In your example, for 100,000 random numbers between 1 and 10 (instead of your 2.0) the first method fails about 1/3 of the time, the second method about 1/6 of the time. I'll bet there's a way to predict that!
This takes ~30 seconds to run, and the results are cute!:
def f(x, a):
return x*x - a
def fp(x):
return 2.*x
def A(a):
xprev = 0.
x = 1.
n = 0
while x != xprev:
xprev = x
x = (x * fp(x) - f(x,a)) / fp(x)
n += 1
if n >100:
return n, x
return n, x
def B(a):
xprev = 0.
x = 1.
n = 0
while x != xprev:
xprev = x
dx = - f(x,a) / fp(x)
x = x + dx
n += 1
if n >100:
return n, x
return n, x
import numpy as np
import matplotlib.pyplot as plt
n = 100000
aa = 1. + 9. * np.random.random(n)
data_A = np.zeros((2, n))
data_B = np.zeros((2, n))
for i, a in enumerate(aa):
data_A[:,i] = A(a)
data_B[:,i] = B(a)
bins = np.linspace(0, 110, 12)
hist_A = np.histogram(data_A, bins=bins)
hist_B = np.histogram(data_B, bins=bins)
print "A: n<10: ", hist_A[0][0], " n>=100: ", hist_A[0][-1]
print "B: n<10: ", hist_B[0][0], " n>=100: ", hist_B[0][-1]
plt.figure()
plt.subplot(1,2,1)
plt.scatter(aa, data_A[0])
plt.subplot(1,2,2)
plt.scatter(aa, data_B[0])
plt.show()
I have a list of probabilities, which I need to normalize to equal 1.0.
e.g. probs = [0.01,0.03,0.005]
I realize that this is done by dividing each probability by the sum of probs. However, if the probabilities become really small, Python will tell me that sum(probs)=0.0. I understand that this is an underflow issue. I suppose I should use the log of each probability. How would I do this?
The sum of even very small floating point values will never truly be 0; they may be close to zero, but can never be exactly zero.
Just divide 1 by their sum, and multiply the probabilities by that factor:
def normalize(probs):
prob_factor = 1 / sum(probs)
return [prob_factor * p for p in probs]
Some probabilities may make up but a very small percentage in the total sum, of course, and that percentage may approach zero. But this just means that when normalising you may end up with normalized probabilities that are either very close to zero, or if smaller than the smallest representable floating point value, equal to zero. The latter only happens if there are probabilities in the list that are so much smaller than the others that they no longer represent anything close to something that'll ever occur.
Demo:
>>> def normalize(probs):
... prob_factor = 1 / sum(probs)
... return [prob_factor * p for p in probs]
...
>>> normalize([0.0000000001,0.000000000003,0.000000000000005])
[0.9708266589000533, 0.029124799767001597, 4.854133294500266e-05]
And the extreme case:
>>> import sys
>>> normalize([sys.float_info.max, sys.float_info.min])
[0.9999999999999999, 0.0]
>>> normalize([sys.float_info.max, sys.float_info.min])[-1] == 0
True
You can always use a scale factor to avoid the underflow problem, either manually entered or automatically calculated, e.g.:
import math
no_z = ([x for x in probs if x > 0.0])
if len(no_z) == 0:
print "Unable to calculate with 0.0 as all the probabilities"
order = int(-math.log10(min(no_z)))
if order > 0:
order = 0
sf = 10**order
scaled = [x * sf for x in probs]
tot = sum(scaled)
norm = [x/tot for x in scaled]
Of course you would probably be better off just using bigfloat or numpy and doing high precision maths.