Numerical accuracy loss in Python - python

I wish to calculate the standard error of a series of numbers. Suppose the numbers are x[i] where i = 1 ... N. To do this
I set
averageX = 0.0
averageXSquared = 0.0
I then loop over all i=1,...N and for each I calculate
averageX += x[i]
averageXSquared += x[i]**2
I then divide by N
averageX = averageXC / N
averageXSquared = averageXSquared/N
I then take the square root of the difference
stdX = math.sqrt(averageXSquared - averageX * averageX)
The argument here is sure to always be >=0.
However if I set all x[i] = 0.07 (for example) then I get a math domain error as the argument of the root function is negative. There seems to be some loss of precision.
The argument is of the order of 10e-15.
This does not look encouraging. I now have to check myself to see if the result is negative before taking the root.
Or have I done something wrong.

This is not a python problem, but a problem with finite precision in general. If you set all numbers to the same value, the standard error is mathematically 0, but not for a computer. The correct way to handle this, is to set very small values <0 to 0.
x = [0.7, 0.7, 0.7]
average = sum(x) / len(x)
sqav = sum(y**2 for y in x) / len(x)
stderr = math.sqrt(max(sqav - average**2, 0))

The correct way, of course is never subtract large numbers. Have another pass, which guarantees non-negativity (you need to do some algebra to realize that the result is mathematically the same):
y = [ v - average for v in x ]
dev = sum(v*v for v in y) / len(x)
stderr = math.sqrt(dev)

Related

How do I evaluate this equation in z3 for python

I'm trying to evaluate a simple absolute value inequality like this using z3.
x = Int("x")
y = Int("y")
def abs(x):
return If(x >= 0,x,-x)
solve(abs( x / 1000 - y / 1000 ) < .01, y==1000)
The output is no solution every time. I know this is mathematically possible, I just can't figure out how z3 does stuff like this.
This is a common gotcha in z3py bindings. Constants are "promoted" to fit into the right type, following the usual Python methodology. But more often than not, it ends up doing the wrong conversion, and you end up with a very confusing situation.
Since your variables x and y are Int values, the comparison against .01 forces that constant to be 0 to fit the types, and that's definitely not what you wanted to say. The general advice is simply not to mix-and-match arithmetic like this: Cast this as a problem over real-values, not integers. (In general SMTLib doesn't allow mixing-and-matching types in numbers, though z3py does. I think that's misguided, but that's a different discussion.)
To address your issue, the simplest thing to do would be to wrap 0.01 into a real-constant, making the z3py bindings interpret it correctly. So, you'll have:
from z3 import *
x = Int("x")
y = Int("y")
def abs(x):
return If(x >= 0,x,-x)
solve(abs( x / 1000 - y / 1000 ) < RealVal(.01), y==1000)
Note the use of RealVal. This returns:
[x = 1000, y = 1000]
I guess this is what you are after.
But I'd, in general, recommend against using conversions like this. Instead, be very explicit yourself, and cast this as a problem, for instance, over Real values. Note that your division / 1000 is also interpreted in this equation as an integer division, i.e., one that produces an integer result. So, I'm guessing this isn't really what you want either. But I hope this gets you started on the right path.
Int('a') < 0.01 is turned (rightly or wrongly) into Int('a') < 0 and clearly the absolute value can never be smaller than 0.
I believe you want Int('a') <= 0 here.
Examples:
solve(Int('a') < 0.01, Int('a') > -1)
no solution
solve(Int('a') <= 0.01, Int('a') > -1)
[a = 0]
Int('a') < 0.01
a < 0

How to determine a proportionately decreasing weight given a list of N elements?

I have a list composed of N elements. For context, I am making a time series forecast, and - once the forecasts have been made - would like to weight the forecasts made at the beginning as more important than the later forecasts. This is useful because when I calculate performance error scores (MAPE), this score will be representative of both the forecasts per item, as well as based on how I want to identify good vs. bad models.
How should I update my existing function in order to take any list of elements (N) in order to generate these steadily decreasing weights?
Here is the function that I have come up with on my own. It works for examples like compute_equal_perc(5), but not for other combinations...
def compute_equal_perc(rng):
perc_allocation = []
equal_perc = 1 / rng
half_rng = rng / 2
step_val = equal_perc / (rng - 1)
print(step_val)
for x in [v for v in range(0, rng)]:
if x == int(half_rng):
perc_allocation.append(equal_perc)
elif x < int(half_rng):
diff_plus = ((abs(int(half_rng) - x) * step_val)) + equal_perc
perc_allocation.append(round(float(diff_plus), 3))
elif x >= int(half_rng):
diff_minus = equal_perc - ((abs(int(half_rng) - x) * step_val))
perc_allocation.append(round(float(diff_minus), 3))
return perc_allocation
For compute_equal_perc(5), the output that I get is:
[0.3, 0.25, 0.2, 0.15, 0.1]
The sum of this sequence should always equal 1, and the increments between values should always be equal.
This can be solved through the application of basic algebra. An arithmetic sequence is defined as
A[i] = a + b*i, for i = 0, 1, 2, 3, ... where a is the initial term
The sum of a sequence of elements 0 through n is
S = (A[0] + A[n]) * (n+1) / 2
in words, sum of the first and list terms, times half the number of terms.
Since you know S and n, you need only decide one more "spread" factor to generate your sequence. The mean element must be 1/n -- this is where your algorithm is wrong, as it fumbles this computation for even values of n.
Your code fails in this coupling of statements:
half_rng = rng / 2
step_val = equal_perc / (rng - 1)
# comparing x to int(half_rng)
If rng is even, you assign the mean value to position rng/2, giving you something such as the list for 4 elements:
[0.417, 0.333, 0.25, 0.167]
This means that you have two elements larger than the desired mean, and only one smaller, forcing the sum over 1.0. Instead, when you have an even quantity of elements, you have to make the mean a "phantom" middle element, and take half-steps around it. Let's look at this with fractions: you already have
[5/12, 4/12, 3/12, 2/12]
Your difference is 1/12 ... 1 / (n * (n-1)) ... and you need to shift these values lower by half a step. Instead, the solution with the spread you've chosen (1/12) would be starting a half-step to the side: subtract 1/24 from each element.
[9/24, 7/24, 5/24, 3/24]
You could also change your step with a simple linear factor. Decide on the ratio you want for your elements in simple integers, such as 5:4:3:2, and then generate your weights from the obvious sum of 5+4+3+2:
[5/14, 4/14, 3/14, 2/14]
Note that this works with any arithmetic sequence of integers, another way of choosing your "spread". If you use 4:3:2:1 you get
[4/10, 3/10, 2/10, 1/10]
or you can cluster them more closely with, say, 13:12:11:10
[13/46, 12/46, 11/46, 10/46]
So ... pick the spread you want and simplify your code to take advantage of that.

Avoid underflow using exp and minimum positive float128 in numpy

I am trying to calculate the following ratio:
w(i) / (sum(w(j)) where w are updated using an exponential decreasing function, i.e. w(i) = w(i) * exp(-k), k being a positive parameter. All the numbers are non-negative.
This ratio is then used to a formula (multiply with a constant and add another constant). As expected, I soon run into underflow problems.
I guess this happens often but can someone give me some references on how to deal with this? I did not find an appropriate transformation so one thing I tried to do is set some minimum positive number as a safety threshold but I did not manage to find which is the minimum positive float (I am representing numbers in numpy.float128). How can I actually get the minimum positive such number on my machine?
The code looks like this:
w = np.ones(n, dtype='float128')
lt = np.ones(n)
for t in range(T):
p = (1-k) * w / w.sum() + (k/n)
# Process a subset of the n elements, call it set I, j is some range()
for i in I:
s = p[list(j[i])].sum()
lt /= s
w[s] *= np.exp(-k * lt)
where k is some constant in (0,1) and n is the length of the array
When working with exponentially small numbers it's usually better to work in log space. For example, log(w*exp(-k)) = log(w) - k, which won't have any over/underflow problems unless k is itself exponentially large or w is zero. And, if w is zero, numpy will correctly return -inf. Then, when doing the sum, you factor out the largest term:
log_w = np.log(w) - k
max_log_w = np.max(log_w)
# Individual terms in the following may underflow, but then they wouldn't
# contribute to the sum anyways.
log_sum_w = max_log_w + np.log(np.sum(np.exp(log_w - max_log_w)))
log_ratio = log_w - log_sum_w
This probably isn't exactly what you want since you could just factor out the k completely (assuming it's a constant and not an array), but it should get you on your way.
Scikit-learn implements a similar thing with extmath.logsumexp, but it's basically the same as the above.

Different implementations of Newton's method in floating point arithmetic

I'm solving a one dimensional non-linear equation with Newton's method. I'm trying to figure out why one of the implementations of Newton's method is converging exactly within floating point precision, wheres another is not.
The following algorithm does not converge:
whereas the following does converge:
You may assume that the functions f and f' are smooth and well behaved. The best explanation I was able to come up with is that this is somehow related to what's called iterative improvement (Golub and Van Loan, 1989). Any further insight would be greatly appreciated!
Here is a simple python example illustrating the issue
# Python
def f(x):
return x*x-2.
def fp(x):
return 2.*x
xprev = 0.
# converges
x = 1. # guess
while x != xprev:
xprev = x
x = (x*fp(x)-f(x))/fp(x)
print(x)
# does not converge
x = 1. # guess
while x != xprev:
xprev = x
dx = -f(x)/fp(x)
x = x + dx
print(x)
Note: I'm aware of how floating point numbers work (please don't post your favourite link to a website telling me to never compare two floating point numbers). Also, I'm not looking for a solution to a problem but for an explanation as to why one of the algorithms converges but not the other.
Update:
As #uhoh pointed out, there are many cases where the second method does not converge. However, I still don't know why the second method converges so much more easily in my real world scenario than the first. All the test cases have very simple functions f whereas the real world f has several hundred lines of code (which is why I don't want to post it). So maybe the complexity of f is important. If you have any additional insight into this, let me know!
None of the methods is perfect:
One situation in which both methods will tend to fail is if the root is about exactly midway between two consecutive floating-point numbers f1 and f2. Then both methods, having arrived to f1, will try to compute that intermediate value and have a good chance of turning up f2, and vice versa.
/f(x)
/
/
/
/
f1 /
--+----------------------+------> x
/ f2
/
/
/
"I'm aware of how floating point numbers work...". Perhaps the workings of floating-point arithmetic are more complicated than imagined.
This is a classic example of cycling of iterates using Newton's method. The comparison of a difference to an epsilon is "mathematical thinking" and can burn you when using floating-point. In your example, you visit several floating-point values for x, and then you are trapped in a cycle between two numbers. The "floating-point thinking" is better formulated as the following (sorry, my preferred language is C++)
std::set<double> visited;
xprev = 0.0;
x = 1.0;
while (x != prev)
{
xprev = x;
dx = -F(x)/DF(x);
x = x + dx;
if (visited.find(x) != visited.end())
{
break; // found a cycle
}
visited.insert(x);
}
I'm trying to figure out why one of the implementations of Newton's method is converging exactly within floating point precision, wheres another is not.
Technically, it doesn't converge to the correct value. Try printing more digits, or using float.hex.
The first one gives
>>> print "%.16f" % x
1.4142135623730949
>>> float.hex(x)
'0x1.6a09e667f3bccp+0'
whereas the correctly rounded value is the next floating point value:
>>> print "%.16f" % math.sqrt(2)
1.4142135623730951
>>> float.hex(math.sqrt(2))
'0x1.6a09e667f3bcdp+0'
The second algorithm is actually alternating between the two values, so doesn't converge.
The problem is due to catastrophic cancellation in f(x): as x*x will be very close to 2, when you subtract 2, the result will be dominated by the rounding error incurred in computing x*x.
I think trying to force an exact equal (instead of err < small) is always going to fail frequently. In your example, for 100,000 random numbers between 1 and 10 (instead of your 2.0) the first method fails about 1/3 of the time, the second method about 1/6 of the time. I'll bet there's a way to predict that!
This takes ~30 seconds to run, and the results are cute!:
def f(x, a):
return x*x - a
def fp(x):
return 2.*x
def A(a):
xprev = 0.
x = 1.
n = 0
while x != xprev:
xprev = x
x = (x * fp(x) - f(x,a)) / fp(x)
n += 1
if n >100:
return n, x
return n, x
def B(a):
xprev = 0.
x = 1.
n = 0
while x != xprev:
xprev = x
dx = - f(x,a) / fp(x)
x = x + dx
n += 1
if n >100:
return n, x
return n, x
import numpy as np
import matplotlib.pyplot as plt
n = 100000
aa = 1. + 9. * np.random.random(n)
data_A = np.zeros((2, n))
data_B = np.zeros((2, n))
for i, a in enumerate(aa):
data_A[:,i] = A(a)
data_B[:,i] = B(a)
bins = np.linspace(0, 110, 12)
hist_A = np.histogram(data_A, bins=bins)
hist_B = np.histogram(data_B, bins=bins)
print "A: n<10: ", hist_A[0][0], " n>=100: ", hist_A[0][-1]
print "B: n<10: ", hist_B[0][0], " n>=100: ", hist_B[0][-1]
plt.figure()
plt.subplot(1,2,1)
plt.scatter(aa, data_A[0])
plt.subplot(1,2,2)
plt.scatter(aa, data_B[0])
plt.show()

Normalize Small Probabilities in Python

I have a list of probabilities, which I need to normalize to equal 1.0.
e.g. probs = [0.01,0.03,0.005]
I realize that this is done by dividing each probability by the sum of probs. However, if the probabilities become really small, Python will tell me that sum(probs)=0.0. I understand that this is an underflow issue. I suppose I should use the log of each probability. How would I do this?
The sum of even very small floating point values will never truly be 0; they may be close to zero, but can never be exactly zero.
Just divide 1 by their sum, and multiply the probabilities by that factor:
def normalize(probs):
prob_factor = 1 / sum(probs)
return [prob_factor * p for p in probs]
Some probabilities may make up but a very small percentage in the total sum, of course, and that percentage may approach zero. But this just means that when normalising you may end up with normalized probabilities that are either very close to zero, or if smaller than the smallest representable floating point value, equal to zero. The latter only happens if there are probabilities in the list that are so much smaller than the others that they no longer represent anything close to something that'll ever occur.
Demo:
>>> def normalize(probs):
... prob_factor = 1 / sum(probs)
... return [prob_factor * p for p in probs]
...
>>> normalize([0.0000000001,0.000000000003,0.000000000000005])
[0.9708266589000533, 0.029124799767001597, 4.854133294500266e-05]
And the extreme case:
>>> import sys
>>> normalize([sys.float_info.max, sys.float_info.min])
[0.9999999999999999, 0.0]
>>> normalize([sys.float_info.max, sys.float_info.min])[-1] == 0
True
You can always use a scale factor to avoid the underflow problem, either manually entered or automatically calculated, e.g.:
import math
no_z = ([x for x in probs if x > 0.0])
if len(no_z) == 0:
print "Unable to calculate with 0.0 as all the probabilities"
order = int(-math.log10(min(no_z)))
if order > 0:
order = 0
sf = 10**order
scaled = [x * sf for x in probs]
tot = sum(scaled)
norm = [x/tot for x in scaled]
Of course you would probably be better off just using bigfloat or numpy and doing high precision maths.

Categories