Comparing float variables precisely? - python

I have a code looking like this:
for i in range (1, 256):
if ((((i-1) * (1 / float(256))) <= proba) and (proba <= (i * (1 / float(256))))):
problist[i] += 1
With proba being a float between 0 and 1 (mostly 0.625 or 0.5).
I want to add proba which is calculated before to a specific interval. Problem is that python seems to assign one value to more than one interval due to rounding errors.
Is there another way to compare these two float numbers being more precise?

Has nothing to do with rounding errors. There aren't any. But if you have intervals [0.49609375, 0.5] and [0.5, 0.50390625], then 0.5 truly is in both of them. Use half-open intervals instead, i.e., change one of those <= to <.
Btw, it would be simpler and faster to simply calculate the interval number by multiplying with 256.
problist[min(int(proba * 256) + 1, 256)] += 1

Related

Continued fraction for pi in Python

I'm trying to approximate pi using continued fraction.
I'm using this formula.
After many hours, I ended up here.
for i in range(1,15,1):
e = ((2*i - 1) ** 2)
b = (e / (6 + ((2*(i+1) - 1) ** 2)))
print(3+b)
`
But my output is not so great...
Your approach is incorrect. If you substitute some example values in place of i you will notice that you completely disregard everything other than i-th and i+1th part of this fraction. You need to make sure that your loop takes into account all the higher levels as well as the new one you are calculating. I think the easiest way to code the solution is by using recursion:
def inf_pi(its, ctr = 1):
if its == 0:
return 0
else:
x = ((2*ctr)-1)**2 / (6 + inf_pi(its - 1, ctr + 1))
return x + (3 if ctr == 1 else 0)
print(inf_pi(10))
If you need an iterative approach, you have to consider 2 things - first of all, you can only calculate this with finite precision, and you need to replace the uncalculated remainder with some value. Second problem is that you are trying to calculate this from outermost fraction, but you don't know what the value of the "infinite" fraction is. If you reverse the order, starting form the innermost fraction, after replacing part outside of wanted precision with a constant, you can keep calculating a value for each step all the way up to the outer fraction.
def it_pi(its):
pi = 0
for i in range(its, 0, -1):
pi = (((2*i)-1)**2) / (6 + pi)
return 3 + pi

while loop, in python, how to compare only 3 decimal points to cease execution

Brand new to programming, python, I want to make this while loop cease execution when the values are equal, but only to 3 decimal points. Appreciate the help!
def newtonSqrt(n):
approx = 0.5 * n
better = 0.5 * (approx + n/approx)
count = 0
while better != approx:
approx = better
better = 0.5 * (approx + n/approx)
count = 1 + count
return (approx, count)
print(newtonSqrt(10))
Round the values to 3 decimal places (they did 2 in that link but you can just do 3 instead) and then compare them. e.g.:
while round(better, 3) != round(approx, 3):
You should not compare floats directly for equality because of binary representation errors. Rounding might work, but looks hackish.
Instead, use something like while abs(better-approx) < error: with error set to 0.001 for example.

How to determine a proportionately decreasing weight given a list of N elements?

I have a list composed of N elements. For context, I am making a time series forecast, and - once the forecasts have been made - would like to weight the forecasts made at the beginning as more important than the later forecasts. This is useful because when I calculate performance error scores (MAPE), this score will be representative of both the forecasts per item, as well as based on how I want to identify good vs. bad models.
How should I update my existing function in order to take any list of elements (N) in order to generate these steadily decreasing weights?
Here is the function that I have come up with on my own. It works for examples like compute_equal_perc(5), but not for other combinations...
def compute_equal_perc(rng):
perc_allocation = []
equal_perc = 1 / rng
half_rng = rng / 2
step_val = equal_perc / (rng - 1)
print(step_val)
for x in [v for v in range(0, rng)]:
if x == int(half_rng):
perc_allocation.append(equal_perc)
elif x < int(half_rng):
diff_plus = ((abs(int(half_rng) - x) * step_val)) + equal_perc
perc_allocation.append(round(float(diff_plus), 3))
elif x >= int(half_rng):
diff_minus = equal_perc - ((abs(int(half_rng) - x) * step_val))
perc_allocation.append(round(float(diff_minus), 3))
return perc_allocation
For compute_equal_perc(5), the output that I get is:
[0.3, 0.25, 0.2, 0.15, 0.1]
The sum of this sequence should always equal 1, and the increments between values should always be equal.
This can be solved through the application of basic algebra. An arithmetic sequence is defined as
A[i] = a + b*i, for i = 0, 1, 2, 3, ... where a is the initial term
The sum of a sequence of elements 0 through n is
S = (A[0] + A[n]) * (n+1) / 2
in words, sum of the first and list terms, times half the number of terms.
Since you know S and n, you need only decide one more "spread" factor to generate your sequence. The mean element must be 1/n -- this is where your algorithm is wrong, as it fumbles this computation for even values of n.
Your code fails in this coupling of statements:
half_rng = rng / 2
step_val = equal_perc / (rng - 1)
# comparing x to int(half_rng)
If rng is even, you assign the mean value to position rng/2, giving you something such as the list for 4 elements:
[0.417, 0.333, 0.25, 0.167]
This means that you have two elements larger than the desired mean, and only one smaller, forcing the sum over 1.0. Instead, when you have an even quantity of elements, you have to make the mean a "phantom" middle element, and take half-steps around it. Let's look at this with fractions: you already have
[5/12, 4/12, 3/12, 2/12]
Your difference is 1/12 ... 1 / (n * (n-1)) ... and you need to shift these values lower by half a step. Instead, the solution with the spread you've chosen (1/12) would be starting a half-step to the side: subtract 1/24 from each element.
[9/24, 7/24, 5/24, 3/24]
You could also change your step with a simple linear factor. Decide on the ratio you want for your elements in simple integers, such as 5:4:3:2, and then generate your weights from the obvious sum of 5+4+3+2:
[5/14, 4/14, 3/14, 2/14]
Note that this works with any arithmetic sequence of integers, another way of choosing your "spread". If you use 4:3:2:1 you get
[4/10, 3/10, 2/10, 1/10]
or you can cluster them more closely with, say, 13:12:11:10
[13/46, 12/46, 11/46, 10/46]
So ... pick the spread you want and simplify your code to take advantage of that.

Is it possible to generate a random number in python on a completely open interval or one that is closed on the high end?

I would like to generate a random number n such that n is in the range (a,b) or (a,b] where a < b. Is this possible in python? It seems the only choices are a + random.random()*(b-a) which is includes [a,b) or random.uniform(a,b) which includes the range [a,b] so neither meet my needs.
Computer generation of "random" numbers is tricky, and especially of "random" floats. You need to think long & hard about what you really want. In the end, you'll need to build something on top of integers, not directly out of floats.
Under the covers, in Python (and every other language using the Mersenne Twister's source code), generating a "random" IEEE-754 double (Python's basic random.random()) really works by generating a random 53-bit integer, then dividing by 2**53:
randrange(2**53) / 9007199254740992.0
That's why the output range is [0.0, 1.0), but not all representable floats in that range are equally likely. Only the ones that can be expressed in the form I/2**53 for an integer 0 <= I < 2**53. For example, the float 1.0 / 2**60 can never be returned.
There are no "real numbers" here, just representable binary-floating-point numbers, so to answer your question first requires that you specify the exact set of those from which you're trying to pick.
If the answer is that you don't want to get that picky, then the distinction between open and closed is also too picky to bother with. If you can specify the precise set, then the solution is to generate more-or-less obvious random integers that map to your output set.
For example, if you want to pick "random" floats from [3.0, 6.0] with just 2 bits after the radix point, there are 13 possible outputs. So the first step is
i = random.randrange(13)
Then map to the range of interest:
return 3.0 + i / 4.0
EDIT: USELESS BUT EDUCATIONAL ;-)
As noted in the comments, picking uniformly from all representable floats x with 0.0 < x < 1.0 can be done, but is very far from being uniformly distributed across that range. There are, for example, 2**52 representable floats in [0.5, 1.0), but also 2**52 representable floats in [0.25, 0.5), and ... in [2.0**-i, 2.0**(1-i)) for increasing i until the number of representable floats starts shrinking when we hit the subnormal range, eventually falling to none when we underflow to 0 completely.
As bit patterns they're very simple, though: the set of representable IEEE-754 doubles (Python floats on almost all platforms) in (0, 1) consists of, when viewing the bit patterns as integers, simply
range(1, 0x3ff0000000000000)
So a function to generate each of those with equal likelihood is straightforward to write using bit-fiddling tricks:
from struct import unpack
from random import randrange
def gen01():
i = randrange(1, 0x3ff0000000000000)
as_bytes = i.to_bytes(8, "big")
return unpack(">d", as_bytes)[0]
Just run that a few times to see why it's useless - it's very heavily skewed toward the 0.0 end of the range:
>>> for i in range(10):
... print(gen01())
9.796357610869274e-104
4.125848254595866e-197
1.8114434720880952e-253
1.4937625148849258e-285
1.0537573744489343e-304
2.79008159472542e-58
4.718459887295062e-217
2.7996009087703915e-295
3.4129442284798105e-170
2.299402306630583e-115
random.randint(a,b) seems to do that. https://docs.python.org/2/library/random.html
Though a bit tricky, you may use np.random.rand to generate random number in (a, b]:
import numpy as np
size = 10 # No. of random numbers to be generated
a, b = 0, 10 # Can be any values
rand_num = np.random.rand(size) # [0, 1)
rand_num *= -1 # (-1, 0]
rand_num += 1 # (0, 1]
rand_num = a + rand_num * (b - a) # (a, b]

Numerical accuracy loss in Python

I wish to calculate the standard error of a series of numbers. Suppose the numbers are x[i] where i = 1 ... N. To do this
I set
averageX = 0.0
averageXSquared = 0.0
I then loop over all i=1,...N and for each I calculate
averageX += x[i]
averageXSquared += x[i]**2
I then divide by N
averageX = averageXC / N
averageXSquared = averageXSquared/N
I then take the square root of the difference
stdX = math.sqrt(averageXSquared - averageX * averageX)
The argument here is sure to always be >=0.
However if I set all x[i] = 0.07 (for example) then I get a math domain error as the argument of the root function is negative. There seems to be some loss of precision.
The argument is of the order of 10e-15.
This does not look encouraging. I now have to check myself to see if the result is negative before taking the root.
Or have I done something wrong.
This is not a python problem, but a problem with finite precision in general. If you set all numbers to the same value, the standard error is mathematically 0, but not for a computer. The correct way to handle this, is to set very small values <0 to 0.
x = [0.7, 0.7, 0.7]
average = sum(x) / len(x)
sqav = sum(y**2 for y in x) / len(x)
stderr = math.sqrt(max(sqav - average**2, 0))
The correct way, of course is never subtract large numbers. Have another pass, which guarantees non-negativity (you need to do some algebra to realize that the result is mathematically the same):
y = [ v - average for v in x ]
dev = sum(v*v for v in y) / len(x)
stderr = math.sqrt(dev)

Categories