Multiplication of two small numbers with tensorflow - python

In some point of my project i have to multiply two small float like 8.696503446228892e-159 and 1.2425389522444519e-158 as i test in following code:
def a2(a,b):
a = tf.cast(a, tf.float64)
b = tf.cast(b, tf.float64)
d = a*b
return d
it will return 0 which cause lots of problem (because it is used in my loss function) any solution how can i multiply them?

Handling large discrepancies in computational magnitude is a field of study in itself.
The first-order way to do this is to write your evaluation code to detect the situation and re-order the operations so as to preserve significant bits of each result. For instance, let's simplify your names a bit:
tf.log(tf.linalg.det(temp_sigma) /
(tf.sqrt(tf.linalg.det(sigma1) * tf.linalg.det(sigma2))))
turns into
log(det(A) / (sqrt(det(B) * det(c))))
The case you have is that det(B) and det(C) are barely above zero, but relatively near each other: the result of sqrt(det(B) * det(C)) will be close to either determinant.
Change the order of operations. For this instance, distribute the square root and do the divisions individually:
log(
( det(A) / sqrt(det(B)) ) / sqrt(det(C)) )
Does that get you moving along?

Related

What's the fastest way to generate a 2D grid of values in python?

I need to generate a 2D array in python whose entries are given by a different function above and under the diagonal.
I tried the following:
x = np.reshape(np.logspace(0.001,10,2**12),(1,4096))
def F(a,x):
y = x.T
Fu = np.triu(1/(2*y**2) * (y/x)**a * ((2*a+1) + (a-1)) / (a+1))
Fl = np.tril(1/(2*y**3) * (x/y)**a * a/(2*a+1), -1)
return Fu + Fl
and this works, but it's a bit too inefficient since it's computing a lot of values that are discarded anyway, some of which are especially slow due to the (x/y)**a term which leads to an overflow for high a (80+). This takes me 1-2s to run, depending on the value of a, but I need to use this function thousands of times, so any improvement would be welcome. Is there a way to avoid computing the whole matrix twice before discarding the upper or lower triangular (which would also avoid the overflow problem), and make this function faster?
You can move the multiplication before so to avoid multiplying a big temporary array (Numpy operations are done from the left to the right). You can also precompute (x/y)**a from (y/a)**a since it is just its inverse. Doing so is faster because computing the power of a floating-point number is slow (especially in double precision). Additionally, you can distribute the (x/y)**a operation so to compute x**a/y**a. This is faster because there is only O(2n) values to compute instead of O(n²). That being said, this operation is not numerically stable in your case because of the big power, so it is not safe. You can finally use numexpr so to compute the power in parallel using multiple threads. You can also compute the sum in-place so to avoid creating expensive temporary arrays (and more efficiently use your RAM). Here is the resulting code:
def F_opt(a,x):
y = x.T
tmp = numexpr.evaluate('(y/x)**a')
Fu = np.triu(1/(2*y**2) * ((2*a+1) + (a-1)) / (a+1) * tmp)
Fl = np.tril(1/(2*y**3) * a/(2*a+1) / tmp, -1)
return np.add(Fu, Fl, out=Fu)
This is 5 times faster on my machine. Note there is still few warning about overflows like in the original code and an additional division by zero warning.
Note that you can make this code a bit faster using a parallel Numba code (especially if a is an integer known at compile-time). If you have access to an (expensive) server-side Nvidia GPU, then you can compute this more efficiently using the cupy package.

Increase float precision

I am developing a machine learning based algorithm on python. The main thing, that I need to calculate to solve this problem is probabilities. This way I have the following code:
class_ans = class_probability[current_class] * lambdas[current_class]
for word in appears_words:
if word in message:
class_ans *= words_probability[(word, current_class)]
else:
class_ans *= (1 - words_probability[(word, current_class)])
ans.append(class_ans)
ans[current_class] /= summ
It works, but in case the dataset is too big or lambdas value is too small, I ran out of my float precision.
I've tryed to research an other algorithm of calculating my answer's value, multimplying and dividing on some random consts different variables to make them not to overflow. Despite this, nothing helped.
This way, I would like to ask, is there any ways to increase my float precision in python?
Thanks!
You cannot. When using serious scientific computation where precision is key (and speed is not), consider the following two options:
Instead of using float, switch your datatype to decimal.Decimal and set your desired precision.
For a more battle-hardened thorough implementation, switch to gmpy2.mpfr as your data type.
However, if your entire computation (or at least the problematic part) involves the multiplication of factors, you can often bypass the need for the above by working in log-space as Konrad Rudolph suggests in the comments:
a * b * c * d * ... = exp(log(a) + log(b) + log(c) + log(d) + ...)

Python Numerical Differentiation and the minimum value for h

I calculate the first derivative using the following code:
def f(x):
f = np.exp(x)
return f
def dfdx(x):
Df = (f(x+h)-f(x-h)) / (2*h)
return Df
For example, for x == 10 this works fine. But when I set h to around 10E-14 or below, Df starts
to get values that are really far away from the expected value f(10) and the relative error between the expected value and Df becomes huge.
Why is that? What is happening here?
The evaluation of f(x) has, at best, a rounding error of |f(x)|*mu where mu is the machine constant of the floating point type. The total error of the central difference formula is thus approximately
2*|f(x)|*mu/(2*h) + |f'''(x)|/6 * h^2
In the present case, the exponential function is equal to all of its derivatives, so that the error is proportional to
mu/h + h^2/6
which has a minimum at h = (3*mu)^(1/3), which for the double format with mu=1e-16 is around h=1e-5.
The precision is increased if instead of 2*h the actual difference (x+h)-(x-h) between the evaluation points is used in the denominator. This can be seen in the following loglog plot of the distance to the exact derivative.
You are probably encountering some numerical instability, as for x = 10 and h =~ 1E-13, the argument for np.exp is very close to 10 whether h is added or subtracted, so small approximation errors in the value of np.exp are scaled significantly by the division with the very small 2 * h.
In addition to the answer by #LutzL I will add some info from a great book Numerical Recipes 3rd Edition: The Art of Scientific Computing from chapter 5.7 about Numerical Derivatives, especially about the choice of optimal h value for given x:
Always choose h so that h and x differ by an exactly representable number. Funny stuff like 1/3 should be avoided, except when x is equal to something along the lines of 14.3333333.
Round-off error is approximately epsilon * |f(x) * h|, where epsilon is floating point accuracy, Python represents floating point numbers with double precision so it's 1e-16. It may differ for more complicated functions (where precision errors arise further), though it's not your case.
Choice of optimal h: Not getting into details it would be sqrt(epsilon) * x for simple forward case, except when your x is near zero (you will find more information in the book), which is your case. You may want to use higher x values in such cases, complementary answer is already provided. In the case of f(x+h) - f(x-h) as in your example it would amount to epsilon ** 1/3 * x, so approximately 5e-6 times x, which choice might be a little difficult in case of small values like yours. Quite close (if one can say so bearing in mind floating point arithmetic...) to practical results posted by #LutzL though.
You may use other derivative formulas, except the symmetric one you are using. You may want to use the forward or backward evaluation(if the function is costly to evaluate and you have calculated f(x) beforehand. If your function is cheap to evaluate, you may want to evaluate it multiple times using higher order methods to make the precision error smaller (see five-point stencil on wikipedia as provided in the comment to your question).
This Python tutorial explains the reason behind the limited precision. In summary, decimals are ultimately represented in binary and the precision is about 17 significant digits. So, you are right that it gets fuzzy beyond 10E-14.

np.exp overflow workaround

I have the following equation:
result = (A * np.exp(b * (t - t0))) / (1 + np.exp(c * (t - t0)))
I feed in an array of t values to get results out. A, b, c, t0 are all constants (b and c are very large, t0 is small but not as small as b and c are large). The problem is, I run into an overflow error because the exponential value quickly gets much too large to fit into a float64 beyond a certain range of t. I'm trying to find a workaround to this while still maintaining a decent level of precision. The result value is well within the range of a float64 container, however the overly large intermediate values of the np.exp calculation prevent me from getting as far as the result.
Some thoughts I had:
Scale down the t input to be able to get the desired range of values, and then de-scale the output so the result is correct
Convert the exponential to a log function
However I'm not sure how to implement either of these ideas, or if they would actually work.
Essential this problem boils down to result = np.exp(a) / np.exp(b), where a and b are in the range of 100-1000. np.exp(709) results in 8.2e307, right at the limit of a float64, but I have larger values that need to feed into it. While the comparison of the two exponentials produces a reasonable value, the exponentials themselves are too large to be calculated.
keeping everything in the log scale is the common solution to this sort of thing. at least that's what we do in statistics where you're often down in the 1e-10000 range, especially at the start before you're any where near convergence. for example, all the scipy probability density functions have logpdf variants which work in the log scale.
I think your expression would be rewritten something like:
d = t - t0
log_result = (np.log(A) + (b * d)) - np.logaddexp(0, c * d)
(untested)

Logarithm over x

Since the following expansion for the logarithm holds:
log(1-x)=-x-x^2/2-x^3/3-...
one can calculate the following functions which have removable singularities at x:
log(1-x)/x=-1-x/2-...
(log(1-x)/x+1)/x=-1/2-x/3-...
((log(1-x)/x+1)/x+1/2)/x=-1/3-x/4-...
I am trying to use NumPy for these calculations, and specifically the log1p function, which is accurate near x=0. However, convergence for the aforementioned functions is still problematic.
Do you have any ideas for any existing functions implementing these formulas or should I write one myself using the previous expansions, which will not be as efficient, however?
The simplest thing to do is something like
In [17]: def logf(x, eps=1e-6):
...: if abs(x) < eps:
...: return -0.5 - x/3.
...: else:
...: return (1. + log1p(-x)/x)/x
and play a bit with the threshold eps.
If you want a numpy-like, vectorized solution, replace an if with a np.where
>>> np.where(x > eps, 1. + log1p(-x)/x) / x, -0.5 - x/3.)
Why not successively take the Square of the candidate, after initially extracting the exponent component? When the square results in a number greater than 2, divide by two, and set the bit in the mantissa of your result that corresponds to the iteration. This is a much quicker and simpler way of determining log base 2, which can then in a single multiplication, be transformed to the e or 10 base.
Some predefined functions don't work at singularity points. One simple-minded solution is to compute the series by adding terms from a peculiar sequence.
For your example, the sequence would be :
sum = 0
for i in range(n):
sum+= x^k/k
sum = -sum
for log(1-x)
Then you keep adding a lot of terms or until the last term is under a small threshold.

Categories