Precision loss numpy - mpmath - python

I use numpy and mpmath in my Python programm. I use numpy, because it allows an easy access to many linear algebra operations. But because numpy's solver for linear equations is not that exact, i use mpmath for more precision operations. After i compute the solution of a System:
solution = mpmath.lu_solve(A,b)
i want the solution as an array. So i use
array = np.zeros(m)
and then do a loop for setting the values:
for i in range(m):
array[i] = solution[i]
or
for i in range(m):
array.put([i],solution[i])
but with both ways i get again numerical instabilities like:
solution[0] = 12.375
array[0] = 12.37500000000000177636
Is there a way to avoid these errors?

numpy ndarrays have homogeneous type. When you make array, the default dtype will be some type of float, which doesn't have as much precision as you want:
>>> array = np.zeros(3)
>>> array
array([ 0., 0., 0.])
>>> array.dtype
dtype('float64')
You can get around this by using dtype=object:
>>> mp.mp.prec = 65
>>> mp.mpf("12.37500000000000177636")
mpf('12.37500000000000177636')
>>> array = np.zeros(3, dtype=object)
>>> array[0] = 12.375
>>> array[1] = mp.mpf("12.37500000000000177636")
>>> array
array([12.375, mpf('12.37500000000000177636'), 0], dtype=object)
but note that there's a significant performance hit when you do this.

For the completeness, and for people like me who stumbled upon this question because numpy's linear solver is not exact enough (it seems to be able to handle 64bit numbers, only), there is also sympy.
The API is somewhat similar to numpy, but needs a few tweaks every now and then.
In [104]: A = Matrix([
[17928014155669123106522437234449354737723367262236489360399559922258155650097260907649387867023242961198972825743674594974017771680414642705007756271459833, 13639120912900071306285490050678803027789658562874829601953000463099941578381997997439951418291413106684405816668933580642992992427754602071359317117391198, 2921704428390104906296926198429197524950528698980675801502622843572749765891484935643316840553487900050392953088680445022408396921815210925936936841894852],
[14748352608418286197003564525494635018601621545162877692512866070970871867506554630832144013042243382377181384934564249544095078709598306314885920519882886, 2008780320611667023380867301336185953729900109553256489538663036530355388609791926150229595099056264556936500639831205368501493630132784265435798020329993, 6522019637107271075642013448499575736343559556365957230686263307525076970365959710482607736811185215265865108154015798779344536283754814484220624537361159],
[ 5150176345214410132408539250659057272148578629292610140888144535962281110335200803482349370429701981480772441369390017612518504366585966665444365683628345, 1682449005116141683379601801172780644784704357790687066410713584101285844416803438769144460036425678359908733332182367776587521824356306545308780262109501, 16960598957857158004200152340723768697140517883876375860074812414430009210110270596775612236591317858945274366804448872120458103728483749408926203642159476]])
In [105]: B = Matrix([
.....: [13229751631544149067279482127723938103350882358472000559554652108051830355519740001369711685002280481793927699976894894174915494730969365689796995942384549941729746359],
.....: [ 6297029075285965452192058994038386805630769499325569222070251145048742874865001443012028449109256920653330699534131011838924934761256065730590598587324702855905568792],
.....: [ 2716399059127712137195258185543449422406957647413815998750448343033195453621025416723402896107144066438026581899370740803775830118300462801794954824323092548703851334]])
In [106]: A.solve(B)
Out[106]:
Matrix([
[358183301733],
[498758543457],
[ 1919512167]])
In [107]:

Related

Why will numpy.round will not round my array?

I am trying to round a numpy array that is outputted by the result of a Keras model prediction. However after executing numpy.round/numpy.around, there is no change.
The end goal here is for the array to get rounded down to 0 if below/equal 0.50 or rounded up if above 0.50.
The code is here:
from keras.models import load_model
import numpy
model = load_model('tried.h5')
data = numpy.loadtxt("AppData\Roaming\MetaQuotes\Terminal\94DDB309C90B408373EFC53AC730F336\MQL4\Files\indicatorout.csv", delimiter=",")
data = numpy.array([data])
print(data)
outdata = model.predict(data)
print(outdata)
numpy.around(outdata, 0)
print(outdata)
numpy.savetxt("AppData\Roaming\MetaQuotes\Terminal\94DDB309C90B408373EFC53AC730F336\MQL4\Files\modelout.txt", outdata)
The logs are also here:
Using TensorFlow backend.
[[1.19539070e+01 1.72686310e+01 2.24426384e+01 1.82771435e+01
2.23788052e+01 1.62105408e+01 1.44595184e+01 1.90179043e+01
1.71749554e+01 1.69194088e+01 1.89911938e+01 1.76701393e+01
5.19613740e-01 5.38522415e+01 9.64037247e+01 1.73570000e-04
4.35710000e-04 9.55710000e-04]]
[[0.4215713]]
[[0.4215713]]
Any help would be greatly appreciated, thank you.
I assume that you want the elements in the array to round to some n decimal places. Below is an illustration for doing so:
# sample array to work with
In [21]: arr = np.random.randn(4)
In [22]: arr
Out[22]: array([-0.94817409, -1.61453252, 0.16566428, -0.53507549])
# round to 3 decimal places; note that `arr` is still unaffected.
In [23]: arr.round(decimals=3)
Out[23]: array([-0.948, -1.615, 0.166, -0.535])
# if you want to round it to nearest integer
In [24]: arr_rint = np.rint(arr)
In [25]: arr_rint
Out[25]: array([-1., -2., 0., -1.])
To make the decimal rounding to work in-place, specify the out= argument as in:
In [26]: arr.round(decimals=3, out=arr)
Out[26]: array([-0.948, -1.615, 0.166, -0.535])

Difference between sum and np.sum for complex numbers numpy

I am trying to split up the multiplication of a dft matrix in to real and imaginary parts
from scipy.linalg import dft
improt numpy as np
# x is always real
x = np.ones(4)
W = dft(4)
Wx = W.dot(x)
Wxme = np.real(W).dot(x) + np.imag(W).dot(x)*1.0j
I would like that Wx and Wxme give the same value but they are not at all. I have narrowed down the bug a bit more:
In [62]: W[1]
Out[62]:
array([ 1.00000000e+00 +0.00000000e+00j,
6.12323400e-17 -1.00000000e+00j,
-1.00000000e+00 -1.22464680e-16j, -1.83697020e-16 +1.00000000e+00j])
In [63]: np.sum(W[1])
Out[63]: (-2.2204460492503131e-16-1.1102230246251565e-16j)
In [64]: sum(W[1])
Out[64]: (-1.8369701987210297e-16-2.2204460492503131e-16j)
Why do sum and np.sum give different values ?? addition of complex numbers should not be anything but adding the real parts and the imaginary parts seperately right ??
Adding the by hand gives me the result I would expect as opposed to what numy gives me:
In [65]: 1.00000000e+00 + 6.12323400e-17 + -1.00000000e+00 + 1.83697020e-16
Out[65]: 1.8369702e-16
What am I missing ??
Up to rounding error, these results are equal. The results have slightly different rounding error due to factors such as different summation order or different levels of precision used to represent intermediate results.

How to keep precision of gmpy2 mpfr in Numpy matrix operation

I am using Multiple-precision Rationals(mpfr) object in Numpy matrix,
matrix([[ mpfr('-366998.93593422093364191959435957721088073331222596080623233278164906447646654043966366647797',300),
mpfr('-366997.28868432286431885359868309613943011772698563764930700121744888472828510537502286003536',300),
mpfr('-366997.28868432286431885359868309613943011772698563764930700121744888472828510537502286003536',300),
mpfr('-366997.28868432286431885359868309613943011772698563764930700121744888472828510537502310955189',300),
mpfr('-366997.33936304224917822062156336656390364691713762458391131405889211470102834400572590888586',300),
mpfr('-366997.28868432286431885359868309613943011772698563764930700121744888472828510537502286003536',300)],
[ mpfr('-40813927.104656436832435886099653290386078894027773129049451436960078610548203287954114434382',300),
mpfr('-10418349883335.380900703935580692318458974868691020694148304775624032110383967472053357462067',300),
mpfr('-40813927.104656436832435886099653290386078894027773129049451436960078610548203287954114434382',300),
mpfr('-40813927.104656436832435886099653290386078894027773129049451436960078610548203287954114434382',300),
mpfr('-40813927.104656436832435886099653290386078894027773129049451436960078610548203287954114434382',300),
mpfr('-40813927.104656436832435886099653290386078894027773129049451436960078610548203287954114434382',300)]], dtype=object)
but when compute the inverse of the matrix, I will lose the precision.
In [10]: a.I
Out[10]:
matrix([[ -5.44966727e-07, 1.91970239e-14],
[ 1.06745086e-11, -9.59848660e-14],
[ -5.44964281e-07, 1.91969377e-14],
[ -5.44964281e-07, 1.91969377e-14],
[ -5.44964356e-07, 1.91969404e-14],
[ -5.44964281e-07, 1.91969377e-14]])
So how to keep precision of mpfr?
Any suggestion will be appreciated!
For performance reasons, numpy uses the LAPACK libraries and has to convert the matrix elements to the standard double type. You may want to try mpmath for multiple precision matrix inversion.

Why is the mean larger than the max in this array?

I have found myself with a very confusing array in Python. There following is the output from iPython when I work with it (with the pylab flag):
In [1]: x = np.load('x.npy')
In [2]: x.shape
Out[2]: (504000,)
In [3]: x
Out[3]:
array([ 98.20354462, 98.26583099, 98.26529694, ..., 98.20297241,
98.19876862, 98.29492188], dtype=float32)
In [4]: min(x), mean(x), max(x)
Out[4]: (97.950058, 98.689438, 98.329773)
I have no idea what is going on. Why is the mean() function providing what is obviously the wrong answer?
I don't even know where to begin to debug this problem.
I am using Python 2.7.6.
I would be willing to share the .npy file if necessary.
Probably due to accumulated rounding error in computing mean(). float32 relative precision is ~ 1e-7, and you have 500000 elements -> ~ 5% rounding in direct computation of sum().
The algorithm for computing sum() and mean() is more sophisticated (pairwise summation) in the latest Numpy version 1.9.0:
>>> import numpy
>>> numpy.__version__
'1.9.0'
>>> x = numpy.random.random(500000).astype("float32") + 300
>>> min(x), numpy.mean(x), max(x)
(300.0, 300.50024, 301.0)
In the meanwhile, you may want to use higher-precision accumulator type: numpy.mean(x, dtype=numpy.float64)
I have included a snippet from the np.mean.__doc__ below. You should try using np.mean(x, dtype=np.float64).
-----
The arithmetic mean is the sum of the elements along the axis divided
by the number of elements.
Note that for floating-point input, the mean is computed using the
same precision the input has. Depending on the input data, this can
cause the results to be inaccurate, especially for `float32` (see
example below). Specifying a higher-precision accumulator using the
`dtype` keyword can alleviate this issue.
In single precision, `mean` can be inaccurate:
>>> a = np.zeros((2, 512*512), dtype=np.float32)
>>> a[0, :] = 1.0
>>> a[1, :] = 0.1
>>> np.mean(a)
0.546875
Computing the mean in float64 is more accurate:
>>> np.mean(a, dtype=np.float64)
0.55000000074505806

Rounding numpy float array read from img with python, values returned not rounded

Simple rounding of a floating point numpy array seems not working for some reason..
I get numpy array from reading a huge img (shape of (7352, 7472)). Ex values:
>>> imarray[3500:3503, 5000:5003]
array([[ 73.33999634, 73.40000153, 73.45999908],
[ 73.30999756, 73.37999725, 73.43000031],
[ 73.30000305, 73.36000061, 73.41000366]], dtype=float32)
And for rounding I've been just trying to use numpy.around() for the raw value, also writing values to a new array, a copie of raw array, but for some reason no results..
arr=imarray
numpy.around(imarray, decimals=3, out=arr)
arr[3500,5000] #results in 73.3399963379, as well as accessing imarray
So, even higher precision!!!
Is that because of such big array?
I need to round it to get the most frequent value (mode), and I'm searching the vay to avoid more and more libraries..
Your array has dtype float32. That is a 4-byte float.
The closest float to 73.340 representable using float32 is roughly 73.33999634:
In [62]: x = np.array([73.33999634, 73.340], dtype = np.float32)
In [63]: x
Out[63]: array([ 73.33999634, 73.33999634], dtype=float32)
So I think np.around is rounding correctly, it is just that your dtype has too large a granularity to round to the number you might be expecting.
In [60]: y = np.around(x, decimals = 3)
In [61]: y
Out[61]: array([ 73.33999634, 73.33999634], dtype=float32)
Whereas, if the dtype were np.float64:
In [64]: x = np.array([73.33999634, 73.340], dtype = np.float64)
In [65]: y = np.around(x, decimals = 3)
In [66]: y
Out[66]: array([ 73.34, 73.34])
Note that even though printed representation for y shows 73.34, it is not necessarily true that the real number 73.34 is exactly representable as a float64 either. The float64 representation is probably just so close to 73.34 that NumPy chooses to print it as 73.34.
The answer by #unutbu is absolutely correct. Numpy is rounding it as close to the number as it can given the precision that you requested. The only thing that I have to add is that you can use numpy.set_printoptions to change how the array is displayed:
>>> import numpy as np
>>> x = np.array([73.33999634, 73.340], dtype = np.float32)
>>> y = np.round(x, decimals = 3)
>>> y
array([ 73.33999634, 73.33999634], dtype=float32)
>>> np.set_printoptions(precision=3)
>>> y
array([ 73.34, 73.34], dtype=float32)

Categories