Difference between sum and np.sum for complex numbers numpy - python

I am trying to split up the multiplication of a dft matrix in to real and imaginary parts
from scipy.linalg import dft
improt numpy as np
# x is always real
x = np.ones(4)
W = dft(4)
Wx = W.dot(x)
Wxme = np.real(W).dot(x) + np.imag(W).dot(x)*1.0j
I would like that Wx and Wxme give the same value but they are not at all. I have narrowed down the bug a bit more:
In [62]: W[1]
Out[62]:
array([ 1.00000000e+00 +0.00000000e+00j,
6.12323400e-17 -1.00000000e+00j,
-1.00000000e+00 -1.22464680e-16j, -1.83697020e-16 +1.00000000e+00j])
In [63]: np.sum(W[1])
Out[63]: (-2.2204460492503131e-16-1.1102230246251565e-16j)
In [64]: sum(W[1])
Out[64]: (-1.8369701987210297e-16-2.2204460492503131e-16j)
Why do sum and np.sum give different values ?? addition of complex numbers should not be anything but adding the real parts and the imaginary parts seperately right ??
Adding the by hand gives me the result I would expect as opposed to what numy gives me:
In [65]: 1.00000000e+00 + 6.12323400e-17 + -1.00000000e+00 + 1.83697020e-16
Out[65]: 1.8369702e-16
What am I missing ??

Up to rounding error, these results are equal. The results have slightly different rounding error due to factors such as different summation order or different levels of precision used to represent intermediate results.

Related

Numpy linalg: linear system with unlikely results

Considering the following matrix equation:
x=Ab
where:
In[1]:A
Out[1]:
matrix([[ 0.477, -0.277, -0.2 ],
[-0.277, 0.444, -0.167],
[-0.2 , -0.167, 0.367]])
In[2]: b
Out[2]: [0, 60, 40]
how come that when I use numpy.linalg() I get the following results?
import numpy as np
x = np.linalg.solve(A, b)
res=x.tolist()
# res=[1.8014398509481981e+18, 1.801439850948198e+18, 1.8014398509481984e+18]
These numbers are huge! What's wrong here? I am suspecting A is in the wrong form, as it multiplies b in my equation, whereas numpy.linalg() considers A as if it multiplies x.
What you give as an equation (x=A b) is just a matrix multiplication rather than a set of linear equations to solve (A x=b) for which you would use np.linalg.solve. What you need to do to get x in your case is simply use np.dot (A.dot(b)).
Your matrix is singular, as can be seen by adding its columns which sum to zero. Mathematically, this system is only solvable for a very small set of b vectors.
The solution you're getting is most likely just numerical noise.

Factorial of a matrix elementwise with Numpy

I'd like to know how to calculate the factorial of a matrix elementwise. For example,
import numpy as np
mat = np.array([[1,2,3],[2,3,4]])
np.the_function_i_want(mat)
would give a matrix mat2 such that mat2[i,j] = mat[i,j]!. I've tried something like
np.fromfunction(lambda i,j: np.math.factorial(mat[i,j]))
but it passes the entire matrix as argument for np.math.factorial. I've also tried to use scipy.vectorize but for matrices larger than 10x10 I get an error. This is the code I wrote:
import scipy as sp
javi = sp.fromfunction(lambda i,j: i+j, (15,15))
fact = sp.vectorize(sp.math.factorial)
fact(javi)
OverflowError: Python int too large to convert to C long
Such an integer number would be greater than 2e9, so I don't understand what this means.
There's a factorial function in scipy.special which allows element-wise computations on arrays:
>>> from scipy.special import factorial
>>> factorial(mat)
array([[ 1., 2., 6.],
[ 2., 6., 24.]])
The function returns an array of float values and so can compute "larger" factorials up to the accuracy floating point numbers allow:
>>> factorial(15)
array(1307674368000.0)
You may need to adjust the print precision of NumPy arrays if you want to avoid the number being displayed in scientific notation.
Regarding scipy.vectorize: the OverflowError implies that the result of some of the calculations are too big to be stored as integers (normally int32 or int64).
If you want to vectorize sp.math.factorial and want arbitrarily large integers, you'll need to specify that the function return an output array with the 'object' datatype. For instance:
fact = sp.vectorize(sp.math.factorial, otypes='O')
Specifying the 'object' type allows Python integers to be returned by fact. These are not limited in size and so you can calculate factorials as large as your computer's memory will permit. Be aware that arrays of this type lose some of the speed and efficiency benefits which regular NumPy arrays have.

Why is the mean larger than the max in this array?

I have found myself with a very confusing array in Python. There following is the output from iPython when I work with it (with the pylab flag):
In [1]: x = np.load('x.npy')
In [2]: x.shape
Out[2]: (504000,)
In [3]: x
Out[3]:
array([ 98.20354462, 98.26583099, 98.26529694, ..., 98.20297241,
98.19876862, 98.29492188], dtype=float32)
In [4]: min(x), mean(x), max(x)
Out[4]: (97.950058, 98.689438, 98.329773)
I have no idea what is going on. Why is the mean() function providing what is obviously the wrong answer?
I don't even know where to begin to debug this problem.
I am using Python 2.7.6.
I would be willing to share the .npy file if necessary.
Probably due to accumulated rounding error in computing mean(). float32 relative precision is ~ 1e-7, and you have 500000 elements -> ~ 5% rounding in direct computation of sum().
The algorithm for computing sum() and mean() is more sophisticated (pairwise summation) in the latest Numpy version 1.9.0:
>>> import numpy
>>> numpy.__version__
'1.9.0'
>>> x = numpy.random.random(500000).astype("float32") + 300
>>> min(x), numpy.mean(x), max(x)
(300.0, 300.50024, 301.0)
In the meanwhile, you may want to use higher-precision accumulator type: numpy.mean(x, dtype=numpy.float64)
I have included a snippet from the np.mean.__doc__ below. You should try using np.mean(x, dtype=np.float64).
-----
The arithmetic mean is the sum of the elements along the axis divided
by the number of elements.
Note that for floating-point input, the mean is computed using the
same precision the input has. Depending on the input data, this can
cause the results to be inaccurate, especially for `float32` (see
example below). Specifying a higher-precision accumulator using the
`dtype` keyword can alleviate this issue.
In single precision, `mean` can be inaccurate:
>>> a = np.zeros((2, 512*512), dtype=np.float32)
>>> a[0, :] = 1.0
>>> a[1, :] = 0.1
>>> np.mean(a)
0.546875
Computing the mean in float64 is more accurate:
>>> np.mean(a, dtype=np.float64)
0.55000000074505806

Adding 2D matrices as numpy arrays

I want to add two 3x2 matrices, g and temp_g.
Currently g is
[[ 2.77777778e+000 6.58946653e-039]
[ 4.96398713e+173 1.64736663e-039]
[ -1.88888889e+000 -3.29473326e-039]]
And temp_g is:
[[ -5.00000000e-01 -2.77777778e+00]
[ -1.24900090e-16 -4.44444444e-01]
[ 5.00000000e-01 1.88888889e+00]]
But when I do g = g + temp_g, and output g, I get this:
[[ 2.27777778e+000 -2.77777778e+000]
[ 4.96398713e+173 -4.44444444e-001]
[ -1.38888889e+000 1.88888889e+000]]
Maybe I'm having trouble understanding long float numbers... but is this what the result ought to be? I expected that g[0][0] would get added to temp_g[0][0], and g[0][1] to temp_g [0][1] and so on...
Your addition is working fine, but your two arrays have some seriously different orders of magnitude.
Taking for example 4.96398713e+173 - 1.24900090e-16, your first number is 189 orders of magnitude bigger than the second. Floating point numbers don't have this level or accuracy, you're talking about talking a number with ~170 0s at the end of it and adding a number along the lines of 0.00000000000000001249 to it.
I would suggest looking at this to see some of the limitations of floating point numbers (in all languages, not just necessarily Python).
The Decimal library can be used for handling numbers more accurately than floats.
import numpy as np
import decimal
a = decimal.Decimal(4.96398713e+173)
b = decimal.Decimal(1.24900090e-16)
print(a+b)
# 4.963987129999999822073620193E+173
# You can also set the dtype of your array to decimal.Decimal
a = np.array([[ 2.77777778e+000, 6.58946653e-039],
[ 4.96398713e+173, 1.64736663e-039],
[ -1.88888889e+000, -3.29473326e-039]],
dtype=np.dtype(decimal.Decimal))

Precision loss numpy - mpmath

I use numpy and mpmath in my Python programm. I use numpy, because it allows an easy access to many linear algebra operations. But because numpy's solver for linear equations is not that exact, i use mpmath for more precision operations. After i compute the solution of a System:
solution = mpmath.lu_solve(A,b)
i want the solution as an array. So i use
array = np.zeros(m)
and then do a loop for setting the values:
for i in range(m):
array[i] = solution[i]
or
for i in range(m):
array.put([i],solution[i])
but with both ways i get again numerical instabilities like:
solution[0] = 12.375
array[0] = 12.37500000000000177636
Is there a way to avoid these errors?
numpy ndarrays have homogeneous type. When you make array, the default dtype will be some type of float, which doesn't have as much precision as you want:
>>> array = np.zeros(3)
>>> array
array([ 0., 0., 0.])
>>> array.dtype
dtype('float64')
You can get around this by using dtype=object:
>>> mp.mp.prec = 65
>>> mp.mpf("12.37500000000000177636")
mpf('12.37500000000000177636')
>>> array = np.zeros(3, dtype=object)
>>> array[0] = 12.375
>>> array[1] = mp.mpf("12.37500000000000177636")
>>> array
array([12.375, mpf('12.37500000000000177636'), 0], dtype=object)
but note that there's a significant performance hit when you do this.
For the completeness, and for people like me who stumbled upon this question because numpy's linear solver is not exact enough (it seems to be able to handle 64bit numbers, only), there is also sympy.
The API is somewhat similar to numpy, but needs a few tweaks every now and then.
In [104]: A = Matrix([
[17928014155669123106522437234449354737723367262236489360399559922258155650097260907649387867023242961198972825743674594974017771680414642705007756271459833, 13639120912900071306285490050678803027789658562874829601953000463099941578381997997439951418291413106684405816668933580642992992427754602071359317117391198, 2921704428390104906296926198429197524950528698980675801502622843572749765891484935643316840553487900050392953088680445022408396921815210925936936841894852],
[14748352608418286197003564525494635018601621545162877692512866070970871867506554630832144013042243382377181384934564249544095078709598306314885920519882886, 2008780320611667023380867301336185953729900109553256489538663036530355388609791926150229595099056264556936500639831205368501493630132784265435798020329993, 6522019637107271075642013448499575736343559556365957230686263307525076970365959710482607736811185215265865108154015798779344536283754814484220624537361159],
[ 5150176345214410132408539250659057272148578629292610140888144535962281110335200803482349370429701981480772441369390017612518504366585966665444365683628345, 1682449005116141683379601801172780644784704357790687066410713584101285844416803438769144460036425678359908733332182367776587521824356306545308780262109501, 16960598957857158004200152340723768697140517883876375860074812414430009210110270596775612236591317858945274366804448872120458103728483749408926203642159476]])
In [105]: B = Matrix([
.....: [13229751631544149067279482127723938103350882358472000559554652108051830355519740001369711685002280481793927699976894894174915494730969365689796995942384549941729746359],
.....: [ 6297029075285965452192058994038386805630769499325569222070251145048742874865001443012028449109256920653330699534131011838924934761256065730590598587324702855905568792],
.....: [ 2716399059127712137195258185543449422406957647413815998750448343033195453621025416723402896107144066438026581899370740803775830118300462801794954824323092548703851334]])
In [106]: A.solve(B)
Out[106]:
Matrix([
[358183301733],
[498758543457],
[ 1919512167]])
In [107]:

Categories