I have a numpy array A with mpf elements that have decimal precision 100. Is this precision cast away if I decide to take the numpy dot product of A with itself?
If this is the case, is there any way to convert a numpy array to an mpmath matrix, so I can keep the precision?
Numpy arrays can hold objects, in particular mpf objects, and their methods such as dot can use the addition/multiplication methods of these objects. Example:
import mpmath
import numpy
mpmath.mp.dps = 25 # higher precision for demonstration
a = [mpmath.sin(mpmath.pi*n/3) for n in range(99)]
b = numpy.array(a)
b.dot(b)
outputs mpf('49.50000000000000000000000165')
For comparison, this is what happens if the array elements are cast to double-precision floats when converting to numpy:
c = numpy.array(a, dtype=float)
c.dot(c)
outputs 49.499999999999993. So, the higher precision provided by mpmath is preserved when the dot method is invoked in the first version.
The previous answer is correct. However, sometimes there are things that are working in numpy and it is not working in mpmath (at least it is done in a different way). Hence, the original (general) question of
"...is there any way to convert a numpy array to an mpmath matrix, so I can keep the precision?.."
In my experience, this (more general) question still needs to have a general answer. One of the answers to this problem is to convert the numpy array first to list then List to mpmath matrix.
Here is a simple example that works for me (warning, may not be efficient):
import mpmath as mp, numpy as np
N = 5
L = np.ones(N)
M = np.diag(L, 2) # A numpy matrix 7x7
# Notes that MPMath "diag" function is limited to one parameter only
M = mp.matrix(M.tolist())
print(type(M),'\n', M)
Related
If I want to calculate the k smallest eigenvalues of the matrix multiplication AA' with A of size 300K by 512 and "'" is the transpose, then that would be infeasible to do it in traditional way. Matlab however provides a nice functionality by using a function argument that perform the product Afun = #(x) A*(A'*x)); to the eigs function. Then, to find the smallest 6 eigenvalues/eigenvectors we call d = eigs(Afun,300000,6,'smallestabs'), where the second input is the size of the matrix AA'. Is there a function in python that performs a similar thing?
To my knowledge, there is no such functionality in numpy. However, I don't see any limitations by using simply numpy.linalg.eigvals for retrieving an array of the matrix eigenvalues. Then simply find the N smallest with a sort:
import numpy as np
import numpy.linalg
A = np.array() # your matrix
eigvals = numpy.linalg.eigvals(A)
eigvals.sort()
smallest_6_eigvals = eigvals[:6]
I have two matrices in Python 2.7: one dense A_dense and the another sparse matrix A_sparse. I am interested in computing element-wise multiplication followed by sum. There are two ways to do it: use numpy's multiplication or scipy sparse multiplication. I expect them to give exactly same result with difference in execution time. But I find that they give different results for certain matrix sizes.
import numpy as np
from scipy import sparse
L=2000
np.random.seed(2)
rand_x=np.random.rand(L)
A_sparse_init=np.diag(rand_x, -1)+np.diag(rand_x, 1)
A_sparse=sparse.csr_matrix(A_sparse_init)
A_dense=np.random.rand(L+1,L+1)
print np.sum(A_sparse.multiply(A_dense))-np.sum(np.multiply(A_dense[A_sparse.nonzero()], A_sparse.data))
Output:
1.1368683772161603e-13
If I choose L=2001, then output is:
0.0
To check the size dependence of the difference using two different multiplication method, I wrote:
L=100
np.random.seed(2)
N_loop=100
multiply_diff_arr=np.zeros(N_loop)
for i in xrange(N_loop):
rand_x=np.random.rand(L)
A_sparse_init=np.diag(rand_x, -1)+np.diag(rand_x, 1)
A_sparse=sparse.csr_matrix(A_sparse_init)
A_dense=np.random.rand(L+1,L+1)
multiply_diff_arr[i]=np.sum(A_sparse.multiply(A_dense))-np.sum(np.multiply(A_dense[A_sparse.nonzero()], A_sparse.data))
L+=1
I got the following plot:
Can anyone help me understand what's happening? Don't we expect the difference between two methods to be at least 1e-18 rather than 1e-13?
I don't have a full answer, but this might help find the answer:
Under the hood, scipy.sparse will convert to coo format and do this:
ret = self.tocoo()
if self.shape == other.shape:
data = np.multiply(ret.data, other[ret.row, ret.col])
The question is then why these two operations give different results:
ret = A_sparse.tocoo()
c = np.multiply(ret.data, A_dense[ret.row, ret.col])
ret.data = c.view(type=np.ndarray)
c.sum() - ret.sum()
-1.1368683772161603e-13
Edit:
The difference stems from different defaults on which axis to add.reduce first.
E.g.:
A_sparse.multiply(A_dense).sum(axis=1).sum()
A_sparse.multiply(A_dense).sum(axis=0).sum()
Numpy defaults to 0 first.
I'm trying to make a dot product of an expression and it was supposed to be symmetric.
It turns out that it just isn't.
B is a 4D array which I must transpose its last two dimensions to become B^t.
D is a 2D array. (It's an expression of the Stiffness Matrix known to the Finite Element Method programmers)
The numpy.dotproduct associated with numpy.transpose and as a second alternative numpy.einsum (the idea came from this topic: Numpy Matrix Multiplication U*B*U.T Results in Non-symmetric Matrix) have already been used and the problem persists.
By the end of the calculations the product B^tDB is obtained and when it's verified if it really is symmetric by subtracting its transpose B^tDB, there is still a residue.
The Dot product or the Einstein Summation are used only over the dimensions of interest (last ones).
The question is: How can these residues be eliminated?
You need to use arbitrary precision floating point math. Here's how you can combine numpy and the mpmath package to define an arbitrary precision version of matrix multiplication (ie the np.dot method):
from mpmath import mp, mpf
import numpy as np
# stands for "decimal places". Larger values
# mean higher precision, but slower computation
mp.dps = 75
def tompf(arr):
"""Convert any numpy array to one of arbitrary precision mpmath.mpf floats
"""
if arr.size and not isinstance(arr.flat[0], mpf):
return np.array([mpf(x) for x in arr.flat]).reshape(*arr.shape)
else:
return arr
def dotmpf(arr0, arr1):
"""An arbitrary precision version of np.dot
"""
return tompf(arr0).dot(tompf(arr1))
As an example, if you then set up B, B^t, and D matrices as so:
bshape = (8,8,8,8)
dshape = (8,8)
B = np.random.rand(*bshape)
BT = np.swapaxes(B, -2, -1)
d = np.random.rand(*dshape)
D = d.dot(d.T)
then B^tDB - (B^tDB)^t will always have a non-zero value if you calculate it using the standard matrix multiplication method from numpy:
M = np.dot(np.dot(B, D), BT)
np.sum(M - M.T)
but if you use the arbitrary precision version given above it won't have a residue:
M = dotmpf(dotmpf(B, D), BT)
np.sum(M - M.T)
Watch out though. Calculations using arbitrary precision math run much slower than those done using standard floating point numbers.
Matlab Code:
AP(queryIdx) = diff([0;recall]')*prec
My python code:
AP[queryIdx] = np.dot(np.diff(np.concatenate(([[0]], recall), axis=0).transpose()),prec)
Variables:(Checked and am quite sure they are equivalent in python and in Matlab)
Recall: 1000x1 np array*
prec: 1000x1 np array
* prints out as [[.],.....,[.]]
Results:
Matlab: .1011
Python: 0.05263158
Only cause I can think of outside of the code is that python uses more
precision, but I doubt that would make such a large difference)
*Edit There was a problem with my prec variable. The above code worked
That code looks a bit messy. Try replacing it with this:
AP[queryIdx] = np.dot(np.diff(np.hstack([0, recall.ravel()])), prec.ravel())
In your post, you mentioned that you have a 1000 x 1 array for both recall and prec. This to me is interpreted as a 2D array with a singleton dimension: the second dimension. As such, you'd need to convert this back to a 1D array using ravel.
Now, np.hstack horizontally stacks 1D arrays together and so this will append a 0 at the front, then apply the diff operator, and the perform the dot product with prec.
One common gotcha that MATLAB coders have with numpy is the representation of 1D arrays in numpy. There is no such thing as the transpose of a 1D array. All numpy 1D arrays are row vectors. If you explicitly want to make the 1D array a column vector, you need to include an additional dimension and make the second dimension 1, then transpose it. Something like this:
r = v[:][None].T
In any case, let's verify the results:
MATLAB
>> recall = (1:1000).';
>> prec = (1000:-1:1).';
>> diff([0; recall].')*prec
ans =
500500
Python (IPython)
In [1]: import numpy as np
In [2]: recall = np.arange(1,1001)
In [3]: prec = np.arange(1000,0,-1)
In [4]: np.dot(np.diff(np.hstack([0, recall.ravel()])), prec.ravel())
Out[4]: 500500
I have:
import numpy as np
from mpmath import *
mpf(np.array(range(0,600)))
But it won't let me do it:
TypeError: cannot create mpf from array
So what should I be doing?
Essentially I'm going to have use this array and multiply element-wise with an incredibly large or incredible small number depending on circumstance (eg 1.35626567e1084 or 6.2345252e-2732) hence the need for mpf.
More specifically I'll be using the besseli and besselk function which create the incredible large and incredible small values.
How do I get an mpf array to hold these numbers?
Multiplying an array by a mpf number just works:
import numpy as np
import mpmath as mp
small_number = mp.besseli(400, 2) # This is an mpf number
# Note that creating a list using `range` and then converting it
# to an array is not very efficient. Do this instead:
A = np.arange(600)
result = small_number * A # Array of dtype object, ie, it contains mpf numbeers
Multiplying element-wise two arrays containing mpf numbers also works:
result * result
So your real problem is how to evaluate an mpmath function in a numpy array. To do that, I'd use np.frompyfunc (some time ago this was the only option).
besseli_vec = np.frompyfunc(mp.besseli, 2, 1)
besseli_vec(0, A)
Check out mpmath.arange:
import numpy as np
import mpmath as mp
np.array(mp.arange(600))