I am facing a mystery right now. I get strange results in some program and I think it may be related to the computation since I got different results with my functions compared to manual computation.
This is from my program, I am printing the values pre-computation :
print("\nPrecomputation:\nmatrix\n:", matrix)
tmp = likelihood_left * likelihood_right
print("\nconditional_dep:", tmp)
print("\nfinal result:", matrix # tmp)
I got the following output:
Precomputation:
matrix:
[array([0.08078721, 0.5802404 , 0.16957052, 0.09629893, 0.07310294])
array([0.14633129, 0.45458744, 0.20096238, 0.02142105, 0.17669784])
array([0.41198731, 0.06197812, 0.05934063, 0.23325626, 0.23343768])
array([0.15686545, 0.29516415, 0.20095091, 0.14720275, 0.19981674])
array([0.15965914, 0.18383683, 0.10606946, 0.14234812, 0.40808645])]
conditional_dep: [0.01391123 0.01388155 0.17221067 0.02675524 0.01033257]
final result: [0.07995043 0.03485223 0.02184015 0.04721548 0.05323298]
The thing is when I compute the following code:
matrix = [np.array([0.08078721, 0.5802404 , 0.16957052, 0.09629893, 0.07310294]),
np.array([0.14633129, 0.45458744, 0.20096238, 0.02142105, 0.17669784]),
np.array([0.41198731, 0.06197812, 0.05934063, 0.23325626, 0.23343768]),
np.array([0.15686545, 0.29516415, 0.20095091, 0.14720275, 0.19981674]),
np.array([0.15965914, 0.18383683, 0.10606946, 0.14234812, 0.40808645])]
tmp = np.asarray([0.01391123, 0.01388155, 0.17221067, 0.02675524, 0.01033257])
matrix # tmp
The values in use are exactly the same as they should be in the computation before but I get the following result:
array([0.04171218, 0.04535276, 0.02546353, 0.04688848, 0.03106443])
This result is then obviously different than the previous one and is the true one (I computed the dot product by hand).
I have been facing this problem the whole day and I did not find anything useful online. If any of you have any even tiny idea where it can come from I'd be really happy :D
Thank's in advance
Yann
PS: I can show more of the code if needed.
PS2: I don't know if it is relevant but this is used in a dynamic programming algorithm.
To recap our discussion in the comments, in the first part ("pre-computation"), the following is true about the matrix object:
>>> matrix.shape
(5,)
>>> matrix.dtype
dtype('O') # aka object
And as you say, this is due to matrix being a slice of a larger, non-uniform array. Let's recreate this situation:
>>> matrix = np.array([[], np.array([0.08078721, 0.5802404 , 0.16957052, 0.09629893, 0.07310294]), np.array([0.14633129, 0.45458744, 0.20096238, 0.02142105, 0.17669784]), np.array([0.41198731, 0.06197812, 0.05934063, 0.23325626, 0.23343768]), np.array([0.15686545, 0.29516415, 0.20095091, 0.14720275, 0.19981674]), np.array([0.15965914, 0.18383683, 0.10606946, 0.14234812, 0.40808645])])[1:]
It is now not a matrix with scalars in rows and columns, but a column vector of column vectors. Technically, matrix # tmp is an operation between two 1-D arrays and hence NumPy should, according to the documentation, calculate the inner product of the two. This is true in this case, with the convention that the sum be over the first axis:
>>> np.array([matrix[i] * tmp[i] for i in range(5)]).sum(axis=0)
array([0.07995043, 0.03485222, 0.02184015, 0.04721548, 0.05323298])
>>> matrix # tmp
array([0.07995043, 0.03485222, 0.02184015, 0.04721548, 0.05323298])
This is essentially the same as taking the transpose of the proper 2-D matrix before the multiplication:
>>> np.stack(matrix).T # tmp
array([0.07995043, 0.03485222, 0.02184015, 0.04721548, 0.05323298])
Equivalently, as noted by #jirasssimok:
>>> tmp # np.stack(matrix)
array([0.07995043, 0.03485222, 0.02184015, 0.04721548, 0.05323298])
Hence the erroneous or unexpected result.
As you have already resolved to do in the comments, this can be avoided in the future by ensuring all matrices are proper 2-D arrays.
It looks like you got the operands switched in one of your matrix multiplications.
Using the same values of matrix and tmp that you provided, matrix # tmp and tmp # matrix provide the two results you showed.1
matrix = [np.array([0.08078721, 0.5802404 , 0.16957052, 0.09629893, 0.07310294]),
np.array([0.14633129, 0.45458744, 0.20096238, 0.02142105, 0.17669784]),
np.array([0.41198731, 0.06197812, 0.05934063, 0.23325626, 0.23343768]),
np.array([0.15686545, 0.29516415, 0.20095091, 0.14720275, 0.19981674]),
np.array([0.15965914, 0.18383683, 0.10606946, 0.14234812, 0.40808645])]
tmp = np.asarray([0.01391123, 0.01388155, 0.17221067, 0.02675524, 0.01033257])
print(matrix # tmp) # [0.04171218 0.04535276 0.02546353 0.04688848 0.03106443]
print(tmp # matrix) # [0.07995043 0.03485222 0.02184015 0.04721548 0.05323298]
To make it a little more obvious what your code is doing, you might also consider using np.dot instead of #. If you pass matrix as the first argument and tmp as the second, it will have the result you want, and make it more clear that you're conceptually calculating dot products rather than multiplying matrices.
As an additional note, if you're performing matrix operations on matrix, it might be better if it was a single two-dimensional array instead of a list of 1-dimensional arrays. this will prevent errors of the sort you'll see right now if you try to run matrix # matrix. This would also let you say matrix.dot(tmp) instead of np.dot(matrix, tmp) if you wanted to.
(I'd guess that you can use np.stack or a similar function to create matrix, or you can call np.stack on matrix after creating it.)
1 Because tmp has only one dimension and matrix has two, NumPy can and will treat tmp as whichever type of vector makes the multiplication work (using broadcasting). So tmp is treated as a column vector in matrix # tmp and a row vector in tmp # matrix.
Related
I'm converting MATLAB code to Python
This is my code in python:
import numpy as np
import math
n=150
L=1
inter=L/n
y=np.linspace(inter/2,L-inter/2,n).transpose()
E=(210000000000)*np.ones(n)
Rho=7800*np.ones(n)
PI=math.pi
A=np.exp( 5+2*y*(np.sin(2*PI*y/L)) )*0.000001
This works fine up until this point with no difference in values or issues until I have to execute this piece of MATLAB code.
Mass=sum(Rho*inter.*A)
I tried the same using np.sum(Rho*inter*A) and just Rho*inter*A
The first case I got a single answer 1.0626206716847877 but MATLAB returns a 150 element array.
In the scond case I got an ndarray like I wanted but the values were not the same as what I got in MATLAB.
Values I got in MATLAB : matlab values pastebin
Values I got in python : python values pastebin
What am I doing wrong?
(Rho[:,None]*inter*A).sum(axis=0)
matches your MATLAB pastebin.
Or using einsum to sort out the axes:
np.einsum('i,j->j', Rho,inter*A)
which just reduces to:
Rho.sum() * inter*A
Is that really what you are trying to do in MATLAB?
It might help if you showed the actual MATLAB code used to create Rho, A etc.
Mass=sum(Rho*inter.*A)
What's the size of Rho and A in MATLAB? One may be [1x150], but the other? Is Rho [1x150] also, or [150x150]. The * is matrix multiplication, like # in numpy, but .* is elementwise.
In the numpy code y, Rho and A all have shape (150,). The transpose on y does nothing. Rho*inter*A is elementwise multiplication producing a (150,) as well.
NumPy always sums all elements of a matrix. MATLAB's default is column-based, i.e. all of your 150 columns sum to a single total, hence the array. Use sum(matrix,'all'); in MATLAB to sum over all elements in a matrix. If you have a MATLAB older than 2018b, use sum(matrix(:)), i.e. store your matrix in a temporary variable, then flatten it to a column before summing.
To sum over columns in Python, specify the axis, being 0: np.sum(matrix,axis=0)
numpy.sum():
Axis or axes along which a sum is performed. The default, axis=None, will sum all of the elements of the input array.
sum() from MATLAB:
S = sum(A) returns the sum of the elements of A along the first array dimension whose size does not equal 1.
If A is a matrix, then sum(A) returns a row vector containing the sum of each column.
S = sum(A,'all') computes the sum of all elements of A. This syntax is valid for MATLABĀ® versions R2018b and later.
To prevent this kind of unclarities, I prefer to always specify which direction to sum over, i.e. sum(matrix,1) for MATLAB and np.sum(matrix,axis=0) for NumPy, regardless of the default.
I think that in MATLAB using sum on a matrix you will get the sum of its individual columns and you will end up with an array with its number of elements equal to that of the columns. Use one more sum command in MATLAB: sum(sum(M)), which is the equivalent of np.sum(M) in Python.
I tried to solve a PDE numerically and in the course of this I faced the problem of a triple-nested for loop resembling the 3 spatial dimension. This construct is nested in another time loop, so you can imagine that the computing takes forever for sufficient large node numbers. The code block looks like this
for jy in range(0,cy-1):
for jx in range(0,cx-1):
for jz in range(0,cz-1):
T[n+1,jx,jy,jz] = T[n,jx,jy,jz] + s*(T[n,jx-1,jy,jz] - 2*T[n,jx,jy,jz] + T[n,jx+1,jy,jz]) + s*(T[n,jx,jy-1,jz] - 2*T[n,jx,jy,jz] + T[n,jx,jy+1,jz]) + s*(T[n,jx,jy,jz-1] - 2*T[n,jx,jy,jz] + T[n,jx,jy,jz+1])
It might look intimidating at first, but is quite easy. I have a 3 dimensional matrix representing a solid bulk material, where each point represents the current temperature. The iteratively calculated next temperature at each point is calculated taking into account each point next to that point - so 6 in total. In the case of a 1-dimensional solid the solution is just a simple matrix multiplication. Is there any chance to represent the 3-loop-system above in a simple matrix solution like in the 1D case?
Best regards!
With numpy you can easily do these kinds of matrix operations,
e.g for a 3x3 matrix
import numpy as np
T = np.random.random((3,3,3))
T = T*T - 2*T ... etc.
First off, you need to be a bit more careful with your terminology. A "matrix" is a 2-Dimensional array of numbers. So you are really talking about an array. Numpy, or better yet Scipy, has an data type called an ndarray. You need to be very careful manipulating them, because although they are sometimes used to represent matrices, there are operations that can be performed on 2-D arrays that are not mathematically legal for matrices.
I strongly recommend you use # and not * to perform multiplication of 1- or 2-D matrices, and be sure to add code to check that the operations you are doing are legal mathematically. As a trivial example, Python lets you add a 1 x n or an n x 1 vector to an n x n matrix, even though that is not mathematically correct. The reason it allows it is, as intimated above, because there is no true matrix type in Python.
It very well may be that you can reformulate your problem to use a 3-D array, and by experimentation find the particular operation you are trying to perform. Just keep in mind that the rules of linear algebra are only casually applied in Python.
I am trying to find an efficient code instead of the following piece of code (that is only one part of my code), to increase the speed:
for pr in some_list:
Tp = T[partition[pr]].sum(0)
Tpx = np.dot(Tp, xhat)
hp = h[partition[[pr]].sum(0)
up = (uk[partition[pr][:]].sum(0))/len(partition[pr])
hpu = hpu + np.dot(hp.T, up)
Tpu = Tpu + np.dot(Tp.T, up)
I have at least two more similar blocks of code. As you can see, I used fancy indexing three times (really couldn't find another way). In my algorithm, I need this part to be done very quickly, but it's not happening now. I will really appreciate any suggestion.
Thank you all.
Best,
If your partitions are few and have many elements each, you should consider swapping around the indices of your objects. Summing an array of shape (30,1000) along its second dimension should be faster than summing an array of shape (1000,30) along its first dimension, since in the former case you are always summing contiguous blocks of memory (i.e. arr[k,:] for each k) for each remaining index. So if you put the summation index last (and get rid of some trailing singleton dimension while you're at it), you might get speed-up.
As hpaulj noted in a comment, it's not clear how your loop could be vectorized. However, since it's performance-critical, you could still try vectorizing some of the work.
I suggest that you store hp, up and Tp for each partition (following pre-allocation), then perform the scalar/matrix products in a single vectorized step. Also note that Tpx is unused in your example, so I omitted it here (whatever you're doing with it, you can do it similarly to the other examples):
part_len = len(some_list) # number of partitions, N
Tpshape = (part_len,) + T.shape[1:] # (N,30,100) if T was (1000,30,100)
hpshape = (part_len,) + h.shape[1:] # (N,30,1) if h was (1000,30,1)
upshape = (part_len,) + uk.shape[1:] # (N,30,1) if uk was (1000,30,1)
Tp = np.zeros(Tpshape)
hp = np.zeros(hpshape)
up = np.zeros(upshape)
for ipr,pr in enumerate(some_list):
Tp[ipr,:,:] = T[partition[pr]].sum(0)
hp[ipr,:,:] = h[partition[[pr]].sum(0)
up[ipr,:,:] = uk[partition[pr]].sum(0)/len(partition[pr])
# compute vectorized dot products:
#Tpx unclear in original, omitted
# sum over second index (dot), sum over first index (sum in loop)
hpu = np.einsum('abc,abd->cd',hp,up) # shape (1,1)
Tpu = np.einsum('abc,abd->cd',Tp,up) # shape (100,1)
Clearly the key player is numpy.einsum. And of course if hpu and Tpu had some prior values before the loop, you have to increment those values with the results from einsum above.
As for einsum, it performs summations and contractions of arrays of arbitrary dimensions. The pattern apearing above, 'abc,abd->cd', when applied to 3d arrays A and B, will return a 2d array C, with the following definition (math pseudocode):
C(c,d) = sum_a sum_b A(a,b,c)*B(a,b,d)
For a given fix a summation index, what's inside is
sum_b A(a,b,c)*B(a,b,d)
which, if the c and d indices are kept, will be euqivalent to np.dot(A(a,:,:).T,B(a,:,:)). Since we're summing these matrices with respect to a too, we're supposed to do exactly what your loopy version does, adding up each np.dot() contribution of the total sums.
I am trying to do an element-wise multiplication for two large sparse matrices. Both are of size around (400K X 500K), with around 100M elements.
However, they might not have non-zero elements in the same positions, and they might not have the same number of non-zero elements. In either situation, Im okay with multiplying the non-zero value of one matrix and the zero value in the other matrix to zero.
I keep running out of memory (8GB) in every approach, which doesnt make much sense. I shouldnt be. These are what I've tried.
A and B are sparse matrices (Ive tried with COO and CSC formats).
# I have loaded sparse matrices A and B, and have a file opened in write mode
row,col = A.nonzero()
index = zip(row,col)
del row,col
for i,j in index :
# Approach 1
A[i,j] *= B[i,j]
# Approach 2
someopenfile.write(' '.join([str(i),str(j),str(A[j,j]*B[i,j]),'\n']))
# Approach 3
if B[i,j] != 0 :
A[i,j] = A[i,j]*B[i,j] # or, I wrote it to a file instead
# like in approach 2
If I comment out the for loop, I see that I use almost 3.5GB of memory. But the moment I use the loop, whether Im writing the products to a file or back to a matrix, the memory usage shoots up to the full memory, causing me to stop the execution, or the system hangs. How can I do this operation without consuming so much memory?
I suspect that your sparse matrices are becoming non sparse when you perform the operation have you tried just:
A.multiply(B)
As I suspect that it will be better optimised than anything that you can easily do.
If A is not already the correct type of sparse matrix you might need:
A = A.tocsr()
# May also need
# B = B.tocsr()
A = A.multiply(B)
I need to invert a large, dense matrix which I hoped to use Scipy's gmres to do. Fortunately, the dense matrix A follows a pattern and I do not need to store the matrix in memory. The LinearOperator class allows us to construct an object which acts as the matrix for GMRES and can compute directly the matrix vector product A*v. That is, we write a function mv(v) which takes as input a vector v and returns mv(v) = A*v. Then, we can use the LinearOperator class to create A_LinOp = LinearOperator(shape = shape, matvec = mv). We can put the linear operator into the Scipy gmres command to evaluate the matrix vector products without ever having to fully load A into memory.
The documentation for the LinearOperator is found here: LinearOperator Documentation.
Here is my problem: to write the routine to compute the matrix vector product mv(v) = A*v, I need another input vector C. The entries in A are of the form A[i,j] = f(C[i] - C[j]). So, what I really want is for mv to be of two inputs, one fixed vector input C, and one variable input v which we want to compute A*v.
MATLAB has a similar setup, where would write x = gmres(#(v) mv(v,C),b) where b is the right hand side of the problem Ax = b, , and mv is the function that takes as variable input v which we want to compute A*v and C is the fixed, known vector which we need for the assembly of A.
My problem is that I can't figure out how to allow the LinearOperator class to accept two inputs, one variable and one "fixed" like I can in MATLAB.
Is there a way to do the analogous operation in SciPy? Alternatively, if anyone knows of a better way of inverting a large, dense matrix (50000, 50000) where the entries follow a pattern, I would greatly appreciate any suggestions.
Thanks!
EDIT: I should have stated this information actually. The matrix is actually (in block form) [A C; C^T 0], where A is N x N (N large) and C is N x 3, and the 0 is 3 x 3 and C^T is the transpose of C. This array C is the same array as the one mentioned above. The entries of A follow a pattern A[i,j] = f(C[i] - C[j]).
I wrote mv(v,C) to go row by row construct A*v[i] for i=0,N, by computing sum f(C[i]-C[j)*v[j] (actually, I do numpy.dot(FC,v) where FC[j] = f(C[i]-C[j]) which works well). Then, at the end doing the computations for the C^T rows. I was hoping to eventually replace the large for loop with a multiprocessing call to parallelize the for loop, but that's a future thing to consider. I will also look into using Cython to speed up the computations.
This is very late, but if you're still interested...
Your A matrix must be very low rank since it's a nonlinearly transformed version of a rank-2 matrix. Plus it's symmetric. That means it's trivial to inverse: get the truncated eigenvalue decompostion with, say, 5 eigenvalues: A = U*S*U', then invert that: A^-1 = U*S^-1*U'. S is diagonal so this is inexpensive. You can get the truncated eigenvalue decomposition with eigh.
That takes care of A. Then for the rest: use the block matrix inversion formula. Looks nasty, but I will bet you 100,000,000 prussian francs that it's 50x faster than the direct method you were using.
I faced the same situation (some years later than you) of trying to use more than one argument to LinearOperator, but for another problem. The solution I found was the use of global variables, to avoid passing the variables as arguments to the function.