I am looking for a way to optimize the following code. It computes all the possible pairwise sums of the elements in an array:
import numpy as np
from itertools import combinations
N = 5000
a = np.random.rand(N)
c = ([a[i]+a[j] for i,j in combinations(range(N),2)])
This is relatively slow. I could get much better performances using the following:
b = a+a[:,None]
c = b[np.triu_indices(N,1)]
yet it still seems largely non-optimized: computing the full matrix b is inefficient because half of it ends up useless, and extracting its upper part (omitting the diagonal) is actually even slower than computing b.
Is there a way to do this faster? This makes me think of a similar problem, computing pairwise distances between points (Fastest pairwise distance metric in python) but I don't know if there is a way to do something similar here using scipy.
edit: I would like to keep the order of the sums, so that the result contains the sum of the indices in this order:
[(0,1), (0,2), ... (0,N-1), (1,2), .... (1,N-1), (2,3), ...],
i.e. the order you would get doing a double for loop for 0=<i<N, for j<i<N
Related
How can I create Matrix P consisting of three eigenvector columns by using a double nested loop.
from sympy.matrices import Matrix, zeros
from sympy import pprint
A = Matrix([[6,2,6], [2,6,6], [6,6,2]])
ew_A = A.eigenvals()
ev_A = A.eigenvects()
pprint(ew_A)
pprint(ev_A)
# Matrix P
(n,m) = A.shape
P = TODO # Initialising
# "filling Matrix P with ...
for i in TODO:
for j in TODO:
P[:,i+j] = TODO
## Calculating Diagonalmatrix
D= P**-1*P*A
Thanks so much in Advance
Finding the eigenvalues of a matrix, or diagonalizing it, is equivalent to finding the zeros of a polynomial with a degree equal to the size of the matrix. So in your case diagonalizing a 3x3 matrix is equivalent to finding the zeros of a 3rd degree polynomial. Maybe there is a simple algorithm for that, but mathematicians always go for the general case.
And in the general case you can show that there is no terminating algorithm for finding the zeros of a 5th-or-higher degree polynomial (that is called Galois theory), so there is also no simple "triple loop" algorithm for matrices of size 5x5 and higher. Eigenvalue software works by an iterative approximation algorithm, so that is a "while" loop around some finite loops.
This means that your question has no answer in the general case. In the 3x3 case maybe, but even that is not going to be terribly simple.
I have a N-square matrix B of integers and i want to build the matrix A such that
A[m,n] = sum([B[i,j] for i in range(1,m) for j in range(1,n)])
As B can be quite big, computing A naively coefficient by coefficient takes much time.
What is the most effective way to compute A ?
import numpy as np
A = np.cumsum(np.cumsum(a, axis=0), axis=1)
You're in luck. Two calls to numpy.cumsum (cumulative sum) should do the trick.
https://numpy.org/doc/stable/reference/generated/numpy.cumsum.html
Equation f(x,a,b) below requires an iterative solution, for which I am using one of the scipy optimisation methods ('brentq') which is essentially calculating the value of x for which f(x,a,b)=0.
However, I need to use array inputs for 'a' and 'b' and the arrays are very large e.g. could be as high as 1-100 million.
What is the most efficient/fastest way to do this with scipy/numpy? At present I am resorting to for loops as per below, but this becomes slow with my actual underlying equations (not shown). Note that each row in array is independent of others.
import numpy as np
from scipy import optimize
# function to solve (simplified)
def f(x,a,b): return (a/x)**0.25 * (x**0.5) - b*x
# array size
N = 10000000
# example input arrays from which 'a' and 'b' are taken (in reality values come from other complex functions)
A = np.linspace(1,500,N)
B = np.linspace(0.1,1,N)
# solution using brentq
results = [optimize.brentq(f, 1e10, 1000, args=(a,b)) for a,b in zip(A,B)]
results
I am trying to get rid of the for loop and instead do an array-matrix multiplication to decrease the processing time when the weights array is very large:
import numpy as np
sequence = [np.random.random(10), np.random.random(10), np.random.random(10)]
weights = np.array([[0.1,0.3,0.6],[0.5,0.2,0.3],[0.1,0.8,0.1]])
Cov_matrix = np.matrix(np.cov(sequence))
results = []
for w in weights:
result = np.matrix(w)*Cov_matrix*np.matrix(w).T
results.append(result.A)
Where:
Cov_matrix is a 3x3 matrix
weights is an array of n lenght with n 1x3 matrices in it.
Is there a way to multiply/map weights to Cov_matrix and bypass the for loop? I am not very familiar with all the numpy functions.
I'd like to reiterate what's already been said in another answer: the np.matrix class has much more disadvantages than advantages these days, and I suggest moving to the use of the np.array class alone. Matrix multiplication of arrays can be easily written using the # operator, so the notation is in most cases as elegant as for the matrix class (and arrays don't have several restrictions that matrices do).
With that out of the way, what you need can be done in terms of a call to np.einsum. We need to contract certain indices of three matrices while keeping one index alone in two matrices. That is, we want to perform w_{ij} * Cov_{jk} * w.T_{ki} with a summation over j, k, giving us an array with i indices. The following call to einsum will do:
res = np.einsum('ij,jk,ik->i', weights, Cov_matrix, weights)
Note that the above will give you a single 1d array, whereas you originally had a list of arrays with shape (1,1). I suspect the above result will even make more sense. Also, note that I omitted the transpose in the second weights argument, and this is why the corresponding summation indices appear as ik rather than ki. This should be marginally faster.
To prove that the above gives the same result:
In [8]: results # original
Out[8]: [array([[0.02803215]]), array([[0.02280609]]), array([[0.0318784]])]
In [9]: res # einsum
Out[9]: array([0.02803215, 0.02280609, 0.0318784 ])
The same can be achieved by working with the weights as a matrix and then looking at the diagonal elements of the result. Namely:
np.diag(weights.dot(Cov_matrix).dot(weights.transpose()))
which gives:
array([0.03553664, 0.02394509, 0.03765553])
This does more calculations than necessary (calculates off-diagonals) so maybe someone will suggest a more efficient method.
Note: I'd suggest slowly moving away from np.matrix and instead work with np.array. It takes a bit of getting used to not being able to do A*b but will pay dividends in the long run. Here is a related discussion.
Suppose I have two dense matrices U (10000x50) and V(50x10000), and one sparse matrix A(10000x10000). Each element in A is either 1 or 0. I hope to find A*(UV), noting that '*' is element-wise multiplication. To solve the problem, Scipy/numpy will calculate a dense matrix UV first. But UV is dense and large (10000x10000) so it's very slow.
Because I only need a few elements of UV indicated by A, it should save a lot of time if only necessary elements are calculated instead of calculating all elements then filtering using A. Is there a way to instruct scipy to do this?
BTW, I used Matlab to solve this problem and Matlab is smart enough to find what I'm trying to do and works efficiently.
Update:
I found Matlab calculated UV fully as scipy does. My scipy installation is simply too slow...
Here's a test script and possible speedup. The basic idea is to use the nonzero coordinates of A to select rows and columns of U and V, and then use einsum to perform a subset of the possible dot products.
import numpy as np
from scipy import sparse
#M,N,d = 10,5,.1
#M,N,d = 1000,50,.1
M,N,d = 5000,50,.01 # about the limit for my memory
A=sparse.rand(M,M,d)
A.data[:] = 1 # a sparse 0,1 array
U=(np.arange(M*N)/(M*N)).reshape(M,N)
V=(np.arange(M*N)/(M*N)).reshape(N,M)
A1=A.multiply(U.dot(V)) # the direct solution
A2=np.einsum('ij,ik,kj->ij',A.A,U,V)
print(np.allclose(A1,A2))
def foo(A,U,V):
# use A to select elements of U and V
A3=A.copy()
U1=U[A.row,:]
V1=V[:,A.col]
A3.data[:]=np.einsum('ij,ji->i',U1,V1)
return A3
A3 = foo(A,U,V)
print(np.allclose(A1,A3.A))
The 3 solutions match. For large arrays, foo is about 2x faster than the direct solution. For small size, the pure einsum is competitive, but bogs down for large arrays.
The use of dot in foo would have computed too many products, ij,jk->ik as opposed to ij,ji->i.