I have two lists of matrices [numpy.ndarray]: (a1,a2,a3.....,an) and (b1,b2,b3......,bn) each one is a square matrix of some size, not all a matrices are the same size and not all b matrices are the same size, but it is guaranteed that dim(a[i])==dim(b[i]) (which means we are only multiplying matrices of same size).
i want to dot product them respectively: a1*b1,a2*b2.....an*bn and store the results in say c1,c2....etc
is there any way to do it besides going over the pairs 1 by 1 in a for loop?
im currently using:
# a_list and b_list contain n matrices each
# a[i] & b[i] are numpy.ndarray objects
a_list = [a1,a2,.....]
b_list = [b1,b2,.....]
result_list = []
for i in range(n):
result_list.append(numpy.dot(a_list[i],b_list[i])
I think the accepted solution is syntactic sugar for a for loop, however we can look for a more interesting option here.
Technically what we want is a numpy array of numpy arrays, allowing us to do vectorized operations between them, similar to how np.array([1,2,3]) * np.array([3,4,5]) performs scalar multiplication between each element.
So we'd like a numpy array of numpy arrays, except that we'd like the * operator to be defined as matrix multiplication instead of element-wise multiplication. It's interesting to note that this is the case for the np.matrix class. It is however important to note that this class is deprecated and can cause complications, but for the case of learning / understanding things all the way, we can try using this class..
import nummpy as np
b_0 = np.asmatrix(np.arange(9).reshape(3, 3))
# b_0 = 0 1 2
# 3 4 5
# 6 7 8
b_1 = np.asmatrix(np.arange(4).reshape(2, 2))
# b_1 = 0 1
# 2 3
a_0 = np.asmatrix(np.eye(3))
a_1 = np.asmatrix(np.eye(2))
a = np.asarray([a_0, a_1])
b = np.asarray([b_0, b_1])
a * b # We get [b_0, b_1])
If this were an important syntactic option for you, you could perhaps write a custom class that would be compatible with numpy arrays (and thus not use np.matrix). This will probably however be slightly slower than using a plain old for loop with np.dot.
You can use python list comprehensions:
result_list = [a.dot(b) for a, b in zip(a_list, b_list)]
Related
I want to add two numpy arrays of different sizes starting at a specific index. As I need to do this couple of thousand times with large arrays, this needs to be efficient, and I am not sure how to do this efficiently without iterating through each cell.
a = [5,10,15]
b = [0,0,10,10,10,0,0]
res = add_arrays(b,a,2)
print(res) => [0,0,15,20,25,0,0]
naive approach:
# b is the bigger array
def add_arrays(b, a, i):
for j in range(len(a)):
b[i+j] = a[j]
You might assign smaller one into zeros array then add, I would do it following way
import numpy as np
a = np.array([5,10,15])
b = np.array([0,0,10,10,10,0,0])
z = np.zeros(b.shape,dtype=int)
z[2:2+len(a)] = a # 2 is offset
res = z+b
print(res)
output
[ 0 0 15 20 25 0 0]
Disclaimer: I assume that offset + len(a) is always less or equal len(b).
Nothing wrong with your approach. You cannot get better asymptotic time or space complexity. If you want to reduce code lines (which is not an end in itself), you could use slice assignment and some other utils:
def add_arrays(b, a, i):
b[i:i+len(a)] = map(sum, zip(b[i:i+len(a)], a))
But the functional overhead should makes this less performant, if anything.
Some docs:
map
sum
zip
It should be faster than Daweo answer, 1.5-5x times (depending on the size ratio between a and b).
result = b.copy()
result[offset: offset+len(a)] += a
If I have a.shape = (3,4,5) and b.shape = (3,5), using np.einsum() makes broadcasting then multiplying the two arrays super easy and explicit:
result = np.einsum('abc, ac -> abc', a, b)
But if I want to add the two arrays, so far as I can tell, I need two separate steps so that the broadcasting happens properly, and the code feels less explicit.
b = np.expand_dims(b, 1)
result = a + b
Is there way out there that allows me to do this array addition with the clarity of np.einsum()?
Broadcasting can occur only on one extra dimension. For adding these two arrays one could expand them in a one-liner as follows:
import numpy as np
a = np.random.rand(3,4,5); b = np.random.rand(3,5);
c = a + b[:, None, :] # c is shape of a, broadcasting occurs along 2nd dimension
Note this is not any different than c = a + np.expand_dim(b, 1). In terms of clarity it is a personal style thing. I prefer broadcasting, others prefer einsum.
I want the first array to display it's values only when common indices values of both the arrays are greater than zero else make it zero. I'm not really sure how to frame the question. Hopefully the expected output provides better insight.
I tried playing around with np.where, but I can't seem to make it work when 2 arrays are provided.
a = np.array([0,2,1,0,4])
b = np.array([1,1,3,4,0])
# Expected Output
a = ([0,2,1,0,0])
The zip function, which takes elements of two arrays side by side, is useful here. You don't necessarily need an np/numpy function.
import numpy as np
a = np.array([0,2,1,0,4])
b = np.array([1,1,3,4,0])
c = np.array([x if x * y > 0 else 0 for x,y in zip(a, b)])
print(c)
I currently have the following double loop in my Python code:
for i in range(a):
for j in range(b):
A[:,i]*=B[j][:,C[i,j]]
(A is a float matrix. B is a list of float matrices. C is a matrix of integers. By matrices I mean m x n np.arrays.
To be precise, the sizes are: A: mxa B: b matrices of size mxl (with l different for each matrix) C: axb. Here m is very large, a is very large, b is small, the l's are even smaller than b
)
I tried to speed it up by doing
for j in range(b):
A[:,:]*=B[j][:,C[:,j]]
but surprisingly to me this performed worse.
More precisely, this did improve performance for small values of m and a (the "large" numbers), but from m=7000,a=700 onwards the first appraoch is roughly twice as fast.
Is there anything else I can do?
Maybe I could parallelize? But I don't really know how.
(I am not committed to either Python 2 or 3)
Here's a vectorized approach assuming B as a list of arrays that are of the same shape -
# Convert B to a 3D array
B_arr = np.asarray(B)
# Use advanced indexing to index into the last axis of B array with C
# and then do product-reduction along the second axis.
# Finally, we perform elementwise multiplication with A
A *= B_arr[np.arange(B_arr.shape[0]),:,C].prod(1).T
For cases with smaller a, we could run a loop that iterates through the length of a instead. Also, for more performance, it might be a better idea to store those elements into a separate 2D array instead and perform the elementwise multiplication only once after we get out of the loop.
Thus, we would have an alternative implementation like so -
range_arr = np.arange(B_arr.shape[0])
out = np.empty_like(A)
for i in range(a):
out[:,i] = B_arr[range_arr,:,C[i,:]].prod(0)
A *= out
So I feel like I might have coded myself into a corner -- but here I am.
I have created a dictionary of arrays (well specifically ascii Columns) because I needed to create five arrays performing the same calculation on an array with five different parameters (The calculation involved multiplying arrays and one of five arbitrary constants).
I now want to create an array where each element corresponds to the sum of the equivalent element from all five arrays. I'd rather not use the ugly for loop that I've created (it's also hard to check if i'm getting the right answer with the loop).
Here is a modified snippet for testing!
import numpy as np
from astropy.table import Column
from pylab import *
# The five paramaters for the Columns
n1 = [14.18,19.09,33.01,59.73,107.19,172.72] #uJy/beam
n2 = [14.99,19.04,32.90,59.99,106.61,184.06] #uJy/beam
n1 = np.array([x*1e-32 for x in n1]) #W/Hz
n2 = np.array([x*1e-32 for x in n2]) #W/Hz
# an example of the arrays being mathed upon
luminosity=np.array([2.393e+24,1.685e+24,2.264e+23,5.466e+22,3.857e+23,4.721e+23,1.818e+23,3.237e+23])
redshift = np.array([1.58,1.825,0.624,0.369,1.247,0.906,0.422,0.66])
field = np.array([True,True,False,True,False,True,False,False])
DMs = {}
for i in range(len(n1)):
DMs['C{0}'.format(i)]=0
for SC,SE,level in zip(n1,n2,DMs):
DMmax = Column([1 for x in redshift], name='DMmax')
DMmax[field]=(((1+redshift[field])**(-0.25))*(luminosity[field]/(4*pi*5*SE))**0.5)*3.24078e-23
DMmax[~field]=(((1+redshift[~field])**(-0.25))*(luminosity[~field]/(4*pi*5*SC))**0.5)*3.24078e-23
DMs[level] = DMmax
Thanks all!
Numpy was built for this! (provided all arrays are of the same shape)
Just add them, and numpy will move element-wise through the arrays. This also has the benefit of being orders of magnitude faster than using a for-loop in the Python layer.
Example:
>>> n1 = np.array([1,2,3])
>>> n2 = np.array([1,2,3])
>>> total = n1 + n2
>>> total
array([2,4,6])
>>> mask = np.array([True, False, True])
>>> n1[mask] ** n2[mask]
array([ 1, 27])
Edit additional input
You might be able to do something like this:
SE_array = (((1+redshift[field]) ** (-0.25)) * (luminosity[field]/(4*pi*5*n1[field])) ** 0.5) * 3.24078e-23
SC_array = (((1+redshift[field]) ** (-0.25)) * (luminosity[field]/(4*pi*5*n2[field])) ** 0.5) * 3.24078e-23
and make the associations by stacking the new arrays:
DM = np.dstack((SE_array, SC_array))
reshaper = DM.shape[1:] # take from shape (1, 6, 2) to (6,2), where 6 is the length of the arrays
DM = DM.reshape(reshaper)
This will give you a 2d array like:
array([[SE_1, SC_1],
[SE_2, SC_2]])
Hope this is helpful
If you can't just add the numpy arrays you can extract the creation of the composite array into a function.
def get_element(i):
global n1, n2, luminosity, redshift, field
return n1[i] + n2[i] + luminosity[i] + redshift[i] + field[i]
L = len(n1)
composite = [get_element(i) for i in range(L)]
The answer was staring at me in the face, but thanks to #willnx, #cricket_007, and #andrew-lavq. Your suggestions made me realise how simple the solution is.
Just add them, and numpy will move element-wise through the arrays. -- willnx
You need a loop to sum all values of a collection -- cricket_007
so it really is as simple as
sum(x for x in DMs.values())
I'm not sure if this is the fastest solution, but I think it's the simplest.