Multiplying arrays with broadcasting - python

I have an mxn A matrix and an nxr B matrix that I want to multiply in a specific way to get an mxr matrix. I want to multiply every element in the ith column of A as a scalar to the ith row of B and the sum the n matrices
For example
a = [[0, 1, 2],
[3, 4, 5],
b = [[0, 1, 2, 3],
[4, 5, 6, 7],
[8, 9, 10, 11]]
The product would be
a*b = [[0, 0, 0, 0], + [[4, 5, 6, 7], + [[16, 18, 20, 22], = [[20, 23, 26, 29],
[0, 3, 6, 9]] [16, 20, 24, 28]] [40, 45, 50, 55]] [56, 68, 80, 92]]
I can't use any loops so I'm pretty sure I have to use broadcasting but I don't know how. Any help is appreciated

Your input matrices are of shape (2, 3) and (3, 4) respectively and the result you want is of shape (2, 4).
What you need is just a dot product of your two matrices as
a = np.array([[0, 1, 2],
[3, 4, 5]])
b = np.array([[0, 1, 2, 3],
[4, 5, 6, 7],
[8, 9, 10, 11]])
print (np.dot(a,b))
# array([[20, 23, 26, 29],
# [56, 68, 80, 92]])

Related

How can you multiply all the values within a 2D df with all the values within a 1D df separately?

I'm new to numpy and I'm currently working on a modeling project for which I have to perform some calculations based on two different data sources. However until now I haven't managed to figure out how I could multiply all the individual values to each other:
I have two data frames
One 2D-dataframe:
df1 = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
One 1D-dataframe:
df2 = np.array([1, 2, 3, 4, 5])
I would like to multiply all the individual values within the first dataframe (df1) separately with all the values that are stored within the second dataframe in order to create a data cube / new 3D-dataframe that has the shape 5x3x3:
df3 = np.array([[[1, 2, 3], [4, 5, 6], [7, 8, 9]], [[2, 4, 6], [8, 10, 12], [14, 16, 18]], ..... ])
I tried different methods but every time I failed to obtain something that looks like df3.
x = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
y = np.array([1, 2, 3, 4, 5])
z = y
for i in range(len(z)):
z.iloc[i] = x
for i in range(0, 5):
for j in range(0, 3):
for k in range(0, 3):
z.iloc[i, j, k] = y.iloc[i] * x.iloc[j, k]
print(z)
Could someone help me out with some example code? Thank you!
Try this:
df1 = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
df2 = np.array([1, 2, 3, 4, 5])
df3 = df1 * df2[:, None, None]
Output:
>>> df3
array([[[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9]],
[[ 2, 4, 6],
[ 8, 10, 12],
[14, 16, 18]],
[[ 3, 6, 9],
[12, 15, 18],
[21, 24, 27]],
[[ 4, 8, 12],
[16, 20, 24],
[28, 32, 36]],
[[ 5, 10, 15],
[20, 25, 30],
[35, 40, 45]]])

Slicing 3D numpy array using list of index

The objective is to slice 3D array using list of index.
Here, the array is of shape 2,5,5. For simplicity, let assume the index 0 to 4 label as A,B,C,D,E.
Assume we have 3d array as below
array([[[44, 47, 64, 67, 67],
[ 9, 83, 21, 36, 87],
[70, 88, 88, 12, 58],
[65, 39, 87, 46, 88],
[81, 37, 25, 77, 72]],
[[ 9, 20, 80, 69, 79],
[47, 64, 82, 99, 88],
[49, 29, 19, 19, 14],
[39, 32, 65, 9, 57],
[32, 31, 74, 23, 35]]], dtype=int64)
The index of interest is [1,3,4]. Again, we label this as B,D,E`. The expected output, when slicing the 3D array based on the index is as below
array([[[83, 36, 87],
[39, 46, 88],
[37, 77, 72]],
[[64, 99, 88],
[32, 9, 57],
[31, 23, 35]]], dtype=int64)
However, slicing the array as below
import numpy as np
np.random.seed(0)
arr = np.random.randint(0, 100, size=(2, 5, 5))
k=arr[:,(1,3,4),(1,3,4)]
does not produced the expect output.
In actual use case, the number of element to be sliced is > 3 elements (> B,D,E). Sorry for the lack of correct terminology used
Try this, which is similar structure to your arr[:,idx,idx] but using np.ix_(). Do read the documentation for np.ix().-
idx = [1,3,4]
ixgrid = np.ix_(idx,idx)
arr[:,ixgrid[0],ixgrid[1]]
array([[[83, 36, 87],
[39, 46, 88],
[37, 77, 72]],
[[64, 99, 88],
[32, 9, 57],
[31, 23, 35]]])
Explanation
What you are WANT to do is extract a mesh from the last 2 axes of the array. But what you are doing is extract exact indexes from each of the 2 axes.
When you use arr[:,(1,3,4),(1,3,4)], you are essentially asking for (1,1), (3,3) and (4,4) from the two matrices arr[0] and arr[1]
What you need is to extract a mesh. This can be achieved with np.ix_ and the magic of broadcasting.
If you ask for ...
[[1],
[3], and [1,3,4]
[4]]
... which is what the np.ix_ constructs, you broadcast the indexes and instead ask for a cross product between them, which is (1,1), (1,3), (1,4), (3,1), (3,3)... etc.
Hope that clarifies why you get the result you are getting and how you can actually get what you need.
The problem
Advanced indexing expects all dimensions to be indexed explicitly. What you're doing here is grabbing the elements at coordinates (1, 1), (3, 3), (4, 4) in each array along axis 0.
The solution
What you need to do is this instead:
idx = (1, 3, 4) # the indices of interest
arr[np.ix_((0, 1), idx, idx)]
Where (0, 1) corresponds to the first two arrays along axis 0.
Output:
array([[[83, 36, 87],
[39, 46, 88],
[37, 77, 72]],
[[64, 99, 88],
[32, 9, 57],
[31, 23, 35]]], dtype=int64)
As shown above, np.ix_((0, 1), idx, idx)) produces an object which can be used for advanced indexing. The (0, 1) means that you're explicitly selecting the elements from the arrays arr[0] and arr[1]. If you have a more general 3D array of shape (n, m, q) and want to grab the same subarray out of every array along axis 0, you can use
np.ix_(np.arange(arr.shape[0]), idx, idx))
As your indices. Note that idx is repeated here because you wanted those specific indices but in general they don't need to match.
Generalizing
More generally, you can slice and dice however you want like so:
In [1]: arrays_to_select = (0, 1)
In [2]: rows_to_select = (1, 3, 4)
In [3]: cols_to_select = (1, 3, 4)
In [4]: indices = np.ix_(arrays_to_select, rows_to_select, cols_to_select)
In [5]: arr[indices]
Out[5]:
array([[[83, 36, 87],
[39, 46, 88],
[37, 77, 72]],
[[64, 99, 88],
[32, 9, 57],
[31, 23, 35]]], dtype=int64)
Let's consider some other shape:
In [4]: x = np.random.randint(0, 9, (4, 3, 5))
In [5]: x
Out[5]:
array([[[1, 0, 2, 1, 0],
[3, 5, 1, 4, 3],
[1, 8, 1, 4, 2]],
[[1, 6, 8, 2, 8],
[0, 0, 4, 2, 3],
[8, 5, 6, 2, 5]],
[[4, 4, 8, 6, 0],
[3, 0, 1, 2, 8],
[0, 8, 2, 4, 3]],
[[7, 8, 8, 1, 4],
[5, 7, 4, 8, 5],
[7, 5, 5, 3, 4]]])
In [6]: rows = (0, 2)
In [7]: cols = (0, 2, 3, 4)
By using those rows and cols, you'll be grabbing the subarrays composed of all the elements from columns 0 through 4, from only the rows 0 and 2. Let's verify that with the first array along axis 0:
In [8]: arrs = (0,) # A 1-tuple which will give us only the first array along axis 0
In [9]: x[np.ix_(arrs, rows, cols)]
Out[9]:
array([[[1, 2, 1, 0],
[1, 1, 4, 2]]])
Now suppose you want the subarrays produced by rows and cols of only the first and last arrays along axis 0. You can explicitly select (0, -1):
In [10]: arrs = (0, -1)
In [11]: x[np.ix_(arrs, rows, cols)]
Out[11]:
array([[[1, 2, 1, 0],
[1, 1, 4, 2]],
[[7, 8, 1, 4],
[7, 5, 3, 4]]])
If, instead, you want that same subarray from all the arrays along axis 0:
In [12]: arrs = np.arange(x.shape[0])
In [13]: arrs
Out[13]: array([0, 1, 2, 3])
In [14]: x[np.ix_(arrs, rows, cols)]
Out[14]:
array([[[1, 2, 1, 0],
[1, 1, 4, 2]],
[[1, 8, 2, 8],
[8, 6, 2, 5]],
[[4, 8, 6, 0],
[0, 2, 4, 3]],
[[7, 8, 1, 4],
[7, 5, 3, 4]]])

Select different slices from each numpy row

I have a 3d tensor and I want to select different slices from the dim=2. something like a[[0, 1], :, [slice(2, 4), slice(1, 3)]].
a=np.arange(2*3*5).reshape(2, 3, 5)
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]],
[[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29]]])
# then I want something like a[[0, 1], :, [slice(2, 4), slice(1, 3)]]
# that gives me np.stack([a[0, :, 2:4], a[1, :, 1:3]]) without a for loop
array([[[ 2, 3],
[ 7, 8],
[12, 13]],
[[16, 17],
[21, 22],
[26, 27]]])
and I've seen this and it is not what I want.
You can use advanced indexing as explained here. You will have to pass the row ids which are [0, 1] in your case and the column ids 2, 3 and 1, 2. Here 2,3 means [2:4] and 1, 2 means [1:3]
import numpy as np
a=np.arange(2*3*5).reshape(2, 3, 5)
rows = np.array([[0], [1]], dtype=np.intp)
cols = np.array([[2, 3], [1, 2]], dtype=np.intp)
aa = np.stack(a[rows, :, cols]).swapaxes(1, 2)
# array([[[ 2, 3],
# [ 7, 8],
# [12, 13]],
# [[16, 17],
# [21, 22],
# [26, 27]]])
Another equivalent way to avoid swapaxes and getting the result in desired format is
aa = np.stack(a[rows, :, cols], axis=2).T
A third way I figured out is by passing the list of indices. Here [0, 0] will correspond to [2,3] and [1, 1] will correspond to [1, 2]. The swapaxes is just to get your desired format of output
a[[[0,0], [1,1]], :, [[2,3], [1,2]]].swapaxes(1,2)
A solution...
import numpy as np
a = np.arange(2*3*5).reshape(2, 3, 5)
np.array([a[0,:,2:4], a[1,:,1:3]])

Python Matrix Multiplication Variations

just asked a question about multiplying matrices and that can be found here, I have one more question though about multiplying matrices. Say I have the following matrices:
matrix_a = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
matrix_b = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
How could I get a result like this:
[[1, 4, 9], [16, 25, 36], [49, 64, 81]]
...so that each element is basically being multiplied by the single corresponding element of the other array. Does anyone know how to do that?
Thanks guys!
You could express the element-wise product (and matrix product) using list comprehensions, zip, and the * argument-unpacking operator:
matrix_a = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
matrix_b = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
elementwise_product = [[ai*bi for ai, bi in zip(*rows)]
for rows in zip(matrix_a, matrix_b)]
print(elementwise_product)
# [[1, 4, 9], [16, 25, 36], [49, 64, 81]]
matrix_product = [[sum([ai*bi for ai, bi in zip(row_a, col_b)])
for col_b in zip(*matrix_b)]
for row_a in matrix_a]
print(matrix_product)
# [[30, 36, 42], [66, 81, 96], [102, 126, 150]]
The numpy package provides an array object that can do both element-wise and matrix-wise calculations:
import numpy as np
matrix_a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
matrix_b = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
matrix_a*matrix_b
np.dot(matrix_a, matrix_b)
This outputs:
array([[ 1, 4, 9],
[16, 25, 36],
[49, 64, 81]])
array([[ 30, 36, 42],
[ 66, 81, 96],
[102, 126, 150]])
Numpy is available using pip install numpy or by using one of the numerical python distributions such as anaconda or pythonxy.
Since those lists are equal, you can just multiply it with itself. Here is a slightly verbose way to iterate the matrix and store the result in a new one.
matrix = [[1,2,3],[4,5,6],[7,8,9]]
result_matrix = [[],[],[]]
print (matrix)
for i in range(0, len(matrix)):
for j in range(0,len(matrix[i])):
result_matrix[i].append(matrix[i][j] * matrix[i][j])
print (result_matrix)
Ouput
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
[[1, 4, 9], [16, 25, 36], [49, 64, 81]]

Using NumPy arrays as indices to NumPy arrays

I have a 3x3x3 NumPy array:
>>> x = np.arange(27).reshape((3, 3, 3))
>>> x
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]]])
Now I create an ordinary list of indices:
>>> i = [[0, 1, 2, 1], [2, 1, 0, 1], [1, 2, 0, 1]]
As expected, I get four values using this list as the index:
>>> x[i]
array([ 7, 14, 18, 13])
But if I now convert i into a NumPy array, I won't get the same answer.
>>> j = np.asarray(i)
>>> x[j]
array([[[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]],
...,
[[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]],
[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]]]])
Why is this so? Why can't I use NumPy arrays as indices to NumPy array?
x[j] is the equivalent of x[j,:,:]
In [163]: j.shape
Out[163]: (3, 4)
In [164]: x[j].shape
Out[164]: (3, 4, 3, 3)
The resulting shape is the shape of j joined with the last 2 dimensions of x. j just selects from the 1st dimension of x.
x[i] on the other hand, is the equivalent to x[tuple(i)], that is:
In [168]: x[[0, 1, 2, 1], [2, 1, 0, 1], [1, 2, 0, 1]]
Out[168]: array([ 7, 14, 18, 13])
In fact x(tuple(j)] produces the same 4 item array.
The different ways of indexing numpy arrays can be confusing.
Another example of how the shape of the index array or lists affects the output:
In [170]: x[[[0, 1], [2, 1]], [[2, 1], [0, 1]], [[1, 2], [0, 1]]]
Out[170]:
array([[ 7, 14],
[18, 13]])
Same items, but in a 2d array.
Check out the docs for numpy, what you are doing is "Integer Array Indexing", you need to pass each coordinate in as a separate array:
j = [np.array(x) for x in i]
x[j]
Out[191]: array([ 7, 14, 18, 13])

Categories