Multiplication with two 2D matrix to output 3D matrix - python

I am wondering any good ways to calculate this type of multiplication.
It's simply multiplying x[i] by x element-wise, and resulting into [2, 2, 3] matrix.
>>> x
array([[0, 1, 2],
[3, 4, 5]])
>>> output
array([[[ 0, 1, 4],
[ 0, 4, 10]],
[[ 0, 4, 10],
[ 9, 16, 25]]])
I tried with code below and wondering for faster version using numpy.
np.array([
np.multiply(x[i], x)
for i in range(x.shape[0])
])

There are two straightforward ways to do so, the first is using broadcasting, and the second one using einsum. I'd recommed using timeit, to compare the various versions for their speed with the application you have in mind:
out_broadcast = x[:, None, :] * x
out_einsum = np.einsum('ij,kj->ikj',x,x)

Related

How to sum specific row values together in Sparse COO matrix to reshape matrix

I have a sparse coo matrix built in python using the scipy library. An example data set looks something like this:
>>> v.toarray()
array([[1, 0, 2, 4],
[0, 0, 3, 1],
[4, 5, 6, 9]])
I would like to add the 0th index and 2nd index together and the 1st index and the and 3rd index together so the shape would change from 3, 4 to 3, 2.
However looking at the docs their sum function doesn't support slicing of some sort. So the only way I have thought of a way to do something like that would be to loop the matrix as an array then use numpy to get the summed values like so:
a_col = []
b_col = []
for x in range(len(v.toarray()):
a_col.append(np.sum(v.toarray()[x, [0, 2]], axis=0))
b_col.append(np.sum(v.toarray()[x, [1, 3]], axis=0))
Then use those values for a_col and b_col to create the matrix again.
But surely there should be a way to handle it with the sum method?
You can add the values with a simple loop and 2d slicing and than take the columns you want
v = np.array([[1, 0, 2, 4],
[0, 0, 3, 1],
[4, 5, 6, 9]])
for i in range(2):
v[:, i] = v[:, i] + v[:, i+2]
print(v[:, :2])
Output
[[ 3 4]
[ 3 1]
[10 14]]
You can use csr_matrix.dot with a special matrix to achieve the same,
csr = csr_matrix(csr.dot(np.array([[1,0,1,0],[0,1,0,1]]).T))
#csr.data
#[ 3, 4, 3, 1, 10, 14]

Is there a way to use numpy.outer on only a subset of dimensions?

I have an array of arrays, like this:
import numpy as np
a = np.array([[1, 2, 3], [4, 5, 6]])
I would like to calculate the pairwise differences between the array elements, something like:
[[[ 0, 0, 0], [-3, -3, -3]],
[[ 3, 3, 3], [ 0, 0, 0]]
My first thought was to use np.subtract.outer(a, a), but that doesn't do what I want - it goes one layer too deep into the arrays. I can see that the numbers I need are in the output of np.subtract.outer(a, a), but the array that I'm actually working with is very large, and I don't have enough memory to be able to allocate for the result.
Thankyou!
You can simply use broadcasting to solve this.
a[:, None, :] - a[None, :, :]
Gives you what you want.

Numpy - Apply a custom function on all combination of rows in matrix to get a new matrix?

I have the following function, that applies the histogram intersection kernel for 2 arrays:
def histogram_intersection_kernel(X, Y):
k = np.array([])
for x_i,y_i in zip(X,Y):
k = np.append(k,np.minimum(x_i,y_i))
return np.sum(k)
now, lets say I have the following matrix "mat":
[[1,0,0,2,3],
[2,3,4,0,1],
[3,3,5,0,1]]
I would like to find an efficient way to get the matrix that is the result of applying "histogram_intersection_kernel" to all of the combinations of rows in mat. In this example it would be:
[[6,2,2],
[6,10,10],
[2,10,12]]
Extend dimensions to 3D and leverage broadcasting -
np.minimum(a[:,None,:],a[None,:,:]).sum(axis=2)
Or simply -
np.minimum(a[:,None],a).sum(2)
Sample run -
In [248]: a
Out[248]:
array([[1, 0, 0, 2, 3],
[2, 3, 4, 0, 1],
[3, 3, 5, 0, 1]])
In [249]: np.minimum(a[:,None],a).sum(2)
Out[249]:
array([[ 6, 2, 2],
[ 2, 10, 10],
[ 2, 10, 12]])

C++ equivalent of numpy.expand_dims() and numpy.concatenate()

As it mentioned in the title, I currently have some 2D images data read from OpenCV, I need to change the dimension to 4D. E.g., dimension [320, 720] to [1, 320, 720, 1], and then make the entire data a single 4D matrix.
In Python, I can just do numpy.expand_dims() for each of those images and then numpy.concatenate() them together. I'm wondering if there is some equivalent APIs that I could use in C++. I've found expand_dims() in Tensorflow, but it only works on tensors, and I haven't found anything for concatenate() yet.
Libraries like OpenCV, Tensorflow, Boost are welcomed. But I want to keep things lighter, so it would be better if I can implement by myself (if not too complicated). Thank you in advance.
expand_dims plus concatenate, could in a more iterative language be written as:
In [107]: x = np.arange(12).reshape(3,4)
In [109]: y = np.zeros((2,3,4,3),dtype=int)
In [110]: for i in range(2):
...: for j in range(3):
...: y[i,:,:,j] = x
...:
In [111]: y
Out[111]:
array([[[[ 0, 0, 0],
[ 1, 1, 1],
[ 2, 2, 2],
[ 3, 3, 3]],
[[ 4, 4, 4],
[ 5, 5, 5],
....
[10, 10, 10],
[11, 11, 11]]]])
In other words, all you need is the ability to create a target array of the right size, and the ability to copy the 2d arrays into appropriate slots.

Numpy.where used with list of values

I have a 2d and 1d array. I am looking to find the two rows that contain at least once the values from the 1d array as follows:
import numpy as np
A = np.array([[0, 3, 1],
[9, 4, 6],
[2, 7, 3],
[1, 8, 9],
[6, 2, 7],
[4, 8, 0]])
B = np.array([0,1,2,3])
results = []
for elem in B:
results.append(np.where(A==elem)[0])
This works and results in the following array:
[array([0, 5], dtype=int64),
array([0, 3], dtype=int64),
array([2, 4], dtype=int64),
array([0, 2], dtype=int64)]
But this is probably not the best way of proceeding. Following the answers given in this question (Search Numpy array with multiple values) I tried the following solutions:
out1 = np.where(np.in1d(A, B))
num_arr = np.sort(B)
idx = np.searchsorted(B, A)
idx[idx==len(num_arr)] = 0
out2 = A[A == num_arr[idx]]
But these give me incorrect values:
In [36]: out1
Out[36]: (array([ 0, 1, 2, 6, 8, 9, 13, 17], dtype=int64),)
In [37]: out2
Out[37]: array([0, 3, 1, 2, 3, 1, 2, 0])
Thanks for your help
If you need to know whether each row of A contains ANY element of array B without interest in which particular element of B it is, the following script can be used:
input:
np.isin(A,B).sum(axis=1)>0
output:
array([ True, False, True, True, True, True])
Since you're dealing with a 2D array* you can use broadcasting to compare B with raveled version of A. This will give you the respective indices in a raveled shape. Then you can reverse the result and get the corresponding indices in original array using np.unravel_index.
In [50]: d = np.where(B[:, None] == A.ravel())[1]
In [51]: np.unravel_index(d, A.shape)
Out[51]: (array([0, 5, 0, 3, 2, 4, 0, 2]), array([0, 2, 2, 0, 0, 1, 1, 2]))
^
# expected result
* From documentation: For 3-dimensional arrays this is certainly efficient in terms of lines of code, and, for small data sets, it can also be computationally efficient. For large data sets, however, the creation of the large 3-d array may result in sluggish performance.
Also, Broadcasting is a powerful tool for writing short and usually intuitive code that does its computations very efficiently in C. However, there are cases when broadcasting uses unnecessarily large amounts of memory for a particular algorithm. In these cases, it is better to write the algorithm's outer loop in Python. This may also produce more readable code, as algorithms that use broadcasting tend to become more difficult to interpret as the number of dimensions in the broadcast increases.
Is something like this what you are looking for?
import numpy as np
from itertools import combinations
A = np.array([[0, 3, 1],
[9, 4, 6],
[2, 7, 3],
[1, 8, 9],
[6, 2, 7],
[4, 8, 0]])
B = np.array([0,1,2,3])
for i in combinations(A, 2):
if np.all(np.isin(B, np.hstack(i))):
print(i[0], ' ', i[1])
which prints the following:
[0 3 1] [2 7 3]
[0 3 1] [6 2 7]
note: this solution does NOT require the rows be consecutive. Please let me know if that is required.

Categories