Is it possible to apply numpy broadcasting (with 1D arrays),
x=np.arange(3)[:,np.newaxis]
y=np.arange(3)
x+y=
array([[0, 1, 2],
[1, 2, 3],
[2, 3, 4]])
to 3d matricies similar to the one below, such that each element in a[i] is treated as a 1D vector like in the example above?
a=np.zeros((2,2,2))
a[0]=1
b=a
result=a+b
resulting in
result[0,0]=array([[2, 2],
[2, 2]])
result[0,1]=array([[1, 1],
[1, 1]])
result[1,0]=array([[1, 1],
[1, 1]])
result[1,1]=array([[0, 0],
[0, 0]])
You can do this in the same way as if they are 1d array, i.e, insert a new axis between axis 0 and axis 1 in either a or b:
a + b[:,None] # or a[:,None] + b
(a + b[:,None])[0,0]
#array([[ 2., 2.],
# [ 2., 2.]])
(a + b[:,None])[0,1]
#array([[ 1., 1.],
# [ 1., 1.]])
(a + b[:,None])[1,0]
#array([[ 1., 1.],
# [ 1., 1.]])
(a + b[:,None])[1,1]
#array([[ 0., 0.],
# [ 0., 0.]])
Since a and b are of same shape, say (2,2,2), a+b will indeed work.
The way broadcasting works is that it matches the dimensions of the operands in reverse order, starting from the last dimension going up (e.g. considering columns before rows in a two-dimensional case). If the dimensions match then the next dimension is considered.
In case the dimensions don't match AND if one of the dimensions is 1 then that operand's dimension is repeated to match the other operand (e.g. if a.shape = (2,1,2) and b.shape = (2,2,2) then the values at the 1st dimension of a are repeated to make the shape (2,2,2))
Related
I am trying to find a fast vectorized (at least partially) solution finding combinatorial occurrence between two 2D numpy array to identified Single Point Polymorphism linkage.
The shape of each array is
(factors, samples)
an example for matrix 1 is as follows:
array([[0., 1., 1.],
[1., 0., 1.]])
and matrix 2
array([[1., 1., 0.],
[0., 0., 0.]])
I need to find the total number of occurrence along samples axis for each permutation of 2 factors at the same position of 2 matrix (order matters because (1,0) count is different from (0,1) count). Therefore the combinations should be [(0, 0), (0, 1), (1, 0), (1, 1)] and the final output is (factor, factor) for counts of each occurrence.
For combination (0,0) for instance, we get the matrix
array([[0, 1],
[0., 1]])
Because
0 counts (0,0) along row 0 of matrix 1 & row 0 of matrix 2,
1 along row 0 of matrix 1 & row 1 of matrix 2,
0 along row 1 of matrix 1 & row 0 of matrix 2,
1 along row 1 of matrix 1 & row 1 of matrix 2,
With example data
import numpy as np
array1 = np.array([
[0., 1., 1.],
[1., 0., 1.]])
array2 = np.array([
[1., 1., 0.],
[0., 0., 0.]])
We can count the desired combinations with np.einsum and reshape to a suitable array
c1 = np.array([1-array1, array1]).astype('int')
c2 = np.array([1-array2, array2]).astype('int')
np.einsum('ijk,lmk->iljm', c1, c2).reshape(-1, len(array1), len(array2))
Output
array([[[0, 1], # counts for (0,0)
[0, 1]],
[[1, 0], # counts for (0,1)
[1, 0]],
[[1, 2], # counts for (1,0)
[1, 2]],
[[1, 0], # counts for (1,1)
[1, 0]]])
Checking that the previous results are equal to dot products
import itertools as it
np.array([x # y.T for x, y in it.product(c1, c2)])
Output
array([[[0, 1],
[0, 1]],
[[1, 0],
[1, 0]],
[[1, 2],
[1, 2]],
[[1, 0],
[1, 0]]])
Since I realized the solution while trying to derive a manual example for the question, I will just provide that we should solve these by dot products:
matrix1_0 = (array1[0]==0).astype('int')
matrix1_1 = (array1[0]==1).astype('int')
matrix2_0 = (array2[1]==0).astype('int')
matrix2_1 = (array2[1]==1).astype('int')
count_00 = np.dot(matrix1_0 , matrix2_0.T)
count_01 = np.dot(matrix1_0 , matrix2_1.T)
count_10 = np.dot(matrix1_1 , matrix2_0.T)
count_11 = np.dot(matrix1_1 , matrix2_1.T)
These would correspond to sum of number of occurrence for each combination for each factor along a certain axis (sample axis 1 here).
I have an array of shape [batch_size, N], for example:
[[1 2]
[3 4]
[5 6]]
and I need to create a 3 indices array with shape [batch_size, N, N] where for every batch I have a N x N diagonal matrix, where diagonals are taken by the corresponding batch element, for example in this case, In this simple case, the result I am looking for is:
[
[[1,0],[0,2]],
[[3,0],[0,4]],
[[5,0],[0,6]],
]
How can I make this operation without for loops and exploting vectorization? I guess it is an extension of dimension, but I cannot find the correct function to do this.
(I need it as I am working with tensorflow and prototyping with numpy).
Try it in tensorflow:
import tensorflow as tf
A = [[1,2],[3 ,4],[5,6]]
B = tf.matrix_diag(A)
print(B.eval(session=tf.Session()))
[[[1 0]
[0 2]]
[[3 0]
[0 4]]
[[5 0]
[0 6]]]
Approach #1
Here's a vectorized one with np.einsum for input array, a -
# Initialize o/p array
out = np.zeros(a.shape + (a.shape[1],),dtype=a.dtype)
# Get diagonal view and assign into it input array values
diag = np.einsum('ijj->ij',out)
diag[:] = a
Approach #2
Another based on slicing for assignment -
m,n = a.shape
out = np.zeros((m,n,n),dtype=a.dtype)
out.reshape(-1,n**2)[...,::n+1] = a
Using np.expand_dims with an element-wise product with np.eye
a = np.array([[1, 2],
[3, 4],
[5, 6]])
N = a.shape[1]
a = np.expand_dims(a, axis=1)
a*np.eye(N)
array([[[1., 0.],
[0., 2.]],
[[3., 0.],
[0., 4.]],
[[5., 0.],
[0., 6.]]])
Explanation
np.expand_dims(a, axis=1) adds a new axis to a, which will now be a (3, 1, 2) ndarray:
array([[[1, 2]],
[[3, 4]],
[[5, 6]]])
You can now multiply this array with a size N identity matrix, which you can generate with np.eye:
np.eye(N)
array([[1., 0.],
[0., 1.]])
Which will yield the desired output:
a*np.eye(N)
array([[[1., 0.],
[0., 2.]],
[[3., 0.],
[0., 4.]],
[[5., 0.],
[0., 6.]]])
Yu can use numpy.diag
m = [[1, 2],
[3, 4],
[5, 6]]
[np.diag(b) for b in m]
EDIT The following plot shows the average execution time for the solution above (solid line), and compared it against #Divakar's (dashed line) for different batch-sizes and different matrix sizes
I don't believe you get much of an improvement, but this is just based on this simple metric
You basically want a function that does the opposite of/reverses np.block(..)
I needed the same thing, so I wrote this little function:
def split_blocks(x, m=2, n=2):
"""
Reverse the action of np.block(..)
>>> x = np.random.uniform(-1, 1, (2, 18, 20))
>>> assert (np.block(split_blocks(x, 3, 4)) == x).all()
:param x: (.., M, N) input matrix to split into blocks
:param m: number of row splits
:param n: number of column, splits
:return:
"""
x = np.array(x, copy=False)
nd = x.ndim
*shape, nr, nc = x.shape
return list(map(list, x.reshape((*shape, m, nr//m, n, nc//n)).transpose(nd-2, nd, *range(nd-2), nd-1, nd+1)))
how to convert reshape 1D numpy array to 2D numpy array
and fill with zeroes on the columns.
For example:
Input:
a = np.array([1,2,3])
Expected output:
np.array([[0, 0, 1],
[0, 0, 2],
[0, 0, 3]])
How do I do this?
a = np.array([1,2,3])
Option 1
np.pad (this should be fast)
np.pad(a[:, None], ((0, 0), (2, 0)), mode='constant')
array([[0, 0, 1],
[0, 0, 2],
[0, 0, 3]])
Option 2
Assign a slice to np.zeros (also very fast)
b = np.zeros((3, 3))
b[:, -1] = a
array([[0., 0., 1.],
[0., 0., 2.],
[0., 0., 3.]])
For your specific example:
a = np.array([1,2,3])
a.resize([3, 3])
a = np.rot90(a, k=3)
hope this helps
Create a function that creates a zero array of m x n x 3 dimensionality. Then go through your original matrix and assign its values to those parties of the new zero matrix that should be non-zero.
I'm trying to solve a problem using multidimensional arrays, rather than resorting to for loops, in order to gain a performance boost, but am having trouble with the indexing.
I've tried various permutations using np.newaxis, but can't seem to achieve the following functionality.
Problem:
Part 1) Take an M x N x N array called a, and for each of the M square matrices, set the upper triangular matrix elements as their negative values.
Part 2) Sum all elements in each of the M matrices (of shape N X N), returning a 1D array with M elements. Let's call this array b.
Attempted Solution
Here is my MWP / attempt using loops (which does work, but I'd rather find a fully array/matrix-based approach
a = np.array(
[[[ 0, 1],
[ 5, 0]],
[[ 0, 3],
[ 2, 0]]])
Part 1):
triangular_upper_idx = np.triu_indices_from(a[0])
for i in range(len(a)):
a[i][triangular_upper_idx] *= -1
a
result:
array([[[ 0, -1],
[ 5, 0]],
[[ 0, -3],
[ 2, 0]]])
Part 2):
b = np.zeros(len(a))
for i in range(len(a)):
b[i] = np.sum(a[i])
b
result:
array([ 4., -1.])
Note:
I have seen a similar question on this topic (Triangular indices for multidimensional arrays in numpy) but the solution there was nested for loops... I feel like numpy may offer a more efficient, clever array-based solution?
Any guidance would be much appreciated.
Thanks
yes numpy has the tools
r = 2
neg_uppr = np.triu(-np.ones((r,r)),1) + np.tril(np.ones((r,r)))
can't tell from your numerical example if you want the diagonal to be inverted too? Then use np.triu(-np.ones((r,r))) + np.tril(np.ones((r,r)),-1)
neg_uppr
Out[23]:
array([[ 1., -1.],
[ 1., 1.]])
a = np.array(
[[[ 0, 1],
[ 5, 0]],
[[ 0, 3],
[ 2, 0]]])
its fast to use the builtin element-wise arithmetic
a = a * neg_uppr
a
Out[26]:
array([[[ 0., -1.],
[ 5., 0.]],
[[ 0., -3.],
[ 2., 0.]]])
you can specify axes to sum over:
np.sum(a, (1,2))
Out[27]: array([ 4., -1.])
I have a matrix like this
x = [[a, b],
[c, d]]
But instead of a,b,c,d for each of those values there's a list of numbers, for example [x, xx, xxx].
I would like to create another matrix that would have ones only on positions where x==0 && xx==0 && xxx==0. How can I do that without loops? For example, I could do B = [x == 0], but how can I do that where there's a list instead of single matrix element?
If the list is of fixed length, you can create a 3d array and then use np.all() on its last axis:
In [1]: import numpy as np
In [2]: a = np.zeros((2, 2, 3)) # 2x2 matrix, 3 variants for each element
In [3]: a[0, 0] = [0, 1, 2] # filling one element of the "matrix"
In [4]: a[0, 1] = 1
In [5]: a[1, 1] = 0 # this
In [6]: a[1, 0] = 0 # and this are "all zeros"
In [7]: a
Out[7]:
array([[[ 0., 1., 2.],
[ 1., 1., 1.]],
[[ 0., 0., 0.],
[ 0., 0., 0.]]])
Now let's construct the matrix b:
In [8]: np.all(a == 0, axis=-1).astype(int)
Out[8]:
array([[0, 0],
[1, 1]])
If you want another condition, you can modify the expression in the following way:
In [9]: np.all(a - [0, 1, 2] == 0, axis=-1).astype(int)
Out[9]:
array([[1, 0],
[0, 0]])