I'm trying to translate the as_strided function of NumPy to a function in Python when I translate ahead the number of strides to the number of variables according to the type of the variable (for float32 I divide the stride by 4, etc).
The code I implemented:
def as_strided(x, shape, strides):
x = x.flatten()
size = 1
for value in shape:
size *= value
arr = np.zeros(size, dtype=np.float32)
curr = 0
for i in range(shape[0]):
for j in range(shape[1]):
for k in range(shape[2]):
arr[curr] = x[i * strides[0] + j * strides[1] + k * strides[2]]
curr = curr + 1
return np.reshape(arr, shape)
In order to test the function I wrote 2 auxiliary functions:
def sliding_window(x, shape, strides):
f_mine = as_strided(x, shape, [stride // 4 for stride in strides])
f_np = np.lib.stride_tricks.as_strided(x, shape=shape, strides=strides).copy()
check_strides(x.flatten(), f_mine)
check_strides(x.flatten(), f_np)
return f_mine, f_np
def check_strides(original, strided):
s1 = int(np.where(original == strided[1][0][0])[0])
s2 = int(np.where(original == strided[0][1][0])[0])
s3 = int(np.where(original == strided[0][0][1])[0])
print([s1, s2, s3])
return [s1, s2, s3]
In the main code, I selected some shape and strides values and ran 2 cases:
Uploaded a .npy file that includes a matrix in float32 - variable x.
Created random matrix of the same size and type as variable x - variable y.
When I check the strides of the resulting matrices I get a strange phenomenon.
For case 1 - the final resulted strides obtained using the NumPy function are different from the required stride (and from my implementation).
For case 2 - the outputs are identical.
The main code:
shape = (30, 818, 300)
strides = (4, 120, 120)
# case 1
x = np.load('x.npy')
s_mine, s_np = sliding_window(x, shape, strides)
print(np.array_equal(s_mine, s_np))
#case 2
y = np.random.randn(x.shape[0], x.shape[1]).astype(np.float32)
s_mine, s_np = sliding_window(y, shape, strides)
print(np.array_equal(s_mine, s_np))
Here you can find the x.npy file that causes the desired stride change in the numpy function. I'd be happy if anyone could explain to me why this is happening.
I downloaded x.npy and loaded it. And ran as_strided on y. I haven't looked at your code.
Normally when playing with as_strided I like to look at the arrays, but in this case they are large enough that I'll focus more making sense the strides and shape.
In [39]: x.shape, x.strides
Out[39]: ((30, 1117), (4, 120))
In [40]: y.shape, y.strides
Out[40]: ((30, 1117), (4468, 4))
I wondered where you got the
shape = (30, 818, 300)
strides = (4, 120, 120)
OK the 30 is shared, but the 4 is only for x. And with those strides x looks like it's F ordered, may be even a transpose of a (1117,30) array. Your y, which was constructed with random, has the typical strides for C ordered array, 4 bytes for the inner, trailing dimension, and 4*1117 for the leading dimension.
I am attempting a numpy.matmul call using as variables
Matrix A of dimensions (p, t, q)
Matrix B of dimensions (r, t).
A categories vector of shape r and p categories, used to take slices of B and define the index of A do use.
The multiplications are done iteratively using the indices of each category. For each category p_i, I extract from A a submatrix (t, q). Then, I multiply those with a subset of B (x, t), where x is a mask defined by r == p_i. Finally, the matrix multiplication of (x, t) and (t, q) produces the output (x, q) which is stored at S[x].
I have noted that I cannot figure out a non-iterative version of this algorithm. The first snippet describes an iterative solution. The second one is an attempt at what I would wish to get, where everything is calculated as a single-step and would be presumably faster. However, it is incorrect because matrix A has three dimensions instead of two. Maybe there is no way to do this in NumPy with a single call, and in general, looking for advice/ideas to try out.
import numpy as np
p, q, r, t = 2, 9, 512, 4
# data initialization (random)
S = np.random.rand(r, q)
A = np.random.randint(0, 3, size=(p, t, q))
B = np.random.rand(r, t)
categories = np.random.randint(0, p, r)
print('iterative') # iterative
for i in range(p):
# print(i)
a = A[i, :, :]
mask = categories == i
b = B[mask]
print(b.shape, a.shape, S[mask].shape,
np.matmul(b, a).shape)
S[mask] = np.matmul(b, a)
a simple way to write it down
S = np.random.rand(r, q)
result = np.matmul(B, A[:p,:,:])
# iterative assignment
i = 0
S[categories == i] = result[i, categories == i, :]
i = 1
S[categories == i] = result[i, categories == i, :]
The next snippet will produce an error during the multiplication step.
# attempt to multiply once, indexing all categories only once (not possible)
S = np.random.rand(r, q)
# attempt to use the categories vector
a = A[categories, :, :]
b = B[categories]
# due to the shapes of the arrays, this multiplication is not possible
print('\nsingle step (error due to shapes of the matrix a')
print(b.shape, a.shape, S[categories].shape)
S[categories] = np.matmul(b, a)
(250, 4) (4, 9) (250, 9) (250, 9)
(262, 4) (4, 9) (262, 9) (262, 9)
(512, 9)
single step (error due to shapes of the 2nd matrix a).
(512, 4) (512, 4, 9) (512, 9)
In [63]: (np.ones((512,4))#np.ones((512,4,9))).shape
Out[63]: (512, 512, 9)
This because the first array is broadcasted to (1,512,4). I think you want instead to do:
In [64]: (np.ones((512,1,4))#np.ones((512,4,9))).shape
Out[64]: (512, 1, 9)
Then remove the middle dimension to get a (512,9).
Another way:
In [72]: np.einsum('ij,ijk->ik', np.ones((512,4)), np.ones((512,4,9))).shape
Out[72]: (512, 9)
To remove the loop altogether, you can try this
bigmask = np.arange(p)[:, np.newaxis] == categories
C = np.matmul(B, A)
res = C[np.broadcast_to(bigmask[..., np.newaxis], C.shape)].reshape(r, q)
# `res` has the same rows as the iterative `S` but in the wrong order
# so we need to reorder the rows
sort_index = np.argsort(np.broadcast_to(np.arange(r), bigmask.shape)[bigmask])
assert np.allclose(S, res[sort_index])
Though I'm not sure it's much faster than the iterative version.
I have a 3D array (4,3,3) in which I would like to iteratively multiply with a 1D array (t variable) and sum to end up with an array (A) that is a summation of the four 3,3 arrays
I'm unsure on how I should be assigning indexes or how and if I should be using np.ndenumerate
import numpy as np
import math
#Enter material constants for calculation of stiffness matrix
E1 = 20
E2 = 1.2
G12 = 0.8
theta = np.array([30,-30,-30,30])
deg = ((math.pi*theta/180))
k = len(theta) #number of layers
t = np.array([0.005,0.005,0.005,0.005])
#Calculation of Q Values
Q11 = 1
Q12 = 2
Q21 = 3
Q22 = 4
Q66 = 5
Qbar = np.zeros((len(theta),3,3),order='F')
for i, x in np.ndenumerate(deg):
m= np.cos(x) #sin of rotated lamina
n= np.sin(x) #cos of rotated lamina
Qbar21 = Qbar12
Qbar[i] = np.array([[Qbar11, Qbar12, Qbar16], [Qbar21, Qbar22, Qbar26], [Qbar16, Qbar26, Qbar66]], order = 'F')
A = np.zeros((3,3))
for i in np.nditer(t):
If I understand correctly, you want to multiply Qbar and t over the first axis, and then summing the result over the first axis (which results in an array of shape (3, 3)).
I created random arrays to make the code minimal:
import numpy as np
Qbar = np.random.randint(2, size=(4, 3, 3))
t = np.arange(4)
A = (Qbar * t[:, None, None]).sum(axis=0)
t[:, None, None] will create two new dimensions so that the shape becomes (4, 1, 1), which can be multiplied to Qbar element-wise. Then we just have to sum over the first axis.
NB: A = np.tensordot(t, Qbar, axes=([0],[0])) also works and can be faster for larger dimensions, but for the dimensions you provided I prefer the first solution.
I have a constant symmetric matrix A with shape (50,50) and inputs x with shape (batch_size, 50) where each entry is an integer in [0,49] - these correspond to indexes in A.
I wish to create a new tensor with shape (batch_size, 50, 50) where each element in the batch is the matrix A permuted according to the ordering given in the input x. Each input has a different ordering of the integers from 0 to 49. Then, this
The only way I've thought to do this does not work, and I fear it would be inefficient even if it didn't give an error:
#Given x and A
x = np.zeros((b, 50))
for i in range(b):
x[b,:] = np.random.permutation(50)
rand_mat = np.random.rand(50,50)
A = np.matmul(rand_mat, np.transpose(rand_mat)) # a random symmetric matrix
# do permutation
batch_size = x.shape[0] # infer batch size from inputs
permuted_matrices = np.zeros((batch_size, 50, 50))
for i in range(batch_size):
permuted_matrices[i,:,:] = A[:,x[i,:]][x[i,:],:] # permute both rows and columns according to x[i,:]
But when I call my layer, I get an error TypeError: 'Tensor' object cannot be interpreted as an integer (because of the for loop). If I instead use tf.shape(x)[0] instead of x.shape[0], then I get TypeError: Expected int32, got None of type 'NoneType' instead (because of np.zeros). Is there a TensorFlow function I could use that would be easier?
Use gather() and gather_nd():
r = 50 # use 10 to check
batch_size = 10
x = tf.random.uniform((batch_size, r), 0, r, tf.int32)
A = tf.range(r * r)
A = tf.reshape(A, (r, r))
ind = x[..., tf.newaxis]
output = tf.gather_nd(A, ind) # permute rows
output = tf.transpose(output, (0, 2, 1))
output = tf.gather(output, x, axis=1, batch_dims=1) # permute columns
output = tf.transpose(output, (0, 2, 1))
I am new on Python and I don't know exactly how to perform multiplication between arrays of different shape.
I have two different arrays w and b such that:
W.shape = [32, 5, 20]
b.shape = [5,]
and I want to multiply
W[:, i, :]*b[i]
for each i from 0 to 4.
How can I do that? Thanks in advance.
You could add a new axis to b so it is multiplied accross W's inner arrays' rows, i.e the second axis:
W * b[:,None]
What you want to do is called Broadcasting. In numpy, you can multiply this way, but only if the shapes match according to some restrictions:
Starting from the right, every component of each arrays' shape must be the equal, 1, or not exist
so right now you have:
W.shape = (32, 5, 20)
b.shape = (5,)
since 20 and 5 don't match, they cant' be broadcast.
If you were to have:
W.shape = (32, 5, 20)
b.shape = (5, 1 )
20 would match with 1 (1 is always ok) and the 5's would match, and you can then multiply them.
To get b's shape to (5, 1), you can either do .reshape(5, 1) (or, more robustly, .reshape(-1, 1)) or fancy index with [:, None]
Thus either of these work:
W * b[:,None] #yatu's answer
W * b.reshape(-1, 1)
Let's say I have a tensor shaped (1, 64, 128, 128) and I want to create a tensor of shape (1, 64, 255) holding the sums of all diagonals for every (128, 128) matrix (there are 1 main, 127 below, 127 above diagonals so in total 255). What I am currently doing is the following:
x = torch.rand(1, 64, 128, 128)
diag_sums = torch.zeros(1, 64, 255)
j = 0
for k in range(-127, 128):
diag_sums[j, :, k + 127] = torch.diagonal(x, offset=k, dim1=-2, dim2=-1).sum(dim=2)
This is obviously very slow, since it is using Python loops and is not done in parallel with respect to k.
I don't think this can be done using torch.diagonal since the function explicitly uses a single int for the offset parameter. If I could pass a list there, this would work, but I guess it would be complicated to implement (requiring changes in PyTorch itself).
I think it could be possible to implement this using torch.einsum, but I cannot think of a way to do it.
So this is my question: how do I get the tensor described above?
Have you considered using torch.nn.functional.conv2d?
You can sum the diagonals with a diagonal filter sliding across the tensor with appropriate zero padding.
import torch
import torch.nn.functional as nnf
# construct a diagonal filter using `eye` function, shape it appropriately
f = torch.eye(x.shape[2])[None, None,...].repeat(x.shape[1], 1, 1, 1)
# compute the diagonal sum with appropriate zero padding
conv_diag_sums = nnf.conv2d(x, f, padding=(x.shape[2]-1,0), groups=x.shape[1])[..., 0]
Note the the result has a slightly different order than the one you computed in the loop:
diag_sums = torch.zeros(1, 64, 255)
for k in range(-127, 128):
diag_sums[j, :, 127-k] = torch.diagonal(x, offset=k, dim1=-2, dim2=-1).sum(dim=2)
# compare
(conv_diag_sums == diag_sums).all()
results with True - they are the same.
Shai's answer works, however it looks like it has a lot of multiplications, due to the large size of the kernel. I figured out a way to do this for my use case. It is based on this answer for a similar question in Numpy: https://stackoverflow.com/a/35074207/6636290
I am doing the following:
digitized = np.sum(np.indices(a.shape), axis=0).ravel()
digitized_tensor = torch.Tensor(digitized).int()
a_tensor = torch.Tensor(a)
torch.bincount(digitized_tensor, a_tensor.view(-1))
If I could figure out a way to do this entirely in PyTorch (without Numpy's indices function), this would be great, but this answers the question.
The previous answers work, but there is another faster solution using strides (and that only uses Pytorch).
First I'll explain with a matrix as it is easier to understand.
Given you have a matrix M with size (n, n), you can change the matrix strides so that the resulting matrix has M's diagonals as columns. Then you can just sum the column to get your result.
import torch
def sum_all_diagonal_matrix(mat: torch.tensor):
n,_ = mat.shape
zero_mat = torch.zeros((n, n)) # Zero matrix used for padding
mat_padded = torch.cat((zero_mat, mat, zero_mat), 1) # pads the matrix on left and right
mat_strided = mat_padded.as_strided((n, 2*n), (3*n + 1, 1)) # Change the strides
sum_diags = torch.sum(mat_strided, 0) # Sums the resulting matrix's columns
return sum_diags[1:]
X = torch.arange(9).reshape(3,3)
# tensor([[0, 1, 2],
# [3, 4, 5],
# [6, 7, 8]])
# tensor([ 6., 10., 12., 6., 2.])
You can do exactly the same with one more dimension:
def sum_all_diagonal(mat: torch.tensor):
k,n,_ = mat.shape
zero_mat = torch.zeros((k, n, n))
mat_padded = torch.cat((zero_mat, mat, zero_mat), 2)
mat_strided = mat_padded.as_strided((k, n, 2*n), (3*n*n, 3*n + 1, 1))
sum_diags = torch.sum(mat_strided, 1)
return sum_diags[:, n:]