Tying all values along the diagonal of a matrix in PyTorch - python

The model I am implementing has a set of parameters a1, ..., aH that correspond to weighing previous outputs. It's realized through multiplying a matrix that looks like this:
a1 0 0 0 0 ...
a2 a1 0 0 0 ...
a3 a2 a1 0 0 ...
: : : : :
In the current implementation, the a's are saved in a one-dimensional nn.parameter.Parameter with H entries, from which the matrix is constructed during each forward pass. The gradient of the matrix automatically propagates to the parameters via autograd.
However, this requires constructing the matrix anew every forward pass. Is there a way to have the matrix itself be the parameter but tie the weights along the main diagonal and lower subdiagonals so that it is equivalent to constructing it from the parameter vector?

You can use dia_matrix from scipy
from scipy.sparse import dia_matrix
a = [1,2,3]
b = [4,5]
m = dia_matrix([a, b], [0, -1])
print(m)
# (0, 0) [1, 2, 3]
# (0, 1) [4, 5]
a[0] = 10
print(m)
# (0, 0) [10, 2, 3]
# (0, 1) [4, 5]

Related

How to convert 2D matrix to 3D by embeddings in numpy? [duplicate]

I'm trying to index the last dimension of a 3D matrix with a matrix consisting of indices that I wish to keep.
I have a matrix of thrust values with shape:
(3, 3, 5)
I would like to filter the last index according to some criteria so that it is reduced from size 5 to size 1. I have already found the indices in the last dimension that fit my criteria:
[[0 0 1]
[0 0 1]
[1 4 4]]
What I want to achieve: for the first row and first column I want the 0th index of the last dimension. For the first row and third column I want the 1st index of the last dimension. In terms of indices to keep the final matrix will become a (3, 3) 2D matrix like this:
[[0,0,0], [0,1,0], [0,2,1];
[1,0,0], [1,1,0], [1,2,1];
[2,0,1], [2,1,4], [2,2,4]]
I'm pretty confident numpy can achieve this, but I'm unable to figure out exactly how. I'd rather not build a construction with nested for loops.
I have already tried:
minValidPower = totalPower[:, :, tuple(indexMatrix)]
But this results in a (3, 3, 3, 3) matrix, so I am not entirely sure how I'm supposed to approach this.
With a as input array and idx as the indexing one -
np.take_along_axis(a,idx[...,None],axis=-1)[...,0]
Alternatively, with open-grids -
I,J = np.ogrid[:idx.shape[0],:idx.shape[1]]
out = a[I,J,idx]
You can build corresponding index arrays for the first two dimensions. Those would basically be:
[0 1 2]
[0 1 2]
[0 1 2]
[0 0 0]
[1 1 1]
[2 2 2]
You can construct these using the meshgrid function. I stored them as m1, and m2 in the example:
vals = np.arange(3*3*5).reshape(3, 3, 5) # test sample
m1, m2 = np.meshgrid(range(3), range(3), indexing='ij')
m3 = np.array([[0, 0, 1], 0, 0, 1], [1, 4, 4]])
sel_vals = vals[m1, m2, m3]
The shape of the result matches the shape of the indexing arrays m1, m2 and m3.

How can I get exactly the same amount elements replaced in numpy 2D matrix?

I got a symmetrical 2D numpy matrix, it only contains ones and zeros and diagonal elements are always 0.
I want to replace part of the elements from one to zero, and the result need to keep symmetrical too. How many elements will be selected depends on the parameterreplace_rate.
Since it's a symmetrical matrix, I take half of the matrix and select the elements(those values are 1) randomly, change them from 1 to 0. And then with a mirror operation, make sure the whole matrix are still symmetrical.
For example
com = np.array ([[0, 1, 1, 1, 1],
[1, 0, 1, 1, 1],
[1, 1, 0, 1, 1],
[1, 1, 1, 0, 1],
[1, 1, 1, 1, 0]])
replace_rate = 0.1
com = np.triu(com)
mask = np.random.choice([0,1],size=(com.shape),p=((1-replace_rate),replace_rate)).astype(np.bool)
r1 = np.random.rand(*com.shape)
com[mask] = r1[mask]
com += com.T - np.diag(com.diagonal())
com is a (5,5) symmetrical matrix, and 10% of elements (only include those values are 1, the diagonal elements are excluded) will be replaced to 0 randomly.
The question is , how can I make sure the amount of elements changed keep the same each time?
Keep the same replace_rate = 0.1, sometimes I will get result like:
com = np.array([[0 1 1 1 1]
[1 0 1 1 1]
[1 1 0 1 1]
[1 1 1 0 1]
[1 1 1 1 0]])
Actually no one changed this time, and if I repeat it, I got 2 elements changed :
com = np.array([[0 1 1 1 1]
[1 0 1 1 1]
[1 1 0 1 0]
[1 1 1 0 1]
[1 1 0 1 0]])
I want to know how to fix the amount of elements changed with the same replace_rate?
Thanks in advance!!
How about something like this:
def make_transform(m, replace_rate):
changed = [] # keep track of indices we already changed
def get_random():
# Get a random pair of indices which are not equal (i.e. not on the diagonal)
c1, c2 = random.choices(range(len(com)), k=2)
if c1 == c2 or (c1,c2) in changed or (c2,c1) in changed:
return get_random() # Recurse until we find an i,j pair : i!=j , that hasnt already been changed
else:
changed.append((c1,c2))
return c1, c2
n_changes = int(m.shape[0]**2 * replace_rate) # the number of changes to make
print(n_changes)
for _ in range(n_changes):
i, j = get_random() # Get an valid index
m[i][j] = m[j][i] = 0
return m
This is the solution I suggest:
def rand_zero(mat, replace_rate):
triu_mat = np.triu(mat)
_ind = np.where(triu_mat != 0) # gets indices of non-zero elements, not just non-diagonals
ind = [x for x in zip(*_ind)]
chng = np.random.choice(range(len(ind)), # select some indices, at rate 'replace_rate'
size = int(replace_rate*mat.size),
replace = False) # do not select duplicates
mod_mat = triu_mat
for c in chng:
mod_mat[ind[c]] = 0
mod_mat = mod_mat + mod_mat.T
return mod_mat
I use int() to truncate to an integer in size, but you can use round() if that's what you desire.
Hope this gives consistent results!

Numpy - Index last dimension of array with index array

I'm trying to index the last dimension of a 3D matrix with a matrix consisting of indices that I wish to keep.
I have a matrix of thrust values with shape:
(3, 3, 5)
I would like to filter the last index according to some criteria so that it is reduced from size 5 to size 1. I have already found the indices in the last dimension that fit my criteria:
[[0 0 1]
[0 0 1]
[1 4 4]]
What I want to achieve: for the first row and first column I want the 0th index of the last dimension. For the first row and third column I want the 1st index of the last dimension. In terms of indices to keep the final matrix will become a (3, 3) 2D matrix like this:
[[0,0,0], [0,1,0], [0,2,1];
[1,0,0], [1,1,0], [1,2,1];
[2,0,1], [2,1,4], [2,2,4]]
I'm pretty confident numpy can achieve this, but I'm unable to figure out exactly how. I'd rather not build a construction with nested for loops.
I have already tried:
minValidPower = totalPower[:, :, tuple(indexMatrix)]
But this results in a (3, 3, 3, 3) matrix, so I am not entirely sure how I'm supposed to approach this.
With a as input array and idx as the indexing one -
np.take_along_axis(a,idx[...,None],axis=-1)[...,0]
Alternatively, with open-grids -
I,J = np.ogrid[:idx.shape[0],:idx.shape[1]]
out = a[I,J,idx]
You can build corresponding index arrays for the first two dimensions. Those would basically be:
[0 1 2]
[0 1 2]
[0 1 2]
[0 0 0]
[1 1 1]
[2 2 2]
You can construct these using the meshgrid function. I stored them as m1, and m2 in the example:
vals = np.arange(3*3*5).reshape(3, 3, 5) # test sample
m1, m2 = np.meshgrid(range(3), range(3), indexing='ij')
m3 = np.array([[0, 0, 1], 0, 0, 1], [1, 4, 4]])
sel_vals = vals[m1, m2, m3]
The shape of the result matches the shape of the indexing arrays m1, m2 and m3.

Why does dim=1 return row indices in torch.argmax?

I am working on argmax function of PyTorch which is defined as:
torch.argmax(input, dim=None, keepdim=False)
Consider an example
a = torch.randn(4, 4)
print(a)
print(torch.argmax(a, dim=1))
Here when I use dim=1 instead of searching column vectors, the function searches for row vectors as shown below.
print(a) :
tensor([[-1.7739, 0.8073, 0.0472, -0.4084],
[ 0.6378, 0.6575, -1.2970, -0.0625],
[ 1.7970, -1.3463, 0.9011, -0.8704],
[ 1.5639, 0.7123, 0.0385, 1.8410]])
print(torch.argmax(a, dim=1))
tensor([1, 1, 0, 3])
As far as my assumption goes dim = 0 represents rows and dim =1 represent columns.
It's time to correctly understand how the axis or dim argument work in PyTorch:
The following example should make sense once you comprehend the above picture:
|
v
dim-0 ---> -----> dim-1 ------> -----> --------> dim-1
| [[-1.7739, 0.8073, 0.0472, -0.4084],
v [ 0.6378, 0.6575, -1.2970, -0.0625],
| [ 1.7970, -1.3463, 0.9011, -0.8704],
v [ 1.5639, 0.7123, 0.0385, 1.8410]]
|
v
# argmax (indices where max values are present) along dimension-1
In [215]: torch.argmax(a, dim=1)
Out[215]: tensor([1, 1, 0, 3])
Note: dim (short for 'dimension') is the torch equivalent of 'axis' in NumPy.
Dimensions are defined as shown in the above excellent answer. I have highlighted the way I understand dimensions in Torch and Numpy (dim and axis respectively) and hope that this will be helpful to others.
Notice that only the specified dimension’s index varies during the argmax operation, and the specified dimension’s index range reduces to a single index once the operation is completed. Let tensor A have M rows and N columns and consider the sum operation for simplicity. The shape of A is (M, N). If dim=0 is specified, then the vectors A[0,:], A[1,:], ..., A[M-1,:] are summed elementwise and the result is another tensor with 1 row and N columns. Notice that only the 0th dimension’s indices vary from 0 throughout M-1. Similarly, If dim=1 is specified, then the vectors A[:,0], A[:,1], ..., A[:,N-1] are summed elementwise and the result is another tensor with M rows and 1 column.
An example is given below:
>>> A = torch.tensor([[1,2,3], [4,5,6]])
>>> A
tensor([[1, 2, 3],
[4, 5, 6]])
>>> S0 = torch.sum(A, dim = 0)
>>> S0
tensor([5, 7, 9])
>>> S1 = torch.sum(A, dim = 1)
>>> S1
tensor([ 6, 15])
In the above sample code, the first sum operation specifies dim=0, therefore A[0,:] and A[1,:], which are [1,2,3] and [4,5,6], are summed and resulted in [5, 7, 9]. When dim=1 was specified, the vectors A[:,0], A[:,1], and A[:2], which are the vectors [1, 4], [2, 5], and [3, 6], are elementwise added to find [6, 15].
Note also that the specified dimension collapses. Again let A have the shape (M, N). If dim=0, then the result will have the shape (1, N), where dimension 0 is reduced from M to 1. Similarly if dim=1, then the result would have the shape (M, 1), where N is reduced to 1. Note also that shapes (1, N) and (M,1) are represented by a single-dimensional tensor with N and M elements respectively.

Numpy: Storing standard basis vector in a memory efficient way

This was the only question I found about standard basis vectors in numpy but it's not really related to my question.
I have a numpy array of integers and I want to determine the co-occurrence matrix which stores the number of times indicies have the same value in the same column. This question describes the problem in more detail.
I have a method of solving my problem but it doesn't scale well.
My question then is this:
Is it possible to store standard basis vectors in a numpy array in a memory efficient manner?
I want to be able to do the following:
Given an array
M = e1 e2 e1
e1 e2 e2
e3 e1 e3
e2 e3 e3
where ei is the transposed i-th standard basis vector of the vector space (R3 in this case), perform matrix multiplication with the transpose of M, i.e. determine np.dot(M, M.T). To be clear, the matrix M above could be written as:
M = 1 0 0 0 1 0 1 0 0
1 0 0 0 1 0 0 1 0
0 0 1 1 0 0 0 0 1
0 1 0 0 0 1 0 0 1
(extra spaces added for emphasis).
The issue with representing the matrix like this is that it isn't scalable in memory with the number of rows and dimension of the vector space.
EDIT: I should mention that the number of columns can increase as well. The memory complexity is D * R * C where D is the dimension of the vector space, R is the number of rows and C is the number of columns. In an average working example I have roughly D == 150, R == 2000 and C == 1000 though R can go up to 20,000 and C is unbounded (though 10,000 is a reasonable estimate).
The rules for standard basis vector multiplication are simple (ei * ei.T == 1, ei * ej.T == 0 if i != j) so I was wondering if it's possible to store these rules in a numpy array to save memory.
Let's encode the basis vectors with numbers: e1 -> 1, e2 -> 2, ... This allows a very memory-efficient storage.
M = np.array([[1, 2, 1], [1, 2, 2], [3, 1, 3], [2, 3, 3]], dtype=np.uint8)
# if more than 255 basis vectors, use uint16.
Now we only need to implement a special dot product that works with these basis vectors. Basically we only replace the multiplication with a comparison:
def basis_dot(a, b):
return np.sum(a[:, :, np.newaxis] == b[np.newaxis, :, :], axis=1)
print(basis_dot(M, M.T))
# [[3 2 0 0]
# [2 3 0 0]
# [0 0 3 1]
# [0 0 1 3]]
Let's verify the result:
M = np.array([[1, 0, 0, 0, 1, 0, 1, 0, 0],
[1, 0, 0, 0, 1, 0, 0, 1, 0],
[0, 0, 1, 1, 0, 0, 0, 0, 1],
[0, 1, 0, 0, 0, 1, 0, 0, 1]])
np.dot(M, M.T)
# array([[3, 2, 0, 0],
# [2, 3, 0, 0],
# [0, 0, 3, 1],
# [0, 0, 1, 3]])
A potential drawback with the approach is the large temporary array required in basis_dot. The memory requirement can be reduced by explicitly coding the loops, at the cost of performance (unless you use a jit compiler).
# slower but more memory friendly
def basis_dot(a, b):
out = np.empty((a.shape[0], b.shape[1]))
for i in range(a.shape[0]):
for j in range(b.shape[1]):
out[i, j] = np.sum(a[i, :] == b[:, j])
return out
So, my assumption based on your example is that you're actually working with a higher dimensionality than just 3. My other assumption is that you're not computing any basis vectors, but just auto-generating basis vectors for RN. I'll ignore the question of exactly what you're trying to accomplish or why you're storing vectors that you can easily auto-generate for now.
If all of the above assumptions are accurate then you can likely gain a lot of benefit by storing in a sparse data format. This will only improve storage if you've got a preponderance of zeroes, but that seems like a reasonable assumption. There are a large number of sparse formats which you can view here. My best guess for you would be the coo_matrix class.
from scipy.sparse import coo_matrix
new_matrix = coo_matrix(<your_matrix>)
Then saving new matrix in your format of choice.

Categories