I have three tensors a, b, c
a = torch.tensor([1,2])
b = torch.tensor([3,4])
c = b.view(2,1)
Now If I do a # b == a # c it return tensor([True])
when I check b.shape c.shape, and they are different.
My question is what is the direction of the 1D tensor, is it vertical or horizontal?
whether it is vertical or horizontal, a # b should not work without b's transpose.
How to understand the 1D dimension's direction in Pytorch? is shape(3) same as shape(3,1) ?
Or shape(3) could be either shape(3,1) or shape (1,3)?
In your example, a and b are 1D tensors which means they are vectors. These vectors are in a 2 dimensional space. vector a has x=1 and y=2.
a#b is product of vector a and b.
a#b = a[0]*b[0] + a[1]*b[1] = 1*3+2*4 = 11
which 11 is a scalar. But c is a matrix and its product is:
a#c = a[0]*c[0] + a[1]*c[1] = 1*[3]+2*[4] = [11]
if you compare these results in torch you have:
a#b == a#c is equal to torch.tensor(11)==torch.tensor([11]) which the result is tensor([True])
a, b both are vectors(1st order tensor) of size 2. Vector multiplication is possible as long as both have same size.
a = torch.tensor([1,2])
b = torch.tensor([3,4])
print(a.shape)
print(b.shape)
#Output:
#torch.Size([2])
#torch.Size([2])
c is reshaped into a matrix(2nd order tensor) of shape (2x1).
dot product is possible between a vector and matrix with valid sizes.
c = b.view(2,1)
print(a.shape) #torch.Size([2])
print(c.shape) #torch.Size([2,1])
print(a#c) # possible
print(c#a) # not possible, will throw error of size mismatch.
Related
Suppose I have three vectors A, B, C
A vector size of 256
B vector size of 256
C vector size of 256
Now I want to do concatenation in the following way:
AB= vector size will be 512
AC = vector size will be 512
BC = vector size will be 512
However, I need to restrict all the concatenated vectors to 256, like:
AB= vector size will be 256
AC = vector size will be 256
BC = vector size will be 256
One way is to take the mean of each two values of the two vectors like A first index value and B first index value, A second index value and B second index value ... etc. Similarly, in the concatenation of other vectors.
How I implement this:
x # torch.Size([32, 3, 256]) # 32 is Batch size, 3 is vector A, vector B, vector C and 256 is each vector dimension
def my_fun(self, x):
iter = x.shape[0]
counter = 0
new_x = torch.zeros((10, x.shape[1]), dtype=torch.float32, device=torch.device('cuda'))
for i in range(0, x.shape[0] - 1):
iter -= 1
for j in range(0, iter):
mean = (x[i, :] + x[i+j, :])/2
new_x[counter, :] = torch.unsqueeze(mean, 0)
counter += 1
final_T = torch.cat((x, new_x), dim=0)
return final_T
ref = torch.zeros((x.shape[0], 15, x.shape[2]), dtype=torch.float32, device=torch.device('cuda'))
for i in range (x.shape[0]):
ref[i, :, :] = self.my_fun(x[i, :, :])
But this implementation is computationally expensive. One reason is I am iterating batch-wise which makes it computationally expensive. Is there any efficient way to implement this task?
Torch has a builtin mean method, which can calculate means element-wise.
import torch
import numpy as np
import itertools as it
allvectors=torch.stack((a,b,c), dim=0)
values=it.combinations([0,1,2], 2)
for i,j in values:
pairedvectors=torch.stack((allvectors[i],allvectors[j]), dim=0)
mean=torch.mean(pairedvectors,dim=0)
for 3 example vectors:
a=torch.from_numpy(np.zeros((5)))
b=torch.from_numpy(np.ones((5)))
c=torch.from_numpy(np.ones((5))*5)
It results in the following vectors:
tensor([0.5000, 0.5000, 0.5000, 0.5000, 0.5000], dtype=torch.float64)
tensor([2.5000, 2.5000, 2.5000, 2.5000, 2.5000], dtype=torch.float64)
tensor([3., 3., 3., 3., 3.], dtype=torch.float64)
I have a 3D array (4,3,3) in which I would like to iteratively multiply with a 1D array (t variable) and sum to end up with an array (A) that is a summation of the four 3,3 arrays
I'm unsure on how I should be assigning indexes or how and if I should be using np.ndenumerate
Thanks
import numpy as np
import math
#Enter material constants for calculation of stiffness matrix
E1 = 20
E2 = 1.2
G12 = 0.8
v12=0.25
v21=(v12/E1)*E2
theta = np.array([30,-30,-30,30])
deg = ((math.pi*theta/180))
k = len(theta) #number of layers
t = np.array([0.005,0.005,0.005,0.005])
#Calculation of Q Values
Q11 = 1
Q12 = 2
Q21 = 3
Q22 = 4
Q66 = 5
Qbar = np.zeros((len(theta),3,3),order='F')
#CALCULATING THE VALUES OF THE QBAR MATRIX
for i, x in np.ndenumerate(deg):
m= np.cos(x) #sin of rotated lamina
n= np.sin(x) #cos of rotated lamina
Qbar11=Q11*3
Qbar12=Q22*4
Qbar16=Q16*4
Qbar21 = Qbar12
Qbar22=Q22*1
Qbar26=Q66*2
Qbar66=Q12*3
Qbar[i] = np.array([[Qbar11, Qbar12, Qbar16], [Qbar21, Qbar22, Qbar26], [Qbar16, Qbar26, Qbar66]], order = 'F')
print(Qbar)
A = np.zeros((3,3))
for i in np.nditer(t):
A[i]=Qbar[i]*t[i]
A=sum(A[i])
If I understand correctly, you want to multiply Qbar and t over the first axis, and then summing the result over the first axis (which results in an array of shape (3, 3)).
I created random arrays to make the code minimal:
import numpy as np
Qbar = np.random.randint(2, size=(4, 3, 3))
t = np.arange(4)
A = (Qbar * t[:, None, None]).sum(axis=0)
t[:, None, None] will create two new dimensions so that the shape becomes (4, 1, 1), which can be multiplied to Qbar element-wise. Then we just have to sum over the first axis.
NB: A = np.tensordot(t, Qbar, axes=([0],[0])) also works and can be faster for larger dimensions, but for the dimensions you provided I prefer the first solution.
Consider the following vector:
import numpy as np
u = np.random.randn(5)
print(u)
[-0.30153275 -1.48236907 -1.09808763 -0.10543421 -1.49627068]
When we print its shape:
print(u.shape)
(5,)
I was told this is neither a column vector nor a row vector. So what is essentially this shape is in numpy (m,) ?
# one-dimensional array (rank 1 array)
# array([ 0.202421 , 1.04496629, -0.28473552, 0.22865349, 0.49918827])
a = np.random.randn(5,) # or b = np.random.randn(5)
# column vector (5 x 1)
# array([[-0.52259951],
# [-0.2200037 ],
# [-1.07033914],
# [ 0.9890279 ],
# [ 0.38434068]])
c = np.random.randn(5,1)
# row vector (1 x 5)
# array([[ 0.42688689, -0.80472245, -0.86294221, 0.28738552, -0.86776229]])
d = np.random.randn(1,5)
For example (see docs):
numpy.dot(a, b)
If both a and b are 1-D arrays, it is inner product of vectors (without complex conjugation).
If both a and b are 2-D arrays, it is matrix multiplication
In AE-lstm,
here is a lstm out with shape [batch, timestamp, diml](regarded as [b, t, dl]) and the aspect vector is [batch, dima] (regarded as [b, da])
How to concat two variables to make the shape be [b, t, dl+da]?
It means that for every batch, concat the aspect vector to every timestamp row.
I am not absolutely sure, but I think what you want is
C = tf.concat([A, tf.tile(tf.expand_dims(B, axis=1), [1, t, 1])], axis=-1)
where A is your lstm and B the aspect vector. I validated it by simply checking the dimensions, which seems to be correct. Let's see if this is what you actually need.
Edit: Just to be clear, this is the entire code I used to test this:
import tensorflow as tf
b = 5
t = 10
dl = 15
da = 12
A = tf.ones(shape=(b, t, dl))
B = tf.ones(shape=(b, da))
C = tf.concat([A, tf.tile(tf.expand_dims(B, axis=1), [1, t, 1])], axis=-1)
print(C)
Which gives the expected output:
Tensor("concat:0", shape=(5, 10, 27), dtype=float32)
I am trying to update the weights in a neural network with this line:
self.l1weights[0] = self.l1weights[0] + self.learning_rate * l1error
And this results in a value error:
ValueError: could not broadcast input array from shape (7,7) into shape (7)
Printing the learning_rate*error and the weights returns something like this:
[[-0.00657573]
[-0.01430752]
[-0.01739463]
[-0.00038115]
[-0.01563393]
[-0.02060908]
[-0.01559269]]
[ 4.17022005e-01 7.20324493e-01 1.14374817e-04 3.02332573e-01
1.46755891e-01 9.23385948e-02 1.86260211e-01]
It is clear the weights are initialized as a vector of length 7 in this example and the error is initialized as a 7x1 matrix. I would expect addition to return a 7x1 matrix or a vector as well, but instead it generates a 7x7 matrix like this:
[[ 4.10446271e-01 7.13748760e-01 -6.46135890e-03 2.95756839e-01
1.40180157e-01 8.57628611e-02 1.79684478e-01]
[ 4.02714481e-01 7.06016970e-01 -1.41931487e-02 2.88025049e-01
1.32448367e-01 7.80310713e-02 1.71952688e-01]
[ 3.99627379e-01 7.02929868e-01 -1.72802505e-02 2.84937947e-01
1.29361266e-01 7.49439695e-02 1.68865586e-01]
[ 4.16640855e-01 7.19943343e-01 -2.66775370e-04 3.01951422e-01
1.46374741e-01 9.19574446e-02 1.85879061e-01]
[ 4.01388075e-01 7.04690564e-01 -1.55195551e-02 2.86698643e-01
1.31121961e-01 7.67046648e-02 1.70626281e-01]
[ 3.96412924e-01 6.99715412e-01 -2.04947062e-02 2.81723492e-01
1.26146810e-01 7.17295137e-02 1.65651130e-01]
[ 4.01429313e-01 7.04731801e-01 -1.54783174e-02 2.86739880e-01
1.31163199e-01 7.67459026e-02 1.70667519e-01]]
Numpy.sum also returns the same 7x7 matrix. Is there a way to solve this without explicit reshaping? Output size is variable and this is an issue specific to when the output size is one.
When adding (7,) array (named a) with (1, 7) array (named b), broadcasting happens and generates (7, 7) array. If your just want to do element-by-element addition, keep them in the same shape.
a + b.flatten() gives (7,). flatten makes all the dimensions collapse into one. This keeps the result as a row.
a.reshape(-1, 1) + b gives (1, 7). -1 in reshape requires numpy to decide how many elements are there given other dimensions. This keeps the result as a column.
a = np.arange(7) # row
b = a.reshape(-1, 1) # column
print((a + b).shape) # (7, 7)
print((a + b.flatten()).shape) # (7,)
print((a.reshape(-1, 1) + b).shape) # (7, 1)
In your case, a and b would be self.l1weights[0] and self.learning_rate * l1error respectively.