I am new on Python and I don't know exactly how to perform multiplication between arrays of different shape.
I have two different arrays w and b such that:
W.shape = [32, 5, 20]
b.shape = [5,]
and I want to multiply
W[:, i, :]*b[i]
for each i from 0 to 4.
How can I do that? Thanks in advance.
You could add a new axis to b so it is multiplied accross W's inner arrays' rows, i.e the second axis:
W * b[:,None]
What you want to do is called Broadcasting. In numpy, you can multiply this way, but only if the shapes match according to some restrictions:
Starting from the right, every component of each arrays' shape must be the equal, 1, or not exist
so right now you have:
W.shape = (32, 5, 20)
b.shape = (5,)
since 20 and 5 don't match, they cant' be broadcast.
If you were to have:
W.shape = (32, 5, 20)
b.shape = (5, 1 )
20 would match with 1 (1 is always ok) and the 5's would match, and you can then multiply them.
To get b's shape to (5, 1), you can either do .reshape(5, 1) (or, more robustly, .reshape(-1, 1)) or fancy index with [:, None]
Thus either of these work:
W * b[:,None] #yatu's answer
W * b.reshape(-1, 1)
Related
I have two numpy arrays, one with shape let's say (10, 5, 200), and another one with the shape (1, 200), how can I stack them so I get as a result an array of dimensions (10, 6, 200)? Basically by stacking it to each 2-d array iterating along the first dimension
a = np.random.random((10, 5, 200))
b = np.zeros((1, 200))
I'v tried with hstack and vstack but I get an error in incorrect number of axis
Let's say:
a = np.random.random((10, 5, 200))
b = np.zeros((1, 200))
Let's look at the volume (number of elements) of each array:
The volume of a is 10*5*200 = 10000.
The volume of an array with (10,6,200) is 10*5*200=1200.
That is you want to create an array that has 2000 more elements.
However, the volume of b is 1*200 = 200.
This means a and b can't be stacked.
As hpaulj mentioned in the comments, one way is to define an numpy array and then fill it:
result = np.empty((a.shape[0], a.shape[1] + b.shape[0], a.shape[2]))
result[:, :a.shape[1], :] = a
result[:, a.shape[1]:, :] = b
I'm trying to translate the as_strided function of NumPy to a function in Python when I translate ahead the number of strides to the number of variables according to the type of the variable (for float32 I divide the stride by 4, etc).
The code I implemented:
def as_strided(x, shape, strides):
x = x.flatten()
size = 1
for value in shape:
size *= value
arr = np.zeros(size, dtype=np.float32)
curr = 0
for i in range(shape[0]):
for j in range(shape[1]):
for k in range(shape[2]):
arr[curr] = x[i * strides[0] + j * strides[1] + k * strides[2]]
curr = curr + 1
return np.reshape(arr, shape)
In order to test the function I wrote 2 auxiliary functions:
def sliding_window(x, shape, strides):
f_mine = as_strided(x, shape, [stride // 4 for stride in strides])
f_np = np.lib.stride_tricks.as_strided(x, shape=shape, strides=strides).copy()
check_strides(x.flatten(), f_mine)
check_strides(x.flatten(), f_np)
return f_mine, f_np
def check_strides(original, strided):
s1 = int(np.where(original == strided[1][0][0])[0])
s2 = int(np.where(original == strided[0][1][0])[0])
s3 = int(np.where(original == strided[0][0][1])[0])
print([s1, s2, s3])
return [s1, s2, s3]
In the main code, I selected some shape and strides values and ran 2 cases:
Uploaded a .npy file that includes a matrix in float32 - variable x.
Created random matrix of the same size and type as variable x - variable y.
When I check the strides of the resulting matrices I get a strange phenomenon.
For case 1 - the final resulted strides obtained using the NumPy function are different from the required stride (and from my implementation).
For case 2 - the outputs are identical.
The main code:
shape = (30, 818, 300)
strides = (4, 120, 120)
# case 1
x = np.load('x.npy')
s_mine, s_np = sliding_window(x, shape, strides)
print(np.array_equal(s_mine, s_np))
#case 2
y = np.random.randn(x.shape[0], x.shape[1]).astype(np.float32)
s_mine, s_np = sliding_window(y, shape, strides)
print(np.array_equal(s_mine, s_np))
Here you can find the x.npy file that causes the desired stride change in the numpy function. I'd be happy if anyone could explain to me why this is happening.
I downloaded x.npy and loaded it. And ran as_strided on y. I haven't looked at your code.
Normally when playing with as_strided I like to look at the arrays, but in this case they are large enough that I'll focus more making sense the strides and shape.
In [39]: x.shape, x.strides
Out[39]: ((30, 1117), (4, 120))
In [40]: y.shape, y.strides
Out[40]: ((30, 1117), (4468, 4))
I wondered where you got the
shape = (30, 818, 300)
strides = (4, 120, 120)
OK the 30 is shared, but the 4 is only for x. And with those strides x looks like it's F ordered, may be even a transpose of a (1117,30) array. Your y, which was constructed with random, has the typical strides for C ordered array, 4 bytes for the inner, trailing dimension, and 4*1117 for the leading dimension.
I have a tensor A of shape (100, 16, 16) and tensor B of shape (100), where 100 is the batch size. I want to create a binary mask of A that has shape (100, 16, 16), where in each element (element has shape (1, 16, 16)) of the mask, the value is 1 if the element is greater than the computed quantile value, else 0. Each element in tensor B indicates the percentile value for each individual element in A, in sequence. If B is simply a scalar, I can use:
flat_A = torch.reshape(A, (100, -1))
quants = torch.quantile(flat_A, B, dim=1)
quants = torch.reshape(quants, (100, 1, 1))
mask = torch.where(A >= quants, 1, 0)
# quants will have shape (100, 1, 1)
The question is: if B is a 1D tensor of shape (100) like I said above, how can I compute the percentile value for each individual element in A? I tried the following, but the results did not look like what I expected:
>>> torch.quantile(flat_A, B, dim=1).shape
torch.Size([100, 100])
>>> torch.quantile(flat_A, B, dim=0).shape
torch.Size([100, 256])
I think the result's shape should be (100), so I can use mask = torch.where(A >= quants, 1, 0), or maybe I misunderstand it?
For more context, this question is also the extension of the scalar B value question I had previously here.
This is one way using torch.quantile() function. Note that here I am using tensors of shape (5, 2, 2) instead of (100, 16, 16) for simplicity.
import torch
# Generate some data of shape (5, 2, 2)
A = torch.arange(5 * 2 * 2).reshape(5, 2, 2) + 1.0
B = torch.linspace(0, 1, 5) # 5 quantile values for each element in A
Af = A.reshape(A.shape[0], -1) # flattens A to a 2D tensor
quantiles = torch.quantile(Af, B, dim = 1, keepdim = True)
quants = quantiles[torch.arange(A.shape[0]), torch.arange(A.shape[0]), 0]
mask = (A >= quants[:, None, None]).type(torch.uint8)
Here the tensor quantiles is of shape torch.Size([5, 5, 1]) because it stores the thresholds for each quantile value in B for each element in A (or row in Af). Since we have 5 quantile values, we get 5 thresholds for each element in A.
For instance, quantiles[i, j, 0] has the threshold for B[i]th quantile of A[j] or Af[j], and you essentially need the values quantiles[k, k, 0] for k in range of batch size or 5 here.
Now to satisfy the requirement that you need thresholds for corresponding quantiles in B and elements in A, simply index out the diagonal elements from quantiles and populate quants that has shape torch.Size([5]).
Finally to get the mask, compare A with the corresponding thresholds for each element. Note that this uses a broadcasted elementwise comparison with the thresholds. mask has the required shape of torch.Size([5, 2, 2]).
I have an numpy array of shape (2000,1) and I need the shape to be (2000,7).
The values in the 6 column we are adding can be anything.
Is there a function or method to accomplish this?
Thanks.
You can try numpy.hstack for this.
>>> x = np.zeros((2000, 1))
>>> x.shape
(2000, 1)
>>> x = np.hstack((x, np.zeros((2000, 6))))
>>> x.shape
(2000, 7)
An interesting option is np.pad function.
The second parameter (pad_with - a list of 2-tuples) defines how many
elements to add at the beginning / end of each dimension.
Example:
arr = np.arange(1,7)[:, np.newaxis]
arr.shape # (6, 1)
result = np.pad(arr, [(0, 0), (0, 6)])
result.shape # (6, 7)
There can be passed also third parameter - mode - defining various
ways what value to pad with. For details see the documentation.
You can do this with broadcasting:
x = np.zeros((2000, 1))
np.broadcast_to(x, (2000,7))
In this case, the values in the first row will be repeated along the second axis.
I am trying to update the weights in a neural network with this line:
self.l1weights[0] = self.l1weights[0] + self.learning_rate * l1error
And this results in a value error:
ValueError: could not broadcast input array from shape (7,7) into shape (7)
Printing the learning_rate*error and the weights returns something like this:
[[-0.00657573]
[-0.01430752]
[-0.01739463]
[-0.00038115]
[-0.01563393]
[-0.02060908]
[-0.01559269]]
[ 4.17022005e-01 7.20324493e-01 1.14374817e-04 3.02332573e-01
1.46755891e-01 9.23385948e-02 1.86260211e-01]
It is clear the weights are initialized as a vector of length 7 in this example and the error is initialized as a 7x1 matrix. I would expect addition to return a 7x1 matrix or a vector as well, but instead it generates a 7x7 matrix like this:
[[ 4.10446271e-01 7.13748760e-01 -6.46135890e-03 2.95756839e-01
1.40180157e-01 8.57628611e-02 1.79684478e-01]
[ 4.02714481e-01 7.06016970e-01 -1.41931487e-02 2.88025049e-01
1.32448367e-01 7.80310713e-02 1.71952688e-01]
[ 3.99627379e-01 7.02929868e-01 -1.72802505e-02 2.84937947e-01
1.29361266e-01 7.49439695e-02 1.68865586e-01]
[ 4.16640855e-01 7.19943343e-01 -2.66775370e-04 3.01951422e-01
1.46374741e-01 9.19574446e-02 1.85879061e-01]
[ 4.01388075e-01 7.04690564e-01 -1.55195551e-02 2.86698643e-01
1.31121961e-01 7.67046648e-02 1.70626281e-01]
[ 3.96412924e-01 6.99715412e-01 -2.04947062e-02 2.81723492e-01
1.26146810e-01 7.17295137e-02 1.65651130e-01]
[ 4.01429313e-01 7.04731801e-01 -1.54783174e-02 2.86739880e-01
1.31163199e-01 7.67459026e-02 1.70667519e-01]]
Numpy.sum also returns the same 7x7 matrix. Is there a way to solve this without explicit reshaping? Output size is variable and this is an issue specific to when the output size is one.
When adding (7,) array (named a) with (1, 7) array (named b), broadcasting happens and generates (7, 7) array. If your just want to do element-by-element addition, keep them in the same shape.
a + b.flatten() gives (7,). flatten makes all the dimensions collapse into one. This keeps the result as a row.
a.reshape(-1, 1) + b gives (1, 7). -1 in reshape requires numpy to decide how many elements are there given other dimensions. This keeps the result as a column.
a = np.arange(7) # row
b = a.reshape(-1, 1) # column
print((a + b).shape) # (7, 7)
print((a + b.flatten()).shape) # (7,)
print((a.reshape(-1, 1) + b).shape) # (7, 1)
In your case, a and b would be self.l1weights[0] and self.learning_rate * l1error respectively.