If I have an array, A, with shape (n, m, o) and an array, B, with shape (n, m), is there a way to divide each array at A[n, m] by the scalar at B[n, m] without a list comprehension?
>>> A.shape
(4,173,1469)
>>> B.shape
(4,173)
>>> # Better way to do:
>>> np.array([[A[i, j] / B[i, j] for j in range(len(B[i]))] for i in range(len(B))])
The problem with a list comprehension is that it is slow, it doesn't return an array (so you have to np.array(_) it, which makes it even slower), it is hard to read, and the whole point of numpy was to move loops from Python to C++ or Fortran.
If A was of shape (n) and B was a scalar (of shape ( )), then this would be trivial: A / B, but this property does not scale with dimensions
>>> A / B
ValueError: operands could not be broadcast together with shapes (4,173,1469) (4,173)
I am looking for a fast way to do this (preferably not by tiling B to an array of shape (n, m, o), and preferably using native numpy tools).
You are absolutely right, there is a better way, I think you are getting the spirit of numpy.
The solution in your case is that you have to add a new dimension to B that consists of one entry in that dimension:
so if your A is of shape (n,m,o) your B has to be of shape (n,m,1) and then you can use the native broadcasting to get your operation "A/B" done.
You can just add that dimension to be by adding a "newaxis" to B there.
import numpy as np
A = np.ones(10,5,3)
B = np.ones(10,5)
Result = A/B[:,:,np.newaxis]
B[:,:,np.newaxis] --> this will turn B into an array of shape of (10,5,1)
From here, the rules of broadcasting are:
When operating on two arrays, NumPy compares their shapes
element-wise. It starts with the trailing dimensions, and works its
way forward. Two dimensions are compatible when
they are equal, or
one of them is 1
Your dimensions are n,m,o and n,m so not compatible.
The / division operator will work using broadcasting if you use:
o,n,m divided by n,m
n,m,o divided by n,m,1
Related
I have a 3D numpy array with shape A=(10227,127,340) and a 1D with shape B=(10227), both array of float64. I just want to sum B to A(first column) at each of the 127x340 grid point.
The output array should be C(10227,127,340) with the values of the first column changed after the sum, of course.
This may be one way of achieving it:
C = A + np.repeat(B, A.shape[1] * A.shape[2]).reshape(A.shape)
The following code also works but is much slower:
C = A.copy()
for i in range(A.shape[0]):
C[i:] += B[i]
A more compact way of doing this uses numpy's fancy dimensional indexing tools:
C = B + A[:,None,None]
This automagically expands A to have the dimensions of B| along the selected axes (indicated by Noneornp.newaxis`)
I'm trying to assemble a tensor based on the contents of two other tensors, like so:
I have a 2D tensor called A, with shape I * J, and another 2D tensor called B, with shape M * N, whose elements are indices into the 1st dimension of A.
I want to obtain a 3D tensor C with shape M * N * J such that C[m,n,:] == A[B[m,n],:] for all m in [0, M) and n in [0, N).
I could do this using nested for-loops to iterate over all indices in M and N, assigning the right values to C at each one, but M and N are large so this is quite slow. I suspect there's some nicer, faster way of doing this using clever slicing or a built-in pytorch function, but I don't know what it would be. It looks a bit like somewhere one would use torch.gather(), but that requires all tensors to have the same number of dimensions. Does anyone know how this ought to be done?
EDIT: torch.index_select(input, dim, index) is almost what I want, but it won't work here because it requires that index be a 1D tensor, while my tensor of indices is 2D.
You could achieve this by flattening the first dimensions which let's you index A. A broadcast will be required to recover the final shape
>>> A[B.flatten(),:].reshape(*B.shape, A.size(-1))
Indexing with A[B.flatten(),:] is equivalent to torch.index_select(A, 0, B.flatten()).
I have a numpy array of shape (29, 10) and a list of 29 elements and I want to end up with an array of shape (29,11)
I am basically converting the list to a numpy array and trying to vstack, but it complain about dimensions not being the same.
Toy example
a = np.zeros((29,10))
a.shape
(29,10)
b = np.array(['A']*29)
b.shape
(29,)
np.vstack((a, b))
ValueError: all the input array dimensions except for the concatenation axis must match exactly
Dimensions do actually match, why am I getting this error and how can I solve it?
I think you are looking for np.hstack.
np.hstack((a, b.reshape(-1,1)))
Moreover b must be 2-dimensional, that's why I used a reshape.
The problem is that you want to append a 1D array to a 2D array.
Also, for the dimension you've given for b, you are probably looking for hstack.
Try this:
a = np.zeros((29,10))
a.shape
(29,10)
b = np.array(['A']*29)[:,None] #to ensure 2D structure
b.shape
(29,1)
np.hstack((a, b))
If you do want to vertically stack, you'd need this:
a = np.zeros((29,10))
a.shape
(29,10)
b = np.array(['A']*10)[None,:] #to ensure 2D structure
b.shape
(1,10)
np.vstack((a, b))
Consider the following simple example:
X = numpy.zeros([10, 4]) # 2D array
x = numpy.arange(0,10) # 1D array
X[:,0] = x # WORKS
X[:,0:1] = x # returns ERROR:
# ValueError: could not broadcast input array from shape (10) into shape (10,1)
X[:,0:1] = (x.reshape(-1, 1)) # WORKS
Can someone explain why numpy has vectors of shape (N,) rather than (N,1) ?
What is the best way to do the casting from 1D array into 2D array?
Why do I need this?
Because I have a code which inserts result x into a 2D array X and the size of x changes from time to time so I have X[:, idx1:idx2] = x which works if x is 2D too but not if x is 1D.
Do you really need to be able to handle both 1D and 2D inputs with the same function? If you know the input is going to be 1D, use
X[:, i] = x
If you know the input is going to be 2D, use
X[:, start:end] = x
If you don't know the input dimensions, I recommend switching between one line or the other with an if, though there might be some indexing trick I'm not aware of that would handle both identically.
Your x has shape (N,) rather than shape (N, 1) (or (1, N)) because numpy isn't built for just matrix math. ndarrays are n-dimensional; they support efficient, consistent vectorized operations for any non-negative number of dimensions (including 0). While this may occasionally make matrix operations a bit less concise (especially in the case of dot for matrix multiplication), it produces more generally applicable code for when your data is naturally 1-dimensional or 3-, 4-, or n-dimensional.
I think you have the answer already included in your question. Numpy allows the arrays be of any dimensionality (while afaik Matlab prefers two dimensions where possible), so you need to be correct with this (and always distinguish between (n,) and (n,1)). By giving one number as one of the indices (like 0 in 3rd row), you reduce the dimensionality by one. By giving a range as one of the indices (like 0:1 in 4th row), you don't reduce the dimensionality.
Line 3 makes perfect sense for me and I would assign to the 2-D array this way.
Here are two tricks that make the code a little shorter.
X = numpy.zeros([10, 4]) # 2D array
x = numpy.arange(0,10) # 1D array
X.T[:1, :] = x
X[:, 2:3] = x[:, None]
Why does the program
import numpy as np
c = np.array([1,2])
print(c.shape)
d = np.array([[1],[2]]).transpose()
print(d.shape)
give
(2,)
(1,2)
as its output? Shouldn't it be
(1,2)
(1,2)
instead? I got this in both python 2.7.3 and python 3.2.3
When you invoke the .shape attribute of a ndarray, you get a tuple with as many elements as dimensions of your array. The length, ie, the number of rows, is the first dimension (shape[0])
You start with an array : c=np.array([1,2]). That's a plain 1D array, so its shape will be a 1-element tuple, and shape[0] is the number of elements, so c.shape = (2,)
Consider c=np.array([[1,2]]). That's a 2D array, with 1 row. The first and only row is [1,2], that gives us two columns. Therefore, c.shape=(1,2) and len(c)=1
Consider c=np.array([[1,],[2,]]). Another 2D array, with 2 rows, 1 column: c.shape=(2,1) and len(c)=2.
Consider d=np.array([[1,],[2,]]).transpose(): this array is the same as np.array([[1,2]]), therefore its shape is (1,2).
Another useful attribute is .size: that's the number of elements across all dimensions, and you have for an array c c.size = np.product(c.shape).
More information on the shape in the documentation.
len(c.shape) is the "depth" of the array.
For c, the array is just a list (a vector), the depth is 1.
For d, the array is a list of lists, the depth is 2.
Note:
c.transpose()
# array([1, 2])
which is not d, so this behaviour is not inconsistent.
dt = d.transpose()
# array([[1],
# [2]])
dt.shape # (2,1)
Quick Fix: check the .ndim property - if its 2, then the .shape property will work as you expect.
Reason Why: if the .ndim property is 2, then numpy reports a shape value that agrees with the convention. If the .ndim property is 1, then numpy just reports shape in a different way.
More talking: When you pass np.array a lists of lists, the .shape property will agree with standard notions of the dimensions of a matrix: (rows, columns).
If you pass np.array just a list, then numpy doesn't think it has a matrix on its hands, and reports the shape in a different way.
The question is: does numpy think it has a matrix, or does it think it has something else on its hands.
transpose does not change the number of dimensions of the array. If c.ndim == 1, c.transpose() == c. Try:
c = np.array([1,2])
print c.shape
print c.T.shape
c = np.atleast_2d(c)
print c.shape
print c.T.shape
Coming from Matlab, I also find it difficult that a single-dimensional array is not organized as (row_count, colum_count)
My function had to respond consistently on a single-dimensional ndarray like [x1, x2, x3] or a list of arrays [[x1, x2, x3], [x1, x2, x3], [x1, x2, x3]].
This worked for me:
dim = np.shape(subtract_matrix)[-1]
Picking the last dimension.