I have used numpy.where() so many times now, and I always wondered about the following statement in the docs:
x, y and condition need to be broadcastable to some shape.
I see why this is necessary for both x and y. We want to assemble the resulting array from the two, so they should be broadcastable to the same shape. However, I do not understand why this is so important for the condition as well. It is only the decision rule. Suppose I have the following three shapes:
condition = (100,)
x = (100, 5)
y = (100, 5)
result = np.where(condition, x, y)
This results in a ValueError, because the "operands could not be broadcast together". To my understanding, this expression should work just fine, because I compose my result of both x and y which are broadcastable.
Can you help me understand why it is so important for the condition to be broadcastable along with x and y?
The condition is fundamentally a boolean array, not a generic condition. You could think of it as a mask over the final broadcasted shape of x and y.
If you think of it that way, it should be clear that the mask must have the same shape, or be broadcastable to the same shape, as the final output.
To illustrate this, here's a simple example. To begin with, consider a scenario in which we have hand-defined a 3x3 mask array as our condition, and we pass in two 3-item arrays as x and y, shaped to broadcast appropriately:
condition = numpy.array([[0, 1, 1],
[1, 0, 1],
[0, 0, 1]])
ones = numpy.ones(3)
numpy.where(condition, ones[:, None], ones[None, :] + 1)
The result looks like this:
>>> numpy.where(condition, ones[:, None], ones[None, :] + 1)
array([[2., 1., 1.],
[1., 2., 1.],
[2., 2., 1.]])
Because of the broadcasting step, x and y behave as if they were defined like this:
>>> x
array([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]])
>>> y
array([[2., 2., 2.],
[2., 2., 2.],
[2., 2., 2.]])
>>> numpy.where(condition, ones[:, None], ones[None, :] + 1)
array([[2., 1., 1.],
[1., 2., 1.],
[2., 2., 1.]])
This is the fundamental behavior of where. The fact that you can pass in a condition like (x > 5) doesn't change anything about the above; (x > 5) becomes a boolean array, and it must have the same shape as the output, or else it must be broadcastable to that shape. Otherwise, the behavior of where would be ill-defined.
(By the way, I am assuming your question is not about why the shapes (100,), (100, 5), and (100, 5) aren't broadcastable; that seems to be a different question.)
Related
I have a torch tensor, pred, in the form (B, 2, H, W) and I want to sum two different values, val1 and val2, to the channels on axis 1.
I managed to do it in a "mechanical" way by accessing the single channels directly, e.g.:
def thresh_format(pred, val1, val2):
tr = torch.zeros_like(pred)
tr[:, 0, :, :] = tr[:, 0, :, :].add(val1)
tr[:, 1, :, :] = tr[:, 1, :, :].add(val2)
return pred + tr
However I'm wondering if there's a "better" way to do it, e.g. by exploiting broadcasting. My understanding from the documentation is that broadcasting happens from trailing dimensions, so in this case I'm puzzled how to make it work for dimension 1.
Any ideas?
The easiest way to achieve this is to stack val1 and val2 in a tensor and reshape it to match the shape of the pred tensor along the common dimension.
pred + torch.tensor([val1, val2]).reshape((1,-1,1,1))
This way, for the addition, torch automatically broadcasts the values along the dimensions where pred has higher order.
It's pretty similar to what happens when you just add a simple scalar value to a tensor, like:
>>> torch.ones((2, 2)) + 3.
tensor([[4., 4.],
[4., 4.]])
But instead of broadcasting the one scalar value to every element of the tensor during the addition, in the aforementioned case the values are broadcasted along the dimensions that do not already match.
>>> B=1; W=2; H=2; val1=3; val2=7
>>> pred = torch.zeros((B,2,W,H))
>>> val = torch.tensor([val1, val2]).reshape((1,-1,1,1))
>>> pred
tensor([[[[0., 0.],
[0., 0.]],
[[0., 0.],
[0., 0.]]]])
>>> val
tensor([[[[3]],
[[7]]]])
>>> pred + val
tensor([[[[3., 3.],
[3., 3.]],
[[7., 7.],
[7., 7.]]]])
I am trying to replace/overwrite values in a array using the following commands:
import numpy as np
test = np.array([[4,5,0],[0,0,0],[0,0,6]])
test
Out[20]:
array([[4., 5., 0.],
[0., 0., 0.],
[0., 0., 6.]])
test[np.where(test[...,0] != 0)][...,1:3] = np.array([[10,11]])
test
Out[22]:
array([[4., 5., 0.],
[0., 0., 0.],
[0., 0., 6.]])
However, as one can see in Out22, the array test has not been modified. So I am concluding that it is not possible to simply overwrite a part of a array or just few cells.
Nevertheless, in other contexts, it is possible to overwrite few cells of a array. For example, in the below code:
test = np.array([[1,2,0],[0,0,0],[0,0,3]])
test
Out[11]:
array([[1., 2., 0.],
[0., 0., 0.],
[0., 0., 3.]])
test[test>0]
Out[12]:
array([1., 2., 3.])
test[test>0] = np.array([4,5,6])
test
Out[14]:
array([[4., 5., 0.],
[0., 0., 0.],
[0., 0., 6.]])
Therefore, my 2 questions:
1- Why does the first command
test[np.where(test[...,0] != 0)][...,1:3] = np.array([10,11])
does not allow modifying the array test ? Why does not it allow accessing the array cells and overwrite them?
2- How could I make it work considering that for my code I would need to select the cells using the command above?
Many thanks!
I'll do you one up. This does work:
test[...,1:3][np.where(test[...,0] != 0)] = np.array([[10,11]])
array([[ 4, 10, 11],
[ 0, 0, 0],
[ 0, 0, 6]])
Why? It's the combination of two factors - numpy indexing and .__setitem__ calls.
The python interpreter sort of reads lines backwards. And when it gets to =, it tries to call .__setitem__ on the furthest thing to the left. __setitem__ is (hopefully) a method of the object, and has two inputs, the target and the indices (whatever is between [...] just before it).
a[b] = c #is intepreted as
a.__setitem__(b, c)
Now, when we index in numpy we have three basic ways we can do it.
slicing (returns views)
'advanced indexing' (returns copies)
'simple indexing' (also returns copies)
One major difference between "advanced" and "simple" indexing is that a numpy array's __setitem__ function can interpret advanced indexes. And views mean the data addresses are the same, so we don't need __setitem__ to get to them.
So:
test[np.where(test[...,0] != 0)][...,1:3] = np.array([[10,11]]) #is intepreted as
(test[np.where(test[...,0] != 0)]).__setitem__( slice([...,1:3]),
np.array([[10,11]]))
But, since np.where(test[...,0] != 0) is an advanced index, (test[np.where(test[...,0] != 0)]) returns a copy, which is then lost because it is never assigned. It does take the elements we want and set them to [10,11], but the result is lost in the buffer somewhere.
If we do:
test[..., 1:3][np.where(test[..., 0] != 0)] = np.array([[10, 11]]) #is intepreted as
(test[..., 1:3]).__setitem__( np.where(test[...,0] != 0), np.array([[10,11]]) )
test[...,1:3] is a view, so it still points to the same memory. Now setitem looks for the locations in test[...,1:3] that correspond to np.where(test[...,0] != 0), and set them equal to np.array([[10,11]]). And everything works.
You can also do this:
test[np.where(test[...,0] != 0), 1:3] = np.array([10, 11])
Now, since all the indexing is in one set of brackets, it's calling test.__setitem__ on those indices, which sets the data correctly as well.
Even simpler (and most pythonic) would be:
test[test[...,0] != 0, 1:3] = np.array([10,11])
I want to create a numpy array b where each component is a 2D matrix, which dimensions are determined by the coordinates of vector a.
What I get doing the following satisfies me:
>>> a = [3,4,1]
>>> b = [np.zeros((a[i], a[i - 1] + 1)) for i in range(1, len(a))]
>>> np.array(b)
array([ array([[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.]]),
array([[ 0., 0., 0., 0., 0.]])], dtype=object)
but if I have found this pathological case where it does not work:
>>> a = [2,1,1]
>>> b = [np.zeros((a[i], a[i - 1] + 1)) for i in range(1, len(a))]
>>> b
[array([[ 0., 0., 0.]]), array([[ 0., 0.]])]
>>> np.array(b)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: could not broadcast input array from shape (3) into shape (1)
I will present a solution to the problem, but do take into account what was said in the comments. Having Numpy arrays that are not aligned prevents most of the useful operations from working their magic. Consider using lists instead.
That being said, curious error indeed. I got the thing to work by assigning in a basic for-loop instead of using the np.array call.
a = [2,1,1]
b = np.zeros(len(a)-1, dtype=object)
for i in range(1, len(a)):
b[i-1] = np.zeros((a[i], a[i - 1] + 1))
And the result:
>>> b
array([array([[0., 0., 0.]]), array([[0., 0.]])], dtype=object)
This is a bit peculiar. Typically, numpy will try to create one array from the input of np.array with a common data type. A list of arrays would be interpreted with the list as being the new dimension. For instance, np.array([np.zeros(3, 1), np.zeros(3, 1)]) would produce a 2 x 3 x 1 array. So this can only happen if the arrays in your list match in shape. Otherwise, you end up with an array of arrays (with dtype=object), which as commented, is not really an ideal scenario.
However, your error seems to occur when the first dimension matches. Numpy for some reason tries to broadcast the arrays somehow and fails. I can reproduce your error even if the arrays are of higher dimension, as long as the first dimension between arrays matches.
I know this isn't a solution, but this wouldn't fit in a comment. As noted by #roganjosh, making this kind of array really gives you no benefit. You're better off sticking to a list of arrays for readability and to avoid the cost of creating these arrays.
I want to create 2D numpy.array knowing at the begining only its shape, i.e shape=2. Now, I want to create in for loop ith one dimensional numpy.arrays, and add them to the main matrix of shape=2, so I'll get something like this:
matrix=
[numpy.array 1]
[numpy.array 2]
...
[numpy.array n]
How can I achieve that? I try to use:
matrix = np.empty(shape=2)
for i in np.arange(100):
array = np.zeros(random_value)
matrix = np.append(matrix, array)
But as a result of print(np.shape(matrix)), after loop, I get something like:
(some_number, )
How can I append each new array in the next row of the matrix? Thank you in advance.
I would suggest working with list
matrix = []
for i in range(10):
a = np.ones(2)
matrix.append(a)
matrix = np.array(matrix)
list does not have the downside of being copied in the memory everytime you use append. so you avoid the problem described by ali_m. at the end of your operation you just convert the list object into a numpy array.
I suspect the root of your problem is the meaning of 'shape' in np.empty(shape=2)
If I run a small version of your code
matrix = np.empty(shape=2)
for i in np.arange(3):
array = np.zeros(3)
matrix = np.append(matrix, array)
I get
array([ 9.57895902e-259, 1.51798693e-314, 0.00000000e+000,
0.00000000e+000, 0.00000000e+000, 0.00000000e+000,
0.00000000e+000, 0.00000000e+000, 0.00000000e+000,
0.00000000e+000, 0.00000000e+000])
See those 2 odd numbers at the start? Those are produced by np.empty(shape=2). That matrix starts as a (2,) shaped array, not an empty 2d array. append just adds sets of 3 zeros to that, resulting in a (11,) array.
Now if you started with a 2 array with the right number of columns, and did concatenate on the 1st dimension you would get a multirow array. (rows only have meaning in 2d or larger).
mat=np.zeros((1,3))
for i in range(1,3):
mat = np.concatenate([mat, np.ones((1,3))*i],axis=0)
produces:
array([[ 0., 0., 0.],
[ 1., 1., 1.],
[ 2., 2., 2.]])
A better way of doing an iterative construction like this is with list append
alist = []
for i in range(0,3):
alist.append(np.ones((1,3))*i)
mat=np.vstack(alist)
alist is:
[array([[ 0., 0., 0.]]), array([[ 1., 1., 1.]]), array([[ 2., 2., 2.]])]
mat is
array([[ 0., 0., 0.],
[ 1., 1., 1.],
[ 2., 2., 2.]])
With vstack you can get by with np.ones((3,), since it turns all of its inputs into 2d array.
append would work, but it also requires axis=0 parameter, and 2 arrays. It gets misused, often by mistaken analogy to the list append. It is just another front end to concatenate. So I prefer not to use it.
Notice that other posters assumed your random value changed during the iteration. That would produce a arrays of differing lengths. For 1d appending that would still produce the long 1d array. But a 2d append wouldn't work, because an 2d array can't be ragged.
mat = np.zeros((2,),int)
for i in range(4):
mat=np.append(mat,np.ones((i,),int)*i)
# array([0, 0, 1, 2, 2, 3, 3, 3])
The function you are looking for is np.vstack
Here is a modified version of your example
import numpy as np
matrix = np.empty(shape=2)
for i in np.arange(3):
array = np.zeros(2)
matrix = np.vstack((matrix, array))
The result is
array([[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[ 0., 0.]])
I need to calculate n number of points(3D) with equal spacing along a defined line(3D).
I know the starting and end point of the line. First, I used
for k in range(nbin):
step = k/float(nbin-1)
bin_point.append(beam_entry+(step*(beamlet_intersection-beam_entry)))
Then, I found that using append for large arrays takes more time, then I changed code like this:
bin_point = [start_point+((k/float(nbin-1))*(end_point-start_point)) for k in range(nbin)]
I got a suggestion that using newaxis will further improve the time.
The modified code looks like this.
step = arange(nbin) / float(nbin-1)
bin_point = start_point + ( step[:,newaxis,newaxis]*((end_pint - start_point))[newaxis,:,:] )
But, I could not understand the newaxis function, I also have a doubt that, whether the same code will work if the structure or the shape of the start_point and end_point are changed. Similarly how can I use the newaxis to mdoify the following code
for j in range(32): # for all los
line_dist[j] = sqrt([sum(l) for l in (end_point[j]-start_point[j])**2])
Sorry for being so clunky, to be more clear the structure of the start_point and end_point are
array([ [[1,1,1],[],[],[]....[]],
[[],[],[],[]....[]],
[[],[],[],[]....[]]......,
[[],[],[],[]....[]] ])
Explanation of the newaxis version in the question: these are not matrix multiplies, ndarray multiply is element-by-element multiply with broadcasting. step[:,newaxis,newaxis] is num_steps x 1 x 1 and point[newaxis,:,:] is 1 x num_points x num_dimensions. Broadcasting together ndarrays with shape (num_steps x 1 x 1) and (1 x num_points x num_dimensions) will work, because the broadcasting rules are that every dimension should be either 1 or the same; it just means "repeat the array with dimension 1 as many times as the corresponding dimension of the other array". This results in an ndarray with shape (num_steps x num_points x num_dimensions) in a very efficient way; the i, j, k subscript will be the k-th coordinate of the i-th step along the j-th line (given by the j-th pair of start and end points).
Walkthrough:
>>> start_points = numpy.array([[1, 0, 0], [0, 1, 0]])
>>> end_points = numpy.array([[10, 0, 0], [0, 10, 0]])
>>> steps = numpy.arange(10)/9.0
>>> start_points.shape
(2, 3)
>>> steps.shape
(10,)
>>> steps[:,numpy.newaxis,numpy.newaxis].shape
(10, 1, 1)
>>> (steps[:,numpy.newaxis,numpy.newaxis] * start_points).shape
(10, 2, 3)
>>> (steps[:,numpy.newaxis,numpy.newaxis] * (end_points - start_points)) + start_points
array([[[ 1., 0., 0.],
[ 0., 1., 0.]],
[[ 2., 0., 0.],
[ 0., 2., 0.]],
[[ 3., 0., 0.],
[ 0., 3., 0.]],
[[ 4., 0., 0.],
[ 0., 4., 0.]],
[[ 5., 0., 0.],
[ 0., 5., 0.]],
[[ 6., 0., 0.],
[ 0., 6., 0.]],
[[ 7., 0., 0.],
[ 0., 7., 0.]],
[[ 8., 0., 0.],
[ 0., 8., 0.]],
[[ 9., 0., 0.],
[ 0., 9., 0.]],
[[ 10., 0., 0.],
[ 0., 10., 0.]]])
As you can see, this produces the correct answer :) In this case broadcasting (10,1,1) and (2,3) results in (10,2,3). What you had is broadcasting (10,1,1) and (1,2,3) which is exactly the same and also produces (10,2,3).
The code for the distance part of the question does not need newaxis: the inputs are num_points x num_dimensions, the ouput is num_points, so one dimension has to be removed. That is actually the axis you sum along. This should work:
line_dist = numpy.sqrt( numpy.sum( (end_point - start_point) ** 2, axis=1 )
Here numpy.sum(..., axis=1) means sum along that axis only, rather than all elements: a ndarray with shape num_points x num_dimensions summed along axis=1 produces a result with num_points, which is correct.
EDIT: removed code example without broadcasting.
EDIT: fixed up order of indexes.
EDIT: added line_dist
I'm not through understanding all you wrote, but some things I already can tell you; maybe they help.
newaxis is rather a marker than a function (in fact, it is plain None). It is used to add an (unused) dimension to a multi-dimensional value. With it you can make a 3D value out of a 2D value (or even more). Each dimension already there in the input value must be represented by a colon : in the index (assuming you want to use all values, otherwise it gets complicated beyond our usecase), the dimensions to be added are denoted by newaxis.
Example:
input is a one-dimensional vector (1D): 1,2,3
output shall be a matrix (2D).
There are two ways to accomplish this; the vector could fill the lines with one value each, or the vector could fill just the first and only line of the matrix. The first is created by vector[:,newaxis], the second by vector[newaxis,:]. Results of this:
>>> array([ 7,8,9 ])[:,newaxis]
array([[7],
[8],
[9]])
>>> array([ 7,8,9 ])[newaxis,:]
array([[7, 8, 9]])
(Dimensions of multi-dimensional values are represented by nesting of arrays of course.)
If you have more dimensions in the input, use the colon more than once (otherwise the deeper nested dimensions are simply ignored, i.e. the arrays are treated as simple values). I won't paste a representation of this here as it won't clarify things due to the optical complexity when 3D and 4D values are written on a 2D display using nested brackets. I hope it gets clear anyway.
The newaxis reshapes the array in such a way so that when you multiply numpy uses broadcasting. Here is a good tutorial on broadcasting.
step[:, newaxis, newaxis] is the same as step.reshape((step.shape[0], 1, 1)) (if step is 1d). Either method for reshaping should be very fast because reshaping arrays in numpy is very cheep, it just makes a view of the array, especially because you should only be doing it once.