I am trying to merge a sliced array to a list in Python but i get an
error: ValueError: operands could not be broadcast together with shapes `(4,)` `(2,)` .
This is my code:
y = np.array([5,3,2,4,6,1])
row = y[2:6] + np.array([0,0])
I am expecting to get a 2-item shifted vector to the left and last 2 items being assigned to 0.
Numpy array works something like a matrix. So when you try to apply the addition operation to a numpy array, you're actually performing an "element-wise addition". That's why the value you add with a numpy array must be the same dimension as the numpy array. Otherwise such a value that can be broadcasted.
Notice the example to understand what I'm saying.
Adding two lists with addition sign:
>>> [1,2] + [3,4]
[1, 2, 3, 4]
Adding two numpy arrays:
>>> np.array([1,2]) + np.array([3,4])
array([4, 6])
To get your work done, use the np.append(arr, val, axis) function. Documentation
array([1, 2, 3, 4])
>>> np.append([1,2], np.array([3,4]))
array([1, 2, 3, 4])
To concatenate arrays use np.concatenate:
In [93]: y = np.array([5,3,2,4,6,1])
In [94]: y[2:6]
Out[94]: array([2, 4, 6, 1])
In [95]: np.concatenate((y[2:6], np.array([0,0])))
Out[95]: array([2, 4, 6, 1, 0, 0])
+ is concatenate for lists. For arrays is addition (numeric sum).
Your question should not have used list and array in a sloppy manner. They are different things (in python/numpy) and can produce confusing answers.
Other answers already explain why your code fail. You can do:
out = np.zeros_like(y)
out[:-2] = y[2:]
Output:
array([2, 4, 6, 1, 0, 0])
For concatenation, you will need to convert your numpy array to a list first.
row = y[2:6] + list(np.array([0,0]))
or equivalently
row = y[2:6] + np.array([0,0]).tolist()
However, if you wish to add the two (superpose a list and numpy array), then the numpy array just needs to be the same shape as y[2:6]:
In : y[2:6] + np.array([1, 2, 3, 4])
Out: array([y[2] + 1, y[3] + 2, y[4] + 3, y[5] + 4])
Related
How do I concatenate properly two numpy vectors without flattening the result? This is really obvious with append, but it gets shamefully messy when turning to numpy.
I've tried concatenate (expliciting axis and not), hstack, vstack. All with no results.
In [1]: a
Out[1]: array([1, 2, 3])
In [2]: b
Out[2]: array([6, 7, 8])
In [3]: c = np.concatenate((a,b),axis=0)
In [4]: c
Out[4]: array([1, 2, 3, 6, 7, 8])
Note that the code above works indeed if a and b are lists instead of numpy arrays.
The output I want:
Out[4]: array([[1, 2, 3], [6, 7, 8]])
EDIT
vstack works indeed for a and b as in above. It does not in my real life case, where I want to iteratively fill an empty array with vectors of some dimension.
hist=[]
for i in range(len(filenames)):
fileload = np.load(filenames[i])
maxarray.append(fileload['maxamp'])
hist_t, bins_t = np.histogram(maxarray[i], bins=np.arange(0,4097,4))
hist = np.vstack((hist,hist_t))
SOLUTION:
I found the solution: you have to properly initialize the array e.g.: How to add a new row to an empty numpy array
For np.concatenate to work here the input arrays should have two dimensions, as you wasnt a concatenation along the second axis here, and the input arrays only have 1 dimension.
You can use np.vstack here, which as explained in the docs:
It is equivalent to concatenation along the first axis after 1-D arrays of shape (N,) have been reshaped to (1,N)
a = np.array([1, 2, 3])
b = np.array([6, 7, 8])
np.vstack([a, b])
array([[1, 2, 3],
[6, 7, 8]])
For some reason I cannot resolve this.
According to the example here for 1-dim array,
https://docs.scipy.org/doc/numpy/reference/generated/numpy.argsort.html
x = np.array([3, 1, 2])
np.argsort(x)
array([1, 2, 0])
And I have tried this myself. But by default, the return result should be ascending..meaning
x([result])
returns
array([1, 2, 3])
Thus shouldnt the result be [2,0,1]
What am I missing here?
From the docs, the first line states "Returns the indices that would sort an array." Hence if you want the positions of the sorted values we have:
x = np.array([3, 1, 2])
np.argsort(x)
>>>array([1, 2, 0])
here we want the index positions of 1, 2 and 3 in x. The psotion of 3 is 0, the psotion of 1 is 1, and the position of 2 is 2, hence array([1, 2, 0]) = sorted_array(1,2,3).
Again from the notes, " argsort returns an array of indices of the same shape as a that index data along the given axis in sorted order."
A more intuitive way of looking at what that means is to use a for loop, where we loop over our returned argsort values, and then index the initial array with these values:
x = np.array([3, 1, 2])
srt_positions = np.argsort(x)
for k in srt_positions:
print x[k]
>>> 1, 2, 3
I found an interesting thing when comparing MATLAB and numpy.
MATLAB:
x = [1, 2]
n = size(X, 2)
% n = 1
Python:
x = np.array([1, 2])
n = x.shape[1]
# error
The question is: how to handle input which may be both ndarray with shape (n,) and ndarray with shape (n, m).
e.g.
def my_summation(X):
"""
X : ndarray
each column of X is an observation.
"""
# my solution for ndarray shape (n,)
# if X.ndim == 1:
# X = X.reshape((-1, 1))
num_of_sample = X.shape[1]
sum = np.zeros(X.shape[0])
for i in range(num_of_sample):
sum = sum + X[:, i]
return sum
a = np.array([[1, 2], [3, 4]])
b = np.array([1, 2])
print my_summation(a)
print my_summation(b)
My solution is forcing ndarray shape (n,) to be shape (n, 1).
The summation is used as an example. What I want is to find an elegant way to handle the possibility of matrix with only one observation(vector) and matrix with more than one observation using ndarray.
Does anyone have better solutions?
I recently learned about numpy.atleast_2d from the Python control
toolbox. You also don't need a for-loop for summation, rather use
numpy.sum.
import numpy as np
def my_summation(X):
"""
X : ndarray
each column of X is an observation.
"""
# my solution for ndarray shape (n,)
# if X.ndim == 1:
# X = X.reshape((-1, 1))
X = np.atleast_2d(X)
return np.sum(X, axis=1)
a = np.array([[1, 2], [3, 4]])
b = np.array([1, 2])
print my_summation(a)
print my_summation(b)
gives
[3 7]
[3]
In a ndarray X, len(X) would the number of elements along the first axis. So, for a 2D array, it would be the number of rows and for a 1D array, it would be the number of elements in itself. This property could be used to reshape the input array that could be a 1D or a 2D array into a 2D array output. For a 1D array as input, the output 2D array would have number of rows same as number of elements. For a 2D array input case, it would have the number of rows same as before, therefore no change with it.
To sum up, one solution would be to put a reshaping code at the top of the function definition, like so -
X = X.reshape(len(X),-1)
Sample runs -
2D Case:
In [50]: X
Out[50]:
array([[6, 7, 8, 1],
[6, 2, 3, 0],
[5, 1, 8, 6]])
In [51]: X.reshape(len(X),-1)
Out[51]:
array([[6, 7, 8, 1],
[6, 2, 3, 0],
[5, 1, 8, 6]])
1D Case:
In [53]: X
Out[53]: array([2, 5, 2])
In [54]: X.reshape(len(X),-1)
Out[54]:
array([[2],
[5],
[2]])
Can someone explain exactly what the axis parameter in NumPy does?
I am terribly confused.
I'm trying to use the function myArray.sum(axis=num)
At first I thought if the array is itself 3 dimensions, axis=0 will return three elements, consisting of the sum of all nested items in that same position. If each dimension contained five dimensions, I expected axis=1 to return a result of five items, and so on.
However this is not the case, and the documentation does not do a good job helping me out (they use a 3x3x3 array so it's hard to tell what's happening)
Here's what I did:
>>> e
array([[[1, 0],
[0, 0]],
[[1, 1],
[1, 0]],
[[1, 0],
[0, 1]]])
>>> e.sum(axis = 0)
array([[3, 1],
[1, 1]])
>>> e.sum(axis=1)
array([[1, 0],
[2, 1],
[1, 1]])
>>> e.sum(axis=2)
array([[1, 0],
[2, 1],
[1, 1]])
>>>
Clearly the result is not intuitive.
Clearly,
e.shape == (3, 2, 2)
Sum over an axis is a reduction operation so the specified axis disappears. Hence,
e.sum(axis=0).shape == (2, 2)
e.sum(axis=1).shape == (3, 2)
e.sum(axis=2).shape == (3, 2)
Intuitively, we are "squashing" the array along the chosen axis, and summing the numbers that get squashed together.
To understand the axis intuitively, refer the picture below (source: Physics Dept, Cornell Uni)
The shape of the (boolean) array in the above figure is shape=(8, 3). ndarray.shape will return a tuple where the entries correspond to the length of the particular dimension. In our example, 8 corresponds to length of axis 0 whereas 3 corresponds to length of axis 1.
If someone need this visual description:
There are good answers for visualization however it might help to think purely from analytical perspective.
You can create array of arbitrary dimension with numpy.
For example, here's a 5-dimension array:
>>> a = np.random.rand(2, 3, 4, 5, 6)
>>> a.shape
(2, 3, 4, 5, 6)
You can access any element of this array by specifying indices. For example, here's the first element of this array:
>>> a[0, 0, 0, 0, 0]
0.0038908603263844155
Now if you take out one of the dimensions, you get number of elements in that dimension:
>>> a[0, 0, :, 0, 0]
array([0.00389086, 0.27394775, 0.26565889, 0.62125279])
When you apply a function like sum with axis parameter, that dimension gets eliminated and array of dimension less than original gets created. For each cell in new array, the operator will get list of elements and apply the reduction function to get a scaler.
>>> np.sum(a, axis=2).shape
(2, 3, 5, 6)
Now you can check that the first element of this array is sum of above elements:
>>> np.sum(a, axis=2)[0, 0, 0, 0]
1.1647502999560164
>>> a[0, 0, :, 0, 0].sum()
1.1647502999560164
The axis=None has special meaning to flatten out the array and apply function on all numbers.
Now you can think about more complex cases where axis is not just number but a tuple:
>>> np.sum(a, axis=(2,3)).shape
(2, 3, 6)
Note that we use same technique to figure out how this reduction was done:
>>> np.sum(a, axis=(2,3))[0,0,0]
7.889432081931909
>>> a[0, 0, :, :, 0].sum()
7.88943208193191
You can also use same reasoning for adding dimension in array instead of reducing dimension:
>>> x = np.random.rand(3, 4)
>>> y = np.random.rand(3, 4)
# New dimension is created on specified axis
>>> np.stack([x, y], axis=2).shape
(3, 4, 2)
>>> np.stack([x, y], axis=0).shape
(2, 3, 4)
# To retrieve item i in stack set i in that axis
Hope this gives you generic and full understanding of this important parameter.
Some answers are too specific or do not address the main source of confusion. This answer attempts to provide a more general but simple explanation of the concept, with a simple example.
The main source of confusion is related to expressions such as "Axis along which the means are computed", which is the documentation of the argument axis of the numpy.mean function. What the heck does "along which" even mean here? "Along which" essentially means that you will sum the rows (and divide by the number of rows, given that we are computing the mean), if the axis is 0, and the columns, if the axis is 1. In the case of axis is 0 (or 1), the rows can be scalars or vectors or even other multi-dimensional arrays.
In [1]: import numpy as np
In [2]: a=np.array([[1, 2], [3, 4]])
In [3]: a
Out[3]:
array([[1, 2],
[3, 4]])
In [4]: np.mean(a, axis=0)
Out[4]: array([2., 3.])
In [5]: np.mean(a, axis=1)
Out[5]: array([1.5, 3.5])
So, in the example above, np.mean(a, axis=0) returns array([2., 3.]) because (1 + 3)/2 = 2 and (2 + 4)/2 = 3. It returns an array of two numbers because it returns the mean of the rows for each column (and there are two columns).
Both 1st and 2nd reply is great for understanding ndarray concept in numpy. I am giving a simple example.
And according to this image by #debaonline4u
https://i.stack.imgur.com/O5hBF.jpg
Suppose , you have an 2D array -
[1, 2, 3]
[4, 5, 6]
In, numpy format it will be -
c = np.array([[1, 2, 3],
[4, 5, 6]])
Now,
c.ndim = 2 (rows/axis=0)
c.shape = (2,3) (axis0, axis1)
c.sum(axis=0) = [1+4, 2+5, 3+6] = [5, 7, 9] (sum of the 1st elements of each rows, so along axis0)
c.sum(axis=1) = [1+2+3, 4+5+6] = [6, 15] (sum of the elements in a row, so along axis1)
So for your 3D array,
I want to compare the elements of an array to a scalar and get an array with the maximum of the compared values. That's I want to call
import numpy as np
np.max([1,2,3,4], 3)
and want to get
array([3,3,3,4])
But I get
ValueError: 'axis' entry is out of bounds
When I run
np.max([[1,2,3,4], 3])
I get
[1, 2, 3, 4]
which is one of the two elements in the list that is not the result I seek for. Is there a Numpy solution for that which is fast as the other built-in functions?
This is already built into numpy with the function np.maximum:
a = np.arange(1,5)
n = 3
np.maximum(a, n)
#array([3, 3, 3, 4])
This doesn't mutate a:
a
#array([1, 2, 3, 4])
If you want to mutate the original array as in #jamylak's answer, you can give a as the output:
np.maximum(a, n, a)
#array([3, 3, 3, 4])
a
#array([3, 3, 3, 4])
Docs:
maximum(x1, x2[, out])
Element-wise maximum of array elements.
Equivalent to np.where(x1 > x2, x1, x2) but faster and does proper broadcasting.
>>> import numpy as np
>>> a = np.array([1,2,3,4])
>>> n = 3
>>> a[a<n] = n
>>> a
array([3, 3, 3, 4])