How do I concatenate properly two numpy vectors without flattening the result? This is really obvious with append, but it gets shamefully messy when turning to numpy.
I've tried concatenate (expliciting axis and not), hstack, vstack. All with no results.
In [1]: a
Out[1]: array([1, 2, 3])
In [2]: b
Out[2]: array([6, 7, 8])
In [3]: c = np.concatenate((a,b),axis=0)
In [4]: c
Out[4]: array([1, 2, 3, 6, 7, 8])
Note that the code above works indeed if a and b are lists instead of numpy arrays.
The output I want:
Out[4]: array([[1, 2, 3], [6, 7, 8]])
EDIT
vstack works indeed for a and b as in above. It does not in my real life case, where I want to iteratively fill an empty array with vectors of some dimension.
hist=[]
for i in range(len(filenames)):
fileload = np.load(filenames[i])
maxarray.append(fileload['maxamp'])
hist_t, bins_t = np.histogram(maxarray[i], bins=np.arange(0,4097,4))
hist = np.vstack((hist,hist_t))
SOLUTION:
I found the solution: you have to properly initialize the array e.g.: How to add a new row to an empty numpy array
For np.concatenate to work here the input arrays should have two dimensions, as you wasnt a concatenation along the second axis here, and the input arrays only have 1 dimension.
You can use np.vstack here, which as explained in the docs:
It is equivalent to concatenation along the first axis after 1-D arrays of shape (N,) have been reshaped to (1,N)
a = np.array([1, 2, 3])
b = np.array([6, 7, 8])
np.vstack([a, b])
array([[1, 2, 3],
[6, 7, 8]])
Related
I would like to index an array of dimension N using an array of size (N,).
For example, let us consider a case where N is 2.
import numpy as np
foo = np.arange(9).reshape(3,3)
bar = np.array((2,1))
>>> foo
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
>>>bar
array([2, 1])
>>>foo[bar[0],bar[1]]
7
This works fine. However, with this method, I would need to write N times bar[i], which is not a nice solution if N is high.
The following command does not give the result that I need:
>>>foo[bar]
array([[6, 7, 8],
[3, 4, 5]])
What could I do to get the result that I want in a nice and concise way?
I think you can turn bar into tuple:
foo[tuple(bar)]
# 7
I have recently started numpy and noticed a peculiar thing.
import numpy as np
a = np.array([[1,2,3], [4,5,9, 8]])
print a.shape, "shape"
print a[1, 0]
The shape, in this case, comes out to be 2L. However if I make a homogenous numpy array as
a = np.array([[1,2,3], [4,5,6]], then a.shape gives (2L, 3L). I understand that the shape of a non-homogenous array is difficult to represent as a tuple.
Additionally, print a[1,0] for non-homogenous array that I created earlier gives a traceback IndexError: too many indices for array. Doing the same on the homogenous array gives back the correct element 4.
Noticing these two peculiarities, I am curious to know how python looks at non-homogenous numpy arrays at a low level.
Thank You in advance
When the sublists differ in length, np.array falls back to creating an object dtype array:
In [272]: a = np.array([[1,2,3], [4,5,9, 8]])
In [273]: a
Out[273]: array([[1, 2, 3], [4, 5, 9, 8]], dtype=object)
This array is similar to the list we started with. Both store the sublists as pointers. The sublists exist else where in memory.
With equal length sublsts, it can create a 2d array, with integer elements:
In [274]: a2 = np.array([[1,2,3], [4,5,9]])
In [275]: a2
Out[275]:
array([[1, 2, 3],
[4, 5, 9]])
In fact to confirm my claim that the sublists are stored elsewhere in memory, let's try to change one:
In [276]: alist = [[1,2,3], [4,5,9, 8]]
In [277]: a = np.array(alist)
In [278]: a
Out[278]: array([[1, 2, 3], [4, 5, 9, 8]], dtype=object)
In [279]: a[0].append(4)
In [280]: a
Out[280]: array([[1, 2, 3, 4], [4, 5, 9, 8]], dtype=object)
In [281]: alist
Out[281]: [[1, 2, 3, 4], [4, 5, 9, 8]]
That would not work in the case of a2. a2 has its own data storage, independent of the source list.
The basic point is that np.array tries to create an n-d array where possible. If it can't it falls back on to creating an object dtype array. And, as has been discussed in other questions, it sometimes raises an error. It is also tricky to intentionally create an object array.
The shape of a is easy, (2,). A single element tuple. a is a 1d array. But that shape does not convey information about the elements of a. And the same goes for the elements of alist. len(alist) is 2. An object array can have a more complex shape, e.g. a.reshape(1,2,1), but it is still just contains pointers
a contains 2 4byte pointers; a2 contains 6 4byte integers.
n [282]: a.itemsize
Out[282]: 4
In [283]: a.nbytes
Out[283]: 8
In [284]: a2.nbytes
Out[284]: 24
In [285]: a2.itemsize
Out[285]: 4
In other words, each element of the outer array will be a row vector from the original 2D array.
A #Jaime already said, a 2D array can be interpreted as an array of 1D arrays, suppose:
a = np.array([[1,2,3],
[4,5,6],
[7,8,9]])
doing a[0] will return array([1, 2, 3]).
So you don't need to do any conversion.
I think it makes little sense to use numpy arrays to do that, just think you're missing out on all the advantages of numpy.
I had the same issue to append a raw with a different length to a 2D-array.
The only trick I found up to now was to use list comprenhsion and append the new row (see below). Not very optimal I guess but at least it works ;-)
Hope this can help
>>> x=np.reshape(np.arange(0,9),(3,3))
>>> x
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
>>> row_to_append = np.arange(9,11)
>>> row_to_append
array([ 9, 10])
>>> result=[item for item in x]
>>> result.append(row_to_append)
>>> result
[array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8]), array([ 9, 10])]
np.vsplit Split an array into multiple sub-arrays vertically (row-wise).
x=np.arange(12).reshape(3,4)
In [7]: np.vsplit(x,3)
Out[7]: [array([[0, 1, 2, 3]]), array([[4, 5, 6, 7]]), array([[ 8, 9, 10, 11]])]
A comprehension could be used to reshape those arrays into 1d ones.
This is a list of arrays, not an array of arrays. Such a sequence of arrays can be recombined with vstack (or hstack, dstack).
np.array([np.arange(3),np.arange(4)])
makes a 2 element array of arrays. But if the arrays in the list are all the same shape (or compatible), it makes a 2d array. In terms of data storage it may not matter whether it is 2d or 1d of 1d arrays.
I just want to know if there is a short cut to unrolling numpy arrays into a single vector. For instance (convert the following Matlab code to python):
Matlab way:
A = zeros(10,10) %
A_unroll = A(:) % <- How can I do this in python
Thank in advance.
Is this what you have in mind?
Edit: As Patrick points out, one has to be careful with translating A(:) to Python.
Of course if you just want to flatten out a matrix or 2-D array of zeros it does not matter.
So here is a way to get behavior like matlab's.
>>> a = np.array([[1,2,3], [4,5,6]])
>>> a
array([[1, 2, 3],
[4, 5, 6]])
>>> # one way to get Matlab behaivor
... (a.T).ravel()
array([1, 4, 2, 5, 3, 6])
numpy.ravel does flatten 2D array, but does not do it the same way matlab's (:) does.
>>> import numpy as np
>>> a = np.array([[1,2,3], [4,5,6]])
>>> a
array([[1, 2, 3],
[4, 5, 6]])
>>> a.ravel()
array([1, 2, 3, 4, 5, 6])
You have to be careful here, since ravel doesn't unravel the elements in the same that Matlab does with A(:). If you use:
>>> a = np.array([[1,2,3], [4,5,6]])
>>> a.shape
(2,3)
>>> a.ravel()
array([1, 2, 3, 4, 5, 6])
While in Matlab:
>> A = [1:3;4:6];
>> size(A)
ans =
2 3
>> A(:)
ans =
1
4
2
5
3
6
In Matlab, the elements are unraveled first down the columns, then by the rows. In Python it's the opposite. This has to do with the order that elements are stored in (C order by default in NumPy vs. Fortran order in Matlab).
Knowing that A(:) is equivalent to reshape(A,[numel(A),1]), you can get the same behaviour in Python with:
>>> a.reshape(a.size,order='F')
array([1, 4, 2, 5, 3, 6])
Note order='F' which refers to Fortran order (columns first unravelling).
I have a 2x2 numpy array :
x = array(([[1,2],[4,5]]))
which I must merge (or stack, if you wish) with a one-dimensional array :
y = array(([3,6]))
by adding it to the end of the rows, thus making a 2x3 numpy array that would output like so :
array([[1, 2, 3],
[4, 5, 6]])
now the proposed method for this in the numpy guides is :
hstack((x,y))
however this doesn't work, returning the following error :
ValueError: arrays must have same number of dimensions
The only workaround possible seems to be to do this :
hstack((x, array(([y])).T ))
which works, but looks and sounds rather hackish. It seems there is not other way to transpose the given array, so that hstack is able to digest it. I was wondering, is there a cleaner way to do this? Wouldn't there be a way for numpy to guess what I wanted to do?
unutbu's answer works in general, but in this case there is also np.column_stack
>>> x
array([[1, 2],
[4, 5]])
>>> y
array([3, 6])
>>> np.column_stack((x,y))
array([[1, 2, 3],
[4, 5, 6]])
Also works:
In [22]: np.append(x, y[:, np.newaxis], axis=1)
Out[22]:
array([[1, 2, 3],
[4, 5, 6]])