Numpy append to empty array - python

I want to append a numpy array to a empty numpy array but it's not working.
reconstructed = numpy.empty((4096,))
to_append = reconstruct(p, e_faces, weights, mu, i)
# to_append=array([129.47776809, 129.30775937, 128.90932868, ..., 103.64777681, 104.99912816, 105.93984307]) It's shape is (4096,)
numpy.append(reconstructed, to_append, axis=0)
#Axis is not working anyway.
Plz help me. I want to put that long array in the empty one. The result is just empty.

Look at what empty produces:
In [140]: x = np.empty((5,))
In [141]: x
Out[141]: array([0. , 0.25, 0.5 , 0.75, 1. ])
append makes a new array; it does not change x
In [142]: np.append(x, [1,2,3,4,5], axis=0)
Out[142]: array([0. , 0.25, 0.5 , 0.75, 1. , 1. , 2. , 3. , 4. , 5. ])
In [143]: x
Out[143]: array([0. , 0.25, 0.5 , 0.75, 1. ])
we have to assign it to a new variable:
In [144]: y = np.append(x, [1,2,3,4,5], axis=0)
In [145]: y
Out[145]: array([0. , 0.25, 0.5 , 0.75, 1. , 1. , 2. , 3. , 4. , 5. ])
Look at that y - those random values that were in x are also in y!
Contrast that with list
In [146]: alist = []
In [147]: alist
Out[147]: []
In [148]: alist.append([1,2,3,4,5])
In [149]: alist
Out[149]: [[1, 2, 3, 4, 5]]
The results are very different. Don't use this as a model for creating arrays.
If you need to build an array row by row, use the list append to collect the rows in one list, and then make the array from that.
In [150]: z = np.array(alist)
In [151]: z
Out[151]: array([[1, 2, 3, 4, 5]])

Related

Numpy: How to stack a single array into each row of a bigger array and turn it into a 2D array?

I have a numpy array named heartbeats with 100 rows. Each row has 5 elements.
I also have a single array named time_index with 5 elements.
I need to prepend the time index to each row of heartbeats.
heartbeats = np.array([
[-0.58, -0.57, -0.55, -0.39, -0.40],
[-0.31, -0.31, -0.32, -0.46, -0.46]
])
time_index = np.array([-2, -1, 0, 1, 2])
What I need:
array([-2, -0.58],
[-1, -0.57],
[0, -0.55],
[1, -0.39],
[2, -0.40],
[-2, -0.31],
[-1, -0.31],
[0, -0.32],
[1, -0.46],
[2, -0.46])
I only wrote two rows of heartbeats to illustrate.
Assuming you are using numpy, the exact output array you are looking for can be made by stacking a repeated version of time_index with the raveled version of heartbeats:
np.stack((np.tile(time_index, len(heartbeats)), heartbeats.ravel()), axis=-1)
Another approach, using broadcasting
In [13]: heartbeats = np.array([
...: [-0.58, -0.57, -0.55, -0.39, -0.40],
...: [-0.31, -0.31, -0.32, -0.46, -0.46]
...: ])
...: time_index = np.array([-2, -1, 0, 1, 2])
Make a target array:
In [14]: res = np.zeros(heartbeats.shape + (2,), heartbeats.dtype)
In [15]: res[:,:,1] = heartbeats # insert a (2,5) into a (2,5) slot
In [17]: res[:,:,0] = time_index[None] # insert a (5,) into a (2,5) slot
In [18]: res
Out[18]:
array([[[-2. , -0.58],
[-1. , -0.57],
[ 0. , -0.55],
[ 1. , -0.39],
[ 2. , -0.4 ]],
[[-2. , -0.31],
[-1. , -0.31],
[ 0. , -0.32],
[ 1. , -0.46],
[ 2. , -0.46]]])
and then reshape to 2d:
In [19]: res.reshape(-1,2)
Out[19]:
array([[-2. , -0.58],
[-1. , -0.57],
[ 0. , -0.55],
[ 1. , -0.39],
[ 2. , -0.4 ],
[-2. , -0.31],
[-1. , -0.31],
[ 0. , -0.32],
[ 1. , -0.46],
[ 2. , -0.46]])
[17] takes a (5,), expands it to (1,5), and then to (2,5) for the insert. Read up on broadcasting.
As an alternative way, you can repeat time_index by np.concatenate based on the specified times:
concatenated = np.concatenate([time_index] * heartbeats.shape[0])
# [-2 -1 0 1 2 -2 -1 0 1 2]
# result = np.dstack((concatenated, heartbeats.reshape(-1))).squeeze()
result = np.array([concatenated, heartbeats.reshape(-1)]).T
Using np.concatenate may be faster than np.tile. This solution is faster than Mad Physicist, but the fastest is using broadcasting as hpaulj's answer.

How to divide an array by an other array element wise in numpy?

I have two arrays, and I want all the elements of one to be divided by the second. For example,
In [24]: a = np.array([1,2,3])
In [25]: b = np.array([1,2,3])
In [26]: a/b
Out[26]: array([1., 1., 1.])
In [27]: 1/b
Out[27]: array([1. , 0.5 , 0.33333333])
This is not the answer I want, the output I want is like (we can see all of the elements of a are divided by b)
In [28]: c = []
In [29]: for i in a:
...: c.append(i/b)
...:
In [30]: c
Out[30]:
[array([1. , 0.5 , 0.33333333]),
array([2. , 1. , 0.66666667]),
In [34]: np.array(c)
Out[34]:
array([[1. , 0.5 , 0.33333333],
[2. , 1. , 0.66666667],
[3. , 1.5 , 1. ]])
But I don't like for loop, it's too slow for big data, so is there a function that included in numpy package or any good (faster) way to solve this problem?
It is simple to do in pure numpy, you can use broadcasting to calculate the outer product (or any other outer operation) of two vectors:
import numpy as np
a = np.arange(1, 4)
b = np.arange(1, 4)
c = a[:,np.newaxis] / b
# array([[1. , 0.5 , 0.33333333],
# [2. , 1. , 0.66666667],
# [3. , 1.5 , 1. ]])
This works, since a[:,np.newaxis] increases the dimension of the (3,) shaped array a into a (3, 1) shaped array, which can be used for the desired broadcasting operation.
First you need to cast a into a 2D array (same shape as the output), then repeat for the dimension you want to loop over. Then vectorized division will work.
>>> a.reshape(-1,1)
array([[1],
[2],
[3]])
>>> a.reshape(-1,1).repeat(b.shape[0], axis=1)
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
>>> a.reshape(-1,1).repeat(b.shape[0], axis=1) / b
array([[1. , 0.5 , 0.33333333],
[2. , 1. , 0.66666667],
[3. , 1.5 , 1. ]])
# Transpose will let you do it the other way around, but then you just get 1 for everything
>>> a.reshape(-1,1).repeat(b.shape[0], axis=1).T
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
>>> a.reshape(-1,1).repeat(b.shape[0], axis=1).T / b
array([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]])
This should do the job:
import numpy as np
a = np.array([1, 2, 3])
b = np.array([1, 2, 3])
print(a.reshape(-1, 1) / b)
Output:
[[ 1. 0.5 0.33333333]
[ 2. 1. 0.66666667]
[ 3. 1.5 1. ]]

How to index elements from a column of a ndarray such that the output is a column vector?

I have an nx2 array of points represented as a ndarray. I want to index some of the elements (indices are given in a ndarray as well) of one of the two column vectors such that the output is a column vector. If however the index array contains only one index, a (1,)-shaped array should be returned.
I already tried the following things without success:
import numpy as np
points = np.array([[0, 1], [1, 1.5], [2.5, 0.5], [4, 1], [5, 2]])
index = np.array([0, 1, 2])
points[index, [0]] -> array([0. , 1. , 2.5]) -> shape (3,)
points[[index], 0] -> array([[0. , 1. , 2.5]]) -> shape (1, 3)
points[[index], [0]] -> array([[0. , 1. , 2.5]]) -> shape (1, 3)
points[index, 0, np.newaxis] -> array([[0. ], [1. ], [2.5]]) -> shape(3, 1) # desired
np.newaxis works for this scenario however if the index array only contains one value it does not deliver the right shape:
import numpy as np
points = np.array([[0, 1], [1, 1.5], [2.5, 0.5], [4, 1], [5, 2]])
index = np.array([0])
points[index, 0, np.newaxis] -> array([[0.]]) -> shape (1, 1)
points[index, [0]] -> array([0.]) -> shape (1,) # desired
Is there possibility to index the ndarray such that the output has shapes (3,1) for the first example and (1,) for the second example without doing case differentiations based on the size of the index array?
Thanks in advance for your help!
In [329]: points = np.array([[0, 1], [1, 1.5], [2.5, 0.5], [4, 1], [5, 2]])
...: index = np.array([0, 1, 2])
We can select 3 rows with:
In [330]: points[index,:]
Out[330]:
array([[0. , 1. ],
[1. , 1.5],
[2.5, 0.5]])
However if we select a column as well, the result is 1d, even if we use [0]. That's because the (3,) row index is broadcast against the (1,) column index, resulting in a (3,) result:
In [331]: points[index,0]
Out[331]: array([0. , 1. , 2.5])
In [332]: points[index,[0]]
Out[332]: array([0. , 1. , 2.5])
If we make row index (3,1) shape, the result also (3,1):
In [333]: points[index[:,None],[0]]
Out[333]:
array([[0. ],
[1. ],
[2.5]])
In [334]: points[index[:,None],0]
Out[334]:
array([[0. ],
[1. ],
[2.5]])
We get the same thing if we use a row slice:
In [335]: points[0:3,[0]]
Out[335]:
array([[0. ],
[1. ],
[2.5]])
Using [index] doesn't help because it makes the row index (1,3) shape, resulting in a (1,3) result. Of course you could transpose it to get (3,1).
With a 1 element index:
In [336]: index1 = np.array([0])
In [337]: points[index1[:,None],0]
Out[337]: array([[0.]])
In [338]: _.shape
Out[338]: (1, 1)
In [339]: points[index1,0]
Out[339]: array([0.])
In [340]: _.shape
Out[340]: (1,)
If the row index was a scalar, as opposed to 1d:
In [341]: index1 = np.array(0)
In [342]: points[index1[:,None],0]
...
IndexError: too many indices for array
In [343]: points[index1[...,None],0] # use ... instead
Out[343]: array([0.])
In [344]: points[index1, 0] # scalar result
Out[344]: 0.0
I think handling the np.array([0]) case separately requires an if test. At least I can't think of a builtin numpy way of burying it.
I'm not certain I understand the wording in your question, but it seems as though you may be after the ndarray.swapaxes method (see https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.ndarray.swapaxes.html#numpy.ndarray.swapaxes)
for your snippet:
points = np.array([[0, 1], [1, 1.5], [2.5, 0.5], [4, 1], [5, 2]])
swapped = points.swapaxes(0,1)
print(swapped)
gives
[[0. 1. 2.5 4. 5. ]
[1. 1.5 0.5 1. 2. ]]

How to use arrays to access matrix elements?

I need to change all nans of a matrix to a different value. I can easily get the nan positions using argwhere, but then I am not sure how to access those positions programmatically. Here is my nonworking code:
myMatrix = np.array([[3.2,2,float('NaN'),3],[3,1,2,float('NaN')],[3,3,3,3]])
nanPositions = np.argwhere(np.isnan(myMatrix))
maxVal = np.nanmax(abs(myMatrix))
for pos in nanPositions :
myMatrix[pos] = maxval
the problem is that myMatrix[pos] does not accept pos as an array.
The more-efficient way of generating your output has already been covered by sacul. However, you're incorrectly indexing your 2D matrix in the case where you want to use an array.
At least to me, it's a bit unintuitive, but you need to use:
myMatrix[[all_row_indices], [all_column_indices]]
The following will give you what you expect:
import numpy as np
myMatrix = np.array([[3.2,2,float('NaN'),3],[3,1,2,float('NaN')],[3,3,3,3]])
nanPositions = np.argwhere(np.isnan(myMatrix))
maxVal = np.nanmax(abs(myMatrix))
print(myMatrix[nanPositions[:, 0], nanPositions[:, 1]])
You can see more about advanced indexing in the documentation
In [54]: arr = np.array([[3.2,2,float('NaN'),3],[3,1,2,float('NaN')],[3,3,3,3]])
...:
In [55]: arr
Out[55]:
array([[3.2, 2. , nan, 3. ],
[3. , 1. , 2. , nan],
[3. , 3. , 3. , 3. ]])
Location of the nan:
In [56]: np.where(np.isnan(arr))
Out[56]: (array([0, 1]), array([2, 3]))
In [57]: np.argwhere(np.isnan(arr))
Out[57]:
array([[0, 2],
[1, 3]])
where produces a tuple of arrays; argwhere the same values but as a 2d array
In [58]: arr[Out[56]]
Out[58]: array([nan, nan])
In [59]: arr[Out[56]] = [100,200]
In [60]: arr
Out[60]:
array([[ 3.2, 2. , 100. , 3. ],
[ 3. , 1. , 2. , 200. ],
[ 3. , 3. , 3. , 3. ]])
The argwhere can be used to index individual items:
In [72]: for ij in Out[57]:
...: print(arr[tuple(ij)])
100.0
200.0
The tuple() is needed here because np.array([1,3]) in interpreted as 2 element indexing on the first dimension.
Another way to get that indexing tuple is to use unpacking:
In [74]: [arr[i,j] for i,j in Out[57]]
Out[74]: [100.0, 200.0]
So while argparse looks useful, it is trickier to use than plain where.
You could, as noted in the other answers, use boolean indexing (I've already modified arr so the isnan test no longer works):
In [75]: arr[arr>10]
Out[75]: array([100., 200.])
More on indexing with a list or array, and indexing with a tuple:
In [77]: arr[[0,0]] # two copies of row 0
Out[77]:
array([[ 3.2, 2. , 100. , 3. ],
[ 3.2, 2. , 100. , 3. ]])
In [78]: arr[(0,0)] # one element
Out[78]: 3.2
In [79]: arr[np.array([0,0])] # same as list
Out[79]:
array([[ 3.2, 2. , 100. , 3. ],
[ 3.2, 2. , 100. , 3. ]])
In [80]: arr[np.array([0,0]),:] # making the trailing : explicit
Out[80]:
array([[ 3.2, 2. , 100. , 3. ],
[ 3.2, 2. , 100. , 3. ]])
You can do this instead (IIUC):
myMatrix[np.isnan(myMatrix)] = np.nanmax(abs(myMatrix))

Repeating numpy values and specifying dtype

I want to generate a numpy array of the form:
0.5*[[0, 0], [1, 1], [2, 2], ...]
I want the final array to have a dtype of numpy.float32.
Here is my attempt:
>>> import numpy as np
>>> N = 5
>>> x = np.array(np.repeat(0.5*np.arange(N), 2), np.float32)
>>> x
array([ 0. , 0. , 0.5, 0.5, 1. , 1. , 1.5, 1.5, 2. , 2. ], dtype=float32)
Is this a good way? Can I avoid the copy (if it is indeed copying) just for type conversion?
You only has to reshape your final result to obtain what you want:
x = x.reshape(-1, 2)
You could also run arange passing the dtype:
x = np.repeat(0.5*np.arange(N, dtype=np.float32), 2).reshape(-1, 2)
You can easily cast the array as another type using the astype method, which accepts an argument copy:
x.astype(np.int8, copy=False)
But, as explained in the documentation, numpy checks for some requirements in order to return the view. If those requirements are not satisfied, a copy is returned.
You can check if a given array is a copy or a view from another by checking the OWNDATA attribute, accessible through the flags property of the ndarray.
EDIT: more on checking if a given array is a copy...
Is there a way to check if numpy arrays share the same data?
An alternative:
np.array([0.5*np.arange(N, dtype=np.float32)]*2)
Gives:
array([[ 0. , 0.5, 1. , 1.5, 2. ],
[ 0. , 0.5, 1. , 1.5, 2. ]], dtype=float32)
You might want to rotate it:
np.rot90(np.array([0.5*np.arange(N, dtype=np.float32)]*2),3)
Giving:
array([[ 0. , 0. ],
[ 0.5, 0.5],
[ 1. , 1. ],
[ 1.5, 1.5],
[ 2. , 2. ]], dtype=float32)
Note, this is slower than #Saullo_Castro's answer:
np.rot90(np.array([0.5*np.arange(N, dtype=np.float32)]*2),3)
10000 loops, best of 3: 24.3us per loop
np.repeat(0.5*np.arange(N, dtype=np.float32), 2).reshape(-1, 2)
10000 loops, best of 3: 9.23 us per loop
np.array(np.repeat(0.5*np.arange(N), 2), np.float32).reshape(-1, 2)
10000 loops, best of 3: 10.4 us per loop
(using %%timeit on ipython)

Categories