I have a 3 dimensional numpy array similar to this:
a = np.array([[[1, 2],
[3, 4]],
[[5, 6],
[7, 8]],
[[9, 10],
[11, 12]]])
What I'd like to do is intersperse each 2D array contained inside the outer array to produce this result:
t = np.array([[[1, 2], [5, 6], [9, 10]],
[[3, 4], [7, 8], [11, 12]]])
I could do this in Python like this, but I'm hoping there's a more efficient, numpy version:
t = np.empty((a.shape[1], a.shape[0], a.shape[2]), a.dtype)
for i, x in np.ndenumerate(a):
t[i[1], i[0], i[2]] = x
As #UdayrajDeshmukh said, you can use the transpose method (which, despite the name that evokes the "transpose" operator in linear algebra, is better understood as "permuting the axes"):
>>> t = a.transpose(1, 0, 2)
>>> t
array([[[ 1, 2],
[ 5, 6],
[ 9, 10]],
[[ 3, 4],
[ 7, 8],
[11, 12]]])
The newly created object t is a shallow array looking into a's data with a different permutation of indices. To replicate your own example, you need to copy it, e.g. t = a.transpose(1, 0, 2).copy()
Try the transpose function. You simply change the first two axes.
t = np.transpose(a, axes=(1, 0, 2))
Seemingly simple question: I have an array with two columns, the first represents an ID and the second a count. I'd like to update it with another, similar array such that
import numpy as np
a = np.array([[1, 2],
[2, 2],
[3, 1],
[4, 5]])
b = np.array([[2, 2],
[3, 1],
[4, 0],
[5, 3]])
a.update(b) # ????
>>> np.array([[1, 2],
[2, 4],
[3, 2],
[4, 5],
[5, 3]])
Is there a way to do this with indexing/slicing such that I don't simply have to iterate over each row?
Generic case
Approach #1: You can use np.add.at to do such an ID-based adding operation like so -
# First column of output array as the union of first columns of a,b
out_id = np.union1d(a[:,0],b[:,0])
# Initialize second column of output array
out_count = np.zeros_like(out_id)
# Find indices where the first columns of a,b are placed in out_id
_,a_idx = np.where(a[:,None,0]==out_id)
_,b_idx = np.where(b[:,None,0]==out_id)
# Place second column of a into out_id & add in second column of b
out_count[a_idx] = a[:,1]
np.add.at(out_count, b_idx,b[:,1])
# Stack the ID and count arrays into a 2-column format
out = np.column_stack((out_id,out_count))
To find a_idx and b_idx, as probably a faster alternative, np.searchsorted could be used like so -
a_idx = np.searchsorted(out_id, a[:,0], side='left')
b_idx = np.searchsorted(out_id, b[:,0], side='left')
Sample input-output :
In [538]: a
Out[538]:
array([[1, 2],
[4, 2],
[3, 1],
[5, 5]])
In [539]: b
Out[539]:
array([[3, 7],
[1, 1],
[4, 0],
[2, 3],
[6, 2]])
In [540]: out
Out[540]:
array([[1, 3],
[2, 3],
[3, 8],
[4, 2],
[5, 5],
[6, 2]])
Approach #2: You can use np.bincount to do the same ID based adding -
# First column of output array as the union of first columns of a,b
out_id = np.union1d(a[:,0],b[:,0])
# Get all IDs and counts in a single arrays
id_arr = np.concatenate((a[:,0],b[:,0]))
count_arr = np.concatenate((a[:,1],b[:,1]))
# Get binned summations
summed_vals = np.bincount(id_arr,count_arr)
# Get mask of valid bins
mask = np.in1d(np.arange(np.max(out_id)+1),out_id)
# Mask valid summed bins for final counts array output
out_count = summed_vals[mask]
# Stack the ID and count arrays into a 2-column format
out = np.column_stack((out_id,out_count))
Specific case
If the ID columns in a and b are sorted, it becomes easier, as we can just use masks with np.in1d to index into the output ID array created with np.union like so -
# First column of output array as the union of first columns of a,b
out_id = np.union1d(a[:,0],b[:,0])
# Masks of first columns of a and b matches in the output ID array
mask1 = np.in1d(out_id,a[:,0])
mask2 = np.in1d(out_id,b[:,0])
# Initialize second column of output array
out_count = np.zeros_like(out_id)
# Place second column of a into out_id & add in second column of b
out_count[mask1] = a[:,1]
np.add.at(out_count, np.where(mask2)[0],b[:,1])
# Stack the ID and count arrays into a 2-column format
out = np.column_stack((out_id,out_count))
Sample run -
In [552]: a
Out[552]:
array([[1, 2],
[2, 2],
[3, 1],
[4, 5],
[8, 5]])
In [553]: b
Out[553]:
array([[2, 2],
[3, 1],
[4, 0],
[5, 3],
[6, 2],
[8, 2]])
In [554]: out
Out[554]:
array([[1, 2],
[2, 4],
[3, 2],
[4, 5],
[5, 3],
[6, 2],
[8, 7]])
>>> col=np.unique(np.hstack((b[:,0],a[:,0])))
>>> dif=np.setdiff1d(col,a[:,0])
>>> val=b[np.in1d(b[:,0],dif)]
>>> result=np.concatenate((a,val))
array([[1, 2],
[2, 2],
[3, 1],
[4, 5],
[5, 3]])
Note that if you want the result become sorted you can use np.lexsort :
result[np.lexsort((result[:,0],result[:,0]))]
Explanation :
First you can find the unique ids with following command :
>>> col=np.unique(np.hstack((b[:,0],a[:,0])))
>>> col
array([1, 2, 3, 4, 5])
Then find the different between the ids if a and all of ids :
>>> dif=np.setdiff1d(col,a[:,0])
>>> dif
array([5])
Then find the items within b with the ids in diff :
>>> val=b[np.in1d(b[:,0],dif)]
>>> val
array([[5, 3]])
And at last concatenate the result with list a:
>>> np.concatenate((a,val))
consider another example with sorting :
>>> a = np.array([[1, 2],
... [2, 2],
... [3, 1],
... [7, 5]])
>>>
>>> b = np.array([[2, 2],
... [3, 1],
... [4, 0],
... [5, 3]])
>>>
>>> col=np.unique(np.hstack((b[:,0],a[:,0])))
>>> dif=np.setdiff1d(col,a[:,0])
>>> val=b[np.in1d(b[:,0],dif)]
>>> result=np.concatenate((a,val))
>>> result[np.lexsort((result[:,0],result[:,0]))]
array([[1, 2],
[2, 2],
[3, 1],
[4, 0],
[5, 3],
[7, 5]])
That's an old question but here is a solution with pandas (that could be generalized for other aggregation functions than sum). Also sorting will occur automatically:
import pandas as pd
import numpy as np
a = np.array([[1, 2],
[2, 2],
[3, 1],
[4, 5]])
b = np.array([[2, 2],
[3, 1],
[4, 0],
[5, 3]])
print((pd.DataFrame(a[:, 1], index=a[:, 0])
.add(pd.DataFrame(b[:, 1], index=b[:, 0]), fill_value=0)
.astype(int))
.reset_index()
.to_numpy())
Output:
[[1 2]
[2 4]
[3 2]
[4 5]
[5 3]]
I have a numpy array that looks like this
[
[[1,2,3], [4,5,6]],
[[3,8,9], [2,9,4]],
[[7,1,3], [1,3,6]]
]
I want it like this after deleting first column
[
[[2,3], [5,6]],
[[8,9], [9,4]],
[[1,3], [3,6]]
]
so currently the dimension is 3*3*3, after removing the first column it should be 3*3*2
You can slice it as so, where 1: signifies that you only want the second and all remaining columns from the inner most array (i.e. you 'delete' its first column).
>>> a[:, :, 1:]
array([[[2, 3],
[5, 6]],
[[8, 9],
[9, 4]],
[[1, 3],
[3, 6]]])
Since you are using numpy I'll mention numpy way of doing this. First of all, the dimension you have specified for the question seems wrong. See below
x = np.array([
[[1,2,3], [4,5,6]],
[[3,8,9], [2,9,4]],
[[7,1,3], [1,3,6]]
])
The shape of x is
x.shape
(3, 2, 3)
You can use numpy.delete to remove a column as shown below
a = np.delete(x, 0, 2)
a
array([[[2, 3],
[5, 6]],
[[8, 9],
[9, 4]],
[[1, 3],
[3, 6]]])
To find the shape of a
a.shape
(3, 2, 2)