Reorganizing a 3d numpy array - python

I've tried and searched for a few days, I've come closer but need your help.
I have a 3d array in python,
shape(files)
>> (31,2049,2)
which corresponds to 31 input files with 2 columns of data with 2048 rows and a header.
I'd like to sort this array based on the header, which is a number, in each file.
I tried to follow NumPy: sorting 3D array but keeping 2nd dimension assigned to first , but i'm incredibly confused.
First I try to setup get my headers for the argsort, I thought I could do
sortval=files[:][0][0]
but this does not work..
Then I simply did a for loop to iterate and get my headers
for i in xrange(shape(files)[0]:
sortval.append([i][0][0])
Then
sortedIdx = np.argsort(sortval)
This works, however I dont understand whats happening in the last line..
files = files[np.arange(len(deck))[:,np.newaxis],sortedIdx]
Help would be appreciated.

Another way to do this is with np.take
header = a[:,0,0]
sorted = np.take(a, np.argsort(header), axis=0)

Here we can use a simple example to demonstrate what your code is doing:
First we create a random 3D numpy matrix:
a = (np.random.rand(3,3,2)*10).astype(int)
array([[[3, 1],
[3, 7],
[0, 3]],
[[2, 9],
[1, 0],
[9, 2]],
[[9, 2],
[8, 8],
[8, 0]]])
Then a[:] will gives a itself, and a[:][0][0] is just the first row in first 2D array in a, which is:
a[:][0]
# array([[3, 1],
# [3, 7],
# [0, 3]])
a[:][0][0]
# array([3, 1])
What you want is the header which are 3,2,9 in this example, so we can use a[:, 0, 0] to extract them:
a[:,0,0]
# array([3, 2, 9])
Now we sort the above list and get an index array:
np.argsort(a[:,0,0])
# array([1, 0, 2])
In order to rearrange the entire 3D array, we need to slice the array with correct order. And np.arange(len(a))[:,np.newaxis] is equal to np.arange(len(a)).reshape(-1,1) which creates a sequential 2D index array:
np.arange(len(a))[:,np.newaxis]
# array([[0],
# [1],
# [2]])
Without the 2D array, we will slice the array to 2 dimension
a[np.arange(3), np.argsort(a[:,0,0])]
# array([[3, 7],
# [2, 9],
# [8, 0]])
With the 2D array, we can perform 3D slicing and keeps the shape:
a[np.arange(3).reshape(-1,1), np.argsort(a[:,0,0])]
array([[[3, 7],
[3, 1],
[0, 3]],
[[1, 0],
[2, 9],
[9, 2]],
[[8, 8],
[9, 2],
[8, 0]]])
And above is the final result you want.
Edit:
To arange the 2D arrays:, one could use:
a[np.argsort(a[:,0,0])]
array([[[2, 9],
[1, 0],
[9, 2]],
[[3, 1],
[3, 7],
[0, 3]],
[[9, 2],
[8, 8],
[8, 0]]])

Related

Python numpy 2D array sum over certain indices

There is a 2-d array like this:
img = [
[[1, 2, 3], [4, 5, 6], [7, 8, 9]],
[[2, 2, 2], [3, 2, 3], [6, 7, 6]],
[[9, 8, 1], [9, 8, 3], [9, 8, 5]]
]
And i just want to get the sum of certain indices which are like this:
indices = [[0, 0], [0, 1]] # which means img[0][0] and img[0][1]
# means here is represents
There was a similar ask about 1-d array in stackoverflow in this link, but it got a error when I tried to use print(img[indices]). Because I want to make it clear that the element of img are those which indicates by indices, and then get the mean sum of it.
Expected output
[5, 7, 9]
Use NumPy:
import numpy as np
img = np.array(img)
img[tuple(indices)].sum(axis = 0)
#array([5, 7, 9])
If the result would be [5, 7, 9] which is sum over the column of the list. Then easy:
img = np.asarray(img)
indices = [[0, 0], [0, 1]]
img[(indices)].sum(axis = 0)
Result:
array([5, 7, 9])
When you supply a fancy index, each element of the index tuple represents a different axis. The shape of the index arrays broadcasts to the shape of the output you get.
In your case, the rows of indices.T are the indices in each axis. You can convert them into an index tuple and append slice(None), which is the programmatic equivalent of :. You can take the mean of the resulting 2D array directly:
img[tuple(indices.T) + (slice(None),)].sum(0)
Another way is to use the splat operator:
img[(*indices.T, slice(None))].sum(0)

Intersperse items of a numpy array

I have a 3 dimensional numpy array similar to this:
a = np.array([[[1, 2],
[3, 4]],
[[5, 6],
[7, 8]],
[[9, 10],
[11, 12]]])
What I'd like to do is intersperse each 2D array contained inside the outer array to produce this result:
t = np.array([[[1, 2], [5, 6], [9, 10]],
[[3, 4], [7, 8], [11, 12]]])
I could do this in Python like this, but I'm hoping there's a more efficient, numpy version:
t = np.empty((a.shape[1], a.shape[0], a.shape[2]), a.dtype)
for i, x in np.ndenumerate(a):
t[i[1], i[0], i[2]] = x
As #UdayrajDeshmukh said, you can use the transpose method (which, despite the name that evokes the "transpose" operator in linear algebra, is better understood as "permuting the axes"):
>>> t = a.transpose(1, 0, 2)
>>> t
array([[[ 1, 2],
[ 5, 6],
[ 9, 10]],
[[ 3, 4],
[ 7, 8],
[11, 12]]])
The newly created object t is a shallow array looking into a's data with a different permutation of indices. To replicate your own example, you need to copy it, e.g. t = a.transpose(1, 0, 2).copy()
Try the transpose function. You simply change the first two axes.
t = np.transpose(a, axes=(1, 0, 2))

Numpy Search & Slice 3D Array

I'm very new to Python & Numpy and am trying to accomplish the following:
Given, 3D Array:
arr_3d = [[[1,2,3],[4,5,6],[0,0,0],[0,0,0]],
[[3,2,1],[0,0,0],[0,0,0],[0,0,0]]
[[1,2,3],[4,5,6],[7,8,9],[0,0,0]]]
arr_3d = np.array(arr_3d)
Get the indices where [0,0,0] appears in the given 3D array.
Slice the given 3D array from where [0,0,0] appears first.
In other words, I'm trying to remove the padding (In this case: [0,0,0]) from the given 3D array.
Here is what I have tried,
arr_zero = np.zeros(3)
for index in range(0, len(arr_3d)):
rows, cols = np.where(arr_3d[index] == arr_zero)
arr_3d[index] = np.array(arr_3d[0][:rows[0]])
But doing this, I keep getting the following error:
Could not broadcast input array from shape ... into shape ...
I'm expecting something like this:
[[[1,2,3],[4,5,6]],
[[3,2,1]]
[[1,2,3],[4,5,6],[7,8,9]]]
Any help would be appreciated.
Get the first occurance of those indices with all() reduction alongwith argmax() and then slice each 2D slice off the 3D array -
In [106]: idx = (arr_3d == [0,0,0]).all(-1).argmax(-1)
# Output as list of arrays
In [107]: [a[:i] for a,i in zip(arr_3d,idx)]
Out[107]:
[array([[1, 2, 3],
[4, 5, 6]]), array([[3, 2, 1]]), array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])]
# Output as list of lists
In [108]: [a[:i].tolist() for a,i in zip(arr_3d,idx)]
Out[108]: [[[1, 2, 3], [4, 5, 6]], [[3, 2, 1]], [[1, 2, 3], [4, 5, 6], [7, 8, 9]]]

What does x=x[class_id] do when used on NumPy arrays

I am learning Python and solving a machine learning problem.
class_ids=np.arange(self.x.shape[0])
np.random.shuffle(class_ids)
self.x=self.x[class_ids]
This is a shuffle function in NumPy but I can't understand what self.x=self.x[class_ids] means. because I think it gives the value of the array to a variable.
It's a very complicated way to shuffle the first dimension of your self.x. For example:
>>> x = np.array([[1, 1], [2, 2], [3, 3], [4, 4], [5, 5]])
>>> x
array([[1, 1],
[2, 2],
[3, 3],
[4, 4],
[5, 5]])
Then using the mentioned approach
>>> class_ids=np.arange(x.shape[0]) # create an array [0, 1, 2, 3, 4]
>>> np.random.shuffle(class_ids) # shuffle the array
>>> x[class_ids] # use integer array indexing to shuffle x
array([[5, 5],
[3, 3],
[1, 1],
[4, 4],
[2, 2]])
Note that the same could be achieved just by using np.random.shuffle because the docstring explicitly mentions:
This function only shuffles the array along the first axis of a multi-dimensional array. The order of sub-arrays is changed but their contents remains the same.
>>> np.random.shuffle(x)
>>> x
array([[5, 5],
[3, 3],
[1, 1],
[2, 2],
[4, 4]])
or by using np.random.permutation:
>>> class_ids = np.random.permutation(x.shape[0]) # shuffle the first dimensions indices
>>> x[class_ids]
array([[2, 2],
[4, 4],
[3, 3],
[5, 5],
[1, 1]])
Assuming self.x is a numpy array:
class_ids is a 1-d numpy array that is being used as an integer array index in the expression: x[class_ids]. Because the previous line shuffled class_ids, x[class_ids] evaluates to self.x shuffled by rows.
The assignment self.x=self.x[class_ids] assigns the shuffled array to self.x

How to delete column in 3d numpy array

I have a numpy array that looks like this
[
[[1,2,3], [4,5,6]],
[[3,8,9], [2,9,4]],
[[7,1,3], [1,3,6]]
]
I want it like this after deleting first column
[
[[2,3], [5,6]],
[[8,9], [9,4]],
[[1,3], [3,6]]
]
so currently the dimension is 3*3*3, after removing the first column it should be 3*3*2
You can slice it as so, where 1: signifies that you only want the second and all remaining columns from the inner most array (i.e. you 'delete' its first column).
>>> a[:, :, 1:]
array([[[2, 3],
[5, 6]],
[[8, 9],
[9, 4]],
[[1, 3],
[3, 6]]])
Since you are using numpy I'll mention numpy way of doing this. First of all, the dimension you have specified for the question seems wrong. See below
x = np.array([
[[1,2,3], [4,5,6]],
[[3,8,9], [2,9,4]],
[[7,1,3], [1,3,6]]
])
The shape of x is
x.shape
(3, 2, 3)
You can use numpy.delete to remove a column as shown below
a = np.delete(x, 0, 2)
a
array([[[2, 3],
[5, 6]],
[[8, 9],
[9, 4]],
[[1, 3],
[3, 6]]])
To find the shape of a
a.shape
(3, 2, 2)

Categories