Iterate over numpy array columnwise - python

np.nditer automatically iterates of the elements of an array row-wise. Is there a way to iterate of elements of an array columnwise?
x = np.array([[1,3],[2,4]])
for i in np.nditer(x):
print i
# 1
# 3
# 2
# 4
What I want is:
for i in Columnwise Iteration(x):
print i
# 1
# 2
# 3
# 4
Is my best bet just to transpose my array before doing the iteration?

For completeness, you don't necessarily have to transpose the matrix before iterating through the elements. With np.nditer you can specify the order of how to iterate through the matrix. The default is usually row-major or C-like order. You can override this behaviour and choose column-major, or FORTRAN-like order which is what you desire. Simply specify an additional argument order and set this flag to 'F' when using np.nditer:
In [16]: x = np.array([[1,3],[2,4]])
In [17]: for i in np.nditer(x,order='F'):
....: print i
....:
1
2
3
4
You can read more about how to control the order of iteration here: http://docs.scipy.org/doc/numpy-1.10.0/reference/arrays.nditer.html#controlling-iteration-order

You could use the shape and slice each column
>>> [x[:, i] for i in range(x.shape[1])]
[array([1, 2]), array([3, 4])]

You could transpose it?
>>> x = np.array([[1,3],[2,4]])
>>> [y for y in x.T]
[array([1, 2]), array([3, 4])]
Or less elegantly:
>>> [np.array([x[j,i] for j in range(x.shape[0])]) for i in range(x.shape[1])]
[array([1, 2]), array([3, 4])]

nditer is not the best iteration tool for this case. It is useful when working toward a compiled (cython) solution, but not in pure Python coding.
Look at some regular iteration strategies:
In [832]: x=np.array([[1,3],[2,4]])
In [833]: x
Out[833]:
array([[1, 3],
[2, 4]])
In [834]: for i in x:print i # print each row
[1 3]
[2 4]
In [835]: for i in x.T:print i # print each column
[1 2]
[3 4]
In [836]: for i in x.ravel():print i # print values in order
1
3
2
4
In [837]: for i in x.T.ravel():print i # print values in column order
1
2
3
4
You comment: I need to fill values into an array based on the index of each cell in the array
What do you mean by index?
A crude 2d iteration with indexing:
In [838]: for i in range(2):
.....: for j in range(2):
.....: print (i,j),x[i,j]
(0, 0) 1
(0, 1) 3
(1, 0) 2
(1, 1) 4
ndindex uses nditer to generate similar indexes
In [841]: for i,j in np.ndindex(x.shape):
.....: print (i,j),x[i,j]
.....:
(0, 0) 1
(0, 1) 3
(1, 0) 2
(1, 1) 4
enumerate is a good Python way of getting both values and indexes:
In [847]: for i,v in enumerate(x):print i,v
0 [1 3]
1 [2 4]
Or you can use meshgrid to generate all the indexes, as arrays
In [843]: I,J=np.meshgrid(range(2),range(2))
In [844]: I
Out[844]:
array([[0, 1],
[0, 1]])
In [845]: J
Out[845]:
array([[0, 0],
[1, 1]])
In [846]: x[I,J]
Out[846]:
array([[1, 2],
[3, 4]])
Note that most of these iterative methods just treat your array as a list of lists. They don't take advantage of the array nature, and will be slow compared to methods that work with the whole x.

Related

Get part of np array with parameters

I am using python and numpy. I am using n dimensional array.
I want to select all elements with index like
arr[a,b,:,c]
but I want to be able to select slice position like parameter. For example if the parameter
#pos =2
arr[a,b,:,c]
#pos =1
arr[a,:,b,c]
I would move the axis of interest (at pos) to the front with numpy.moveaxis(array,pos,0)[1] and then simply slice with [:,a,b,c].
There is also numpy.take[2], but in your case you would still need to loop over each dimension a,b,c, so I think moveaxis is more convenient. Maybe there is an even more direct way to do this.
The idea of moving the slicing axis to one end is a good one. Various numpy functions use that idea.
In [171]: arr = np.ones((2,3,4,5),int)
In [172]: arr[0,0,:,0].shape
Out[172]: (4,)
In [173]: arr[0,:,0,0].shape
Out[173]: (3,)
Another idea is to build a indexing tuple:
In [176]: idx = (0,0,slice(None),0)
In [177]: arr[idx].shape
Out[177]: (4,)
In [178]: idx = (0,slice(None),0,0)
In [179]: arr[idx].shape
Out[179]: (3,)
To do this programmatically it may be easier to start with a list or array that can be modified, and then convert it to a tuple for indexing. Details will vary depending on how you prefer to specify the axis and variables.
If any of a,b,c are arrays (or lists), you may get some shape surprises, since it's a case of mixing advanced and basic indexing. But as long as they are scalars, that's not an issue.
You could np.transpose the array arr based on your preferences before you try to slice it, since you move your axis of interest (i.e. the :) "to the back". This way, you can rearrange arr, s.t. you can always call arr[a,b,c].
Example with only a and b:
import numpy as np
a = 0
b = 2
target_axis = 1
# Generate some random data
arr = np.random.randint(10, size=[3, 3, 3], dtype=int)
print(arr)
#[[[0 8 2]
# [3 9 4]
# [0 3 6]]
#
# [[8 5 4]
# [9 8 5]
# [8 6 1]]
#
# [[2 2 5]
# [5 3 3]
# [9 1 8]]]
# Define transpose s.t. target_axis is the last axis
transposed_shape = np.arange(arr.ndim)
transposed_shape = np.delete(transposed_shape, target_axis)
transposed_shape = np.append(transposed_shape, target_axis)
print(transposed_shape)
#[0 2 1]
# Caution! These 0 and 2 above do not come from a or b.
# Instead they are the indices of the axes.
# Transpose arr
arr_T = np.transpose(arr, transposed_shape)
print(arr_T)
#[[[0 3 0]
# [8 9 3]
# [2 4 6]]
#
# [[8 9 8]
# [5 8 6]
# [4 5 1]]
#
# [[2 5 9]
# [2 3 1]
# [5 3 8]]]
print(arr_T[a,b])
#[2 4 6]

How to define arrays in a list?

I have a following snippet of the code shown below. I want that after execution of the code, the original array values(like x1, x2, y1 and y2 array etc) be changed accordingly, which currently is not happening. Is there any way to do this ??. Like currently after the execution of the code, the x1 array values remain unchanged, I want that they should be changed accordingly as list1[0] array value gets changed after execution of the code.
import numpy as np
x1=np.array([10,2,10,5,10,7,10,6])
y1=np.array([2,3,6,5,8,9,7,8])
r1=np.array([0,4,0,3,0,5,0,3])
x2=np.array([10,3,10,6,10,8,10,7])
y2=np.array([2,3,6,5,8,9,7,8])
r2=np.array([0,5,0,7,0,9,0,3])
list1=[x1,x2]
list2=[y1,y2]
list3=[r1,r2]
for plane in range(0,2):
x=list1[plane]
y=list2[plane]
r=list3[plane]
comb=np.array([x,y,r])
comb=np.transpose(comb)
combsort=comb[np.argsort(comb[:,0])]
combsort=combsort.transpose()
x=combsort[0]
y=combsort[1]
r=combsort[2]
ind1=np.where(x==10)
ind2=ind1[0]
if(ind2.size):
indd=ind2[0]
x[indd:indd+len(ind2)]=np.ones(len(ind2))
y[indd:indd+len(ind2)]=np.ones(len(ind2))
r[indd:indd+len(ind2)]=np.ones(len(ind2))
list1[plane]=x
list2[plane]=y
list3[plane]=r
print(x1)
print(list1[0])
Output
[10 2 10 5 10 7 10 6]
[2 5 6 7 1 1 1 1]
You can use numpy's column stack
np.column_stack((x1, y1))
A much simpler case:
In [63]: x=np.array([1,2,3])
In [64]: alist = [x]
In [65]: alist
Out[65]: [array([1, 2, 3])]
In [66]: alist[0]
Out[66]: array([1, 2, 3])
In [67]: alist[0][:] = [4,5,6] # this changes elements of the array
In [68]: alist
Out[68]: [array([4, 5, 6])]
In [69]: x
Out[69]: array([4, 5, 6])
this changes the element of the list - to a list with a different size
In [70]: alist[0]=[7,8]
In [71]: alist
Out[71]: [[7, 8]]
In [72]: x
Out[72]: array([4, 5, 6])

Looping through Numpy Array elements

Is there a more readable way to code a loop in Python that goes through each element of a Numpy array? I have come up with the following code, but it seems cumbersome & not very readable:
import numpy as np
arr01 = np.random.randint(1,10,(3,3))
for i in range(0,(np.shape(arr01[0])[0]+1)):
for j in range(0,(np.shape(arr01[1])[0]+1)):
print (arr01[i,j])
I could make it more explicit such as:
import numpy as np
arr01 = np.random.randint(1,10,(3,3))
rows = np.shape(arr01[0])[0]
cols = np.shape(arr01[1])[0]
for i in range(0, (rows + 1)):
for j in range(0, (cols + 1)):
print (arr01[i,j])
However, that still seems a bit more cumbersome, compared to other languages, i.e. an equivalent code in VBA could read (supposing the array had already been populated):
dim i, j as integer
for i = lbound(arr01,1) to ubound(arr01,1)
for j = lbound(arr01,2) to ubound(arr01,2)
msgBox arr01(i, j)
next j
next i
You should use the builtin function nditer, if you don't need to have the indexes values.
for elem in np.nditer(arr01):
print(elem)
EDIT: If you need indexes (as a tuple for 2D table), then:
for index, elem in np.ndenumerate(arr01):
print(index, elem)
Seems like you've skipped over some intro Python chapters. With a list there are several simple ways of iterating:
In [1]: alist = ['a','b','c']
In [2]: for i in alist: print(i) # on the list itself
a
b
c
In [3]: len(alist)
Out[3]: 3
In [4]: for i in range(len(alist)): print(i,alist[i]) # index is ok
0 a
1 b
2 c
In [5]: for i,v in enumerate(alist): print(i,v) # but enumerate is simpler
0 a
1 b
2 c
Note the indexes. range(3) is sufficient. alist[3] produces an error.
In [6]: arr = np.arange(6).reshape(2,3)
In [7]: arr
Out[7]:
array([[0, 1, 2],
[3, 4, 5]])
In [8]: for row in arr:
...: for col in row:
...: print(row,col)
...:
[0 1 2] 0
[0 1 2] 1
[0 1 2] 2
[3 4 5] 3
[3 4 5] 4
[3 4 5] 5
The shape is a tuple. The row count is then arr.shape[0], and columns arr.shape[1]. Or you can 'unpack' both at once:
In [9]: arr.shape
Out[9]: (2, 3)
In [10]: n,m = arr.shape
In [11]: [arr[i,j] for i in range(n) for j in range(m)]
Out[11]: [0, 1, 2, 3, 4, 5]
But we can get the same flat list of values with ravel and optional conversion to list:
In [12]: arr.ravel()
Out[12]: array([0, 1, 2, 3, 4, 5])
In [13]: arr.ravel().tolist()
Out[13]: [0, 1, 2, 3, 4, 5]
But usually with numpy arrays, you shouldn't be iterating at all. Learn enough of the numpy basics so you can work with the whole array, not elements.
nditer can be used, as the other answer shows, to iterate through an array in a flat manner, but there are a number of details about it that could easily confuse a beginner. There are a couple of intro pages to nditer, but they should be read in full. Usually I discourage its use.
In [14]: for i in np.nditer(arr):
...: print(i, type(i), i.shape)
...:
0 <class 'numpy.ndarray'> () # this element is a 0d array, not a scalar integer
1 <class 'numpy.ndarray'> ()
2 <class 'numpy.ndarray'> ()
...
Iterating with ndenumerate or on the tolist produce different types of elements. The type may matter if you try to do more than display the value, so be careful.
In [15]: list(np.ndenumerate(arr))
Out[15]: [((0, 0), 0), ((0, 1), 1), ((0, 2), 2), ((1, 0), 3), ((1, 1), 4), ((1, 2), 5)]
In [16]: for ij, v in np.ndenumerate(arr):
...: print(ij, v, type(v))
...:
(0, 0) 0 <class 'numpy.int64'>
(0, 1) 1 <class 'numpy.int64'>
...
In [17]: for i, v in enumerate(arr.ravel().tolist()):
...: print(i, v, type(v))
...:
0 0 <class 'int'>
1 1 <class 'int'>
...

What's the most efficient way to split up a Numpy ndarray using percentage?

Hi I'm new to Python & Numpy and I'd like to ask what is the most efficient way to split a ndarray into 3 parts: 20%, 60% and 20%
import numpy as np
row_indices = np.random.permutation(10)
Let's assume the ndarray has 10 items: [7 9 3 1 2 4 5 6 0 8]
The expected results are the ndarray separated into 3 parts like part1, part2 and part3.
part1: [7 9]
part2: [3 1 2 4 5]
part3: [0 8]
Here's one way -
# data array
In [85]: a = np.array([7, 9, 3, 1, 2, 4, 5, 6, 0, 8])
# percentages (ratios) array
In [86]: p = np.array([0.2,0.6,0.2]) # must sum upto 1
In [87]: np.split(a,(len(a)*p[:-1].cumsum()).astype(int))
Out[87]: [array([7, 9]), array([3, 1, 2, 4, 5, 6]), array([0, 8])]
Alternative to np.split :
np.split could be slower when working with large data, so, we could alternatively use a loop there -
split_idx = np.r_[0,(len(a)*p.cumsum()).astype(int)]
out = [a[i:j] for (i,j) in zip(split_idx[:-1],split_idx[1:])]
I normally just go for the most obvious solution, although there are much fancier ways to do the same. It takes a second to implement and doesn't even require debugging (since it's extremely simple)
part1 = [a[i, ...] for i in range(int(a.shape[0] * 0.2))]
part2 = [a[i, ...] for i in range(int(a.shape[0] * 0.2), int(len(a) * 0.6))]
part3 = [a[i, ...] for i in range(int(a.shape[0] * 0.6), len(a))]
A few things to notice though
This is rounded and therefore you could get something which is only roughly a 20-60-20 split
You get back a list of element so you might have to re-numpyfy them with np.asarray()
You can use this method for indexing multiple objects (e.g. labels and inputs) for the same elements
If you get the indices once before the splits (indices = list(range(a.shape[0]))) you could also shuffle them thus taking care of data shuffling at the same time

Next argmax values in python

I have a function that returns the argmax from a large 2d array
getMax = np.argmax(dist, axis=1)
However I want to get the next biggest values, is there a way of removing the getMax values from the original array and then performing argmax again?
Use the command np.argsort(a, axis=-1, kind='quicksort', order=None), but with appropriate choice of arguments (below).
here is the documentation. Note "It returns an array of indices of the same shape as a that index data along the given axis in sorted order."
The default order is small to large. So sort with -dist (for quick coding). Caution: doing -dist causes a new array to be generated which you may care about if dist is huge. See bottom of post for a better alternative there.
Here is an example:
x = np.array([[1,2,5,0],[5,7,2,3]])
L = np.argsort(-x, axis=1)
print L
[[2 1 0 3]
[1 0 3 2]]
x
array([[1, 2, 5, 0],
[5, 7, 2, 3]])
So the n'th entry in a row of L gives the locations of the n'th largest element of x.
x is unchanged.
L[:,0] will give the same output as np.argmax(x)
L[:,0]
array([2, 1])
np.argmax(x,axis=1)
array([2, 1])
and L[:,1] will give the same as a hypothetical argsecondmax(x)
L[:,1]
array([1, 0])
If you don't want to generate a new list, so you don't want to use -x:
L = np.argsort(x, axis=1)
print L
[[3 0 1 2]
[2 3 0 1]]
L[:,-1]
array([2, 1])
L[:,-2]
array([1, 0])
If speed is important to you, using argpartition rather than argsort could be useful.
For example, to return the n largest elements from a list:
import numpy as np
l = np.random.random_integer(0, 100, 1e6)
top_n_1 = l[np.argsort(-l)[0:n]]
top_n_2 = l[np.argpartition(l, -n)[-n:]]
The %timeit function in ipython reports
10 loops, best of 3: 56.9 ms per loop for top_n_1 and 100 loops, best of 3: 8.06 ms per loop for top_n_2.
I hope this is useful.

Categories