So I have an array:
array([[[27, 27, 28],
[27, 14, 28]],
[[14, 5, 4],
[ 5, 6, 14]]])
How can I iterate through it and on each iteration get the [a, b, c] values, I try like that:
for v in np.nditer(a):
print(v)
but it just prints
27
27
28
27
14
28
14
5
4
5
6
I need:
[27 27 28]
[27 14 28]...
b = a.reshape(-1, 3)
for triplet in b:
...
Apparently you want to iterate on the first 2 dimensions of the array, returning the 3rd (as 1d array).
In [242]: y = np.array([[[27, 27, 28],
...: [27, 14, 28]],
...:
...: [[14, 5, 4],
...: [ 5, 6, 14]]])
Double loops are fine, as is reshaping to a (4,2) and iterating.
nditer isn't usually needed, or encouraged as an iteration mechanism (its documentation needs a stronger disclaimer). It's really meant for C level code. It isn't used much in Python level code. One exception is the np.ndindex function, which can be useful in this case:
In [244]: for ij in np.ndindex(y.shape[:2]):
...: print(ij, y[ij])
...:
(0, 0) [27 27 28]
(0, 1) [27 14 28]
(1, 0) [14 5 4]
(1, 1) [ 5 6 14]
ndindex uses nditer in multi_index mode on a temp array of the specified shape.
Where possible try to work without iteration. Iteration, with any of these tricks, is relatively slow.
You could do something ugly as
for i in range(len(your_array)):
for j in range(len(your_array[i])):
print(your_array[i][j])
Think of it as having arrays within an array. So within array v you have array a which in turn contains the triplets b
import numpy as np
na = np.array
v=na([[[27, 27, 28], [27, 14, 28]], [[14, 5, 4],[ 5, 6, 14]]])
for a in v:
for b in a:
print b
Output:
[27 27 28]
[27 14 28]
[14 5 4]
[ 5 6 14]
Alternatively you could do the following,
v2 = [b for a in v for b in a]
Now all your triplets are stored in v2
[array([27, 27, 28]),
array([27, 14, 28]),
array([14, 5, 4]),
array([ 5, 6, 14])]
..and you can access them like a 1D array eg
print v2[0]
gives..
array([27, 27, 28])
Another alternative (useful for arbitrary dimensionality of the array containing the n-tuples):
a_flat = a.ravel()
n = 3
m = len(a_flat)/n
[a_flat[i:i+n] for i in range(m)]
or in one line (slower):
[a.ravel()[i:i+n] for i in range(len(a.ravel())/n)]
or for further usage within a loop:
for i in range(len(a.ravel())/n):
print a.ravel()[i:i+n]
Reshape the array A (whose shape is n1, n2, 3) to array B (whose shape is n1 * n2, 3), and iterate through B. Note that B is just A's view. A and B share the same data block in the memory, but they have different array headers information where records their shapes, and changing values in B will also change A's value. The code below:
a = np.array([[[27, 27, 28],[27, 14, 28]],
[[14, 5, 4],[ 5, 6, 14]]])
b = a.reshape((-1, 3))
for last_d in b:
a, b, c = last_d
# do something with last_d or a, b, c
Related
I have a numpy array named "a":
a = numpy.array([
[[1, 2, 3], [11, 22, 33]],
[[4, 5, 6], [44, 55, 66]],
])
I want to print the following (in this exact format):
1 2 3
11 22 33
4 5 6
44 55 66
To accomplish this, I wrote the following:
for i in range(len(A)):
a = A[i]
for j in range(len(a)):
a1 = a[j][0]
a2 = a[j][1]
a3 = a[j][2]
print(a1, a2, a3)
The output is:
1 2 3
11 22 33
4 5 6
44 55 66
I would like to vectorize my solution (if possible) and discard the for loop. I understand that this problem might not benefit from vectorization. In reality (for work-related purposes), the array "a" has 52 elements and each element contains hundreds of arrays stored inside. I'd like to solve a basic/trivial case and move onto a more advanced, realistic case.
Also, I know that Numpy arrays were not meant to be iterated through.
I could have used Python lists to accomplish the following, but I really want to vectorize this (if possible, of course).
You could use np.apply_along_axis which maps the array with a function on an arbitrary axis. Applying it on axis=2 to get the desired result.
Using print directly as the callback:
>>> np.apply_along_axis(print, 2, a)
[1 2 3]
[11 22 33]
[4 5 6]
[44 55 66]
Or with a lambda wrapper:
>>> np.apply_along_axis(lambda r: print(' '.join([str(x) for x in r])), 2, a)
1 2 3
11 22 33
4 5 6
44 55 66
In [146]: a = numpy.array([
...: [[1, 2, 3], [11, 22, 33]],
...: [[4, 5, 6], [44, 55, 66]],
...: ])
...:
In [147]: a
Out[147]:
array([[[ 1, 2, 3],
[11, 22, 33]],
[[ 4, 5, 6],
[44, 55, 66]]])
A proper "vectorized" numpy output is:
In [148]: a.reshape(-1,3)
Out[148]:
array([[ 1, 2, 3],
[11, 22, 33],
[ 4, 5, 6],
[44, 55, 66]])
You could also convert that to a list of lists:
In [149]: a.reshape(-1,3).tolist()
Out[149]: [[1, 2, 3], [11, 22, 33], [4, 5, 6], [44, 55, 66]]
But you want a print without the standard numpy formatting (nor list formatting)
But this iteration is easy:
In [150]: for row in a.reshape(-1,3):
...: print(*row)
...:
1 2 3
11 22 33
4 5 6
44 55 66
Since your desired output is a print, or at least "unformatted" strings, there's no "vectorized", i.e. whole-array, option. You have to iterate on each line!
np.savetxt creates a csv output by iterating on rows and writing a format tuple, e.g. f.write(fmt%tuple(row)).
In [155]: np.savetxt('test', a.reshape(-1,3), fmt='%d')
In [156]: cat test
1 2 3
11 22 33
4 5 6
44 55 66
To get that exact output without iterating, try this:
print(str(a.tolist()).replace('], [', '\n').replace('[', '').replace(']', '').replace(',', ''))
Here is the example:
import numpy as np
a = np.array([1,2],[3,4],[5,6])
b = np.array([7,8],[9,10],[11,12],[12,13])
what I want is to use each item in a to plus every item in b then plus them together. For example, [1,2] should plus every row in b 1+7=8,2+8=10, 8+10=18;1+9=10,2+10=12,10+12=22... The result would like to be that[[18,22,26,28...],[22,26,....],[26,30....]]
My question is how to fulfil that? I know use numpy can be more efficient than loop but how to use the matrix to calculate this?
I believe this does what you want:
>>> a = np.array([[1,2],[3,4],[5,6]])
>>> b = np.array([[7,8],[9,10],[11,12],[12,13]])
>>> np.sum(a, axis=1)[:,None] + np.sum(b, axis=1)[None,:]
array([[18, 22, 26, 28],
[22, 26, 30, 32],
[26, 30, 34, 36]])
You can use list comprehensions:
import numpy as np
a = np.array([[1, 2], [3, 4], [5, 6]])
b = np.array([[7, 8], [9, 10], [11, 12], [12, 13]])
[[sum(i) + sum(j) for j in b] for i in a]
Output:
[[18, 22, 26, 28], [22, 26, 30, 32], [26, 30, 34, 36]]
This can be done in the most precise and succinct way as follows:
np.einsum('ijk-> ij', a[:,None,:]+b)
Let me explain each step.It combines einsum and broadcasting concepts of numpy.
Step 1- a[:,None,:] reshapes matrix a to shape (3,1,2). This mid axis with value 1 is helpful for broadcasting.
Step 2- a[:,None,:] + b broadcasts a and adds matrix b to get a resultant matrix of shape (3,4,2).
Step 3- np.einsum('ijk-> ij', a[:,None,:]+b) does sum reduction along the last axis of matrix obtained from previous step.
I want to extract some members from many large numpy arrays. A simple example is
A = np.arange(36).reshape(6,6)
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]])
I want to extract the members in a shifted windows in each row with minimum stride 2 and maximum stride 4. For example, in the first row, I would like to have
[2, 3, 4] # A[i,i+2:i+4+1] where i == 0
In the second row, I want to have
[9, 10, 11] # A[i,i+2:i+4+1] where i == 1
In the third row, I want to have
[16, 17, 0]
[[2, 3, 4],
[9, 10, 11],
[16 17, 0],
[23, 0, 0]]
I want to know efficient ways to do this. Thanks.
you can extract values in an array by providing a list of indices for each dimension. For example, if you want the second diagonal, you can use arr[np.arange(0, len(arr)-1), np.arange(1, len(arr))]
For your example, I would do smth like what's in the code below. although, I did not account for different strides. If you want to account for strides, you can change the behaviour of the index list creation. If you struggle with adding the stride functionality, comment and I'll edit this answer.
import numpy as np
def slide(arr, window_len = 3, start = 0):
# pad array to get zeros out of bounds
arr_padded = np.zeros((arr.shape[0], arr.shape[1]+window_len-1))
arr_padded[:arr.shape[0], :arr.shape[1]] = arr
# compute the number of window moves
repeats = min(arr.shape[0], arr.shape[1]-start)
# create index lists
idx0 = np.arange(repeats).repeat(window_len)
idx1 = np.concatenate(
[np.arange(start+i, start+i+window_len)
for i in range(repeats)])
return arr_padded[idx0, idx1].reshape(-1, window_len)
A = np.arange(36).reshape(6,6)
print(f'A =\n{A}')
print(f'B =\n{slide(A, start = 2)}')
output:
A =
[[ 0 1 2 3 4 5]
[ 6 7 8 9 10 11]
[12 13 14 15 16 17]
[18 19 20 21 22 23]
[24 25 26 27 28 29]
[30 31 32 33 34 35]]
B =
[[ 2. 3. 4.]
[ 9. 10. 11.]
[16. 17. 0.]
[23. 0. 0.]]
This is a follow up on this question.
From a 2d array, create another 2d array composed of randomly selected values from original array (values not shared among rows) without using a loop
I am looking for a way to create a 2D array whose rows are randomly selected unique values (non-repeating) from another row, without using a loop.
Here is a way to do it With using a loop.
pool = np.random.randint(0, 30, size=[4,5])
seln = np.empty([4,3], int)
for i in range(0, pool.shape[0]):
seln[i] =np.random.choice(pool[i], 3, replace=False)
print('pool = ', pool)
print('seln = ', seln)
>pool = [[ 1 11 29 4 13]
[29 1 2 3 24]
[ 0 25 17 2 14]
[20 22 18 9 29]]
seln = [[ 8 12 0]
[ 4 19 13]
[ 8 15 24]
[12 12 19]]
Here is a method that does not uses a loop, however, it can select the same value multiple times in each row.
pool = np.random.randint(0, 30, size=[4,5])
print(pool)
array([[ 4, 18, 0, 15, 9],
[ 0, 9, 21, 26, 9],
[16, 28, 11, 19, 24],
[20, 6, 13, 2, 27]])
# New array shape
new_shape = (pool.shape[0],3)
# Indices where to randomly choose from
ix = np.random.choice(pool.shape[1], new_shape)
array([[0, 3, 3],
[1, 1, 4],
[2, 4, 4],
[1, 2, 1]])
ixs = (ix.T + range(0,np.prod(pool.shape),pool.shape[1])).T
array([[ 0, 3, 3],
[ 6, 6, 9],
[12, 14, 14],
[16, 17, 16]])
pool.flatten()[ixs].reshape(new_shape)
array([[ 4, 15, 15],
[ 9, 9, 9],
[11, 24, 24],
[ 6, 13, 6]])
I am looking for a method that does not use a loop, and if a particular value from a row is selected, that value can Not be selected again.
Here is a way without explicit looping. However, it requires generating an array of random numbers of the size of the original array. That said, the generation is done using compiled code so it should be pretty fast. It can fail if you happen to generate two identical numbers, but the chance of that happening is essentially zero.
m,n = 4,5
pool = np.random.randint(0, 30, size=[m,n])
new_width = 3
mask = np.argsort(np.random.rand(m,n))<new_width
pool[mask].reshape(m,3)
How it works:
We generate a random array of floats, and argsort it. By default, when artsort is applied to a 2d array it is applied along axis 1 so the value of the i,j entry of the argsorted list is what place the j-th entry of the i-th row would appear if you sorted the i-th row.
We then find all the values in this array where the entries whose values are less than new_width. Each row contains the numbers 0,...,n-1 in a random order, so exactly new_width of them will be less than new_width. This means each row of mask will have exactly new_width number of entries which are True, and the rest will be False (when you use a boolean operator between a ndarray and a scalar it applies it component-wise).
Finally, the boolean mask is applied to the original data to grab new_width many entries from each row.
You could also use np.vectorize for your loop solution, although that is just shorthand for a loop.
What is the Pythonic way to get a list of diagonal elements in a matrix passing through entry (i,j)?
For e.g., given a matrix like:
[1 2 3 4 5]
[6 7 8 9 10]
[11 12 13 14 15]
[16 17 18 19 20]
[21 22 23 24 25]
and an entry, say, (1,3) (representing element 9) how can I get the elements in the diagonals passing through 9 in a Pythonic way? Basically, [3,9,15] and [5,9,13,17,21] both.
Using np.diagonal with a little offset logic.
import numpy as np
lst = np.array([[1, 2, 3, 4, 5],
[6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25]])
i, j = 1, 3
major = np.diagonal(lst, offset=(j - i))
print(major)
array([ 3, 9, 15])
minor = np.diagonal(np.rot90(lst), offset=-lst.shape[1] + (j + i) + 1)
print(minor)
array([ 5, 9, 13, 17, 21])
The indices i and j are the row and column. By specifying the offset, numpy knows from where to begin selecting elements for the diagonal.
For the major diagonal, You want to start collecting from 3 in the first row. So you need to take the current column index and subtract it by the current row index, to figure out the correct column index at the 0th row. Similarly for the minor diagonal, where the array is flipped (rotated by 90˚) and the process repeats.
As another alternative method, with raveling the array and for matrix with shape (n*n):
array = np.array([[1, 2, 3, 4, 5],
[6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25]])
x, y = 1, 3
a_mod = array.ravel()
size = array.shape[0]
if y >= x:
diag = a_mod[y-x:(x+size-y)*size:size+1]
else:
diag = a_mod[(x-y)*size::size+1]
if x-(size-1-y) >= 0:
reverse_diag = array[:, ::-1].ravel()[(x-(size-1-y))*size::size+1]
else:
reverse_diag = a_mod[x:x*size+1:size-1]
# diag --> [ 3 9 15]
# reverse_diag --> [ 5 9 13 17 21]
The correctness of the resulted arrays must be checked further. This can be developed to handle matrices with other shapes e.g. (n*m).