How to swap array's columns if condition is satisfied - python

I have an Nx3 numpy array:
A = [[01,02,03]
[11,12,13]
[21,22,23]]
I need an array where the second and third columns are swapped if the sum of the second and third numbers is greater then 20:
[[01,02,03]
[11,13,12]
[21,23,22]]
Is it possible to achieve this without a loop?
UPDATE:
So, the story behind this is that I want to swap colors in a RGB image, namely green and blue, but not yellow - this is my condition. Empirically I found out it is abs(green - blue) > 15 && (blue > green)
swapped = np.array(img).reshape(img.shape[0] * img.shape[1], img.shape[2])
idx = ((np.abs(swapped[:,1] - swapped[:,2]) < 15) & (swapped[:, 2] < swapped[:, 1]))
swapped[idx, 1], swapped[idx, 2] = swapped[idx, 2], swapped[idx, 1]
plt.imshow(swapped.reshape(img.shape[0], img.shape[1], img.shape[2]))
this actually works, but partially. The first column will be swapped, but the second one will be overwritten.
# tested in pyton3
a = np.array([[1,2,3],[11,12,13],[21,22,23]])
a[:,1], a[:,2] = a[:,2], a[:,1]
array([[ 1, 3, 3],
[11, 13, 13],
[21, 23, 23]])

Here's one way with masking -
# Get 1D mask of length same as the column length of array and with True
# values at places where the combined sum is > 20
m = A[:,1] + A[:,2] > 20
# Get the masked elements off the second column
tmp = A[m,2]
# Assign into the masked places in the third col from the
# corresponding masked places in second col.
# Note that this won't change `tmp` because `tmp` isn't a view into
# the third col, but holds a separate memory space
A[m,2] = A[m,1]
# Finally assign into the second col from tmp
A[m,1] = tmp
Sample run -
In [538]: A
Out[538]:
array([[ 1, 2, 3],
[11, 12, 13],
[21, 22, 23]])
In [539]: m = A[:,1] + A[:,2] > 20
...: tmp = A[m,2]
...: A[m,2] = A[m,1]
...: A[m,1] = tmp
In [540]: A
Out[540]:
array([[ 1, 2, 3],
[11, 13, 12],
[21, 23, 22]])

How about using np.where along with "fancy" indexing, and np.flip to swap the elements.
In [145]: A
Out[145]:
array([[ 1, 2, 3],
[11, 12, 13],
[21, 22, 23]])
# extract matching sub-array
In [146]: matches = A[np.where(np.sum(A[:, 1:], axis=1) > 20)]
In [147]: matches
Out[147]:
array([[11, 12, 13],
[21, 22, 23]])
# swap elements and update the original array using "boolean" indexing
In [148]: A[np.where(np.sum(A[:, 1:], axis=1) > 20)] = np.hstack((matches[:, :1], np.flip(matches[:, 1:], axis=1)))
In [149]: A
Out[149]:
array([[ 1, 2, 3],
[11, 13, 12],
[21, 23, 22]])
One more approach based on #Divakar's suggestion would be to:
First get the indices which are nonzero for the condition specified (here
sum of the elements in second and third column > 20)
In [70]: idx = np.flatnonzero(np.sum(A[:, 1:3], axis=1) > 20)
Then create an open mesh using np.ix_
In [71]: gidx = np.ix_(idx,[1,2])
# finally update the original array `A`
In [72]: A[gidx] = A[gidx][:,::-1]

Related

How can i add values to an array with numpy?

in this code im trying to add the sum of values and add it to an array named array_values, but it didnt, only prints []
array_values = ([])
value = 0.0
for a in range(0, 8):
for b in range (1, 5):
value = value + float(klines[a][b])
#print(value)
np.append(array_values, value)#FIX array_values.append(value)
print("aƱadiendo: ",value)
value = 0.0
print(array_values)
Does this solve your problem?
import numpy as np
array_values = ([])
value = 0.0
for a in range(0, 8):
for b in range (1, 5):
value = value + float(klines[a][b])
#print(value)
array_values = np.append(array_values, value)
print("aƱadiendo: ",value)
value = 0.0
print(array_values)
np.append returns an ndarray.
A copy of arr with values appended to axis. Note that append does not occur in-place: a new array is allocated and filled. If axis is None, out is a flattened array.
Check the return section to understand better
https://numpy.org/doc/stable/reference/generated/numpy.append.html
Assuming klines is a 2d numeric dtype array:
In [231]: klines = np.arange(1,13).reshape(4,3)
In [232]: klines
Out[232]:
array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12]])
we can simple sum across rows with:
In [233]: klines.sum(axis=1)
Out[233]: array([ 6, 15, 24, 33])
the equivalent using your style of iteration:
In [234]: alist = []
...: value = 0
...: for i in range(4):
...: for j in range(3):
...: value += klines[i,j]
...: alist.append(value)
...: value = 0
...:
In [235]: alist
Out[235]: [6, 15, 24, 33]
Use of np.append is slower and harder to get right.
Even if klines is a list of lists, the sums can be easily done with:
In [236]: [sum(row) for row in klines]
Out[236]: [6, 15, 24, 33]

How to efficiently shuffle some values of a numpy array while keeping their relative order?

I have a numpy array and a mask specifying which entries from that array to shuffle while keeping their relative order. Let's have an example:
In [2]: arr = np.array([5, 3, 9, 0, 4, 1])
In [4]: mask = np.array([True, False, False, False, True, True])
In [5]: arr[mask]
Out[5]: array([5, 4, 1]) # These entries shall be shuffled inside arr, while keeping their order.
In [6]: np.where(mask==True)
Out[6]: (array([0, 4, 5]),)
In [7]: shuffle_array(arr, mask) # I'm looking for an efficient realization of this function!
Out[7]: array([3, 5, 4, 9, 0, 1]) # See how the entries 5, 4 and 1 haven't changed their order.
I've written some code that can do this, but it's really slow.
import numpy as np
def shuffle_array(arr, mask):
perm = np.arange(len(arr)) # permutation array
n = mask.sum()
if n > 0:
old_true_pos = np.where(mask == True)[0] # old positions for which mask is True
old_false_pos = np.where(mask == False)[0] # old positions for which mask is False
new_true_pos = np.random.choice(perm, n, replace=False) # draw new positions
new_true_pos.sort()
new_false_pos = np.setdiff1d(perm, new_true_pos)
new_pos = np.hstack((new_true_pos, new_false_pos))
old_pos = np.hstack((old_true_pos, old_false_pos))
perm[new_pos] = perm[old_pos]
return arr[perm]
To make things worse, I actually have two large matrices A and B with shape (M,N). Matrix A holds arbitrary values, while each row of matrix B is the mask which to use for shuffling one corresponding row of matrix A according to the procedure that I outlined above. So what I want is shuffled_matrix = row_wise_shuffle(A, B).
The only way I have so far found to do it is via my shuffle_array() function and a for loop.
Can you think of any numpy'onic way to accomplish this task avoiding loops? Thank you so much in advance!
For 1d case:
import numpy as np
a = np.arange(8)
b = np.array([1,1,1,1,0,0,0,0])
# Get ordered values
ordered_values = a[np.where(b==1)]
# We'll shuffle both arrays
shuffled_ix = np.random.permutation(a.shape[0])
a_shuffled = a[shuffled_ix]
b_shuffled = b[shuffled_ix]
# Replace the values with correct order
a_shuffled[np.where(b_shuffled==1)] = ordered_values
a_shuffled # Notice that 0, 1, 2, 3 preserves order.
>>>
array([0, 1, 2, 6, 3, 4, 7, 5])
for 2d case, columnwise shuffle (along axis=1):
import numpy as np
a = np.arange(24).reshape(4,6)
b = np.array([[0,0,0,0,1,1], [1,1,1,0,0,0], [1,1,1,1,0,0], [0,0,1,1,0,0]])
# The code below works for column shuffle (i.e. axis=1).
# Get ordered values
i,j = np.where(b==1)
values = a[i, j]
values
# We'll shuffle both arrays for axis=1
# taken from https://stackoverflow.com/questions/5040797/shuffling-numpy-array-along-a-given-axis
idx = np.random.rand(*a.shape).argsort(axis=1)
a_shuffled = np.take_along_axis(a,idx,axis=1)
b_shuffled = np.take_along_axis(b,idx,axis=1)
# Replace the values with correct order
a_shuffled[np.where(b_shuffled==1)] = values
# Get the result
a_shuffled # see that 4,5 | 6,7,8 | 12,13,14,15 | 20, 21 preserves order
>>>
array([[ 4, 1, 0, 3, 2, 5],
[ 9, 6, 7, 11, 8, 10],
[12, 13, 16, 17, 14, 15],
[23, 20, 19, 22, 21, 18]])
for 2d case, rowwise shuffle (along axis=0), we can use the same code, first transpose arrays and after shuffle transpose back:
import numpy as np
a = np.arange(24).reshape(4,6)
b = np.array([[0,0,0,0,1,1], [1,1,1,0,0,0], [1,1,1,1,0,0], [0,0,1,1,0,0]])
# The code below works for column shuffle (i.e. axis=1).
# As you said rowwise, we first transpose
at = a.T
bt = b.T
# Get ordered values
i,j = np.where(bt==1)
values = at[i, j]
values
# We'll shuffle both arrays for axis=1
# taken from https://stackoverflow.com/questions/5040797/shuffling-numpy-array-along-a-given-axis
idx = np.random.rand(*at.shape).argsort(axis=1)
at_shuffled = np.take_along_axis(at,idx,axis=1)
bt_shuffled = np.take_along_axis(bt,idx,axis=1)
# Replace the values with correct order
at_shuffled[np.where(bt_shuffled==1)] = values
# Get the result
a_shuffled = at_shuffled.T
a_shuffled # see that 6,12 | 7, 13 | 8,14,20 | 15, 21 preserves order
>>>
array([[ 6, 7, 2, 3, 10, 17],
[18, 19, 8, 15, 16, 23],
[12, 13, 14, 21, 4, 5],
[ 0, 1, 20, 9, 22, 11]])

Create matrix with numpy meshgrid

I want to create a matrix with M rows and N columns. The increment along the columns are always 1, whilst the increment in rows are a constant value, c. For example, to create this matrix:
The number of rows are 4, the number of columns are 2 and the shift between rows: c = 8. One way to perform this could be:
# Indices of columns
coord_x = np.arange(0, 2)
# Indices of rows
coord_y = np.arange(1, 37, 9)
# Creates 2 matrices with the coordinates
x, y = np.meshgrid(coord_x, coord_y)
# To perform the shift between columns
idx_left = x + y
And the output is:
print(idx_left)
[[ 1 2]
[10 11]
[19 20]
[28 29]]
Can I perform this without the adding idx_left = x + y?. I've already seen other functions but I don't find any that considers a shift along the rows and columns...
Stride tricks
You can use np.lib.stride_tricks for this purpose.
arr = np.arange(1,100)
shape = (4,2)
strides = (arr.strides[0]*9,arr.strides[0]*1) #8 bytes with 9 steps on axis=0, 8bytes with 1 step on axis=1
np.lib.stride_tricks.as_strided(arr, shape=shape, strides=strides)
array([[ 1, 2],
[10, 11],
[19, 20],
[28, 29]])
Another example with 2 shift on axis=0 and 3 shift on axis=1.
arr = np.arange(1,100)
shape = (4,2)
strides = (arr.strides[0]*2,arr.strides[0]*3) #8bytes * 2shift on axis=0, 8bytes*3shift on axis=1
np.lib.stride_tricks.as_strided(arr, shape=shape, strides=strides)
array([[ 1, 4],
[ 3, 6],
[ 5, 8],
[ 7, 10]])
Broadcasting
You could simply do this the same way you are doing without meshgrids only using broadcasting as well -
#Your original example
coord_x = np.arange(0, 2, 1) #start, stop, step
coord_y = np.arange(1, 37, 9)
coord_x[None,:] + coord_y[:,None]
array([[ 1, 2],
[10, 11],
[19, 20],
[28, 29]])
Linspace
You could use linspace if you have extreme limits. So, in your case, you can create a 4 row array from (1,2) to (28,29).
np.linspace((1,2),(28,29),4)
array([[ 1., 2.],
[10., 11.],
[19., 20.],
[28., 29.]])
Mgrid
Mgrid is more convenient than mesh grid for your purpose. You can do -
np.mgrid[0:2:1, 1:37:9].sum(0).T
array([[ 1, 2],
[10, 11],
[19, 20],
[28, 29]])

Splitting arrays depending on unique values in an array

I currently have two arrays, one of which has several repeated values and another with unique values.
Eg array 1 : a = [1, 1, 2, 2, 3, 3]
Eg array 2 : b = [10, 11, 12, 13, 14, 15]
I was developing a code in python that looks at the first array and distinguishes the elements that are all the same and remembers the indices. A new array is created that contains the elements of array b at those indices.
Eg: As array 'a' has three unique values at positions 1,2... 3,4... 5,6, then three new arrays would be created such that it contains the elements of array b at positions 1,2... 3,4... 5,6. Thus, the result would be three new arrays:
b1 = [10, 11]
b2 = [12, 13]
b3 = [14, 15]
I have managed to develop a code, however, it only works for when there are three unique values in array 'a'. In the case there are more or less unique values in array 'a', the code has to be physically modified.
import itertools
import numpy as np
import matplotlib.tri as tri
import sys
a = [1, 1, 2, 2, 3, 3]
b = [10, 10, 20, 20, 30, 30]
b_1 = []
b_2 = []
b_3 = []
unique = []
for vals in a:
if vals not in unique:
unique.append(vals)
if len(unique) != 3:
sys.exit("More than 3 'a' values - check dimension")
for j in range(0,len(a)):
if a[j] == unique[0]:
b_1.append(c[j])
elif a[j] == unique[1]:
b_2.append(c[j])
elif a[j] == unique[2]:
b_3.append(c[j])
else:
sys.exit("More than 3 'a' values - check dimension")
print (b_1)
print (b_2)
print (b_3)
I was wondering if there is perhaps a more elegant way to perform this task such that the code is able to cope with an n number of unique values.
Well given that you are also using numpy, here's one way using np.unique. You can set return_index=True to get the indices of the unique values, and use them to split the array b with np.split:
a = np.array([1, 1, 2, 2, 3, 3])
b = np.array([10, 11, 12, 13, 14, 15])
u, s = np.unique(a, return_index=True)
np.split(b,s[1:])
Output
[array([10, 11]), array([12, 13]), array([14, 15])]
You can use the function groupby():
from itertools import groupby
from operator import itemgetter
a = [1, 1, 2, 2, 3, 3]
b = [10, 11, 12, 13, 14, 15]
[[i[1] for i in g] for _, g in groupby(zip(a, b), key=itemgetter(0))]
# [[10, 11], [12, 13], [14, 15]]

Efficient numpy indexing: Take first N rows of every block of M rows

x = np.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15])
I want to grab first 2 rows of array x from every block of 5, result should be:
x[fancy_indexing] = [1,2, 6,7, 11,12]
It's easy enough to build up an index like that using a for loop.
Is there a one-liner slicing trick that will pull it off? Points for simplicity here.
Approach #1 Here's a vectorized one-liner using boolean-indexing -
x[np.mod(np.arange(x.size),M)<N]
Approach #2 If you are going for performance, here's another vectorized approach using NumPy strides -
n = x.strides[0]
shp = (x.size//M,N)
out = np.lib.stride_tricks.as_strided(x, shape=shp, strides=(M*n,n)).ravel()
Sample run -
In [61]: # Inputs
...: x = np.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15])
...: N = 2
...: M = 5
...:
In [62]: # Approach 1
...: x[np.mod(np.arange(x.size),M)<N]
Out[62]: array([ 1, 2, 6, 7, 11, 12])
In [63]: # Approach 2
...: n = x.strides[0]
...: shp = (x.size//M,N)
...: out=np.lib.stride_tricks.as_strided(x,shape=shp,strides=(M*n,n)).ravel()
...:
In [64]: out
Out[64]: array([ 1, 2, 6, 7, 11, 12])
I first thought you need this to work for 2d arrays due to your phrasing of "first N rows of every block of M rows", so I'll leave my solution as this.
You could work some magic by reshaping your array into 3d:
M = 5 # size of blocks
N = 2 # number of columns to cut
x = np.arange(3*4*M).reshape(4,-1) # (4,3*N)-shaped dummy input
x = x.reshape(x.shape[0],-1,M)[:,:,:N+1].reshape(x.shape[0],-1) # (4,3*N)-shaped output
This will extract every column according to your preference. In order to use it for your 1d case you'd need to make your 1d array into a 2d one using x = x[None,:].
Reshape the array to multiple rows of five columns then take (slice) the first two columns of each row.
>>> x
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])
>>> x.reshape(x.shape[0] / 5, 5)[:,:2]
array([[ 1, 2],
[ 6, 7],
[11, 12]])
Or
>>> x.reshape(x.shape[0] / 5, 5)[:,:2].flatten()
array([ 1, 2, 6, 7, 11, 12])
>>>
It only works with 1-d arrays that have a length that is a multiple of five.
import numpy as np
x = np.array(range(1, 16))
y = np.vstack([x[0::5], x[1::5]]).T.ravel()
y
// => array([ 1, 2, 6, 7, 11, 12])
Taking the first N rows of every block of M rows in the array [1, 2, ..., K]:
import numpy as np
K = 30
M = 5
N = 2
x = np.array(range(1, K+1))
y = np.vstack([x[i::M] for i in range(N)]).T.ravel()
y
// => array([ 1, 2, 6, 7, 11, 12, 16, 17, 21, 22, 26, 27])
Notice that .T and .ravel() are fast operations: they don't copy any data, but just manipulate the dimensions and strides of the array.
If you insist on getting your slice using fancy indexing:
import numpy as np
K = 30
M = 5
N = 2
x = np.array(range(1, K+1))
fancy_indexing = [i*M+n for i in range(len(x)//M) for n in range(N)]
x[fancy_indexing]
// => array([ 1, 2, 6, 7, 11, 12, 16, 17, 21, 22, 26, 27])

Categories