Modifying entire row of an array based on a condition using Numpy

Modifying entire row of an array based on a condition using Numpy - python

I have an array:
xNew = np.array([[0.50,0.25],[-0.4,-0.2],[0.60,0.80],[1.20,1.90],[-0.10,0.60],[0.10,1.2]])
and another array:
x = np.array([[0.55,0.34],[0.45,0.26],[0.14,0.29],[0.85,0.89],[0.27,0.78],[0.45,0.05]])
If an element in a row is smaller than 0 or larger than 1 in xNew , that row should be entirely replaced by corresponding row in x. The desired output is:
xNew = np.array([[0.50,0.25],[0.45,0.26],[0.60,0.80],[0.85,0.89],[0.27,0.78],[0.45,0.05]])
I am looking for an efficient way to accomplish this using numpy functions.
Thanks!

You can use advanced indexing:
idx = ((xNew<0)|(xNew>1)).any(-1)
xNew[idx]=x[idx]
output:
[[0.5 0.25]
[0.45 0.26]
[0.6 0.8 ]
[0.85 0.89]
[0.27 0.78]
[0.45 0.05]]

for index, y in enumerate(xNew):
if(np.any(np.greater(y,[1,1])) or np.any(np.less(y,[0,0]))):
xNew[index] = x[index]

Related

How to scale and print an array based on its minimum and maximum value?

I'm trying to scale the following NumPy array based on its minimum and maximum values.
array = [[17405.051 17442.4 17199.6 17245.65 ]
[17094.949 17291.75 17091.15 17222.75 ]
[17289. 17294.9 17076.551 17153. ]
[17181.85 17235.1 17003.9 17222. ]]
Formula used is:
m=(x-xmin)/(xmax-xmin)
wherein m is an individually scaled item, x is an individual item, xmax is the highest value and xmin is the smallest value of the array.
My question is how do I print the scaled array?
P.S. - I can't use MinMaxScaler as I need to scale a given number (outside the array) by plugging it in the mentioned formula with xmin & xmax of the given array.
I tried scaling the individual items by iterating over the array but I'm unable to put together the scaled array.
I'm new to NumPy, any suggestions would be welcome.
Thank you.

Use method ndarray.min(), ndarray.max() or ndarray.ptp()(gets the range of the values in the array):
>>> ar = np.array([[17405.051, 17442.4, 17199.6, 17245.65 ],
... [17094.949, 17291.75, 17091.15, 17222.75 ],
... [17289., 17294.9, 17076.551, 17153. ],
... [17181.85, 17235.1, 17003.9, 17222. ]])
>>> min_val = ar.min()
>>> range_val = ar.ptp()
>>> (ar - min_val) / range_val
array([[0.91482554, 1. , 0.44629418, 0.55131129],
[0.2076374 , 0.65644242, 0.19897377, 0.4990878 ],
[0.65017104, 0.663626 , 0.16568073, 0.34002281],
[0.40581528, 0.527252 , 0. , 0.49737742]])
I think you should learn more about the basic operation of numpy.

import numpy as np
array_list = [[17405.051, 17442.4, 17199.6, 17245.65 ],
[17094.949, 17291.75, 17091.15, 17222.75 ],
[17289., 17294.9, 17076.551, 17153., ],
[17181.85, 17235.1, 17003.9, 17222. ]]
# Convert list into numpy array
array = np.array(array_list)
# Create empty list
scaled_array_list=[]
for x in array:
m = (x - np.min(array))/(np.max(array)-np.min(array))
scaled_array_list.append(m)
# Convert list into numpy array
scaled_array = np.array(scaled_array_list)
scaled_array
My version is by iterating over the array as you said.
You can also put everything in a function and use it in future:
def scaler(array_to_scale):
# Create empty list
scaled_array_list=[]
for x in array:
m = (x - np.min(array))/(np.max(array)-np.min(array))
scaled_array_list.append(m)
# Convert list into numpy array
scaled_array = np.array(scaled_array_list)
return scaled_array
# Here it is our input
array_list = [[17405.051, 17442.4, 17199.6, 17245.65 ],
[17094.949, 17291.75, 17091.15, 17222.75 ],
[17289., 17294.9, 17076.551, 17153., ],
[17181.85, 17235.1, 17003.9, 17222. ]]
# Convert list into numpy array
array = np.array(array_list)
scaler(array)
Output:
Out:
array([[0.91482554, 1. , 0.44629418, 0.55131129],
[0.2076374 , 0.65644242, 0.19897377, 0.4990878 ],
[0.65017104, 0.663626 , 0.16568073, 0.34002281],
[0.40581528, 0.527252 , 0. , 0.49737742]])

normalize multi dimensional numpy array using the last value along axis 1

I have the following numpy array :
A = np.array([[1,2,3,4,5],
[15,25,35,45,55]])
I would like to create a new array with the same shape by dividing each dimension by the last element of the dimension
The output desired would be :
B = np.array([[0.2,0.4,0.6,0.8,1],
[0.27272727,0.45454545,0.63636364,0.81818182,1]])
Any idea ?

Slice the last element while keeping the dimensions and divide:
B = A/A[:,[-1]] # slice with [] to keep the dimensions
or, better, to avoid an unnecessary copy:
B = A/A[:,-1,None]
output:
array([[0.2 , 0.4 , 0.6 , 0.8 , 1. ],
[0.27272727, 0.45454545, 0.63636364, 0.81818182, 1. ]])

You mean this?
B = np.array([[A[i][j]/A[i][len(A[i])-1] for j in range(0,len(A[i]))] for i in range(0,len(A))])

You can achieve this using:
[list(map(lambda i: i / a[-1], a)) for a in A]
Result:
[[0.2, 0.4, 0.6, 0.8, 1.0], [0.2727272727272727, 0.45454545454545453, 0.6363636363636364, 0.8181818181818182, 1.0]]

Adding on #mozway answer, it seems to be faster to take the last column and then add an axis with:
B = A/A[:,-1][:,None]
for instance.
See the benchmark:

Extract from numpy array with coordinates of points

I'm currently writing a code, and I have to extract from a numpy array.
For example: [[1,1] , [0.6,0.6], [0,0]]), given the condition for the extracted points [x,y] must satisfy x>=0.5 and y >= 0.5
I've tried to use numpy extract, with the condition arr[0]>=0.5 & arr[1]>=0.5 however that does not seem to work
It applied the condition on all the elements, and I just want it to apply to the points inside my array.
Thanks in advance!

You can use multiple conditions to slice an array as follows:
import numpy as np
a = np.array([[1, 1] , [0.6, 0.6], [0, 0]])
new = a[(a[:, 0] >= 0.5) & (a[:, 1] >= 0.5)]
Results:
array([[1. , 1. ],
[0.6, 0.6]])
The first condition filters on column 0 and the second condition filters on column 1. Only rows where both conditions are met will be in the results.

I would do it following way: firstly look for rows full-filling condition:
import numpy as np
a = np.array([[1,1] , [0.6,0.6], [0,0]])
rows = np.apply_along_axis(lambda x:x[0]>=0.5 and x[1]>=0.5,1,a)
then use it for indexing:
out = a[rows]
print(out)
output:
[[1. 1. ]
[0.6 0.6]]

It can be solved using python generators.
import numpy as np
p = [[1,1] , [0.6,0.6], [0,0]]
result = np.array([x for x in p if x[0]>0.5 and x[1]>0.5 ])
You can read more about generators from here.
Also you can try this:-
p = np.array(p)
result= p[np.all(p>0.5, axis=1)]

Fastest way to add two arrays to create a matrix with python [duplicate]

This question already has answers here:
Subtract all pairs of values from two arrays
(2 answers)
Closed 4 years ago.
I have two numpy arrrays:
import numpy as np
points_1 = np.array([1.5,2.5,1,3])
points_2 = np.array([3,4])
I would like to take evey point from points_1 array and deduce whole points_2 array from it in order to get a matrix
I would like to get
[[-1.5,-2.5]
[-0.5,-1.5]
[-2 , -3]
[0 , -1]]
I know there is a way with iteration
points = [x - points_2 for x in points_1]
points = np.array(points)
However this option is not fast enough. In reality I am using much bigger arrays.
Is there some fastser way?
Thanks!

You just have to chose points_2 "better" (better means here an other dimension of you matrix), then it works as you expect it:
so do not use points_2 = np.array([3, 4]) but points_2 = np.array([[3],[4]]):
import numpy as np
points_1 = np.array([1.5,2.5,1,3])
points_2 = np.array([[3],[4]])
points = (points_1 - points_2).transpose()
print(points)
results in:
[[-1.5 -2.5]
[-0.5 -1.5]
[-2. -3. ]
[ 0. -1. ]]

If you don't the whole array at once. You can use generators and benefit from lazy evaluation:
import numpy as np
points_1 = np.array([1.5,2.5,1,3])
points_2 = np.array([3,4])
def get_points():
def get_points_internal():
for p1 in points_1:
for p2 in points_2:
yield [p1 - p2]
x = len(points_1) * len(points_2)
points_1d = get_points_internal()
for i in range(0, int(x/2)):
yield [next(points_1d), next(points_1d)]
points = get_points()

Make use of numpy's broadcasting feature. This will provide the following:
import numpy as np
points_1 = np.array([1.5,2.5,1,3])
points_2 = np.array([3,4])
points = points_1[:, None] - points_2
print(points)
Output:
[[-1.5 -2.5]
[-0.5 -1.5]
[-2. -3. ]
[ 0. -1. ]]
It works by repeating the operation over the 1 dimension injected by the None index. For more info see the link.

You can do it in one line :
np.subtract.outer(points_1,points_2)
This is vectored so very fast.

You need to use tranposed matrix.
points_1-np.transpose([points_2])
and for your result
np.tanspose(points_1-np.transpose([points_2]))

Compare two numpy arrays by first Column and create a third numpy array by concatenating two arrays

I have two 2d numpy arrays which is used to plot simulation results.
The first column of both arrays a and b contains the time intervals and the second column contains the data to be plotted. The two arrays have different shapes a(500,2) b(600,2). I want to compare these two numpy arrays by first column and create a third array with matches found on the first column of a. If no match is found add 0 to third column.
Is there any numpy trick to do this?
For instance:
a=[[0.002,0.998],
[0.004,0.997],
[0.006,0.996],
[0.008,0.995],
[0.010,0.993]]
b= [[0.002,0.666],
[0.004,0.665],
[0.0041,0.664],
[0.0042,0.664],
[0.0043,0.664],
[0.0044,0.663],
[0.0045,0.663],
[0.0005,0.663],
[0.006,0.663],
[0.0061,0.662],
[0.008,0.661]]
expected output
c= [[0.002,0.998,0.666],
[0.004,0.997,0.665],
[0.006,0.996,0.663],
[0.008,0.995,0.661],
[0.010,0.993, 0 ]]

I can quickly think of the solution as
import numpy as np
a = np.array([[0.002, 0.998],
[0.004, 0.997],
[0.006, 0.996],
[0.008, 0.995],
[0.010, 0.993]])
b = np.array([[0.002, 0.666],
[0.004, 0.665],
[0.0041, 0.664],
[0.0042, 0.664],
[0.0043, 0.664],
[0.0044, 0.663],
[0.0045, 0.663],
[0.0005, 0.663],
[0.0006, 0.663],
[0.00061, 0.662],
[0.0008, 0.661]])
c = []
for row in a:
index = np.where(b[:,0] == row[0])[0]
if np.size(index) != 0:
c.append([row[0], row[1], b[index[0], 1]])
else:
c.append([row[0], row[1], 0])
print c
As pointed out in the comments above, there seems to be a data entry error

import numpy as np
i = np.intersect1d(a[:,0], b[:,0])
overlap = np.vstack([i, a[np.in1d(a[:,0], i), 1], b[np.in1d(b[:,0], i), 1]]).T
underlap = np.setdiff1d(a[:,0], b[:,0])
underlap = np.vstack([underlap, a[np.in1d(a[:,0], underlap), 1], underlap*0]).T
fast_c = np.vstack([overlap, underlap])
This works by taking the intersection of the first column of a and b using intersect1d, and then using in1d to cross-reference that intersection with the second columns.
vstack stacks the elements of the input vertically, and the transpose is needed to get the right dimensions (very fast operation).
Then find times in a that are not in b using setdiff1d, and complete the result by putting 0s in the third column.
This prints out
array([[ 0.002, 0.998, 0.666],
[ 0.004, 0.997, 0.665],
[ 0.006, 0.996, 0. ],
[ 0.008, 0.995, 0. ],
[ 0.01 , 0.993, 0. ]])

The following works both for numpy arrays and simple python lists.
c = [[*x, y[1]] for x in a for y in b if x[0] == y[0]]
d = [[*x, 0] for x in a if x[0] not in [y[0] for y in b]]
c.extend(d)
Someone braver than I am could try to make this one line.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Modifying entire row of an array based on a condition using Numpy - python

You can use advanced indexing: idx = ((xNew<0)|(xNew>1)).any(-1) xNew[idx]=x[idx] output: [[0.5 0.25] [0.45 0.26] [0.6 0.8 ] [0.85 0.89] [0.27 0.78] [0.45 0.05]]

for index, y in enumerate(xNew): if(np.any(np.greater(y,[1,1])) or np.any(np.less(y,[0,0]))): xNew[index] = x[index]

Related

How to scale and print an array based on its minimum and maximum value?

normalize multi dimensional numpy array using the last value along axis 1

Extract from numpy array with coordinates of points

Fastest way to add two arrays to create a matrix with python [duplicate]

Compare two numpy arrays by first Column and create a third numpy array by concatenating two arrays

Categories

Resources