I'm currently writing a code, and I have to extract from a numpy array.
For example: [[1,1] , [0.6,0.6], [0,0]]), given the condition for the extracted points [x,y] must satisfy x>=0.5 and y >= 0.5
I've tried to use numpy extract, with the condition arr[0]>=0.5 & arr[1]>=0.5 however that does not seem to work
It applied the condition on all the elements, and I just want it to apply to the points inside my array.
Thanks in advance!
You can use multiple conditions to slice an array as follows:
import numpy as np
a = np.array([[1, 1] , [0.6, 0.6], [0, 0]])
new = a[(a[:, 0] >= 0.5) & (a[:, 1] >= 0.5)]
Results:
array([[1. , 1. ],
[0.6, 0.6]])
The first condition filters on column 0 and the second condition filters on column 1. Only rows where both conditions are met will be in the results.
I would do it following way: firstly look for rows full-filling condition:
import numpy as np
a = np.array([[1,1] , [0.6,0.6], [0,0]])
rows = np.apply_along_axis(lambda x:x[0]>=0.5 and x[1]>=0.5,1,a)
then use it for indexing:
out = a[rows]
print(out)
output:
[[1. 1. ]
[0.6 0.6]]
It can be solved using python generators.
import numpy as np
p = [[1,1] , [0.6,0.6], [0,0]]
result = np.array([x for x in p if x[0]>0.5 and x[1]>0.5 ])
You can read more about generators from here.
Also you can try this:-
p = np.array(p)
result= p[np.all(p>0.5, axis=1)]
Related
Given three lists, e.g.
a = [0.4, 0.6, 0.8]
b = [0.3, 0.2, 0.5]
c = [0.1, 0.6, 0.12]
I want to generate a confusion matrix, which essentially applies a function (e.g. the correlation) between each of the combinations of the lists.
Essentially the calculations then look like this:
confusion_matrix = np.array([
[1,
scipy.stats.pearsonr(a, b)[0],
scipy.stats.pearsonr(a, c)[0]],
[scipy.stats.pearsonr(b, a)[0],
1,
scipy.stats.pearsonr(b, c)[0]],
[scipy.stats.pearsonr(c, a)[0],
scipy.stats.pearsonr(c, b)[0],
1]
])
Does a Python function exist, which is capable of generating such a matrix automatically, without spelling out every element? If this could also generates a heatmap from the matrix, that would be even better.
You can write a list comprehension:
import numpy as np
from scipy.stats import pearsonr
from itertools import product
matrix = [a, b, c]
np.array([
[1 if i1 == i2 else pearsonr(matrix[i1], matrix[i2])[0]
for i2 in range(len(a))] for i1 in range(len(a))
])
This outputs:
[[ 1. 0.65465367 0.03532591]
[ 0.65465367 1. -0.73233089]
[ 0.03532591 -0.73233089 1. ]]
I have an array:
xNew = np.array([[0.50,0.25],[-0.4,-0.2],[0.60,0.80],[1.20,1.90],[-0.10,0.60],[0.10,1.2]])
and another array:
x = np.array([[0.55,0.34],[0.45,0.26],[0.14,0.29],[0.85,0.89],[0.27,0.78],[0.45,0.05]])
If an element in a row is smaller than 0 or larger than 1 in xNew , that row should be entirely replaced by corresponding row in x. The desired output is:
xNew = np.array([[0.50,0.25],[0.45,0.26],[0.60,0.80],[0.85,0.89],[0.27,0.78],[0.45,0.05]])
I am looking for an efficient way to accomplish this using numpy functions.
Thanks!
You can use advanced indexing:
idx = ((xNew<0)|(xNew>1)).any(-1)
xNew[idx]=x[idx]
output:
[[0.5 0.25]
[0.45 0.26]
[0.6 0.8 ]
[0.85 0.89]
[0.27 0.78]
[0.45 0.05]]
for index, y in enumerate(xNew):
if(np.any(np.greater(y,[1,1])) or np.any(np.less(y,[0,0]))):
xNew[index] = x[index]
This question already has answers here:
Subtract all pairs of values from two arrays
(2 answers)
Closed 4 years ago.
I have two numpy arrrays:
import numpy as np
points_1 = np.array([1.5,2.5,1,3])
points_2 = np.array([3,4])
I would like to take evey point from points_1 array and deduce whole points_2 array from it in order to get a matrix
I would like to get
[[-1.5,-2.5]
[-0.5,-1.5]
[-2 , -3]
[0 , -1]]
I know there is a way with iteration
points = [x - points_2 for x in points_1]
points = np.array(points)
However this option is not fast enough. In reality I am using much bigger arrays.
Is there some fastser way?
Thanks!
You just have to chose points_2 "better" (better means here an other dimension of you matrix), then it works as you expect it:
so do not use points_2 = np.array([3, 4]) but points_2 = np.array([[3],[4]]):
import numpy as np
points_1 = np.array([1.5,2.5,1,3])
points_2 = np.array([[3],[4]])
points = (points_1 - points_2).transpose()
print(points)
results in:
[[-1.5 -2.5]
[-0.5 -1.5]
[-2. -3. ]
[ 0. -1. ]]
If you don't the whole array at once. You can use generators and benefit from lazy evaluation:
import numpy as np
points_1 = np.array([1.5,2.5,1,3])
points_2 = np.array([3,4])
def get_points():
def get_points_internal():
for p1 in points_1:
for p2 in points_2:
yield [p1 - p2]
x = len(points_1) * len(points_2)
points_1d = get_points_internal()
for i in range(0, int(x/2)):
yield [next(points_1d), next(points_1d)]
points = get_points()
Make use of numpy's broadcasting feature. This will provide the following:
import numpy as np
points_1 = np.array([1.5,2.5,1,3])
points_2 = np.array([3,4])
points = points_1[:, None] - points_2
print(points)
Output:
[[-1.5 -2.5]
[-0.5 -1.5]
[-2. -3. ]
[ 0. -1. ]]
It works by repeating the operation over the 1 dimension injected by the None index. For more info see the link.
You can do it in one line :
np.subtract.outer(points_1,points_2)
This is vectored so very fast.
You need to use tranposed matrix.
points_1-np.transpose([points_2])
and for your result
np.tanspose(points_1-np.transpose([points_2]))
For machine learning, I'm appliying Parzen Window algorithm.
I have an array (m,n). I would like to check on each row if any of the values is > 0.5 and if each of them is, then I would return 0, otherwise 1.
I would like to know if there is a way to do this without a loop thanks to numpy.
You can use np.all with axis=1 on a boolean array.
import numpy as np
arr = np.array([[0.8, 0.9], [0.1, 0.6], [0.2, 0.3]])
print(np.all(arr>0.5, axis=1))
>> [True False False]
import numpy as np
# Value Initialization
a = np.array([0.75, 0.25, 0.50])
y_predict = np.zeros((1, a.shape[0]))
#If the value is greater than 0.5, the value is 1; otherwise 0
y_predict = (a > 0.5).astype(float)
I have an array (m,n). I would like to check on each row if any of the values is > 0.5
That will be stored in b:
import numpy as np
a = # some np.array of shape (m,n)
b = np.any(a > 0.5, axis=1)
and if each of them is, then I would return 0, otherwise 1.
I'm assuming you mean 'and if this is the case for all rows'. In this case:
c = 1 - 1 * np.all(b)
c contains your return value, either 0 or 1.
I have two 2d numpy arrays which is used to plot simulation results.
The first column of both arrays a and b contains the time intervals and the second column contains the data to be plotted. The two arrays have different shapes a(500,2) b(600,2). I want to compare these two numpy arrays by first column and create a third array with matches found on the first column of a. If no match is found add 0 to third column.
Is there any numpy trick to do this?
For instance:
a=[[0.002,0.998],
[0.004,0.997],
[0.006,0.996],
[0.008,0.995],
[0.010,0.993]]
b= [[0.002,0.666],
[0.004,0.665],
[0.0041,0.664],
[0.0042,0.664],
[0.0043,0.664],
[0.0044,0.663],
[0.0045,0.663],
[0.0005,0.663],
[0.006,0.663],
[0.0061,0.662],
[0.008,0.661]]
expected output
c= [[0.002,0.998,0.666],
[0.004,0.997,0.665],
[0.006,0.996,0.663],
[0.008,0.995,0.661],
[0.010,0.993, 0 ]]
I can quickly think of the solution as
import numpy as np
a = np.array([[0.002, 0.998],
[0.004, 0.997],
[0.006, 0.996],
[0.008, 0.995],
[0.010, 0.993]])
b = np.array([[0.002, 0.666],
[0.004, 0.665],
[0.0041, 0.664],
[0.0042, 0.664],
[0.0043, 0.664],
[0.0044, 0.663],
[0.0045, 0.663],
[0.0005, 0.663],
[0.0006, 0.663],
[0.00061, 0.662],
[0.0008, 0.661]])
c = []
for row in a:
index = np.where(b[:,0] == row[0])[0]
if np.size(index) != 0:
c.append([row[0], row[1], b[index[0], 1]])
else:
c.append([row[0], row[1], 0])
print c
As pointed out in the comments above, there seems to be a data entry error
import numpy as np
i = np.intersect1d(a[:,0], b[:,0])
overlap = np.vstack([i, a[np.in1d(a[:,0], i), 1], b[np.in1d(b[:,0], i), 1]]).T
underlap = np.setdiff1d(a[:,0], b[:,0])
underlap = np.vstack([underlap, a[np.in1d(a[:,0], underlap), 1], underlap*0]).T
fast_c = np.vstack([overlap, underlap])
This works by taking the intersection of the first column of a and b using intersect1d, and then using in1d to cross-reference that intersection with the second columns.
vstack stacks the elements of the input vertically, and the transpose is needed to get the right dimensions (very fast operation).
Then find times in a that are not in b using setdiff1d, and complete the result by putting 0s in the third column.
This prints out
array([[ 0.002, 0.998, 0.666],
[ 0.004, 0.997, 0.665],
[ 0.006, 0.996, 0. ],
[ 0.008, 0.995, 0. ],
[ 0.01 , 0.993, 0. ]])
The following works both for numpy arrays and simple python lists.
c = [[*x, y[1]] for x in a for y in b if x[0] == y[0]]
d = [[*x, 0] for x in a if x[0] not in [y[0] for y in b]]
c.extend(d)
Someone braver than I am could try to make this one line.