How to set threshold values in a numpy array?

How to set threshold values in a numpy array? - python

I have an array of values and I want to set specific values to integers. Anything below 0.95 set to 0, anything above 1.6 set to 2. How can I set everything between 0.95 and 1.6 to 1?
n1_binary = np.where(n1_img_resize < 0.95, 0, n1_img_resize)
n1_binary = np.where(n1_binary > 1.6, 2, n1_binary)

Like this in one line using np.where:
n1_binary = np.where((n1_binary > 0.95) & (n1_binary <= 1.6), 1, n1_binary)
Check below example:
In [652]: a = np.array([0.99, 1.23, 1.7, 9])
In [653]: a = np.where((a > 0.95) & (a <= 1.6), 1, a)
In [654]: a
Out[654]: array([1. , 1. , 1.7, 9. ])

Try this:
a = np.array([0.3, 5, 7, 2])
a[a < 0.95] = 0
a[a > 1.6] = 1
This is very clear and consice, and tells exactly what you are doing. a now is:
[0.0, 1.0, 1.0, 1.0]

Related

how to use pandas list to solve linear equation using numpy

I was hoping to find out how to take two pandas lists and solve for x.
list_a = [-1, 1, 1, -1, 1, 1, 1, -1, 0, 1]
list_b = [0.0, 0.0, 1.75, -1.125, 1.0, 0.5, 0.0, -1.25, 1.375, -0.125]
for each entry in the list I would like to compute the following:
x + list_b = list_a
which would then return a list of 10 elements with the result for x.
Any help is appreciated!

If you convert list_a and list_b to numpy.array you can just solve for x using subtraction, which numpy will perform elementwise
>>> import numpy as np
>>> list_a = np.array([-1, 1, 1, -1, 1, 1, 1, -1, 0, 1])
>>> list_b = np.array([0.0, 0.0, 1.75, -1.125, 1.0, 0.5, 0.0, -1.25, 1.375, -0.125])
>>> x = list_a - list_b
>>> x
array([-1. , 1. , -0.75 , 0.125, 0. , 0.5 , 1. , 0.25 , -1.375, 1.125])

numpy first occurence in array for an array of reference values

Given a threshold alpha and a numpy array a, there are multiple possibilities for finding the first index i such that arr[i] > alpha; see Numpy first occurrence of value greater than existing value:
numpy.searchsorted(a, alpha)+1
numpy.argmax(a > alpha)
In my case, alpha can be either a scalar or an array of arbitrary shape. I'd like to have a function get_lowest that works in both cases:
alpha = 1.12
arr = numpy.array([0.0, 1.1, 1.2, 3.0])
get_lowest(arr, alpha) # 2
alpha = numpy.array(1.12, -0.5, 2.7])
arr = numpy.array([0.0, 1.1, 1.2, 3.0])
get_lowest(arr, alpha) # [2, 0, 3]
Any hints?

You can use broadcasting:
In [9]: arr = array([ 0. , 1.1, 1.2, 3. ])
In [10]: alpha = array([ 1.12, -0.5 , 2.7 ])
In [11]: np.argmax(arr > np.atleast_2d(alpha).T, axis=1)
Out[11]: array([2, 0, 3])
To collapse multidimensional arrays, you can use np.squeeze, but you might have to do something special if you want a Python float in your first case:
def get_lowest(arr, alpha):
b = np.argmax(arr > np.atleast_2d(alpha).T, axis=1)
b = np.squeeze(b)
if np.size(b) == 1:
return float(b)
return b

searchsorted actually does the trick:
np.searchsorted(a, alpha)
The axis argument to argmax helps out; this
np.argmax(numpy.add.outer(alpha, -a) < 0, axis=-1)
does the trick. Indeed
import numpy as np
a = np.array([0.0, 1.1, 1.2, 3.0])
alpha = 1.12
np.argmax(np.add.outer(alpha, -a) < 0, axis=-1) # 0
np.searchsorted(a, alpha) # 0
alpha = np.array([1.12, -0.5, 2.7])
np.argmax(np.add.outer(alpha, -a) < 0, axis=-1) # [2 0 3]
np.searchsorted(a, alpha) # [2 0 3]

Using Python numpy where condition to change entires below a certain value

Here's my array:
import numpy as np
a = np.array([0, 5.0, 0, 5.0, 5.0])
Is it possible to use numpy.where in some way to add a value x to all those entires in a that are less than l?
So something like:
a = a[np.where(a < 5).add(2.5)]
Should return:
array([2.5, 5.0, 2.5, 5.0, 5.0])

a = np.array([0., 5., 0., 5., 5.])
a[np.where(a < 5)] += 2.5
in case you really want to use where or just
a[a < 5] += 2.5
which I usually use for these kind of operations.

You could use np.where to create the array of additions and then simply add to a -
a + np.where(a < l, 2.5,0)
Sample run -
In [16]: a = np.array([1, 5, 4, 5, 5])
In [17]: l = 5
In [18]: a + np.where(a < l, 2.5,0)
Out[18]: array([ 3.5, 5. , 6.5, 5. , 5. ])

Given that you probably need to change the dtype (from int to float) you need to create a new array. A simple way without explicit .astype or np.where calls is multiplication with a mask:
>>> b = a + (a < 5) * 2.5
>>> b
array([ 2.5, 5. , 2.5, 5. , 5. ])
with np.where this can be changed to a simple expression (using the else-condition, third argument, in where):
>>> a = np.where(a < 5, a + 2.5, a)
>>> a
array([ 2.5, 5. , 2.5, 5. , 5. ])

a += np.where(a < 1, 2.5, 0)
where will return the second argument wherever the condition (first argument) is satisfied and the third argument otherwise.

You can use a "masked array" as an index. Boolean operations, such as a < 1 return such an array.
>>> a<1
array([False, False, False, False, False], dtype=bool)
you can use it as
>>> a[a<1] += 1
The a<1 part selects only the items in a that match the condition. You can operate on this part only then.
If you want to keep a trace of your selection, you can proceed in two steps.
>>> mask = a>1
>>> a[mask] += 1
Also, you can count the items matching the conditions:
>>> print np.sum(mask)

Digitize numpy array

I have two vectors:
time_vec = np.array([0.2,0.23,0.3,0.4,0.5,...., 28....])
values_vec = np.array([500,200,220,250,200,...., 218....])
time_vec.shape == values_vec.shape
Now, I want to take the bin the values for every 0.5 second interval and take the mean of the values. So for example
value_vec = np.array(mean_of(500,200,220,250,200), mean_of(next values in next 0.5 second interval))
Is there any numpy method I am missing which bin and take mean of the bins?

You may use np.ufunc.reduceat. You just need to populate where the breaking points are, i.e. when floor(t / .5) changes:
say for:
>>> t
array([ 0. , 0.025 , 0.2125, 0.2375, 0.2625, 0.3375, 0.475 , 0.6875, 0.7 , 0.7375, 0.8 , 0.9 ,
0.925 , 1.05 , 1.1375, 1.15 , 1.1625, 1.1875, 1.1875, 1.225 ])
>>> b
array([ 0.8144, 0.3734, 1.4734, 0.6307, -0.611 , -0.8762, 1.6064, 0.3863, -0.0103, -1.6889, -0.4328, -0.7373,
1.7856, 0.8938, -1.1574, -0.4029, -0.4352, -0.4412, -1.7819, -0.3298])
the break points are:
>>> i = np.r_[0, 1 + np.nonzero(np.diff(np.floor(t / .5)))[0]]
>>> i
array([ 0, 7, 13])
and the sum over each interval is:
>>> np.add.reduceat(b, i)
array([ 3.411 , -0.6975, -3.6545])
and the mean would be sum over length of interval:
>>> np.add.reduceat(b, i) / np.diff(np.r_[i, len(b)])
array([ 0.4873, -0.1162, -0.5221])

You can pass a weights= parameter to np.histogram to compute the summed values within each time bin, then normalize by the bin count:
# 0.5 second time bins to average within
tmin = time_vec.min()
tmax = time_vec.max()
bins = np.arange(tmin - (tmin % 0.5), tmax - (tmax % 0.5) + 0.5, 0.5)
# summed values within each bin
bin_sums, edges = np.histogram(time_vec,bins=bins, weights=values_vec)
# number of values within each bin
bin_counts, edges = np.histogram(time_vec,bins=bins)
# average value within each bin
bin_means = bin_sums / bin_counts

You can use np.bincount that is supposedly pretty efficient for such binning operations. Here's an implementation based on it to solve our case -
# Find indices where 0.5 intervals shifts onto next ones
A = time_vec*2
idx = np.searchsorted(A,np.arange(1,int(np.ceil(A.max()))),'right')
# Setup ID array such that all 0.5 intervals are ID-ed same
out = np.zeros((A.size),dtype=int)
out[idx[idx < A.size]] = 1
ID = out.cumsum()
# Finally use bincount to sum and count elements of same IDs
# and thus get mean values per ID
mean_vec = np.bincount(ID,values_vec)/np.bincount(ID)
Sample run -
In [189]: time_vec
Out[189]:
array([ 0.2 , 0.23, 0.3 , 0.4 , 0.5 , 0.7 , 0.8 , 0.92, 0.95,
1. , 1.11, 1.5 , 2. , 2.3 , 2.5 , 4.5 ])
In [190]: values_vec
Out[190]: array([36, 11, 93, 32, 72, 75, 26, 28, 77, 31, 60, 77, 76, 32, 6, 85])
In [191]: ID
Out[191]: array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 3, 4, 4, 5], dtype=int32)
In [192]: mean_vec
Out[192]: array([ 48.8, 47.4, 68.5, 76. , 19. , 85. ])

matlab ismember function in python

Although similar questions have been raised a couple of times, still I cannot make a function similar to the matlab ismember function in Python. In particular, I want to use this function in a loop, and compare in each iteration a whole matrix to an element of another matrix. Where the same value is occurring, I want to print 1 and in any other case 0.
Let say that I have the following matrices
d = np.reshape(np.array([ 2.25, 1.25, 1.5 , 1. , 0. , 1.25, 1.75, 0. , 1.5 , 0. ]),(1,10))
d_unique = np.unique(d)
then I have
d_unique
array([ 0. , 1. , 1.25, 1.5 , 1.75, 2.25])
Now I want to iterate like
J = np.zeros(np.size(d_unique))
for i in xrange(len(d_unique)):
J[i] = np.sum(ismember(d,d_unique[i]))
so as to take as an output:
J = [3,1,2,2,1,1]
Does anybody have any idea? Many thanks in advance.

In contrast to other answers, numpy has the built-in numpy.in1d for doing that.
Usage in your case:
bool_array = numpy.in1d(array1, array2)
Note: It also accepts lists as inputs.
EDIT (2021):
numpy now recommend using np.isin instead of np.in1d. np.isin preserves the shape of the input array, while np.in1d returns a flattened output.

To answer your question, I guess you could define a ismember similarly to:
def ismember(d, k):
return [1 if (i == k) else 0 for i in d]
But I am not familiar with numpy, so a little adjustement may be in order.
I guess you could also use Counter from collections:
>>> from collections import Counter
>>> a = [2.25, 1.25, 1.5, 1., 0., 1.25, 1.75, 0., 1.5, 0. ]
>>> Counter(a)
Counter({0.0: 3, 1.25: 2, 1.5: 2, 2.25: 1, 1.0: 1, 1.75: 1})
>>> Counter(a).keys()
[2.25, 1.25, 0.0, 1.0, 1.5, 1.75]
>>> c =Counter(a)
>>> [c[i] for i in sorted(c.keys())]
[3, 1, 2, 2, 1, 1]
Once again, not numpy, you will probably have to do some list(d) somewhere.

Try the following function:
def ismember(A, B):
return [ np.sum(a == B) for a in A ]
This should very much behave like the corresponding MALTAB function.

Try the ismember library from pypi.
pip install ismember
Example:
# Import library
from ismember import ismember
# data
d = [ 2.25, 1.25, 1.5 , 1. , 0. , 1.25, 1.75, 0. , 1.5 , 0. ]
d_unique = [ 0. , 1. , 1.25, 1.5 , 1.75, 2.25]
# Lookup
Iloc,idx = ismember(d, d_unique)
# Iloc is boolean defining existence of d in d_unique
print(Iloc)
# [[True True True True True True True True True True]]
# indexes of d_unique that exists in d
print(idx)
# array([5, 2, 3, 1, 0, 2, 4, 0, 3, 0], dtype=int64)
print(d_unique[idx])
array([2.25, 1.25, 1.5 , 1. , 0. , 1.25, 1.75, 0. , 1.5 , 0. ])
print(d[Iloc])
array([2.25, 1.25, 1.5 , 1. , 0. , 1.25, 1.75, 0. , 1.5 , 0. ])
# These vectors will match
d[Iloc]==d_unique[idx]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to set threshold values in a numpy array? - python

I have an array of values and I want to set specific values to integers. Anything below 0.95 set to 0, anything above 1.6 set to 2. How can I set everything between 0.95 and 1.6 to 1? n1_binary = np.where(n1_img_resize < 0.95, 0, n1_img_resize) n1_binary = np.where(n1_binary > 1.6, 2, n1_binary)

Like this in one line using np.where: n1_binary = np.where((n1_binary > 0.95) & (n1_binary <= 1.6), 1, n1_binary) Check below example: In [652]: a = np.array([0.99, 1.23, 1.7, 9]) In [653]: a = np.where((a > 0.95) & (a <= 1.6), 1, a) In [654]: a Out[654]: array([1. , 1. , 1.7, 9. ])

Try this: a = np.array([0.3, 5, 7, 2]) a[a < 0.95] = 0 a[a > 1.6] = 1 This is very clear and consice, and tells exactly what you are doing. a now is: [0.0, 1.0, 1.0, 1.0]

Related

how to use pandas list to solve linear equation using numpy

numpy first occurence in array for an array of reference values

Using Python numpy where condition to change entires below a certain value

Digitize numpy array

matlab ismember function in python

Categories

Resources