Find the number closest to zero in an array - python

I have a bunch of numpy arrays that have both positive and negative numbers in them and I want to find the number closest to zero in each array not the smallest number. I also want to retain the signs of the numbers. Example below:
array1 = np.array([5, 3.2, -1.2, -0.2, 7])
array2 = np.array([19, -20, -4.7, 6, 9, 42])
array3 = np.array([4, 0.3, -9, 8, 6, 14])
Ideal output would be something that give me the number closest to zero, so for each array respectively it would be:
"Closest to zero for array 1:" -0.2
"Closest to zero for array 2:" -4.7
"Closest to zero for array 3:" 0.3
Is there any way to do this?

One way without numpy; using min with abs:
for arr in [array1, array2, array3]:
print(arr, min(arr, key=abs))
Output:
[ 5. 3.2 -1.2 -0.2 7. ] -0.2
[ 19. -20. -4.7 6. 9. 42. ] -4.7
[ 4. 0.3 -9. 8. 6. 14. ] 0.3

A combination of argmin and abs:
>>> for array in (array1, array2, array3):
... print(array, array[np.argmin(np.abs(array))])
[ 5. 3.2 -1.2 -0.2 7. ] -0.2
[ 19. -20. -4.7 6. 9. 42. ] -4.7
[ 4. 0.3 -9. 8. 6. 14. ] 0.3

min1=abs(array1[0])
for i in array1:
if(abs(i)<abs(min1)):
min1=i
print("Closest to zero for array 1: "+ str(min1))

if you are trying to get the minimum value
np.minimum([2, 3, 4], [1, 5, 2])
//
np.minimum(np.eye(2), [0.5, 2]) # broadcasting
reference: https://numpy.org/doc/stable/reference/generated/numpy.minimum.html

myList = [4, 1, 88, 44, 3,-1,-7,-19,-0.5,-0.2]
def compute_closest_to_zero(myList):
positive = []
negative = []
if len(myList) == 0:
print('0')
else:
for i in myList:
if i >= 0:
positive.append(i)
#print(positive)
else:
negative.append(i)
#print(negative)
#print(min(positive))
if min(positive) + max(negative) < 0:
print(min(positive))
else:
print(max(negative))
return
Blockquote
compute_closest_to_zero(myList)

Related

Formating multidimensional numpy arrays to find the sum betyween 2 values Python

The function down below is meant to sum out all second row values of Numbers[:,0] between 2 consecutive elements of limits limit1-3. For the first calculation if none of the values are between 0 and 2 (the first two elements of limit1) within Numbers so the resultant is 0. For the second calculation 3,4 within Numbers[:,0] is between the values 2-5 in limit1 so the second column of Numbers is summed up 1+3 =4 resulting in 4. How could I implement this to the function below?
def formating(a, b, c):
# Formating goes here
x = np.sort(c);
# digitize
l = np.digitize(a, x)
# output:
result = np.bincount(l, weights=b)
return result[1:len(b)]
Numbers = np.array([[3,1], [4,3], [5,3], [7,11], [8,9], [10,20] , [20, 45]])
limit1 = np.array([0, 2 , 5, 12, 15])
limit2 = np.array([0, 2 , 5, 12])
limit3 = np.array([0, 2 , 5, 12, 15, 22])
result1= formating(Numbers[:,0], Numbers[:,1], limit1)
result2= formating(Numbers[:,0], Numbers[:,1], limit2)
result3= formating(Numbers[:,0], Numbers[:,1], limit3)
Expected Output
result1: [ 0. 4. 43. 0. ]
result2: [ 0. 4. 43. ]
result3: [ 0. 4. 43. 0. 45.]
Current Output
result1: [ 0. 4. 43. 0. 45.]
result2: [ 0. 4. 43. 45.]
result3: [ 0. 4. 43. 0. 45.]
This:
return result[1:len(b)]
should be
return result[1:len(c)]
Your return vector is dependent on the length of your bins, not your input data.

Python, compare n by m numpy array with n sized numpy array

Im doing a programming project and I´m hard stuck for some reason.
gradeList = [-3,0,2,4,7,10,12]
for i1 in range(np.size(grades,1)-1):
for i2 in range(np.size(grades,0)-1):
for i3 in range(len(gradeList)-1):
if grades[i1,i2] != gradeList[i3]:
print(grades[i1,i2])
print(i1,i2,i3)
print("This is an error"+str(grades[i1,i2]))
else:
print("FINE")
I´m trying to check each value in the n by m array for each value in my gradeList and eventually I want to print the position of the grades in the n by m array that are not in the gradeList. I get the following error code:
IndexError: index 3 is out of bounds for axis 1 with size 3
My grades array:
grades = np.array([[ 7. 7. 4. ],[ 12. 10. 10. ],[ -3. 7. 2. ],[ 10. 12. 12. ],[ 12. 12. 12. ],[ 10. 12. 12. ],[ -3.8 2.2 11. ],[ 20. 12.6 100. ],[ 4. -3. 7. ],[ 10. 10. 10. ],[ 4. -3. 7. ],[ 10. 10. 10. ],[ 10. 10. 10. ],[ 12. 12. 12. ],[ -3. -3. -3. ],[ 20. 12.6 100. ]])
I think the problem lies there:
# i1 => [0,1]
# i2 => [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
# i3 => [0, 1, 2, 3, 4, 5]
When you call grades[i1,i2] you can have grades[0,3] and its out of bounds, since that axis has three elements.
I guess the solution is to change grades[i1,i2] to grades[i2,i1] where it appears (three times).
You've swapped the definitions of i1 and i2, which causes the error you're getting. Here's how to fix your code:
for i1 in range(grades.shape[0]):
for i2 in range(grades.shape[1]):
for i3 in range(len(gradeList)):
if grades[i1,i2] != gradeList[i3]:
print(grades[i1,i2])
print(i1,i2,i3)
print("This is an error"+str(grades[i1,i2]))
else:
print("FINE")
In the above code grades.shape[0] is equivalent to np.size(grades, 0) in your original code. The grades.shape[0] is the more commonly used idiom.
Additionally, I've removed all of the -1 adjustments from the definition of your ranges. If you have those there it will prevent your loops from reaching the last values in your arrays. The behavior of range is that it will stop one value before it reaches the maximum value you set it to.
For example, list(range(len(gradeList))) will return the complete set of indices of gradeList:
[0, 1, 2, 3, 4, 5, 6]
whereas list(range(len(gradeList - 1))) will omit the last index:
[0, 1, 2, 3, 4, 5]

Finding percentage change with Numpy

I'm writing a function to find the percentage change using Numpy and function calls. So far what I got is:
def change(a,b):
answer = (np.subtract(a[b+1], a[b])) / a[b+1] * 100
return answer
print(change(a,0))
"a" is the array I have made and b will be the index/numbers I am trying to calculate.
For example:
My Array is
[[1,2,3,5,7]
[1,4,5,6,7]
[5,8,9,10,32]
[3,5,6,13,11]]
How would I calculate the percentage change between 1 to 2 (=0.5) or 1 to 4(=0.75) or 5,7 etc..
Note: I know how mathematically to get the change, I'm not sure how to do this in python/ numpy.
If I understand correctly, that you're trying to find percent change in each row, then you can do:
>>> np.diff(a) / a[:,1:] * 100
Which gives you:
array([[ 50. , 33.33333333, 40. , 28.57142857],
[ 75. , 20. , 16.66666667, 14.28571429],
[ 37.5 , 11.11111111, 10. , 68.75 ],
[ 40. , 16.66666667, 53.84615385, -18.18181818]])
I know you have asked this question with Numpy in mind and got answers above:
import numpy as np
np.diff(a) / a[:,1:]
I attempt to solve this with Pandas. For those who would have the same question but using Pandas instead of Numpy
import pandas as pd
data = [[1,2,3,4,5],
[1,4,5,6,7],
[5,8,9,10,32],
[3,5,6,13,11]]
df = pd.DataFrame(data)
df_change = df.rolling(1,axis=1).sum().pct_change(axis=1)
print(df_change)
I suggest to simply shift the array. The computation basically becomes a one-liner.
import numpy as np
arr = np.array(
[
[1, 2, 3, 5, 7],
[1, 4, 5, 6, 7],
[5, 8, 9, 10, 32],
[3, 5, 6, 13, 11],
]
)
# Percentage change from row to row
pct_chg_row = arr[1:] / arr[:-1] - 1
[[ 0. 1. 0.66666667 0.2 0. ]
[ 4. 1. 0.8 0.66666667 3.57142857]
[-0.4 -0.375 -0.33333333 0.3 -0.65625 ]]
# Percentage change from column to column
pct_chg_col = arr[:, 1::] / arr[:, 0:-1] - 1
[[ 1. 0.5 0.66666667 0.4 ]
[ 3. 0.25 0.2 0.16666667]
[ 0.6 0.125 0.11111111 2.2 ]
[ 0.66666667 0.2 1.16666667 -0.15384615]]
You could easily generalize the task, so that you are not limited to compute the change from one row/column to another, but be able to compute the change for n rows/columns.
n = 2
pct_chg_row_generalized = arr[n:] / arr[:-n] - 1
[[4. 3. 2. 1. 3.57142857]
[2. 0.25 0.2 1.16666667 0.57142857]]
pct_chg_col_generalized = arr[:, n:] / arr[:, :-n] - 1
[[2. 1.5 1.33333333]
[4. 0.5 0.4 ]
[0.8 0.25 2.55555556]
[1. 1.6 0.83333333]]
If the output array must have the same shape as the input array, you need to make sure to insert the appropriate number of np.nan.
out_row = np.full_like(arr, np.nan, dtype=float)
out_row[n:] = arr[n:] / arr[:-n] - 1
[[ nan nan nan nan nan]
[ nan nan nan nan nan]
[4. 3. 2. 1. 3.57142857]
[2. 0.25 0.2 1.16666667 0.57142857]]
out_col = np.full_like(arr, np.nan, dtype=float)
out_col[:, n:] = arr[:, n:] / arr[:, :-n] - 1
[[ nan nan 2. 1.5 1.33333333]
[ nan nan 4. 0.5 0.4 ]
[ nan nan 0.8 0.25 2.55555556]
[ nan nan 1. 1.6 0.83333333]]
Finally, a small function for the general 2D case might look like this:
def np_pct_chg(arr: np.ndarray, n: int = 1, axis: int = 0) -> np.ndarray:
out = np.full_like(arr, np.nan, dtype=float)
if axis == 0:
out[n:] = arr[n:] / arr[:-n] - 1
elif axis == 1:
out[:, n:] = arr[:, n:] / arr[:, :-n] - 1
return out
The accepted answer is close but incorrect if you're trying to take % difference from left to right.
You should get the following percent difference:
1,2,3,5,7 --> 100%, 50%, 66.66%, 40%
check for yourself: https://www.calculatorsoup.com/calculators/algebra/percent-change-calculator.php
Going by what Josmoor98 said, you can use np.diff(a) / a[:,:-1] * 100 to get the percent difference from left to right, which will give you the correct answer.
array([[100. , 50. , 66.66666667, 40. ],
[300. , 25. , 20. , 16.66666667],
[ 60. , 12.5 , 11.11111111, 220. ],
[ 66.66666667, 20. , 116.66666667, -15.38461538]])
import numpy as np
a = np.array([[1,2,3,5,7],
[1,4,5,6,7],
[5,8,9,10,32],
[3,5,6,13,11]])
np.array([(i[:-1]/i[1:]) for i in a])
Combine all your arrays.
Then make a data frame from them.
df = pd.df(data=array you made)
Use the pct_change() function on dataframe. It will calculate the % change for all rows in dataframe.

Finding 1000 linear interpolated values between every number of a list in Python

I am a beginner in Python and am stuck on a problem. I have two lists of 60 floating point numbers, lets call them start and end. The numbers in both the lists are not in an increasing or decreasing order.
start = [ ] //60 floating point numbers
end = [ ] // 60 floating numbers
I would like to find 1000 interpolated values between start[0] and end[0] and repeat the process for all 60 values of list. How do I go about it?
You can do this with a list comprehension and using numpy.linspace
import numpy as np
[np.linspace(first, last, 1000) for first, last in zip(start, end)]
As a small example (with fewer values)
>>> start = [1, 5, 10]
>>> end = [2, 10, 20]
>>> [np.linspace(first, last, 5) for first, last in zip(start, end)]
[array([ 1. , 1.25, 1.5 , 1.75, 2. ]),
array([ 5. , 6.25, 7.5 , 8.75, 10. ]),
array([ 10. , 12.5, 15. , 17.5, 20. ])]

numpy how find local minimum in neighborhood on 1darray

I've got a list of sorted samples. They're sorted by their sample time, where each sample is taken one second after the previous one.
I'd like to find the minimum value in a neighborhood of a specified size.
For example, given a neighborhood size of 2 and the following sample size:
samples = [ 5, 12.3, 12.3, 7, 2, 6, 9, 10, 5, 9, 17, 2 ]
I'd expect the following output: [5, 2, 5, 2]
What would be the best way to achieve this in numpy / scipy
Edited: Explained the reasoning behind the min values:
5 - the 2 number window next to it are [12.3 12.3]. 5 is smaller
2 - to the left [12.3, 7] to the right [6 9]. 2 is the min
5 - to the left [9 10] to the right [9 17]. 5 is the min
notice that 9 isn't min are there's a 2 window to its left and right with a smaller value (2)
Use scipy's argrelextrema:
>>> import numpy as np
>>> from scipy.signal import argrelextrema
>>> data = np.array([ 5, 12.3, 12.3, 7, 2, 6, 9, 10, 5, 9, 17, 2 ])
>>> radius = 2 # number of elements to the left and right to compare to
>>> argrelextrema(data, np.less, order=radius)
(array([4, 8]),)
Which suggest that numbers at position 4 and 8 (2 and 5) are the smallest ones in within a 2 size neighbourhood. The numbers at boundaries (5 and 2) are not detected since argrelextrema only supports clip or wrap boundary conditions. As for your question, I guess you are interested in them too. To detect them, it is easy to add reflect boundary conditions first:
>>> new_data = np.pad(data, radius, mode='reflect')
>>> new_data
array([ 12.3, 12.3, 5. , 12.3, 12.3, 7. , 2. , 6. , 9. ,
10. , 5. , 9. , 17. , 2. , 17. , 9. ])
With the data with the corresponding boundary conditions, we can now apply the previus extrema detector:
>>> arg_minimas = argrelextrema(new_data, np.less, order=radius)[0] - radius
>>> arg_minimas
array([ 0, 4, 8, 11])
Which returns the positions where the local extrema (minimum in this case since np.less) happens in a sliding window of radius=2.
NOTE the -radius to fix the +radius index after wrapping the array with reflect boundary conditions with np.pad.
EDIT: if you are insterested in the values and not in positions, it is straight forward:
>>> data[arg_minimas]
array([ 5., 2., 5., 2.])
It seems, basically you are finding local minima in a sliding window, but that sliding window slides in such a manner that the ending of the previous window act as the starting of a new window. For such a specific problem, suggested in this solution is a vectorized approach that uses broadcasting -
import numpy as np
# Inputs
N = 2
samples = [ 5, 12.3, 12.3, 7, 2, 6, 9, 10, 5, 9, 17, 2 ]
# Convert input list to a numpy array
S = np.asarray(samples)
# Calculate the number of Infs to be appended at the end
append_endlen = int(2*N*np.ceil((S.size+1)/(2*N))-1 - S.size)
# Append Infs at the start and end of the input array
S1 = np.concatenate((np.repeat(np.Inf,N),S,np.repeat(np.Inf,append_endlen)),0)
# Number of sliding windows
num_windows = int((S1.size-1)/(2*N))
# Get windowed values from input array into rows.
# Thus, get minimum from each row to get the desired local minimum.
indexed_vals = S1[np.arange(num_windows)[:,None]*2*N + np.arange(2*N+1)]
out = indexed_vals.min(1)
Sample runs
Run # 1: Original input data
In [105]: S # Input array
Out[105]:
array([ 5. , 12.3, 12.3, 7. , 2. , 6. , 9. , 10. , 5. ,
9. , 17. , 2. ])
In [106]: N # Window radius
Out[106]: 2
In [107]: out # Output array
Out[107]: array([ 5., 2., 5., 2.])
Run # 2: Modified input data, Window radius = 2
In [101]: S # Input array
Out[101]:
array([ 5. , 12.3, 12.3, 7. , 2. , 6. , 9. , 10. , 5. ,
9. , 17. , 2. , 0. , -3. , 7. , 99. , 1. , 0. ,
-4. , -2. ])
In [102]: N # Window radius
Out[102]: 2
In [103]: out # Output array
Out[103]: array([ 5., 2., 5., -3., -4., -4.])
Run # 3: Modified input data, Window radius = 3
In [97]: S # Input array
Out[97]:
array([ 5. , 12.3, 12.3, 7. , 2. , 6. , 9. , 10. , 5. ,
9. , 17. , 2. , 0. , -3. , 7. , 99. , 1. , 0. ,
-4. , -2. ])
In [98]: N # Window radius
Out[98]: 3
In [99]: out # Output array
Out[99]: array([ 5., 2., -3., -4.])
>>> import numpy as np
>>> a = np.array(samples)
>>> [a[max(i-2,0):i+2].min() for i in xrange(1, a.size)]
[5.0, 2.0, 2.0, 2.0, 2.0, 5.0, 5.0, 5.0, 2.0]
As Divakar pointed out in the comments, this is what a sliding window yields. If you want to remove duplicates, that can be done separately
This will look through each window, find the minimum value, and add it to a list if the window's minimum value isn't equal to the most recently added value.
samples = [5, 12.3, 12.3, 7, 2, 6, 9, 10, 5, 9, 17, 2]
neighborhood = 2
minima = []
for i in xrange(len(samples)):
window = samples[max(0, i - neighborhood):i + neighborhood + 1]
windowMin = min(window)
if minima == [] or windowMin != minima[-1]:
minima.append(windowMin)
This gives the output you described:
print minima
> [5, 2, 5, 2]
However, #imaluengo's answer is better since it will include both of two consecutive equal minimum values if they have different indices in the original list!

Categories