Find index of the first value that is below/above a threshold - python

I have a list with a series of random floats that go from negative to positive, like:
values = [0.001, 0.05, 0.09, 0.1, 0.4, 0.8, 0.9, 0.95, 0.99]
I wish to filter out the indices that first meet the greater than/less than values that I wish. For example, if I want the first closest value less than 0.1 I would get an index of 2 and if I want the first highest value greater than 0.9 I'd get 7.
I have a find_nearest method that I am using but since this dataset is randomized, this is not ideal.
EDIT: Figured out a solution.
low = next(x[0] for x in enumerate(list(reversed(values))) if x[1] < 0.1)
high = next(x[0] for x in enumerate(values) if x[1] > 0.9)

if the values list gets long you may want the bisect module from the standard lib
bisect_left, bisect_right may serve as the >, < tests
import bisect
values = [0.001, 0.05, 0.09, 0.1, 0.4, 0.8, 0.9, 0.95, 0.99]
bisect.bisect_left(values, .1)
Out[226]: 3
bisect.bisect_right(values, .1)
Out[227]: 4

Related

Change one weight in a list and adjust all other weights accordingly so that the sum of the list is 1.0 again

I have a list of weights which all have a value range between 0.0 and 1.0. The sum of the values in list should be always 1.0.
Now I would like to write a function in which I can change one weight from the list by a certain value (positive or negative). The remaining weights of the lst should be adjusted evenly, so that the sum of the list result in 1.0 again at the end.
Example:
weights = [0.5, 0.2, 0.2, 0.1]
If I increase the second entry of the list by 0.3, the resulting list should look like this:
weights = [0.4, 0.5, 0.1, 0.0]
I've tried with the following function:
def change_weight(weights, index, value):
result = []
weight_to_change = weights[index] + value
weights.pop(index)
for i, weight in enumerate(weights):
if i == index:
result.append(weight_to_change)
result.append(weight - value/len(weights))
return result
This works perfectly for the example above:
weights = [0.5, 0.2, 0.2, 0.1]
print(change_weight(weights, 1, 0.3))
# like expected: [0.4, 0.5, 0.1, 0.0]
However, if I want to change the second weight about 0.5. The the last element of the list will get a negative value:
weights = [0.5, 0.2, 0.2, 0.1]
print(change_weight(weights, 1, 0.5))
results in [0.33, 0.7, 0.03, -0.07]
However, I do not want any negative values in the list. Such values should instead be set to 0.0 and the remainder added or subtracted evenly to the other values.
Does anyone have an idea how I can implement this?
Here is a implementation of the idea of #RemiCuingnet :
def change_weight(weights, index, value):
new_weight = weights[index] + value
old_sum = sum(w for i,w in enumerate(weights) if i != index)
new_weights = []
for i,w in enumerate(weights):
if i == index:
new_weights.append(new_weight)
else:
new_weights.append(w*(1-new_weight)/old_sum)
return new_weights
For example
print(change_weight([0.5, 0.2, 0.2, 0.1],1,.3))
print(change_weight([0.5, 0.2, 0.2, 0.1],1,.5))
Output:
[0.3125, 0.5, 0.12500000000000003, 0.06250000000000001]
[0.18750000000000006, 0.7, 0.07500000000000002, 0.03750000000000001]

Python: Optimize weights in portfolio

I have the following dataframe with weights:
df = pd.DataFrame({'a': [0.1, 0.5, 0.1, 0.3], 'b': [0.2, 0.4, 0.2, 0.2], 'c': [0.3, 0.2, 0.4, 0.1],
'd': [0.1, 0.1, 0.1, 0.7], 'e': [0.2, 0.1, 0.3, 0.4], 'f': [0.7, 0.1, 0.1, 0.1]})
and then I normalize each row using:
df = df.div(df.sum(axis=1), axis=0)
I want to optimize the normalized weights of each row such that no weight is less than 0 or greater than 0.4.
If the weight is greater than 0.4, it will be clipped to 0.4 and the additional weight will be distributed to the other entries in a pro-rata fashion (meaning the second largest weight will receive more weight so it gets close to 0.4, and if there is any remaining weight, it will be distributed to the third and so on).
Can this be done using the "optimize" function?
Thank you.
UPDATE: I would also like to set a minimum bound for the weights. In my original question, the minimum weight bound was automatically considered as zero, however, I would like to set a constraint such that the minimum weight is at at least equal to 0.05, for example.
Unfortunately, I can only find a loop solution to this problem. When you trim off the excess weight and redistribute it proportionally, the underweight may go over the limit. Then they have to be trimmed off. And the cycle keep repeating until no value is overweight. The same goes for underweight rows.
# The original data frame. No normalization yet
df = pd.DataFrame(
{
"a": [0.1, 0.5, 0.1, 0.3],
"b": [0.2, 0.4, 0.2, 0.2],
"c": [0.3, 0.2, 0.4, 0.1],
"d": [0.1, 0.1, 0.1, 0.7],
"e": [0.2, 0.1, 0.3, 0.4],
"f": [0.7, 0.1, 0.1, 0.1],
}
)
def ensure_min_weight(row: np.array, min_weight: float):
while True:
underweight = row < min_weight
if not underweight.any():
break
missing_weight = min_weight * underweight.sum() - row[underweight].sum()
row[~underweight] -= missing_weight / row[~underweight].sum() * row[~underweight]
row[underweight] = min_weight
def ensure_max_weight(row: np.array, max_weight: float):
while True:
overweight = row > max_weight
if not overweight.any():
break
excess_weight = row[overweight].sum() - (max_weight * overweight.sum())
row[~overweight] += excess_weight / row[~overweight].sum() * row[~overweight]
row[overweight] = max_weight
values = df.to_numpy()
normalized = values / values.sum(axis=1)[:, None]
min_weight = 0.15 # just for fun
max_weight = 0.4
for i in range(len(values)):
row = normalized[i]
ensure_min_weight(row, min_weight)
ensure_max_weight(row, max_weight)
# Normalized weight
assert np.isclose(normalized.sum(axis=1), 1).all(), "Normalized weight must sum up to 1"
assert ((min_weight <= normalized) & (normalized <= max_weight)).all(), f"Normalized weight must be between {min_weight} and {max_weight}"
print(pd.DataFrame(normalized, columns=df.columns))
# Raw values
# values = normalized * values.sum(axis=1)[:, None]
# print(pd.DataFrame(values, columns=df.columns))
Note that this algorithm will run into infinite loop if your min_weight and max_weight are illogical: try min_weight = 0.4 and max_weight = 0.5. You should handle these errors in the 2 ensure functions.

How to get elements from a specific range out of a list?

Does anybody have an idea how to get the elements in a list whose values fall within a specific (from - to) range?
I need a loop to check if a list contains elements in a specific range, and if there are any, I need the biggest one to be saved in a variable..
Example:
list = [0.5, 0.56, 0.34, 0.45, 0.53, 0.6]
# range (0.5 - 0.58)
# biggest = 0.56
You could use a filtered comprehension to get only those elements in the range you want, then find the biggest of them using the built-in max():
lst = [0.5, 0.56, 0.34, 0.45, 0.53, 0.6]
biggest = max([e for e in lst if 0.5 < e < 0.58])
# biggest = 0.56
As an alternative to other answers, you can also use filter and lambda:
lst = [0.5, 0.56, 0.34, 0.45, 0.53, 0.6]
biggest = max([i for i in filter(lambda x: 0.5 < x < 0.58, lst)])
I suppose a normal if check would be faster, but I'll give this just for completeness.
Also, you should not use list = ... as list is a built-in in python.
You could also go about it a step at a time, as the approach may aid in debugging.
I used numpy in this case, which is also a helpful tool to put in your tool belt.
This should run as is:
import numpy as np
l = [0.5, 0.56, 0.34, 0.45, 0.53, 0.6]
a = np.array(l)
low = 0.5
high = 0.58
index_low = (a < high)
print(index_low)
a_low = a[index_low]
print(a_low)
index_in_range = (a_low >= low)
print(index_in_range)
a_in_range = a_low[index_in_range]
print(a_in_range)
a_max = a_in_range.max()
print(a_max)

np.ceil in python doesn't work in for loop

I am quite new to python and I have a problem with the np.ceil function. So when I do np.ceil(10/0.1), I get 100, which is what I expect. However when I do it in a for loop:
interval = np.arange(0.01,0.2,0.01)
for i in interval:
print(np.ceil(10/i))
I obtain the right results for all values of i, except for i=0.1. For this I get 101 instead of 100. Can someone tell me why is this happening? Thank you!
It has nothing to do with being in a variable or not. The values are simply not the same:
In [9]: interval
Out[9]:
array([ 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09,
0.1 , 0.11, 0.12, 0.13, 0.14, 0.15, 0.16, 0.17, 0.18, 0.19])
In [10]: x = interval[9]
In [11]: x
Out[11]: 0.099999999999999992
In [12]: i = 0.1
In [13]: x == i
Out[13]: False
Note, neither number is exactly 0.1 since that number cannot be represented exactly using binary floating point. Also note:
In [14]: type(x), type(i)
Out[14]: (numpy.float64, float)
Although, that isn't as relevant.
You can force another float representation by rounding it.
This should do the trick in your specific case.
interval = np.arange(0.01,0.2,0.01)
for i in interval:
if i == 0.1:
print('this will miss')
if i == interval[9]:
print('this will hit')
j = round(i, 3)
print(np.ceil(10/i), np.ceil(10/j))

Plotting several graphs with values extracted from one array

I have an numpy array, lets say one with 4 rows and 6 (always even number) columns:
m=np.round(np.random.rand(4,6))
array([[ 0.99, 0.48, 0.05, 0.26, 0.92, 0.44],
[ 0.81, 0.54, 0.19, 0.38, 0.5 , 0.02],
[ 0.11, 0.96, 0.04, 0.69, 0.78, 0.31],
[ 0.5 , 0.53, 0.94, 0.77, 0.6 , 0.75]])
I now want to plot graphs according to the column pairs, in this case
Graph 1: x-values=m[:,1] and y-values=m[:,0]
Graph 2: x-values=m[:,3] and y-values=m[:,2]
Graph 3: x-values=m[:,5] and y-values=m[:,4]
The first two columns are basically a pair of values, the next two are another pair of values and the last two also are a pair of values.
All the graphs should be in the same plot!
I need a general solution for plotting multiple graphs like this with an undefined but EVEN number of columns of the array. Something like a loop!
Hope somebody can help me :)
you can loop on all values of the column pairs
import matplotlib.pyplot
i=1
while i<len(m[0]):
x = m[:,i]
y = m[:,i-1]
plt.plot(x,y)
plt.savefig('placeholderName_%d.png' % i)
plt.close()
i=i+2
note that I'm starting at 1, and incrementing by two. this conforms to the example you presented
This gives terrible results with the m arra y you specified, but if it was just a sample and your data is more realistic, the following should do:
for i in range(m.shape[1] // 2):
plt.figure()
plt.plot(m[:, 2* i], m[:, 2 * i + 1])
If you want all the plots on the same figure, just move the plt.figure() out of the loop:
plt.figure()
for i in range(m.shape[1] // 2):
plt.plot(m[:, 2* i], m[:, 2 * i + 1])

Categories