Chop a list into a list of lists [duplicate] - python

This question already has answers here:
How do I split a list into equally-sized chunks?
(66 answers)
Closed 9 years ago.
list_of_numbers = [10.0, 12.0, 14.0, 16.0, 15.0, 13.0, 10.0, 9.2, 11.7, 14.8, 16.5, 14.8, 13.8, 10.2, 9.5, 13.0, 14.4, 17.2, 15.4, 12.5, 12.1, 10.0, 12.4, 11.9, 16.8, 15.6, 14.6, 10.4, 10.4, 11.0, 12.2, 18.8, 13.9, 12.0, 6.8, 11.2, 9.4, 12.6, 15.5, 14.0, 11.2, 12.3, 14.3, 11.7, 13.9, 13.4, 21.4, 13.7, 12.6]
Out of this list i want to create a list of lists with 7 elements each. How do i do? (There are 49 elements in the list so i want to i want to create a list of 7 lists in it. Order should remain the same as in the list_of_numbers

You can use a simple combination of slicing and list comprehension:
result = [list_of_numbers[i:i+7]
for i in range(0, len(list_of_numbers), 7)]

You can use zip with iter as follows:
zip(*[iter(list_of_numbers)]*7)
Output:
[(10.0, 12.0, 14.0, 16.0, 15.0, 13.0, 10.0), (9.2, 11.7, 14.8, 16.5, 14.8, 13.8, 10.2), (9.5, 13.0, 14.4, 17.2, 15.4, 12.5, 12.1), (10.0, 12.4, 11.9, 16.8, 15.6, 14.6, 10.4), (10.4, 11.0, 12.2, 18.8, 13.9, 12.0, 6.8), (11.2, 9.4, 12.6, 15.5, 14.0, 11.2, 12.3), (14.3, 11.7, 13.9, 13.4, 21.4, 13.7, 12.6)]

Related

python, create a series of lists from two other lists with index

HELLO thank you in advance for your help, I've been trying to learn python on my own over the last few months!
I have two list of lists :
countries_list = [['Canada'], ['China'], ['Finland'], ...]
ratios = [[10.2, 10.3, 11.4, 12.0], [8.2, 8.1, 9.0, 9.1], [15.4, 15.5, 15.8, 16.0], ...]
I want to merge the lists together according to the indices. For example, countries[0] = ['Canada'] and ratios[0] = [10.2, 10.3, 11.4, 12.0]. I want to use the indices to create this final list:
final_list = [[10.2, 10.3, 11.4, 12.0, 'Canada'], [8.2, 8.1, 9.0, 9.1,'China'], [15.4, 15.5, 15.8, 16.0, 'Finland']...]
this is the code I've come up with for now:
final_list = []
for countries in countries_list:
for ratio_list in ratios:
current_ratios = []
for r in ratio_list:
current_ratios.append(r)
current_ratios.append(countries)
rows_list.append(current_ratios)
print(rows_list)
this is the output:
[[9.8, 10.3, 10.9, 11.4, 12.0, 12.6, 14.8, 19.2, 25.2, 'Eswatini'], [9.8, 10.3, 10.9, 11.4, 12.0, 12.6, 14.8, 19.2, 25.2, 'Bahamas'], [9.8, 10.3, 10.9, 11.4, 12.0, 12.6, 14.8, 19.2, 25.2, 'Jamaica'], [9.8, 10.3, 10.9, 11.4, 12.0, 12.6, 14.8, 19.2, 25.2, 'Chad'], [9.8, 10.3, 10.9, 11.4, 12.0, 12.6, 14.8, 19.2, 25.2, 'Kenya'], [9.8, 10.3, 10.9, 11.4, 12.0, 12.6, 14.8, 19.2, 25.2, 'Mali'], [9.8, 10.3, 10.9, 11.4, 12.0, 12.6, 14.8, 19.2, 25.2, 'Guyana'] ...]
As you can see, it is kinda close to the desired outcome, but the ratios are always the same. The nested loops are very confusing to me and I find myself wondering what the ordering is and just what's happening here in general.
You can use zip() + list-comprehension:
countries_list = [["Canada"], ["China"], ["Finland"]]
ratios = [
[10.2, 10.3, 11.4, 12.0],
[8.2, 8.1, 9.0, 9.1],
[15.4, 15.5, 15.8, 16.0],
]
out = [[*r, *c] for c, r in zip(countries_list, ratios)]
print(out)
Prints:
[
[10.2, 10.3, 11.4, 12.0, "Canada"],
[8.2, 8.1, 9.0, 9.1, "China"],
[15.4, 15.5, 15.8, 16.0, "Finland"],
]

Python plot function in interval

from pylab import *
def x(t) :
if 0 <= t < 8 :
return(2*t)
elif 8 <= t < 20 :
return(t**3)
t = arange(5.0, 20, 0.3)
print([i for i in t])
Output is
[5.0, 5.3, 5.6, 5.8999999999999995, 6.199999999999999, 6.499999999999999, 6.799999999999999, 7.099999999999999, 7.399999999999999, 7.699999999999998, 7.999999999999998, 8.299999999999997, 8.599999999999998, 8.899999999999999, 9.199999999999998, 9.499999999999996, 9.799999999999997, 10.099999999999998, 10.399999999999997, 10.699999999999996, 10.999999999999996, 11.299999999999997, 11.599999999999996, 11.899999999999995, 12.199999999999996, 12.499999999999996, 12.799999999999995, 13.099999999999994, 13.399999999999995, 13.699999999999996, 13.999999999999995, 14.299999999999994, 14.599999999999994, 14.899999999999995, 15.199999999999994, 15.499999999999993, 15.799999999999994, 16.099999999999994, 16.39999999999999, 16.699999999999992, 16.999999999999993, 17.299999999999994, 17.599999999999994, 17.89999999999999, 18.199999999999992, 18.499999999999993, 18.79999999999999, 19.09999999999999, 19.39999999999999, 19.699999999999992, 19.999999999999993]
What I want is
[5.0, 5.3, 5.6, 5.9, 6.2, 6.5, 6.8, 7.1, 7.4, 7.7, 8.0, so on]
When it comes to 8.0, my output is 7.999999999999998 < 8.
So wrong answer.
I want 8.0.
So that I can plot function.
plot(t, array([x(i) for i in t]))
I guess a simple rounding off is all you need.
Change the last line to this:
print([round(i,1) for i in t])
Output:
[5.0, 5.3, 5.6, 5.9, 6.2, 6.5, 6.8, 7.1, 7.4, 7.7, 8.0, 8.3, 8.6, 8.9, 9.2, 9.5, 9.8, 10.1, 10.4, 10.7, 11.0, 11.3, 11.6, 11.9, 12.2, 12.5, 12.8, 13.1, 13.4, 13.7, 14.0, 14.3, 14.6, 14.9, 15.2, 15.5, 15.8, 16.1, 16.4, 16.7, 17.0, 17.3, 17.6, 17.9, 18.2, 18.5, 18.8, 19.1, 19.4, 19.7]
So in your case the code becomes something like:
from pylab import *
def x(t) :
if 0 <= t < 8 :
return(2*t)
elif 8 <= t < 20 :
return(t**3)
t = arange(5.0, 20, 0.3)
t = [round(i,1) for i in t]
print(t)
Now you can use this t and get the following plot:

max and min values in pandas data frame

I have a pandas dataframe which shows hourly temperature readings in 1990, as shown below:
Date and time Dry bulb temperature
0 1990-01-01 00:00:00 8.2
1 1990-01-01 01:00:00 8.1
2 1990-01-01 02:00:00 8.3
3 1990-01-01 03:00:00 8.5
4 1990-01-01 04:00:00 8.8
... ... ...
8755 1990-12-31 19:00:00 3.0
8756 1990-12-31 20:00:00 2.6
8757 1990-12-31 21:00:00 2.8
8758 1990-12-31 22:00:00 4.2
8759 1990-12-31 23:00:00 2.0
I want to calculate the max dry bulb temperature every 24 hours and get the corresponding date and time. How would I go about this?
So far I have:
o=[]
for i in range(0, len(Dataframe['Dry bulb temperature']), 24):
ymax = np.max(Dataframe['Dry bulb temperature'][i:i+24])
o.append(ymax)
print(o)
which gives the max temp every 24 hours as follows:
[9.7, 9.9, 8.4, 10.4, 11.2, 12.0, 10.5, 10.7, 11.9, 12.0, 11.5, 11.4, 10.2, 10.9, 13.6, 11.5, 9.6, 10.9, 10.8, 12.3, 12.3, 12.2, 11.5, 7.9, 12.7, 6.0, 9.4, 8.2, 9.8, 10.6, 9.6, 8.8, 10.8, 8.6, 11.9, 11.7, 12.2, 13.8, 12.5, 10.8, 13.2, 8.2, 7.4, 12.1, 12.4, 8.6, 7.7, 12.3, 13.3, 12.3, 13.1, 12.0, 12.7, 11.5, 12.7, 12.5, 12.5, 8.7, 13.2, 7.7, 9.0, 10.1, 10.6, 10.9, 11.9, 11.4, 13.3, 12.2, 15.0, 14.1, 13.1, 12.9, 13.7, 12.7, 12.7, 16.3, 14.9, 12.8, 11.8, 14.2, 11.5, 11.7, 10.4, 10.1, 9.9, 9.6, 10.6, 12.7, 16.0, 15.3, 14.4, 14.2, 8.6, 7.0, 9.8, 11.6, 12.6, 11.1, 12.3, 12.2, 14.8, 15.2, 11.3, 12.1, 12.0, 12.3, 11.5, 10.8, 10.0, 11.7, 15.3, 12.9, 17.0, 17.6, 18.9, 14.2, 13.3, 14.9, 17.8, 20.6, 21.9, 24.1, 26.8, 25.4, 24.9, 23.5, 16.4, 14.9, 13.8, 14.2, 17.7, 17.9, 16.8, 15.7, 16.3, 18.9, 19.4, 18.3, 14.5, 17.6, 18.8, 18.1, 21.9, 18.2, 14.7, 14.9, 19.4, 20.0, 14.9, 18.9, 16.8, 17.6, 15.8, 14.6, 17.0, 15.6, 16.4, 15.0, 13.9, 18.5, 22.7, 16.4, 16.8, 15.6, 16.7, 19.0, 19.0, 17.2, 17.6, 18.7, 17.4, 15.5, 18.2, 17.8, 18.5, 21.9, 19.7, 21.2, 16.6, 17.3, 16.5, 16.3, 17.2, 18.5, 18.1, 17.3, 16.9, 21.3, 22.6, 17.5, 18.9, 21.9, 26.2, 26.5, 24.7, 25.3, 24.2, 23.3, 22.6, 23.1, 27.6, 30.2, 27.2, 22.1, 19.7, 22.6, 21.1, 23.8, 24.7, 22.1, 22.4, 23.7, 26.9, 29.2, 32.3, 30.0, 21.4, 22.2, 22.0, 23.0, 21.2, 22.6, 23.4, 24.9, 22.6, 19.7, 21.1, 18.9, 18.6, 22.0, 22.2, 19.4, 20.5, 24.8, 24.1, 27.0, 24.8, 25.1, 21.2, 22.6, 20.1, 18.3, 18.8, 20.6, 25.6, 22.1, 18.8, 17.7, 16.7, 18.4, 17.9, 20.2, 21.8, 20.6, 20.5, 21.0, 21.3, 19.6, 18.1, 17.4, 18.8, 16.0, 15.8, 15.9, 16.0, 14.4, 15.3, 16.4, 18.3, 17.3, 18.8, 17.3, 19.2, 16.0, 16.9, 16.4, 15.7, 19.7, 16.5, 14.0, 14.5, 14.7, 17.7, 15.2, 19.8, 18.6, 17.8, 18.0, 16.2, 16.7, 17.1, 17.7, 16.6, 16.1, 13.3, 16.3, 14.8, 14.8, 12.5, 12.8, 13.6, 10.2, 14.0, 12.9, 11.4, 10.7, 10.3, 10.4, 8.7, 9.7, 10.4, 11.0, 13.4, 13.9, 12.9, 16.3, 16.2, 13.1, 14.1, 15.8, 15.3, 12.0, 11.9, 9.7, 9.1, 6.7, 8.8, 7.4, 5.4, 7.9, 7.3, 6.3, 7.6, 8.1, 7.3, 6.6, 9.0, 10.0, 7.4, 4.7, 9.6, 4.0, 3.3, 7.0, 9.7, 10.1, 5.4, 3.4, 3.7, 5.0, 2.3, 3.6, 6.9, 9.4, 12.1, 11.4, 10.1, 10.2, 9.7, 13.7, 7.3, 11.5, 9.4, 9.6, 9.0]
I want to get the corresponding dates for each max temperature in the form:
[9.7,1990-01-02 03:00:00],...,etc.
You can use this:
df['Date and time'] = pd.to_datetime(df['Date and time'])
df1 = df.set_index('Date and time').resample('D')['Dry bulb temperature'].agg({'max':'max', 'min':'min'})
It gives you this output for the visible data in your question:
max min
Date and time
1990-01-01 8.8 8.1
1990-12-31 4.2 2.0
If you really want the result as a list you can use this afterwards:
df1.reset_index().to_numpy()
[array([Timestamp('1990-01-01 00:00:00'), 8.8, 8.1], dtype=object),
array([Timestamp('1990-12-31 00:00:00'), 4.2, 2.0], dtype=object)]
To get the exact datetime of max value per day you can try this:
df2 = df.set_index('Date and time')
df2.loc[df2.groupby(df2.index.dayofyear).idxmax().iloc[:, 0]]
Dry_bulb_temperature
Date_and_time
1990-01-01 04:00:00 8.8
1990-12-31 22:00:00 4.2
You can try to use this one:
from datetime import timedelta
day = min(df['Date and time'])
max_day = max(df['Date and time'])
results = list()
while day <= max_day:
# small part of dataframe
temp = df[(df['Date and time'] >= day) & (df['Date and time'] < day + timedelta(1))]
# Row with max temprature
row = df.iloc[temp['Dry bulb temperature'].idxmax()]
results.append([row['Dry bulb temperature'], row['Date and time']])
day += timedelta(1)

Scaling down matplotlib y-axis values

I have done the following code (don't bother with constants, these are just for the plotting):
percentage = [1.11, 1.63, 0.356, 0.808, 0.0459, 0.355, 0.133, 0.156, 0.0445, 0.631, 0.179, 0.226, 0.0272, 0.201, 0.177, 0.177, 0.224, 0.271, 0.176, 0.279, 0.302, 0.476, 0.397, 0.571, 0.491, 0.872, 1.08, 1.09, 1.23, 1.75, 1.96, 1.96, 1.68, 1.88, 1.57, 1.71, 1.09, 1.06, 1.05, 0.978, 0.724, 0.763, 0.691, 0.897, 0.817, 0.944, 0.825, 0.872, 0.911, 0.911, 0.895, 0.894, 0.823, 0.822, 0.838, 0.766, 0.766, 1.00, 1.01, 1.12, 1.14, 1.11, 1.57, 1.29, 1.69, 1.92, 1.99, 2.02, 2.04, 2.34, 2.45, 2.41, 2.44, 2.21, 2.13, 2.14, 1.89, 1.74, 1.53, 1.25, 1.31, 1.34, 1.38, 1.14, 1.00, 0.882, 0.826, 0.929, 0.580, 0.444, 0.293, 0.880, 0.618, 1.40, 0.538, 1.07]
result = dispatch_evs_arrival(1000, percentage)
samples = create_hist_value(result)
laws = pdf.create_distribution(2)
# Execute the algorithm
em.em_algorithm(samples, laws)
bins = []
i = 0
while i <= 96:
bins.append(i*0.25)
i = i + 1
matplotlib.rcParams.update({'font.size': 18})
# Plotting the graph.
plt.hist(samples, bins=bins, normed=1, color='r', alpha=0.5, histtype='bar', ec='black')
plt.xlabel("Time of day - 15 min. resolution")
plt.ylabel("Probability in %")
xmin, xmax = plt.xlim()
x = np.linspace(xmin, xmax, 10000)
plt.xlim(0, 25)
p = norm.pdf(x, laws[0].mean, laws[0].std_deviation)
p1 = norm.pdf(x, laws[1].mean, laws[1].std_deviation)
plt.plot(x, laws[0].weight * p + laws[1].weight * p1, 'k', linewidth=2)
plt.xticks(np.arange(0, 25, 2.0))
plt.yticks(np.arange(0, 0.13, 0.04))
#plt.plot(x, 'k', linewidth=2)
plt.grid()
plt.show()
This plot the following graph:
I would like to scale down the y-label values by divided it by 4, which will give me the same result overall. Is it possible to do it properly with matplotlib ?
edit: applying iCart answer give me this:
Which is not what I want. I would like the exact same results as in the first diagram, but having 0.03, 0.02 and 0.01 instead of having 0.12, 0.08 and 0.04. I am pretty sure it should be possible, as the overall shape will not change.
Here are the samples:
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25,
0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.5, 0.5, 0.5, 0.75,
0.75, 0.75, 0.75, 0.75, 0.75, 0.75, 0.75, 1.25, 1.25, 1.25,
1.5, 1.75, 1.75, 2.25, 2.25, 2.25, 2.25, 2.25, 2.25, 2.5,
2.75, 2.75, 3.25, 3.25, 3.5, 3.75, 4.0, 4.0, 4.25, 4.25,
4.5, 4.75, 4.75, 5.0, 5.0, 5.0, 5.25, 5.25, 5.25, 5.25,
5.5, 5.5, 5.5, 5.75, 5.75, 5.75, 5.75, 5.75, 6.0, 6.0,
6.0, 6.0, 6.25, 6.25, 6.25, 6.25, 6.25, 6.25, 6.25, 6.25,
6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5,
6.75, 6.75, 6.75, 6.75, 6.75, 6.75, 6.75, 6.75, 6.75, 6.75,
7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0,
7.0, 7.0, 7.25, 7.25, 7.25, 7.25, 7.25, 7.25, 7.25, 7.25,
7.25, 7.25, 7.25, 7.25, 7.25, 7.25, 7.25, 7.25, 7.25, 7.5,
7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5,
7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.75, 7.75,
7.75, 7.75, 7.75, 7.75, 7.75, 7.75, 7.75, 7.75, 7.75, 7.75,
7.75, 7.75, 7.75, 7.75, 7.75, 7.75, 7.75, 8.0, 8.0, 8.0,
8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0,
8.0, 8.0, 8.0, 8.25, 8.25, 8.25, 8.25, 8.25, 8.25, 8.25,
8.25, 8.25, 8.25, 8.25, 8.25, 8.25, 8.25, 8.25, 8.25, 8.25,
8.25, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5,
8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.75, 8.75, 8.75, 8.75,
8.75, 8.75, 8.75, 8.75, 8.75, 8.75, 8.75, 8.75, 8.75, 8.75,
8.75, 8.75, 8.75, 9.0, 9.0, 9.0, 9.0, 9.0, 9.0, 9.0,
9.0, 9.0, 9.0, 9.25, 9.25, 9.25, 9.25, 9.25, 9.25, 9.25,
9.25, 9.25, 9.25, 9.5, 9.5, 9.5, 9.5, 9.5, 9.5, 9.5,
9.5, 9.5, 9.5, 9.75, 9.75, 9.75, 9.75, 9.75, 9.75, 9.75,
9.75, 9.75, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.25,
10.25, 10.25, 10.25, 10.25, 10.25, 10.25, 10.5, 10.5, 10.5, 10.5,
10.5, 10.5, 10.75, 10.75, 10.75, 10.75, 10.75, 10.75, 10.75, 10.75,
11.0, 11.0, 11.0, 11.0, 11.0, 11.0, 11.0, 11.0, 11.25, 11.25,
11.25, 11.25, 11.25, 11.25, 11.25, 11.25, 11.25, 11.5, 11.5, 11.5,
11.5, 11.5, 11.5, 11.5, 11.5, 11.75, 11.75, 11.75, 11.75, 11.75,
11.75, 11.75, 11.75, 12.0, 12.0, 12.0, 12.0, 12.0, 12.0, 12.0,
12.0, 12.0, 12.25, 12.25, 12.25, 12.25, 12.25, 12.25, 12.25, 12.25,
12.25, 12.5, 12.5, 12.5, 12.5, 12.5, 12.5, 12.5, 12.5, 12.75,
12.75, 12.75, 12.75, 12.75, 12.75, 12.75, 12.75, 13.0, 13.0, 13.0,
13.0, 13.0, 13.0, 13.0, 13.0, 13.25, 13.25, 13.25, 13.25, 13.25,
13.25, 13.25, 13.25, 13.5, 13.5, 13.5, 13.5, 13.5, 13.5, 13.5,
13.5, 13.75, 13.75, 13.75, 13.75, 13.75, 13.75, 13.75, 14.0, 14.0,
14.0, 14.0, 14.0, 14.0, 14.0, 14.25, 14.25, 14.25, 14.25, 14.25,
14.25, 14.25, 14.25, 14.25, 14.25, 14.5, 14.5, 14.5, 14.5, 14.5,
14.5, 14.5, 14.5, 14.5, 14.5, 14.75, 14.75, 14.75, 14.75, 14.75,
14.75, 14.75, 14.75, 14.75, 14.75, 14.75, 15.0, 15.0, 15.0, 15.0,
15.0, 15.0, 15.0, 15.0, 15.0, 15.0, 15.0, 15.25, 15.25, 15.25,
15.25, 15.25, 15.25, 15.25, 15.25, 15.25, 15.25, 15.25, 15.5, 15.5,
15.5, 15.5, 15.5, 15.5, 15.5, 15.5, 15.5, 15.5, 15.5, 15.5,
15.5, 15.5, 15.5, 15.75, 15.75, 15.75, 15.75, 15.75, 15.75, 15.75,
15.75, 15.75, 15.75, 15.75, 15.75, 16.0, 16.0, 16.0, 16.0, 16.0,
16.0, 16.0, 16.0, 16.0, 16.0, 16.0, 16.0, 16.0, 16.0, 16.0,
16.0, 16.25, 16.25, 16.25, 16.25, 16.25, 16.25, 16.25, 16.25, 16.25,
16.25, 16.25, 16.25, 16.25, 16.25, 16.25, 16.25, 16.25, 16.25, 16.25,
16.5, 16.5, 16.5, 16.5, 16.5, 16.5, 16.5, 16.5, 16.5, 16.5,
16.5, 16.5, 16.5, 16.5, 16.5, 16.5, 16.5, 16.5, 16.5, 16.75,
16.75, 16.75, 16.75, 16.75, 16.75, 16.75, 16.75, 16.75, 16.75, 16.75,
16.75, 16.75, 16.75, 16.75, 16.75, 16.75, 16.75, 16.75, 16.75, 17.0,
17.0, 17.0, 17.0, 17.0, 17.0, 17.0, 17.0, 17.0, 17.0, 17.0,
17.0, 17.0, 17.0, 17.0, 17.0, 17.0, 17.0, 17.0, 17.0, 17.25,
17.25, 17.25, 17.25, 17.25, 17.25, 17.25, 17.25, 17.25, 17.25, 17.25,
17.25, 17.25, 17.25, 17.25, 17.25, 17.25, 17.25, 17.25, 17.25, 17.25,
17.25, 17.25, 17.5, 17.5, 17.5, 17.5, 17.5, 17.5, 17.5, 17.5,
17.5, 17.5, 17.5, 17.5, 17.5, 17.5, 17.5, 17.5, 17.5, 17.5,
17.5, 17.5, 17.5, 17.5, 17.5, 17.5, 17.75, 17.75, 17.75, 17.75,
17.75, 17.75, 17.75, 17.75, 17.75, 17.75, 17.75, 17.75, 17.75, 17.75,
17.75, 17.75, 17.75, 17.75, 17.75, 17.75, 17.75, 17.75, 17.75, 17.75,
18.0, 18.0, 18.0, 18.0, 18.0, 18.0, 18.0, 18.0, 18.0, 18.0,
18.0, 18.0, 18.0, 18.0, 18.0, 18.0, 18.0, 18.0, 18.0, 18.0,
18.0, 18.0, 18.0, 18.0, 18.25, 18.25, 18.25, 18.25, 18.25, 18.25,
18.25, 18.25, 18.25, 18.25, 18.25, 18.25, 18.25, 18.25, 18.25, 18.25,
18.25, 18.25, 18.25, 18.25, 18.25, 18.25, 18.5, 18.5, 18.5, 18.5,
18.5, 18.5, 18.5, 18.5, 18.5, 18.5, 18.5, 18.5, 18.5, 18.5,
18.5, 18.5, 18.5, 18.5, 18.5, 18.5, 18.5, 18.75, 18.75, 18.75,
18.75, 18.75, 18.75, 18.75, 18.75, 18.75, 18.75, 18.75, 18.75, 18.75,
18.75, 18.75, 18.75, 18.75, 18.75, 18.75, 18.75, 18.75, 19.0, 19.0,
19.0, 19.0, 19.0, 19.0, 19.0, 19.0, 19.0, 19.0, 19.0, 19.0,
19.0, 19.0, 19.0, 19.0, 19.0, 19.0, 19.25, 19.25, 19.25, 19.25,
19.25, 19.25, 19.25, 19.25, 19.25, 19.25, 19.25, 19.25, 19.25, 19.25,
19.25, 19.25, 19.25, 19.5, 19.5, 19.5, 19.5, 19.5, 19.5, 19.5,
19.5, 19.5, 19.5, 19.5, 19.5, 19.5, 19.5, 19.5, 19.75, 19.75,
19.75, 19.75, 19.75, 19.75, 19.75, 19.75, 19.75, 19.75, 19.75, 19.75,
20.0, 20.0, 20.0, 20.0, 20.0, 20.0, 20.0, 20.0, 20.0, 20.0,
20.0, 20.0, 20.0, 20.25, 20.25, 20.25, 20.25, 20.25, 20.25, 20.25,
20.25, 20.25, 20.25, 20.25, 20.25, 20.25, 20.5, 20.5, 20.5, 20.5,
20.5, 20.5, 20.5, 20.5, 20.5, 20.5, 20.5, 20.5, 20.5, 20.75,
20.75, 20.75, 20.75, 20.75, 20.75, 20.75, 20.75, 20.75, 20.75, 20.75,
21.0, 21.0, 21.0, 21.0, 21.0, 21.0, 21.0, 21.0, 21.0, 21.0,
21.25, 21.25, 21.25, 21.25, 21.25, 21.25, 21.25, 21.25, 21.5, 21.5,
21.5, 21.5, 21.5, 21.5, 21.5, 21.5, 21.75, 21.75, 21.75, 21.75,
21.75, 21.75, 21.75, 21.75, 21.75, 22.0, 22.0, 22.0, 22.0, 22.0,
22.25, 22.25, 22.25, 22.25, 22.5, 22.5, 22.75, 22.75, 22.75, 22.75,
22.75, 22.75, 22.75, 22.75, 23.0, 23.0, 23.0, 23.0, 23.0, 23.0,
23.25, 23.25, 23.25, 23.25, 23.25, 23.25, 23.25, 23.25, 23.25, 23.25,
23.25, 23.25, 23.25, 23.5, 23.5, 23.5, 23.5, 23.5, 23.75, 23.75,
23.75, 23.75, 23.75, 23.75, 23.75, 23.75, 23.75, 23.75]
The array of scaled samples is the same, but with all values divided by 4, which cause a shift on the x-axis.
plt.plot(x, (laws[0].weight * p + laws[1].weight * p1)/4, 'k', linewidth=2)
plt.xticks(np.arange(0, 25, 2.0))
plt.yticks(np.arange(0, 0.03, 0.01))
What is the result of this if you change these lines in your code, and add this
plt.ylim(0, 0.03)
If you want to scale down the actual values, you can divide the numpy array directly:
In [1]: import numpy as np
In [2]: np.arange(0, 0.13, 0.04)
Out[2]: array([ 0. , 0.04, 0.08, 0.12])
In [3]: np.arange(0, 0.13, 0.04) / 4
Out[3]: array([ 0. , 0.01, 0.02, 0.03])

How to improve the speed of my selection process, python

Edit: Due to errors in my code i updated with my oldest, but working code
I get a list of speed recordings from a database, and I want to find the max speed in that list. Sounds easy enough, but I got some requirements for any max speed to count:
If the max speed is over a certain level, it has to have more than a certain number of records to be recognized as maximum speed. The reason for this logic is that I want the max speed under normal conditions, not just an error or one time occurrence. I also have a constraint that a speed has to be over a certain limit to be counted, for the same reason.
Here is the example on a speed array:
v = [8.0, 1.3, 0.7, 0.8, 0.9, 1.1, 14.9, 14.0, 14.1, 14.2, 14.3, 13.8, 13.9, 13.7, 13.6, 13.5, 13.4, 15.7, 15.8, 15.0, 15.3, 15.4, 15.5, 15.6, 15.2, 12.8, 12.7, 12.6, 8.7, 8.8, 8.6, 9.0, 8.5, 8.4, 8.3, 0.1, 0.0, 16.4, 16.5, 16.7, 16.8, 17.0, 17.1, 17.8, 17.7, 17.6, 17.4, 17.5, 17.3, 17.9, 18.2, 18.3, 18.1, 18.0, 18.4, 18.5, 18.6, 19.0, 19.1, 18.9, 19.2, 19.3, 19.9, 20.1, 19.8, 20.0, 19.7, 19.6, 19.5, 20.2, 20.3, 18.7, 18.8, 17.2, 16.9, 11.5, 11.2, 11.3, 11.4, 7.1, 12.9, 14.4, 13.1, 13.2, 12.5, 12.1, 12.2, 13.0, 0.2, 3.6, 7.4, 4.6, 4.5, 4.3, 4.0, 9.4, 9.6, 9.7, 5.8, 5.7, 7.3, 2.1, 0.4, 0.3, 16.1, 11.9, 12.0, 11.7, 11.8, 10.0, 10.1, 9.8, 15.1, 14.7, 14.8, 10.2, 10.3, 1.2, 9.9, 1.9, 3.4, 14.6, 0.6, 5.1, 5.2, 7.5, 19.4, 10.7, 10.8, 10.9, 0.5, 16.3, 16.2, 16.0, 16.6, 12.4, 11.0, 1.7, 1.6, 2.4, 11.6, 3.9, 3.8, 14.5, 11.1]
This is my code to find what I define as the true maximum speed:
from collections import Counter
while max(speeds)>30:
speeds.remove(max(speeds))
nwsp = []
for s in speeds:
nwsp.append(np.floor(s))
count = Counter(nwsp)
while speeds and max(speeds)>14 and count[np.floor(max(speeds))]<10:
speeds.remove(max(speeds))
while speeds and max(speeds)<5:
speeds.remove(max(speeds))
if speeds:
print max(speeds)
return max(speeds)
else:
return False
Result with v as shown over: 19.9
The reason that i make the nwsp is that it doesn't matter for me if f.ex 19.6 is only found 9 times - if any number inside the same integer, f.ex 19.7 is found 3 times as well, then 19.6 will be valid.
How can I rewrite/optimize this code so the selection process is quicker? I already removed the max(speeds) and instead sorted the list and referenced the largest element using speeds[-1].
Sorry for not adding any unit to my speeds.
Your code is just slow because you call max and remove over and over and over again and each of those calls costs time proportional to the length of the list. Any reasonable solution will be much faster.
If you know that False can't happen, then this suffices:
speeds = [8.0, 1.3, 0.7, 0.8, 0.9, 1.1, 14.9, 14.0, 14.1, 14.2, 14.3, 13.8, 13.9, 13.7, 13.6, 13.5, 13.4, 15.7, 15.8, 15.0, 15.3, 15.4, 15.5, 15.6, 15.2, 12.8, 12.7, 12.6, 8.7, 8.8, 8.6, 9.0, 8.5, 8.4, 8.3, 0.1, 0.0, 16.4, 16.5, 16.7, 16.8, 17.0, 17.1, 17.8, 17.7, 17.6, 17.4, 17.5, 17.3, 17.9, 18.2, 18.3, 18.1, 18.0, 18.4, 18.5, 18.6, 19.0, 19.1, 18.9, 19.2, 19.3, 19.9, 20.1, 19.8, 20.0, 19.7, 19.6, 19.5, 20.2, 20.3, 18.7, 18.8, 17.2, 16.9, 11.5, 11.2, 11.3, 11.4, 7.1, 12.9, 14.4, 13.1, 13.2, 12.5, 12.1, 12.2, 13.0, 0.2, 3.6, 7.4, 4.6, 4.5, 4.3, 4.0, 9.4, 9.6, 9.7, 5.8, 5.7, 7.3, 2.1, 0.4, 0.3, 16.1, 11.9, 12.0, 11.7, 11.8, 10.0, 10.1, 9.8, 15.1, 14.7, 14.8, 10.2, 10.3, 1.2, 9.9, 1.9, 3.4, 14.6, 0.6, 5.1, 5.2, 7.5, 19.4, 10.7, 10.8, 10.9, 0.5, 16.3, 16.2, 16.0, 16.6, 12.4, 11.0, 1.7, 1.6, 2.4, 11.6, 3.9, 3.8, 14.5, 11.1]
from collections import Counter
count = Counter(map(int, speeds))
print max(s for s in speeds
if 5 <= s <= 30 and (s <= 14 or count[int(s)] >= 10))
If the False case can happen, this would be one way:
speeds = [8.0, 1.3, 0.7, 0.8, 0.9, 1.1, 14.9, 14.0, 14.1, 14.2, 14.3, 13.8, 13.9, 13.7, 13.6, 13.5, 13.4, 15.7, 15.8, 15.0, 15.3, 15.4, 15.5, 15.6, 15.2, 12.8, 12.7, 12.6, 8.7, 8.8, 8.6, 9.0, 8.5, 8.4, 8.3, 0.1, 0.0, 16.4, 16.5, 16.7, 16.8, 17.0, 17.1, 17.8, 17.7, 17.6, 17.4, 17.5, 17.3, 17.9, 18.2, 18.3, 18.1, 18.0, 18.4, 18.5, 18.6, 19.0, 19.1, 18.9, 19.2, 19.3, 19.9, 20.1, 19.8, 20.0, 19.7, 19.6, 19.5, 20.2, 20.3, 18.7, 18.8, 17.2, 16.9, 11.5, 11.2, 11.3, 11.4, 7.1, 12.9, 14.4, 13.1, 13.2, 12.5, 12.1, 12.2, 13.0, 0.2, 3.6, 7.4, 4.6, 4.5, 4.3, 4.0, 9.4, 9.6, 9.7, 5.8, 5.7, 7.3, 2.1, 0.4, 0.3, 16.1, 11.9, 12.0, 11.7, 11.8, 10.0, 10.1, 9.8, 15.1, 14.7, 14.8, 10.2, 10.3, 1.2, 9.9, 1.9, 3.4, 14.6, 0.6, 5.1, 5.2, 7.5, 19.4, 10.7, 10.8, 10.9, 0.5, 16.3, 16.2, 16.0, 16.6, 12.4, 11.0, 1.7, 1.6, 2.4, 11.6, 3.9, 3.8, 14.5, 11.1]
from collections import Counter
count = Counter(map(int, speeds))
valids = [s for s in speeds
if 5 <= s <= 30 and (s <= 14 or count[int(s)] >= 10)]
print max(valids) if valids else False
Or sort and use next, which can take your False as default:
speeds = [8.0, 1.3, 0.7, 0.8, 0.9, 1.1, 14.9, 14.0, 14.1, 14.2, 14.3, 13.8, 13.9, 13.7, 13.6, 13.5, 13.4, 15.7, 15.8, 15.0, 15.3, 15.4, 15.5, 15.6, 15.2, 12.8, 12.7, 12.6, 8.7, 8.8, 8.6, 9.0, 8.5, 8.4, 8.3, 0.1, 0.0, 16.4, 16.5, 16.7, 16.8, 17.0, 17.1, 17.8, 17.7, 17.6, 17.4, 17.5, 17.3, 17.9, 18.2, 18.3, 18.1, 18.0, 18.4, 18.5, 18.6, 19.0, 19.1, 18.9, 19.2, 19.3, 19.9, 20.1, 19.8, 20.0, 19.7, 19.6, 19.5, 20.2, 20.3, 18.7, 18.8, 17.2, 16.9, 11.5, 11.2, 11.3, 11.4, 7.1, 12.9, 14.4, 13.1, 13.2, 12.5, 12.1, 12.2, 13.0, 0.2, 3.6, 7.4, 4.6, 4.5, 4.3, 4.0, 9.4, 9.6, 9.7, 5.8, 5.7, 7.3, 2.1, 0.4, 0.3, 16.1, 11.9, 12.0, 11.7, 11.8, 10.0, 10.1, 9.8, 15.1, 14.7, 14.8, 10.2, 10.3, 1.2, 9.9, 1.9, 3.4, 14.6, 0.6, 5.1, 5.2, 7.5, 19.4, 10.7, 10.8, 10.9, 0.5, 16.3, 16.2, 16.0, 16.6, 12.4, 11.0, 1.7, 1.6, 2.4, 11.6, 3.9, 3.8, 14.5, 11.1]
count = Counter(map(int, speeds))
print next((s for s in reversed(sorted(speeds))
if 5 <= s <= 30 and (s <= 14 or count[int(s)] >= 10)),
False)
Instead of Counter, you could also use groupby:
speeds = [8.0, 1.3, 0.7, 0.8, 0.9, 1.1, 14.9, 14.0, 14.1, 14.2, 14.3, 13.8, 13.9, 13.7, 13.6, 13.5, 13.4, 15.7, 15.8, 15.0, 15.3, 15.4, 15.5, 15.6, 15.2, 12.8, 12.7, 12.6, 8.7, 8.8, 8.6, 9.0, 8.5, 8.4, 8.3, 0.1, 0.0, 16.4, 16.5, 16.7, 16.8, 17.0, 17.1, 17.8, 17.7, 17.6, 17.4, 17.5, 17.3, 17.9, 18.2, 18.3, 18.1, 18.0, 18.4, 18.5, 18.6, 19.0, 19.1, 18.9, 19.2, 19.3, 19.9, 20.1, 19.8, 20.0, 19.7, 19.6, 19.5, 20.2, 20.3, 18.7, 18.8, 17.2, 16.9, 11.5, 11.2, 11.3, 11.4, 7.1, 12.9, 14.4, 13.1, 13.2, 12.5, 12.1, 12.2, 13.0, 0.2, 3.6, 7.4, 4.6, 4.5, 4.3, 4.0, 9.4, 9.6, 9.7, 5.8, 5.7, 7.3, 2.1, 0.4, 0.3, 16.1, 11.9, 12.0, 11.7, 11.8, 10.0, 10.1, 9.8, 15.1, 14.7, 14.8, 10.2, 10.3, 1.2, 9.9, 1.9, 3.4, 14.6, 0.6, 5.1, 5.2, 7.5, 19.4, 10.7, 10.8, 10.9, 0.5, 16.3, 16.2, 16.0, 16.6, 12.4, 11.0, 1.7, 1.6, 2.4, 11.6, 3.9, 3.8, 14.5, 11.1]
from itertools import *
groups = (list(group) for _, group in groupby(reversed(sorted(speeds)), int))
print next((s[0] for s in groups
if 5 <= s[0] <= 30 and (s[0] <= 14 or len(s) >= 10)),
False)
Just in case all of these look odd to you, here's one close to your original. Just looking at the speeds from fastest to slowest and returning the first that matches the requirements:
def f(speeds):
count = Counter(map(int, speeds))
for speed in reversed(sorted(speeds)):
if 5 <= speed <= 30 and (speed <= 14 or count[int(speed)] >= 10):
return speed
return False
Btw, your definition of "the true maximum speed" seems rather odd to me. How about just looking at a certain percentile? Maybe like this:
print sorted(speeds)[len(speeds) * 9 // 10]
I'm not sure if this is faster, but it is shorter, and I think it achieves your requirements. It uses Counter.
from collections import Counter
import math
def valid(item):
speed,count = item
return speed <= 30 and (speed <= 13 or count >= 10)
speeds = [4,3,1,3,4,5,6,7,14,16,18,19,20,34,5,4,3,2,12,58,14,14,14]
speeds = map(math.floor,speeds)
counts = Counter(speeds)
max_valid_speed = max(filter(valid,counts.items()))
Result: max_valid_speed == (12,1)
Using your sort idea we can start at the end of the list at the numbers less than 30, returning on the first number that matched the criteria or returning False:
from collections import Counter
def f(speeds):
# get speeds that satisfy the range
rev = [speed for speed in speeds if 5 <= speed < 30]
rev.sort(reverse=True)
c = Counter((int(v) for v in rev))
for speed in rev:
# will hit highest numbers first
# so return first that matches
if speed > 14 and c[int(speed)] > 9 or speed < 15:
return speed
# we did not find any speed that matched our requirement
return False
Output for your list v:
In [70]: f(v)
Out[70]: 19.9
Without sorting you could use a dict, depending on your what your data is like will decide which is best, it will work for all cases including an empty list:
def f_dict(speeds):
d = defaultdict(lambda: defaultdict(lambda: 0, {}))
for speed in speeds:
key = int(speed)
d[key]["count"] += 1
if speed > d[key]["speed"]:
d[key]["speed"] = speed
filt = max(filter(lambda x: (15 <= x[0] < 30 and
x[1]["count"] > 9 or x[0] < 15), d.items()), default=False)
return filt[1]["speed"] if filt else False
Output:
In [95]: f_dict(v)
Out[95]: 19.9

Categories