Scaling down matplotlib y-axis values - python
I have done the following code (don't bother with constants, these are just for the plotting):
percentage = [1.11, 1.63, 0.356, 0.808, 0.0459, 0.355, 0.133, 0.156, 0.0445, 0.631, 0.179, 0.226, 0.0272, 0.201, 0.177, 0.177, 0.224, 0.271, 0.176, 0.279, 0.302, 0.476, 0.397, 0.571, 0.491, 0.872, 1.08, 1.09, 1.23, 1.75, 1.96, 1.96, 1.68, 1.88, 1.57, 1.71, 1.09, 1.06, 1.05, 0.978, 0.724, 0.763, 0.691, 0.897, 0.817, 0.944, 0.825, 0.872, 0.911, 0.911, 0.895, 0.894, 0.823, 0.822, 0.838, 0.766, 0.766, 1.00, 1.01, 1.12, 1.14, 1.11, 1.57, 1.29, 1.69, 1.92, 1.99, 2.02, 2.04, 2.34, 2.45, 2.41, 2.44, 2.21, 2.13, 2.14, 1.89, 1.74, 1.53, 1.25, 1.31, 1.34, 1.38, 1.14, 1.00, 0.882, 0.826, 0.929, 0.580, 0.444, 0.293, 0.880, 0.618, 1.40, 0.538, 1.07]
result = dispatch_evs_arrival(1000, percentage)
samples = create_hist_value(result)
laws = pdf.create_distribution(2)
# Execute the algorithm
em.em_algorithm(samples, laws)
bins = []
i = 0
while i <= 96:
bins.append(i*0.25)
i = i + 1
matplotlib.rcParams.update({'font.size': 18})
# Plotting the graph.
plt.hist(samples, bins=bins, normed=1, color='r', alpha=0.5, histtype='bar', ec='black')
plt.xlabel("Time of day - 15 min. resolution")
plt.ylabel("Probability in %")
xmin, xmax = plt.xlim()
x = np.linspace(xmin, xmax, 10000)
plt.xlim(0, 25)
p = norm.pdf(x, laws[0].mean, laws[0].std_deviation)
p1 = norm.pdf(x, laws[1].mean, laws[1].std_deviation)
plt.plot(x, laws[0].weight * p + laws[1].weight * p1, 'k', linewidth=2)
plt.xticks(np.arange(0, 25, 2.0))
plt.yticks(np.arange(0, 0.13, 0.04))
#plt.plot(x, 'k', linewidth=2)
plt.grid()
plt.show()
This plot the following graph:
I would like to scale down the y-label values by divided it by 4, which will give me the same result overall. Is it possible to do it properly with matplotlib ?
edit: applying iCart answer give me this:
Which is not what I want. I would like the exact same results as in the first diagram, but having 0.03, 0.02 and 0.01 instead of having 0.12, 0.08 and 0.04. I am pretty sure it should be possible, as the overall shape will not change.
Here are the samples:
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.25,
0.25, 0.25, 0.25, 0.25, 0.25, 0.25, 0.5, 0.5, 0.5, 0.75,
0.75, 0.75, 0.75, 0.75, 0.75, 0.75, 0.75, 1.25, 1.25, 1.25,
1.5, 1.75, 1.75, 2.25, 2.25, 2.25, 2.25, 2.25, 2.25, 2.5,
2.75, 2.75, 3.25, 3.25, 3.5, 3.75, 4.0, 4.0, 4.25, 4.25,
4.5, 4.75, 4.75, 5.0, 5.0, 5.0, 5.25, 5.25, 5.25, 5.25,
5.5, 5.5, 5.5, 5.75, 5.75, 5.75, 5.75, 5.75, 6.0, 6.0,
6.0, 6.0, 6.25, 6.25, 6.25, 6.25, 6.25, 6.25, 6.25, 6.25,
6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5,
6.75, 6.75, 6.75, 6.75, 6.75, 6.75, 6.75, 6.75, 6.75, 6.75,
7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0,
7.0, 7.0, 7.25, 7.25, 7.25, 7.25, 7.25, 7.25, 7.25, 7.25,
7.25, 7.25, 7.25, 7.25, 7.25, 7.25, 7.25, 7.25, 7.25, 7.5,
7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5,
7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.75, 7.75,
7.75, 7.75, 7.75, 7.75, 7.75, 7.75, 7.75, 7.75, 7.75, 7.75,
7.75, 7.75, 7.75, 7.75, 7.75, 7.75, 7.75, 8.0, 8.0, 8.0,
8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0, 8.0,
8.0, 8.0, 8.0, 8.25, 8.25, 8.25, 8.25, 8.25, 8.25, 8.25,
8.25, 8.25, 8.25, 8.25, 8.25, 8.25, 8.25, 8.25, 8.25, 8.25,
8.25, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5,
8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.75, 8.75, 8.75, 8.75,
8.75, 8.75, 8.75, 8.75, 8.75, 8.75, 8.75, 8.75, 8.75, 8.75,
8.75, 8.75, 8.75, 9.0, 9.0, 9.0, 9.0, 9.0, 9.0, 9.0,
9.0, 9.0, 9.0, 9.25, 9.25, 9.25, 9.25, 9.25, 9.25, 9.25,
9.25, 9.25, 9.25, 9.5, 9.5, 9.5, 9.5, 9.5, 9.5, 9.5,
9.5, 9.5, 9.5, 9.75, 9.75, 9.75, 9.75, 9.75, 9.75, 9.75,
9.75, 9.75, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.0, 10.25,
10.25, 10.25, 10.25, 10.25, 10.25, 10.25, 10.5, 10.5, 10.5, 10.5,
10.5, 10.5, 10.75, 10.75, 10.75, 10.75, 10.75, 10.75, 10.75, 10.75,
11.0, 11.0, 11.0, 11.0, 11.0, 11.0, 11.0, 11.0, 11.25, 11.25,
11.25, 11.25, 11.25, 11.25, 11.25, 11.25, 11.25, 11.5, 11.5, 11.5,
11.5, 11.5, 11.5, 11.5, 11.5, 11.75, 11.75, 11.75, 11.75, 11.75,
11.75, 11.75, 11.75, 12.0, 12.0, 12.0, 12.0, 12.0, 12.0, 12.0,
12.0, 12.0, 12.25, 12.25, 12.25, 12.25, 12.25, 12.25, 12.25, 12.25,
12.25, 12.5, 12.5, 12.5, 12.5, 12.5, 12.5, 12.5, 12.5, 12.75,
12.75, 12.75, 12.75, 12.75, 12.75, 12.75, 12.75, 13.0, 13.0, 13.0,
13.0, 13.0, 13.0, 13.0, 13.0, 13.25, 13.25, 13.25, 13.25, 13.25,
13.25, 13.25, 13.25, 13.5, 13.5, 13.5, 13.5, 13.5, 13.5, 13.5,
13.5, 13.75, 13.75, 13.75, 13.75, 13.75, 13.75, 13.75, 14.0, 14.0,
14.0, 14.0, 14.0, 14.0, 14.0, 14.25, 14.25, 14.25, 14.25, 14.25,
14.25, 14.25, 14.25, 14.25, 14.25, 14.5, 14.5, 14.5, 14.5, 14.5,
14.5, 14.5, 14.5, 14.5, 14.5, 14.75, 14.75, 14.75, 14.75, 14.75,
14.75, 14.75, 14.75, 14.75, 14.75, 14.75, 15.0, 15.0, 15.0, 15.0,
15.0, 15.0, 15.0, 15.0, 15.0, 15.0, 15.0, 15.25, 15.25, 15.25,
15.25, 15.25, 15.25, 15.25, 15.25, 15.25, 15.25, 15.25, 15.5, 15.5,
15.5, 15.5, 15.5, 15.5, 15.5, 15.5, 15.5, 15.5, 15.5, 15.5,
15.5, 15.5, 15.5, 15.75, 15.75, 15.75, 15.75, 15.75, 15.75, 15.75,
15.75, 15.75, 15.75, 15.75, 15.75, 16.0, 16.0, 16.0, 16.0, 16.0,
16.0, 16.0, 16.0, 16.0, 16.0, 16.0, 16.0, 16.0, 16.0, 16.0,
16.0, 16.25, 16.25, 16.25, 16.25, 16.25, 16.25, 16.25, 16.25, 16.25,
16.25, 16.25, 16.25, 16.25, 16.25, 16.25, 16.25, 16.25, 16.25, 16.25,
16.5, 16.5, 16.5, 16.5, 16.5, 16.5, 16.5, 16.5, 16.5, 16.5,
16.5, 16.5, 16.5, 16.5, 16.5, 16.5, 16.5, 16.5, 16.5, 16.75,
16.75, 16.75, 16.75, 16.75, 16.75, 16.75, 16.75, 16.75, 16.75, 16.75,
16.75, 16.75, 16.75, 16.75, 16.75, 16.75, 16.75, 16.75, 16.75, 17.0,
17.0, 17.0, 17.0, 17.0, 17.0, 17.0, 17.0, 17.0, 17.0, 17.0,
17.0, 17.0, 17.0, 17.0, 17.0, 17.0, 17.0, 17.0, 17.0, 17.25,
17.25, 17.25, 17.25, 17.25, 17.25, 17.25, 17.25, 17.25, 17.25, 17.25,
17.25, 17.25, 17.25, 17.25, 17.25, 17.25, 17.25, 17.25, 17.25, 17.25,
17.25, 17.25, 17.5, 17.5, 17.5, 17.5, 17.5, 17.5, 17.5, 17.5,
17.5, 17.5, 17.5, 17.5, 17.5, 17.5, 17.5, 17.5, 17.5, 17.5,
17.5, 17.5, 17.5, 17.5, 17.5, 17.5, 17.75, 17.75, 17.75, 17.75,
17.75, 17.75, 17.75, 17.75, 17.75, 17.75, 17.75, 17.75, 17.75, 17.75,
17.75, 17.75, 17.75, 17.75, 17.75, 17.75, 17.75, 17.75, 17.75, 17.75,
18.0, 18.0, 18.0, 18.0, 18.0, 18.0, 18.0, 18.0, 18.0, 18.0,
18.0, 18.0, 18.0, 18.0, 18.0, 18.0, 18.0, 18.0, 18.0, 18.0,
18.0, 18.0, 18.0, 18.0, 18.25, 18.25, 18.25, 18.25, 18.25, 18.25,
18.25, 18.25, 18.25, 18.25, 18.25, 18.25, 18.25, 18.25, 18.25, 18.25,
18.25, 18.25, 18.25, 18.25, 18.25, 18.25, 18.5, 18.5, 18.5, 18.5,
18.5, 18.5, 18.5, 18.5, 18.5, 18.5, 18.5, 18.5, 18.5, 18.5,
18.5, 18.5, 18.5, 18.5, 18.5, 18.5, 18.5, 18.75, 18.75, 18.75,
18.75, 18.75, 18.75, 18.75, 18.75, 18.75, 18.75, 18.75, 18.75, 18.75,
18.75, 18.75, 18.75, 18.75, 18.75, 18.75, 18.75, 18.75, 19.0, 19.0,
19.0, 19.0, 19.0, 19.0, 19.0, 19.0, 19.0, 19.0, 19.0, 19.0,
19.0, 19.0, 19.0, 19.0, 19.0, 19.0, 19.25, 19.25, 19.25, 19.25,
19.25, 19.25, 19.25, 19.25, 19.25, 19.25, 19.25, 19.25, 19.25, 19.25,
19.25, 19.25, 19.25, 19.5, 19.5, 19.5, 19.5, 19.5, 19.5, 19.5,
19.5, 19.5, 19.5, 19.5, 19.5, 19.5, 19.5, 19.5, 19.75, 19.75,
19.75, 19.75, 19.75, 19.75, 19.75, 19.75, 19.75, 19.75, 19.75, 19.75,
20.0, 20.0, 20.0, 20.0, 20.0, 20.0, 20.0, 20.0, 20.0, 20.0,
20.0, 20.0, 20.0, 20.25, 20.25, 20.25, 20.25, 20.25, 20.25, 20.25,
20.25, 20.25, 20.25, 20.25, 20.25, 20.25, 20.5, 20.5, 20.5, 20.5,
20.5, 20.5, 20.5, 20.5, 20.5, 20.5, 20.5, 20.5, 20.5, 20.75,
20.75, 20.75, 20.75, 20.75, 20.75, 20.75, 20.75, 20.75, 20.75, 20.75,
21.0, 21.0, 21.0, 21.0, 21.0, 21.0, 21.0, 21.0, 21.0, 21.0,
21.25, 21.25, 21.25, 21.25, 21.25, 21.25, 21.25, 21.25, 21.5, 21.5,
21.5, 21.5, 21.5, 21.5, 21.5, 21.5, 21.75, 21.75, 21.75, 21.75,
21.75, 21.75, 21.75, 21.75, 21.75, 22.0, 22.0, 22.0, 22.0, 22.0,
22.25, 22.25, 22.25, 22.25, 22.5, 22.5, 22.75, 22.75, 22.75, 22.75,
22.75, 22.75, 22.75, 22.75, 23.0, 23.0, 23.0, 23.0, 23.0, 23.0,
23.25, 23.25, 23.25, 23.25, 23.25, 23.25, 23.25, 23.25, 23.25, 23.25,
23.25, 23.25, 23.25, 23.5, 23.5, 23.5, 23.5, 23.5, 23.75, 23.75,
23.75, 23.75, 23.75, 23.75, 23.75, 23.75, 23.75, 23.75]
The array of scaled samples is the same, but with all values divided by 4, which cause a shift on the x-axis.
plt.plot(x, (laws[0].weight * p + laws[1].weight * p1)/4, 'k', linewidth=2)
plt.xticks(np.arange(0, 25, 2.0))
plt.yticks(np.arange(0, 0.03, 0.01))
What is the result of this if you change these lines in your code, and add this
plt.ylim(0, 0.03)
If you want to scale down the actual values, you can divide the numpy array directly:
In [1]: import numpy as np
In [2]: np.arange(0, 0.13, 0.04)
Out[2]: array([ 0. , 0.04, 0.08, 0.12])
In [3]: np.arange(0, 0.13, 0.04) / 4
Out[3]: array([ 0. , 0.01, 0.02, 0.03])
Related
Python: How to compute multiple x intercept given two array?
I'm working on a project to visualize data but I've encountered an issue about finding multiple x intercept (maybe one, maybe at least two). Given that x = np.array([ 3. , 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 4. , 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8, 4.9, 5. , 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7, 5.8, 5.9, 6. , 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9, 7. , 7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 7.7, 7.8, 7.9, 8. , 8.1, 8.2, 8.3, 8.4, 8.5, 8.6, 8.7, 8.8, 8.9, 9. , 9.1, 9.2, 9.3, 9.4, 9.5, 9.6, 9.7, 9.8, 9.9, 10. , 10.1, 10.2, 10.3, 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 11. , 11.1, 11.2, 11.3, 11.4, 11.5, 11.6, 11.7, 11.8, 11.9, 12. , 12.1, 12.2, 12.3, 12.4, 12.5, 12.6, 12.7, 12.8, 12.9, 13. , 13.1, 13.2, 13.3, 13.4, 13.5, 13.6, 13.7, 13.8, 13.9, 14. , 14.1, 14.2, 14.3, 14.4, 14.5, 14.6, 14.7, 14.8, 14.9, 15. , 15.1, 15.2, 15.3, 15.4, 15.5, 15.6, 15.7, 15.8, 15.9, 16. , 16.1, 16.2, 16.3, 16.4, 16.5, 16.6, 16.7, 16.8, 16.9, 17. , 17.1, 17.2, 17.3, 17.4, 17.5, 17.6, 17.7, 17.8, 17.9, 18. , 18.1, 18.2, 18.3, 18.4, 18.5, 18.6, 18.7, 18.8, 18.9, 19. , 19.1, 19.2, 19.3, 19.4, 19.5, 19.6, 19.7, 19.8, 19.9, 20. , 20.1, 20.2, 20.3, 20.4, 20.5, 20.6, 20.7, 20.8, 20.9, 21. , 21.1, 21.2, 21.3, 21.4, 21.5, 21.6, 21.7, 21.8, 21.9, 22. , 22.1, 22.2, 22.3, 22.4, 22.5, 22.6, 22.7, 22.8, 22.9, 23. , 23.1, 23.2, 23.3, 23.4, 23.5, 23.6, 23.7, 23.8, 23.9, 24. , 24.1, 24.2, 24.3, 24.4, 24.5, 24.6, 24.7, 24.8, 24.9, 25. , 25.1, 25.2, 25.3, 25.4, 25.5, 25.6, 25.7, 25.8, 25.9, 26. , 26.1, 26.2, 26.3, 26.4, 26.5, 26.6, 26.7, 26.8, 26.9]) y = np.array([ 28250., 27750., 27250., 26750., 26250., 25750., 25250., 24750., 24250., 23750., 23250., 22750., 22250., 21750., 21250., 20750., 20250., 19750., 19250., 18750., 18250., 17750., 17250., 16750., 16250., 15750., 15250., 14750., 14250., 13750., 13250., 12750., 12250., 11750., 11250., 10750., 10250., 9750., 9250., 8750., 8250., 7750., 7250., 6750., 6250., 5750., 5250., 4750., 4250., 3750., 3250., 2750., 2250., 1750., 1250., 750., 250., -250., -750., -1250., -1750., -2250., -2750., -3250., -3750., -4250., -4750., -5250., -5750., -6250., -6750., -7250., -7750., -8250., -8750., -9250., -9750., -10250., -10750., -11250., -11750., -12250., -12750., -13250., -13750., -14250., -14750., -15250., -15750., -16250., -16750., -17250., -17750., -18250., -18750., -19250., -19750., -20250., -20750., -21250., -21750., -22250., -22750., -23250., -23750., -24250., -24750., -25250., -25750., -26250., -26750., -27250., -27750., -28250., -28750., -29250., -29750., -30250., -30750., -31250., -31750., -31250., -30750., -30250., -29750., -29250., -28750., -28250., -27750., -27250., -26750., -26250., -25750., -25250., -24750., -24250., -23750., -23250., -22750., -22250., -21750., -21250., -20750., -20250., -19750., -19250., -18750., -18250., -17750., -17250., -16750., -16250., -15750., -15250., -14750., -14250., -13750., -13250., -12750., -12250., -11750., -11250., -10750., -10250., -9750., -9250., -8750., -8250., -7750., -7250., -6750., -6250., -5750., -5250., -4750., -4250., -3750., -3250., -2750., -2250., -1750., -1250., -750., -250., 250., 750., 1250., 1750., 2250., 2750., 3250., 3750., 4250., 4750., 5250., 5750., 6250., 6750., 7250., 7750., 8250., 8750., 9250., 9750., 10250., 10750., 11250., 11750., 12250., 12750., 13250., 13750., 14250., 14750., 15250., 15750., 16250., 16750., 17250., 17750., 18250., 18750., 19250., 19750., 20250., 20750., 21250., 21750., 22250., 22750., 23250., 23750., 24250., 24750., 25250., 25750., 26250., 26750., 27250., 27750.]) The concept is finding the x value while the corresponding y value is 0, could you help me to figure it out? Thanks!
Your data looks like this and you want to get the x-intercepts: One simple option using only numpy is to check whenever y changes sign: s = np.sign(y) # if you want to check the intercept with y=n # use s = np.sign(y-n) instead x[np.r_[s[:-1]!=s[1:], [False]]] output: array([ 8.6, 21.3]) NB. this is working well here as you have a nice density of the data. If this is not the case, you might want to get the point before and after the shift and to take the mean: s = np.sign(y) mask = s[:-1]!=s[1:] np.c_[x[np.r_[mask, [False]]], # point before x[np.r_[[False], mask]], # point after ].mean(1) # array([ 8.65, 21.35]) Visual output:
Try np.where coordinates_where_y_is_zero = np.where(y == 0) print(coordinates_where_y_is_zero) corresponding_x = x[coordinates_where_y_is_zero] print(corresponding_x)
python, create a series of lists from two other lists with index
HELLO thank you in advance for your help, I've been trying to learn python on my own over the last few months! I have two list of lists : countries_list = [['Canada'], ['China'], ['Finland'], ...] ratios = [[10.2, 10.3, 11.4, 12.0], [8.2, 8.1, 9.0, 9.1], [15.4, 15.5, 15.8, 16.0], ...] I want to merge the lists together according to the indices. For example, countries[0] = ['Canada'] and ratios[0] = [10.2, 10.3, 11.4, 12.0]. I want to use the indices to create this final list: final_list = [[10.2, 10.3, 11.4, 12.0, 'Canada'], [8.2, 8.1, 9.0, 9.1,'China'], [15.4, 15.5, 15.8, 16.0, 'Finland']...] this is the code I've come up with for now: final_list = [] for countries in countries_list: for ratio_list in ratios: current_ratios = [] for r in ratio_list: current_ratios.append(r) current_ratios.append(countries) rows_list.append(current_ratios) print(rows_list) this is the output: [[9.8, 10.3, 10.9, 11.4, 12.0, 12.6, 14.8, 19.2, 25.2, 'Eswatini'], [9.8, 10.3, 10.9, 11.4, 12.0, 12.6, 14.8, 19.2, 25.2, 'Bahamas'], [9.8, 10.3, 10.9, 11.4, 12.0, 12.6, 14.8, 19.2, 25.2, 'Jamaica'], [9.8, 10.3, 10.9, 11.4, 12.0, 12.6, 14.8, 19.2, 25.2, 'Chad'], [9.8, 10.3, 10.9, 11.4, 12.0, 12.6, 14.8, 19.2, 25.2, 'Kenya'], [9.8, 10.3, 10.9, 11.4, 12.0, 12.6, 14.8, 19.2, 25.2, 'Mali'], [9.8, 10.3, 10.9, 11.4, 12.0, 12.6, 14.8, 19.2, 25.2, 'Guyana'] ...] As you can see, it is kinda close to the desired outcome, but the ratios are always the same. The nested loops are very confusing to me and I find myself wondering what the ordering is and just what's happening here in general.
You can use zip() + list-comprehension: countries_list = [["Canada"], ["China"], ["Finland"]] ratios = [ [10.2, 10.3, 11.4, 12.0], [8.2, 8.1, 9.0, 9.1], [15.4, 15.5, 15.8, 16.0], ] out = [[*r, *c] for c, r in zip(countries_list, ratios)] print(out) Prints: [ [10.2, 10.3, 11.4, 12.0, "Canada"], [8.2, 8.1, 9.0, 9.1, "China"], [15.4, 15.5, 15.8, 16.0, "Finland"], ]
max and min values in pandas data frame
I have a pandas dataframe which shows hourly temperature readings in 1990, as shown below: Date and time Dry bulb temperature 0 1990-01-01 00:00:00 8.2 1 1990-01-01 01:00:00 8.1 2 1990-01-01 02:00:00 8.3 3 1990-01-01 03:00:00 8.5 4 1990-01-01 04:00:00 8.8 ... ... ... 8755 1990-12-31 19:00:00 3.0 8756 1990-12-31 20:00:00 2.6 8757 1990-12-31 21:00:00 2.8 8758 1990-12-31 22:00:00 4.2 8759 1990-12-31 23:00:00 2.0 I want to calculate the max dry bulb temperature every 24 hours and get the corresponding date and time. How would I go about this? So far I have: o=[] for i in range(0, len(Dataframe['Dry bulb temperature']), 24): ymax = np.max(Dataframe['Dry bulb temperature'][i:i+24]) o.append(ymax) print(o) which gives the max temp every 24 hours as follows: [9.7, 9.9, 8.4, 10.4, 11.2, 12.0, 10.5, 10.7, 11.9, 12.0, 11.5, 11.4, 10.2, 10.9, 13.6, 11.5, 9.6, 10.9, 10.8, 12.3, 12.3, 12.2, 11.5, 7.9, 12.7, 6.0, 9.4, 8.2, 9.8, 10.6, 9.6, 8.8, 10.8, 8.6, 11.9, 11.7, 12.2, 13.8, 12.5, 10.8, 13.2, 8.2, 7.4, 12.1, 12.4, 8.6, 7.7, 12.3, 13.3, 12.3, 13.1, 12.0, 12.7, 11.5, 12.7, 12.5, 12.5, 8.7, 13.2, 7.7, 9.0, 10.1, 10.6, 10.9, 11.9, 11.4, 13.3, 12.2, 15.0, 14.1, 13.1, 12.9, 13.7, 12.7, 12.7, 16.3, 14.9, 12.8, 11.8, 14.2, 11.5, 11.7, 10.4, 10.1, 9.9, 9.6, 10.6, 12.7, 16.0, 15.3, 14.4, 14.2, 8.6, 7.0, 9.8, 11.6, 12.6, 11.1, 12.3, 12.2, 14.8, 15.2, 11.3, 12.1, 12.0, 12.3, 11.5, 10.8, 10.0, 11.7, 15.3, 12.9, 17.0, 17.6, 18.9, 14.2, 13.3, 14.9, 17.8, 20.6, 21.9, 24.1, 26.8, 25.4, 24.9, 23.5, 16.4, 14.9, 13.8, 14.2, 17.7, 17.9, 16.8, 15.7, 16.3, 18.9, 19.4, 18.3, 14.5, 17.6, 18.8, 18.1, 21.9, 18.2, 14.7, 14.9, 19.4, 20.0, 14.9, 18.9, 16.8, 17.6, 15.8, 14.6, 17.0, 15.6, 16.4, 15.0, 13.9, 18.5, 22.7, 16.4, 16.8, 15.6, 16.7, 19.0, 19.0, 17.2, 17.6, 18.7, 17.4, 15.5, 18.2, 17.8, 18.5, 21.9, 19.7, 21.2, 16.6, 17.3, 16.5, 16.3, 17.2, 18.5, 18.1, 17.3, 16.9, 21.3, 22.6, 17.5, 18.9, 21.9, 26.2, 26.5, 24.7, 25.3, 24.2, 23.3, 22.6, 23.1, 27.6, 30.2, 27.2, 22.1, 19.7, 22.6, 21.1, 23.8, 24.7, 22.1, 22.4, 23.7, 26.9, 29.2, 32.3, 30.0, 21.4, 22.2, 22.0, 23.0, 21.2, 22.6, 23.4, 24.9, 22.6, 19.7, 21.1, 18.9, 18.6, 22.0, 22.2, 19.4, 20.5, 24.8, 24.1, 27.0, 24.8, 25.1, 21.2, 22.6, 20.1, 18.3, 18.8, 20.6, 25.6, 22.1, 18.8, 17.7, 16.7, 18.4, 17.9, 20.2, 21.8, 20.6, 20.5, 21.0, 21.3, 19.6, 18.1, 17.4, 18.8, 16.0, 15.8, 15.9, 16.0, 14.4, 15.3, 16.4, 18.3, 17.3, 18.8, 17.3, 19.2, 16.0, 16.9, 16.4, 15.7, 19.7, 16.5, 14.0, 14.5, 14.7, 17.7, 15.2, 19.8, 18.6, 17.8, 18.0, 16.2, 16.7, 17.1, 17.7, 16.6, 16.1, 13.3, 16.3, 14.8, 14.8, 12.5, 12.8, 13.6, 10.2, 14.0, 12.9, 11.4, 10.7, 10.3, 10.4, 8.7, 9.7, 10.4, 11.0, 13.4, 13.9, 12.9, 16.3, 16.2, 13.1, 14.1, 15.8, 15.3, 12.0, 11.9, 9.7, 9.1, 6.7, 8.8, 7.4, 5.4, 7.9, 7.3, 6.3, 7.6, 8.1, 7.3, 6.6, 9.0, 10.0, 7.4, 4.7, 9.6, 4.0, 3.3, 7.0, 9.7, 10.1, 5.4, 3.4, 3.7, 5.0, 2.3, 3.6, 6.9, 9.4, 12.1, 11.4, 10.1, 10.2, 9.7, 13.7, 7.3, 11.5, 9.4, 9.6, 9.0] I want to get the corresponding dates for each max temperature in the form: [9.7,1990-01-02 03:00:00],...,etc.
You can use this: df['Date and time'] = pd.to_datetime(df['Date and time']) df1 = df.set_index('Date and time').resample('D')['Dry bulb temperature'].agg({'max':'max', 'min':'min'}) It gives you this output for the visible data in your question: max min Date and time 1990-01-01 8.8 8.1 1990-12-31 4.2 2.0 If you really want the result as a list you can use this afterwards: df1.reset_index().to_numpy() [array([Timestamp('1990-01-01 00:00:00'), 8.8, 8.1], dtype=object), array([Timestamp('1990-12-31 00:00:00'), 4.2, 2.0], dtype=object)] To get the exact datetime of max value per day you can try this: df2 = df.set_index('Date and time') df2.loc[df2.groupby(df2.index.dayofyear).idxmax().iloc[:, 0]] Dry_bulb_temperature Date_and_time 1990-01-01 04:00:00 8.8 1990-12-31 22:00:00 4.2
You can try to use this one: from datetime import timedelta day = min(df['Date and time']) max_day = max(df['Date and time']) results = list() while day <= max_day: # small part of dataframe temp = df[(df['Date and time'] >= day) & (df['Date and time'] < day + timedelta(1))] # Row with max temprature row = df.iloc[temp['Dry bulb temperature'].idxmax()] results.append([row['Dry bulb temperature'], row['Date and time']]) day += timedelta(1)
How to improve the speed of my selection process, python
Edit: Due to errors in my code i updated with my oldest, but working code I get a list of speed recordings from a database, and I want to find the max speed in that list. Sounds easy enough, but I got some requirements for any max speed to count: If the max speed is over a certain level, it has to have more than a certain number of records to be recognized as maximum speed. The reason for this logic is that I want the max speed under normal conditions, not just an error or one time occurrence. I also have a constraint that a speed has to be over a certain limit to be counted, for the same reason. Here is the example on a speed array: v = [8.0, 1.3, 0.7, 0.8, 0.9, 1.1, 14.9, 14.0, 14.1, 14.2, 14.3, 13.8, 13.9, 13.7, 13.6, 13.5, 13.4, 15.7, 15.8, 15.0, 15.3, 15.4, 15.5, 15.6, 15.2, 12.8, 12.7, 12.6, 8.7, 8.8, 8.6, 9.0, 8.5, 8.4, 8.3, 0.1, 0.0, 16.4, 16.5, 16.7, 16.8, 17.0, 17.1, 17.8, 17.7, 17.6, 17.4, 17.5, 17.3, 17.9, 18.2, 18.3, 18.1, 18.0, 18.4, 18.5, 18.6, 19.0, 19.1, 18.9, 19.2, 19.3, 19.9, 20.1, 19.8, 20.0, 19.7, 19.6, 19.5, 20.2, 20.3, 18.7, 18.8, 17.2, 16.9, 11.5, 11.2, 11.3, 11.4, 7.1, 12.9, 14.4, 13.1, 13.2, 12.5, 12.1, 12.2, 13.0, 0.2, 3.6, 7.4, 4.6, 4.5, 4.3, 4.0, 9.4, 9.6, 9.7, 5.8, 5.7, 7.3, 2.1, 0.4, 0.3, 16.1, 11.9, 12.0, 11.7, 11.8, 10.0, 10.1, 9.8, 15.1, 14.7, 14.8, 10.2, 10.3, 1.2, 9.9, 1.9, 3.4, 14.6, 0.6, 5.1, 5.2, 7.5, 19.4, 10.7, 10.8, 10.9, 0.5, 16.3, 16.2, 16.0, 16.6, 12.4, 11.0, 1.7, 1.6, 2.4, 11.6, 3.9, 3.8, 14.5, 11.1] This is my code to find what I define as the true maximum speed: from collections import Counter while max(speeds)>30: speeds.remove(max(speeds)) nwsp = [] for s in speeds: nwsp.append(np.floor(s)) count = Counter(nwsp) while speeds and max(speeds)>14 and count[np.floor(max(speeds))]<10: speeds.remove(max(speeds)) while speeds and max(speeds)<5: speeds.remove(max(speeds)) if speeds: print max(speeds) return max(speeds) else: return False Result with v as shown over: 19.9 The reason that i make the nwsp is that it doesn't matter for me if f.ex 19.6 is only found 9 times - if any number inside the same integer, f.ex 19.7 is found 3 times as well, then 19.6 will be valid. How can I rewrite/optimize this code so the selection process is quicker? I already removed the max(speeds) and instead sorted the list and referenced the largest element using speeds[-1]. Sorry for not adding any unit to my speeds.
Your code is just slow because you call max and remove over and over and over again and each of those calls costs time proportional to the length of the list. Any reasonable solution will be much faster. If you know that False can't happen, then this suffices: speeds = [8.0, 1.3, 0.7, 0.8, 0.9, 1.1, 14.9, 14.0, 14.1, 14.2, 14.3, 13.8, 13.9, 13.7, 13.6, 13.5, 13.4, 15.7, 15.8, 15.0, 15.3, 15.4, 15.5, 15.6, 15.2, 12.8, 12.7, 12.6, 8.7, 8.8, 8.6, 9.0, 8.5, 8.4, 8.3, 0.1, 0.0, 16.4, 16.5, 16.7, 16.8, 17.0, 17.1, 17.8, 17.7, 17.6, 17.4, 17.5, 17.3, 17.9, 18.2, 18.3, 18.1, 18.0, 18.4, 18.5, 18.6, 19.0, 19.1, 18.9, 19.2, 19.3, 19.9, 20.1, 19.8, 20.0, 19.7, 19.6, 19.5, 20.2, 20.3, 18.7, 18.8, 17.2, 16.9, 11.5, 11.2, 11.3, 11.4, 7.1, 12.9, 14.4, 13.1, 13.2, 12.5, 12.1, 12.2, 13.0, 0.2, 3.6, 7.4, 4.6, 4.5, 4.3, 4.0, 9.4, 9.6, 9.7, 5.8, 5.7, 7.3, 2.1, 0.4, 0.3, 16.1, 11.9, 12.0, 11.7, 11.8, 10.0, 10.1, 9.8, 15.1, 14.7, 14.8, 10.2, 10.3, 1.2, 9.9, 1.9, 3.4, 14.6, 0.6, 5.1, 5.2, 7.5, 19.4, 10.7, 10.8, 10.9, 0.5, 16.3, 16.2, 16.0, 16.6, 12.4, 11.0, 1.7, 1.6, 2.4, 11.6, 3.9, 3.8, 14.5, 11.1] from collections import Counter count = Counter(map(int, speeds)) print max(s for s in speeds if 5 <= s <= 30 and (s <= 14 or count[int(s)] >= 10)) If the False case can happen, this would be one way: speeds = [8.0, 1.3, 0.7, 0.8, 0.9, 1.1, 14.9, 14.0, 14.1, 14.2, 14.3, 13.8, 13.9, 13.7, 13.6, 13.5, 13.4, 15.7, 15.8, 15.0, 15.3, 15.4, 15.5, 15.6, 15.2, 12.8, 12.7, 12.6, 8.7, 8.8, 8.6, 9.0, 8.5, 8.4, 8.3, 0.1, 0.0, 16.4, 16.5, 16.7, 16.8, 17.0, 17.1, 17.8, 17.7, 17.6, 17.4, 17.5, 17.3, 17.9, 18.2, 18.3, 18.1, 18.0, 18.4, 18.5, 18.6, 19.0, 19.1, 18.9, 19.2, 19.3, 19.9, 20.1, 19.8, 20.0, 19.7, 19.6, 19.5, 20.2, 20.3, 18.7, 18.8, 17.2, 16.9, 11.5, 11.2, 11.3, 11.4, 7.1, 12.9, 14.4, 13.1, 13.2, 12.5, 12.1, 12.2, 13.0, 0.2, 3.6, 7.4, 4.6, 4.5, 4.3, 4.0, 9.4, 9.6, 9.7, 5.8, 5.7, 7.3, 2.1, 0.4, 0.3, 16.1, 11.9, 12.0, 11.7, 11.8, 10.0, 10.1, 9.8, 15.1, 14.7, 14.8, 10.2, 10.3, 1.2, 9.9, 1.9, 3.4, 14.6, 0.6, 5.1, 5.2, 7.5, 19.4, 10.7, 10.8, 10.9, 0.5, 16.3, 16.2, 16.0, 16.6, 12.4, 11.0, 1.7, 1.6, 2.4, 11.6, 3.9, 3.8, 14.5, 11.1] from collections import Counter count = Counter(map(int, speeds)) valids = [s for s in speeds if 5 <= s <= 30 and (s <= 14 or count[int(s)] >= 10)] print max(valids) if valids else False Or sort and use next, which can take your False as default: speeds = [8.0, 1.3, 0.7, 0.8, 0.9, 1.1, 14.9, 14.0, 14.1, 14.2, 14.3, 13.8, 13.9, 13.7, 13.6, 13.5, 13.4, 15.7, 15.8, 15.0, 15.3, 15.4, 15.5, 15.6, 15.2, 12.8, 12.7, 12.6, 8.7, 8.8, 8.6, 9.0, 8.5, 8.4, 8.3, 0.1, 0.0, 16.4, 16.5, 16.7, 16.8, 17.0, 17.1, 17.8, 17.7, 17.6, 17.4, 17.5, 17.3, 17.9, 18.2, 18.3, 18.1, 18.0, 18.4, 18.5, 18.6, 19.0, 19.1, 18.9, 19.2, 19.3, 19.9, 20.1, 19.8, 20.0, 19.7, 19.6, 19.5, 20.2, 20.3, 18.7, 18.8, 17.2, 16.9, 11.5, 11.2, 11.3, 11.4, 7.1, 12.9, 14.4, 13.1, 13.2, 12.5, 12.1, 12.2, 13.0, 0.2, 3.6, 7.4, 4.6, 4.5, 4.3, 4.0, 9.4, 9.6, 9.7, 5.8, 5.7, 7.3, 2.1, 0.4, 0.3, 16.1, 11.9, 12.0, 11.7, 11.8, 10.0, 10.1, 9.8, 15.1, 14.7, 14.8, 10.2, 10.3, 1.2, 9.9, 1.9, 3.4, 14.6, 0.6, 5.1, 5.2, 7.5, 19.4, 10.7, 10.8, 10.9, 0.5, 16.3, 16.2, 16.0, 16.6, 12.4, 11.0, 1.7, 1.6, 2.4, 11.6, 3.9, 3.8, 14.5, 11.1] count = Counter(map(int, speeds)) print next((s for s in reversed(sorted(speeds)) if 5 <= s <= 30 and (s <= 14 or count[int(s)] >= 10)), False) Instead of Counter, you could also use groupby: speeds = [8.0, 1.3, 0.7, 0.8, 0.9, 1.1, 14.9, 14.0, 14.1, 14.2, 14.3, 13.8, 13.9, 13.7, 13.6, 13.5, 13.4, 15.7, 15.8, 15.0, 15.3, 15.4, 15.5, 15.6, 15.2, 12.8, 12.7, 12.6, 8.7, 8.8, 8.6, 9.0, 8.5, 8.4, 8.3, 0.1, 0.0, 16.4, 16.5, 16.7, 16.8, 17.0, 17.1, 17.8, 17.7, 17.6, 17.4, 17.5, 17.3, 17.9, 18.2, 18.3, 18.1, 18.0, 18.4, 18.5, 18.6, 19.0, 19.1, 18.9, 19.2, 19.3, 19.9, 20.1, 19.8, 20.0, 19.7, 19.6, 19.5, 20.2, 20.3, 18.7, 18.8, 17.2, 16.9, 11.5, 11.2, 11.3, 11.4, 7.1, 12.9, 14.4, 13.1, 13.2, 12.5, 12.1, 12.2, 13.0, 0.2, 3.6, 7.4, 4.6, 4.5, 4.3, 4.0, 9.4, 9.6, 9.7, 5.8, 5.7, 7.3, 2.1, 0.4, 0.3, 16.1, 11.9, 12.0, 11.7, 11.8, 10.0, 10.1, 9.8, 15.1, 14.7, 14.8, 10.2, 10.3, 1.2, 9.9, 1.9, 3.4, 14.6, 0.6, 5.1, 5.2, 7.5, 19.4, 10.7, 10.8, 10.9, 0.5, 16.3, 16.2, 16.0, 16.6, 12.4, 11.0, 1.7, 1.6, 2.4, 11.6, 3.9, 3.8, 14.5, 11.1] from itertools import * groups = (list(group) for _, group in groupby(reversed(sorted(speeds)), int)) print next((s[0] for s in groups if 5 <= s[0] <= 30 and (s[0] <= 14 or len(s) >= 10)), False) Just in case all of these look odd to you, here's one close to your original. Just looking at the speeds from fastest to slowest and returning the first that matches the requirements: def f(speeds): count = Counter(map(int, speeds)) for speed in reversed(sorted(speeds)): if 5 <= speed <= 30 and (speed <= 14 or count[int(speed)] >= 10): return speed return False Btw, your definition of "the true maximum speed" seems rather odd to me. How about just looking at a certain percentile? Maybe like this: print sorted(speeds)[len(speeds) * 9 // 10]
I'm not sure if this is faster, but it is shorter, and I think it achieves your requirements. It uses Counter. from collections import Counter import math def valid(item): speed,count = item return speed <= 30 and (speed <= 13 or count >= 10) speeds = [4,3,1,3,4,5,6,7,14,16,18,19,20,34,5,4,3,2,12,58,14,14,14] speeds = map(math.floor,speeds) counts = Counter(speeds) max_valid_speed = max(filter(valid,counts.items())) Result: max_valid_speed == (12,1)
Using your sort idea we can start at the end of the list at the numbers less than 30, returning on the first number that matched the criteria or returning False: from collections import Counter def f(speeds): # get speeds that satisfy the range rev = [speed for speed in speeds if 5 <= speed < 30] rev.sort(reverse=True) c = Counter((int(v) for v in rev)) for speed in rev: # will hit highest numbers first # so return first that matches if speed > 14 and c[int(speed)] > 9 or speed < 15: return speed # we did not find any speed that matched our requirement return False Output for your list v: In [70]: f(v) Out[70]: 19.9 Without sorting you could use a dict, depending on your what your data is like will decide which is best, it will work for all cases including an empty list: def f_dict(speeds): d = defaultdict(lambda: defaultdict(lambda: 0, {})) for speed in speeds: key = int(speed) d[key]["count"] += 1 if speed > d[key]["speed"]: d[key]["speed"] = speed filt = max(filter(lambda x: (15 <= x[0] < 30 and x[1]["count"] > 9 or x[0] < 15), d.items()), default=False) return filt[1]["speed"] if filt else False Output: In [95]: f_dict(v) Out[95]: 19.9
Chop a list into a list of lists [duplicate]
This question already has answers here: How do I split a list into equally-sized chunks? (66 answers) Closed 9 years ago. list_of_numbers = [10.0, 12.0, 14.0, 16.0, 15.0, 13.0, 10.0, 9.2, 11.7, 14.8, 16.5, 14.8, 13.8, 10.2, 9.5, 13.0, 14.4, 17.2, 15.4, 12.5, 12.1, 10.0, 12.4, 11.9, 16.8, 15.6, 14.6, 10.4, 10.4, 11.0, 12.2, 18.8, 13.9, 12.0, 6.8, 11.2, 9.4, 12.6, 15.5, 14.0, 11.2, 12.3, 14.3, 11.7, 13.9, 13.4, 21.4, 13.7, 12.6] Out of this list i want to create a list of lists with 7 elements each. How do i do? (There are 49 elements in the list so i want to i want to create a list of 7 lists in it. Order should remain the same as in the list_of_numbers
You can use a simple combination of slicing and list comprehension: result = [list_of_numbers[i:i+7] for i in range(0, len(list_of_numbers), 7)]
You can use zip with iter as follows: zip(*[iter(list_of_numbers)]*7) Output: [(10.0, 12.0, 14.0, 16.0, 15.0, 13.0, 10.0), (9.2, 11.7, 14.8, 16.5, 14.8, 13.8, 10.2), (9.5, 13.0, 14.4, 17.2, 15.4, 12.5, 12.1), (10.0, 12.4, 11.9, 16.8, 15.6, 14.6, 10.4), (10.4, 11.0, 12.2, 18.8, 13.9, 12.0, 6.8), (11.2, 9.4, 12.6, 15.5, 14.0, 11.2, 12.3), (14.3, 11.7, 13.9, 13.4, 21.4, 13.7, 12.6)]