Related
When I put the following data in a histogram, the result is not plotted correctly. For example, there is a wheight equal to 5 (near the end) but it is not plotted.
How can I solve this?
import matplotlib.pyplot as plt
a=[]
for i in range(1000):
a.append(i*0.001)
b=[2.0, 1.0, 0.0, 0.0, 2.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 5.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 3.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 2.0, 0.0, 1.0, 0.0, 1.0, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 2.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 2.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 2.0, 0.0, 1.0, 0.0, 0.0, 0.0, 2.0, 0.0, 2.0, 1.0, 0.0, 1.0, 1.0, 2.0, 0.0, 0.0, 2.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 2.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 3.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 3.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 2.0, 2.0, 0.0, 0.0, 0.0, 1.0, 3.0, 0.0, 0.0, 1.0, 0.0, 3.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 3.0, 0.0, 3.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 2.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 3.0, 0.0, 2.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 2.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 2.0, 0.0, 0.0, 2.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 3.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 2.0, 0.0, 1.0, 2.0, 2.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 2.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 2.0, 1.0, 0.0, 2.0, 0.0, 2.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 2.0, 3.0, 2.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 3.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 2.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 2.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 3.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 3.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 2.0, 0.0, 1.0, 1.0, 2.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 2.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 4.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 2.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 3.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 2.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 3.0, 0.0, 0.0, 0.0, 2.0, 1.0, 0.0, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0, 0.0, 2.0, 1.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0, 0.0, 2.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 2.0, 3.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 3.0, 0.0, 0.0, 1.0, 2.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0, 0.0, 0.0, 1.0, 2.0, 1.0, 1.0, 2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 2.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 2.0, 0.0, 2.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 1.0, 2.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 2.0, 1.0, 2.0, 0.0, 0.0, 2.0, 1.0, 0.0, 0.0, 3.0, 0.0, 0.0, 0.0, 0.0, 3.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 2.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 2.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 2.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 3.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 3.0, 0.0, 0.0, 0.0, 0.0, 2.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 2.0, 0.0, 5.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0]
plt.hist(a,1000, weights=b)
plt.show()
Check out the documentation of the matplotlib.pyplot.hist function here.
The first parameter, in your case a, specifies the values you plot.
The weights parameter says how much every value from a contributes to the accumulated weight.
Not sure why you are using a and b the way you do, but if you run this:
plt.hist(b,1000)
You will see the value 5.
You can also use the third parameter - the range of the histogram.
Use (0, 5) to show values in this range now matter what you have in a:
plt.hist(a, 1000, (0,5), weights=b)
I'm trying to run 3 optimization with for loop and store the results in one dataframe.
After each optimization (element of the for loop), I append lists of results and being able to get all the reults in one list. However, when I try to convert the list to dataframe, I get one row for each of the optimization and multiple values in each cell corresponding to the variable name and the optimization number like this:
Date = []
results = []
for idx, df in enumerate([df0,df1,df2]):
model = ConcreteModel()
model.T = Set(initialize=df.hour.tolist(), ordered=True)
...
# Solve model
solver = SolverFactory('glpk')
solver.solve(model)
Date = list(df['Date'])
results.append([Date, model.Ein.get_values().values(), model.Eout.get_values().values(),
model.Z.get_values().values(), model.NES.get_values().values(),
model.L.get_values().values()])
df_results = pd.DataFrame(results)
df_results.rename(columns = {0: 'Date', 1: 'Ein', 2:'Eout', 3:'Z', 4:'NES', 5:'L'}, inplace = True)
df_results
## The output of the df is:
Date Ein
0 [2019-01-01, 2019-01-01, 2019-01-01, 2019-01-0... (0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 0.0, ... (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ... (0.0, 0.0, 1.0, 2.0, 3.0, 3.0, 4.0, 5.0, 5.0, ... (0.0, 0.0, -100.0, -100.0, -100.0, 0.0, -100.0... (16231.0, 16051.0, 15806.0, 15581.0, 15610.0, ...
1 [2019-01-16, 2019-01-16, 2019-01-16, 2019-01-1... (0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, ... (0.0, 0.0, 0.0, 0.0, 0.0, 0.5, 0.0, 0.0, 0.0, ... (0.0, 1.0, 1.0, 1.0, 1.0, 0.5, 1.5, 2.5, 3.5, ... (0.0, -100.0, 0.0, 0.0, 0.0, 50.0, -100.0, -10... (17643.0, 18654.0, 20462.0, 20448.0, 20305.0, ...
2 [2019-01-31, 2019-01-31, 2019-01-31, 2019-01-3... (0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, ... (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, ... (0.0, 0.0, 1.0, 1.0, 2.0, 3.0, 4.0, 3.0, 3.0, ... (0.0, 0.0, -100.0, 0.0, -100.0, -100.0, -100.0... (22155.0, 22184.0, 21510.0, 21193.0, 20884.0, ...
#The output of the list named results is:
[[['2019-01-01',
'2019-01-01',
'2019-01-01',
...
'2019-01-15',
'2019-01-15',
'2019-01-15',
'2019-01-15',
'2019-01-15',
'2019-01-15',
'2019-01-16',
'2019-01-16',
'2019-01-16',
'2019-01-16',
'2019-01-16'],
dict_values([0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
...
-1.11022302462516e-16, 0.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.11022302462516e-16, 0.0, 1.0, 0.5, 0.0, 0.0, 0.0, 0.166666666666667, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.333333333333333, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.666666666666667, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]),
dict_values([0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.5, 0.25, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.25,
...
0.333333333333333, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.5, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.166666666666667, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.833333333333333, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.166666666666667, 0.0, 0.0, 0.0, 0.666666666666667, 0.333333333333333, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.5, 0.25, 0.125, 0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625, 0.00048828125, 0.0]),
dict_values([0.0, 0.0, 1.0, 2.0, 3.0, 3.0, 4.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 4.0, 3.0, 2.0, 1.0, 1.0, 1.0, 1.0, 0.5, 0.5, 0.5, 1.5, 2.5, 3.5, 4.5, 4.5, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 3.0, 3.0, 2.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 3.0, 4.0, 5.0, 5.0, 5.0, 4.0, 3.0, 3.0, 3.0,
...
0.142857142857143, 0.142857142857143, 1.0, 2.0, 3.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 3.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 3.0, 4.0, 5.0, 5.0, 5.0, 4.0, 3.0, 3.0, 3.0, 2.0, 1.0, 1.0, 1.0, 1.0, 1.33333333333333, 1.33333333333333, 0.666666666666667, 0.666666666666667, 0.5, 0.25, 0.125, 0.0625, 0.03125, 0.015625, 0.0078125, 0.00390625, 0.001953125, 0.0009765625, 0.00048828125, 0.00048828125]),
[['2019-01-16',
'2019-01-16',
'2019-01-16',
'2019-01-16',
'2019-01-16',
'2019-01-16',
'2019-01-16',
'2019-01-16',
'2019-01-16',
'2019-01-16',
'2019-01-16',
'2019-01-16',
'2019-01-16',
'2019-01-16',
'2019-01-16',
...
Is it because each result in the for loop has de different dictionary? How could my results in this form:
Date Ein Eout Z NES L
0 2019-01-01 1.0 0.0 1.0 -100.0 16231.0
1 2019-01-01 1.0 1.0 0.0 100.0 16051.0,
...
You're constantly appending to results creating a list of lists of the wrong dimension. I hope this solution works for you -
df_results = pd.DataFrame(zip(Date, model.Ein.get_values().values(), model.Eout.get_values().values(),
model.Z.get_values().values(), model.NES.get_values().values(),
model.L.get_values().values()))
Let me know if it doesn't.
I have to run soak tests for longer duration and capture 3 datasets (before the run, in-between the run, after the run), plot them and manually analyze the plots.
All the datasets span across the very large range (0-10^5). So, when I am plotting this data using matplotlib's bar function, the bar for smaller values is too small to be analyzed.
import matplotlib
matplotlib.use('Agg')
import sys,os,argparse,json,string,numpy
from datetime import datetime
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
bx = ('smmpg_b1024k', 'smmpg_b10k', 'smmpg_b11k', 'smmpg_b128', 'smmpg_b128k', 'smmpg_b12k', 'smmpg_b13k', 'smmpg_b14k', 'smmpg_b15k', 'smmpg_b160', 'smmpg_b16k', 'smmpg_b17k', 'smmpg_b18k', 'smmpg_b192', 'smmpg_b192k', 'smmpg_b19k', 'smmpg_b1k', 'smmpg_b20k', 'smmpg_b21k', 'smmpg_b224', 'smmpg_b22k', 'smmpg_b23k', 'smmpg_b24k', 'smmpg_b256', 'smmpg_b256k', 'smmpg_b25k', 'smmpg_b26k', 'smmpg_b27k', 'smmpg_b288', 'smmpg_b28k', 'smmpg_b29k', 'smmpg_b2k', 'smmpg_b30k', 'smmpg_b31k', 'smmpg_b32', 'smmpg_b320', 'smmpg_b320k', 'smmpg_b32k', 'smmpg_b33k', 'smmpg_b34k', 'smmpg_b352', 'smmpg_b35k', 'smmpg_b36k', 'smmpg_b37k', 'smmpg_b384', 'smmpg_b384k', 'smmpg_b38k', 'smmpg_b39k', 'smmpg_b3k', 'smmpg_b40k', 'smmpg_b416', 'smmpg_b41k', 'smmpg_b42k', 'smmpg_b43k', 'smmpg_b448', 'smmpg_b448k', 'smmpg_b44k', 'smmpg_b45k', 'smmpg_b46k', 'smmpg_b47k', 'smmpg_b480', 'smmpg_b48k', 'smmpg_b49k', 'smmpg_b4k', 'smmpg_b50k', 'smmpg_b512', 'smmpg_b512k', 'smmpg_b51k', 'smmpg_b52k', 'smmpg_b53k', 'smmpg_b544', 'smmpg_b54k', 'smmpg_b55k', 'smmpg_b56k', 'smmpg_b576', 'smmpg_b576k', 'smmpg_b57k', 'smmpg_b58k', 'smmpg_b59k', 'smmpg_b5k', 'smmpg_b608', 'smmpg_b60k', 'smmpg_b61k', 'smmpg_b62k', 'smmpg_b63k', 'smmpg_b64', 'smmpg_b640', 'smmpg_b640k', 'smmpg_b64k', 'smmpg_b672', 'smmpg_b6k', 'smmpg_b704', 'smmpg_b704k', 'smmpg_b736', 'smmpg_b768', 'smmpg_b768k', 'smmpg_b7k', 'smmpg_b800', 'smmpg_b832', 'smmpg_b832k', 'smmpg_b864', 'smmpg_b896', 'smmpg_b896k', 'smmpg_b8k', 'smmpg_b928', 'smmpg_b96', 'smmpg_b960', 'smmpg_b960k', 'smmpg_b992', 'smmpg_b9k', 'smmpg_ccb', 'smmpg_msb', 'smmpg_twomb', 'total-pages', 'total-size')
before = (0.0, 2.0, 2.0, 4.0, 8.0, 2.0, 2.0, 2.0, 2.0, 6.0, 2.0, 4.0, 44.0, 76.0, 6.0, 2.0, 2.0, 2.0, 18.0, 2.0, 18.0, 30.0, 32.0, 2.0, 12.0, 2.0, 170.0, 0.0, 4.0, 2.0, 0.0, 24.0, 0.0, 2.0, 10.0, 2.0, 12.0, 2.0, 36.0, 0.0, 2.0, 0.0, 0.0, 0.0, 12.0, 22.0, 2.0, 0.0, 272.0, 2.0, 4.0, 2.0, 0.0, 2.0, 4.0, 2.0, 0.0, 0.0, 0.0, 0.0, 10.0, 0.0, 0.0, 4.0, 0.0, 2.0, 2.0, 2.0, 0.0, 0.0, 8.0, 2.0, 0.0, 2.0, 2.0, 6.0, 0.0, 0.0, 0.0, 34.0, 2.0, 0.0, 2.0, 0.0, 2.0, 92.0, 2.0, 0.0, 2.0, 2.0, 40.0, 2.0, 0.0, 2.0, 2.0, 0.0, 14.0, 2.0, 4.0, 2.0, 2.0, 2.0, 0.0, 18.0, 2.0, 28.0, 4.0, 0.0, 2.0, 2.0, 6.0, 214.0, 26226.0, 13813.0, 27626.0)
intermediate = (0.0, 2.0, 2.0, 4.0, 8.0, 2.0, 2.0, 2.0, 2.0, 6.0, 2.0, 4.0, 44.0, 76.0, 6.0, 2.0, 2.0, 2.0, 18.0, 2.0, 18.0, 30.0, 32.0, 2.0, 12.0, 2.0, 170.0, 0.0, 4.0, 2.0, 0.0, 24.0, 0.0, 2.0, 10.0, 2.0, 12.0, 2.0, 36.0, 0.0, 2.0, 0.0, 0.0, 0.0, 12.0, 22.0, 2.0, 0.0, 272.0, 2.0, 4.0, 2.0, 0.0, 2.0, 4.0, 2.0, 0.0, 0.0, 0.0, 0.0, 10.0, 0.0, 0.0, 4.0, 0.0, 2.0, 2.0, 2.0, 0.0, 0.0, 8.0, 2.0, 0.0, 2.0, 2.0, 6.0, 0.0, 0.0, 0.0, 34.0, 2.0, 0.0, 2.0, 0.0, 2.0, 92.0, 2.0, 0.0, 2.0, 2.0, 40.0, 2.0, 0.0, 2.0, 2.0, 0.0, 14.0, 2.0, 4.0, 2.0, 2.0, 2.0, 0.0, 18.0, 2.0, 28.0, 4.0, 0.0, 2.0, 2.0, 6.0, 214.0, 26226.0, 13813.0, 27626.0)
after = (0.0, 2.0, 2.0, 4.0, 8.0, 2.0, 2.0, 2.0, 2.0, 6.0, 2.0, 4.0, 44.0, 76.0, 6.0, 2.0, 2.0, 2.0, 18.0, 2.0, 18.0, 30.0, 32.0, 2.0, 12.0, 2.0, 170.0, 0.0, 4.0, 2.0, 0.0, 24.0, 0.0, 2.0, 10.0, 2.0, 12.0, 2.0, 36.0, 0.0, 2.0, 0.0, 0.0, 0.0, 12.0, 22.0, 2.0, 0.0, 272.0, 2.0, 4.0, 2.0, 0.0, 2.0, 4.0, 2.0, 0.0, 0.0, 0.0, 0.0, 10.0, 0.0, 0.0, 4.0, 0.0, 2.0, 2.0, 2.0, 0.0, 0.0, 8.0, 2.0, 0.0, 2.0, 2.0, 6.0, 0.0, 0.0, 0.0, 34.0, 2.0, 0.0, 2.0, 0.0, 2.0, 92.0, 2.0, 0.0, 2.0, 2.0, 40.0, 2.0, 0.0, 2.0, 2.0, 0.0, 14.0, 2.0, 4.0, 2.0, 2.0, 2.0, 0.0, 18.0, 2.0, 28.0, 4.0, 0.0, 2.0, 2.0, 6.0, 214.0, 26226.0, 13813.0, 27626.0)
x_locations= numpy.arange(len(bx))
width=0.27
fig = plt.figure(figsize=(50, 20))
ax = fig.add_subplot(111)
before_test_mempools_bar = ax.bar(x_locations, list(before), width, color='r')
intermediate_test_mempools_bar = ax.bar(x_locations + width, list(intermediate), width, color='g')
after_test_mempools_bar = ax.bar(x_locations + width *2,list(after), width, color='b')
ax.set_ylabel('Memory')
ax.set_xticks(x_locations + width)
ax.set_xticklabels(bx,rotation=90)
ax.legend((before_test_mempools_bar[0],intermediate_test_mempools_bar[0],after_test_mempools_bar[0]),('BEFORE','INTERMEDIATE','AFTER'))
fig.savefig("plot.png")
plt.close()
The above code produces the following plot:
Goal:
My goal is to accommodate all the data in the plot that is visually nice and so the plot can be analyzed by any tester in the team.
Currently, it's hard to see what's happened with a smaller range of values.
One possible approach would be normalization but not sure if the data would be retained original.
Any possible solutions are appreciated.
Transcribing #Alexander Reynold's comment into an answer:
Use a logarithmic y-axis, i.e. instead of plot() use semilogy() – You can change the base depending on what the dynamic range you need to display is.
I didn't know that there is already an argument parameter in bar function to change the scale of Y-axis.
After adding log=True argument to all the bar functions as below,
before_test_mempools_bar = ax.bar(x_locations, list(before_test_mempools), width, color='r',log=True)
intermediate_test_mempools_bar = ax.bar(x_locations + width, list(intermediate_test_mempools), width, color='g',log=True)
after_test_mempools_bar = ax.bar(x_locations + width *2,list(after_test_mempools), width, color='b',log=True)
My plot looks much nicer now and easy to analyze.
If I may, I think your problem is not technical but that you didn't think enough about you want you to show and what you want the people to look at because the graphic you're showing doesn't seem to have a lot of "noise" - i.e. area of the graphics that don't give much or even any information.
So, even if you only provided simulated data, it seems that there is some room of improvement to make a much readable and "to the point" visualization.
For example you could:
remove uninteresting information (maybe those at 0.0 or those that haven't evolved ?)
regroup some categories by group (what about creating new aggregated categories ? or showing the data in a total different way with values on the x axes and names of categories on the y axes ?)
Also, maybe you're putting together different kind of things (those last 3 bx categories ('smmpg_twomb', 'total-pages' &'total-size') shouldn't they be put in a graph on their own ?)
Use a data structure like pandas' DataFrame to better handle and clean your data in order to do all of the three previous suggestions.
It's just a few suggestions but maybe it will help.
Here is an exemple of what you could do... Just to illustrate:
import matplotlib
matplotlib.use('Agg')
import sys,os,argparse,json,string,numpy
from datetime import datetime
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
bx = ('smmpg_b1024k', 'smmpg_b10k', 'smmpg_b11k', 'smmpg_b128', 'smmpg_b128k', 'smmpg_b12k', 'smmpg_b13k',
'smmpg_b14k', 'smmpg_b15k', 'smmpg_b160', 'smmpg_b16k', 'smmpg_b17k', 'smmpg_b18k', 'smmpg_b192',
'smmpg_b192k', 'smmpg_b19k', 'smmpg_b1k', 'smmpg_b20k', 'smmpg_b21k', 'smmpg_b224', 'smmpg_b22k',
'smmpg_b23k', 'smmpg_b24k', 'smmpg_b256', 'smmpg_b256k', 'smmpg_b25k', 'smmpg_b26k', 'smmpg_b27k',
'smmpg_b288', 'smmpg_b28k', 'smmpg_b29k', 'smmpg_b2k', 'smmpg_b30k', 'smmpg_b31k', 'smmpg_b32',
'smmpg_b320', 'smmpg_b320k', 'smmpg_b32k', 'smmpg_b33k', 'smmpg_b34k', 'smmpg_b352', 'smmpg_b35k',
'smmpg_b36k', 'smmpg_b37k', 'smmpg_b384', 'smmpg_b384k', 'smmpg_b38k', 'smmpg_b39k', 'smmpg_b3k',
'smmpg_b40k', 'smmpg_b416', 'smmpg_b41k', 'smmpg_b42k', 'smmpg_b43k', 'smmpg_b448', 'smmpg_b448k',
'smmpg_b44k', 'smmpg_b45k', 'smmpg_b46k', 'smmpg_b47k', 'smmpg_b480', 'smmpg_b48k', 'smmpg_b49k',
'smmpg_b4k', 'smmpg_b50k', 'smmpg_b512', 'smmpg_b512k', 'smmpg_b51k', 'smmpg_b52k', 'smmpg_b53k',
'smmpg_b544', 'smmpg_b54k', 'smmpg_b55k', 'smmpg_b56k', 'smmpg_b576', 'smmpg_b576k', 'smmpg_b57k',
'smmpg_b58k', 'smmpg_b59k', 'smmpg_b5k', 'smmpg_b608', 'smmpg_b60k', 'smmpg_b61k', 'smmpg_b62k',
'smmpg_b63k', 'smmpg_b64', 'smmpg_b640', 'smmpg_b640k', 'smmpg_b64k', 'smmpg_b672', 'smmpg_b6k',
'smmpg_b704', 'smmpg_b704k', 'smmpg_b736', 'smmpg_b768', 'smmpg_b768k', 'smmpg_b7k', 'smmpg_b800',
'smmpg_b832', 'smmpg_b832k', 'smmpg_b864', 'smmpg_b896', 'smmpg_b896k', 'smmpg_b8k', 'smmpg_b928',
'smmpg_b96', 'smmpg_b960', 'smmpg_b960k', 'smmpg_b992', 'smmpg_b9k', 'smmpg_ccb', 'smmpg_msb',
'smmpg_twomb', 'total-pages', 'total-size')
before = (0.0, 2.0, 2.0, 4.0, 8.0, 2.0, 2.0, 2.0, 2.0, 6.0, 2.0, 4.0, 44.0, 76.0, 6.0, 2.0, 2.0, 2.0, 18.0, 2.0, 18.0, 30.0, 32.0, 2.0, 12.0, 2.0, 170.0, 0.0, 4.0, 2.0, 0.0, 24.0, 0.0, 2.0, 10.0, 2.0, 12.0, 2.0, 36.0, 0.0, 2.0, 0.0, 0.0, 0.0, 12.0, 22.0, 2.0, 0.0, 272.0, 2.0, 4.0, 2.0, 0.0, 2.0, 4.0, 2.0, 0.0, 0.0, 0.0, 0.0, 10.0, 0.0, 0.0, 4.0, 0.0, 2.0, 2.0, 2.0, 0.0, 0.0, 8.0, 2.0, 0.0, 2.0, 2.0, 6.0, 0.0, 0.0, 0.0, 34.0, 2.0, 0.0, 2.0, 0.0, 2.0, 92.0, 2.0, 0.0, 2.0, 2.0, 40.0, 2.0, 0.0, 2.0, 2.0, 0.0, 14.0, 2.0, 4.0, 2.0, 2.0, 2.0, 0.0, 18.0, 2.0, 28.0, 4.0, 0.0, 2.0, 2.0, 6.0, 214.0, 26226.0, 13813.0, 27626.0)
intermediate = (0.0, 2.0, 2.0, 4.0, 8.0, 2.0, 2.0, 2.0, 2.0, 6.0, 2.0, 4.0, 44.0, 76.0, 6.0, 2.0, 2.0, 2.0, 18.0, 2.0, 18.0, 30.0, 32.0, 2.0, 12.0, 2.0, 170.0, 0.0, 4.0, 2.0, 0.0, 24.0, 0.0, 2.0, 10.0, 2.0, 12.0, 2.0, 36.0, 0.0, 2.0, 0.0, 0.0, 0.0, 12.0, 22.0, 2.0, 0.0, 272.0, 2.0, 4.0, 2.0, 0.0, 2.0, 4.0, 2.0, 0.0, 0.0, 0.0, 0.0, 10.0, 0.0, 0.0, 4.0, 0.0, 2.0, 2.0, 2.0, 0.0, 0.0, 8.0, 2.0, 0.0, 2.0, 2.0, 6.0, 0.0, 0.0, 0.0, 34.0, 2.0, 0.0, 2.0, 0.0, 2.0, 92.0, 2.0, 0.0, 2.0, 2.0, 40.0, 2.0, 0.0, 2.0, 2.0, 0.0, 14.0, 2.0, 4.0, 2.0, 2.0, 2.0, 0.0, 18.0, 2.0, 28.0, 4.0, 0.0, 2.0, 2.0, 6.0, 214.0, 26226.0, 13813.0, 27626.0)
after = (0.0, 2.0, 2.0, 4.0, 8.0, 2.0, 2.0, 2.0, 2.0, 6.0, 2.0, 4.0, 44.0, 76.0, 6.0, 2.0, 2.0, 2.0, 18.0, 2.0, 18.0, 30.0, 32.0, 2.0, 12.0, 2.0, 170.0, 0.0, 4.0, 2.0, 0.0, 24.0, 0.0, 2.0, 10.0, 2.0, 12.0, 2.0, 36.0, 0.0, 2.0, 0.0, 0.0, 0.0, 12.0, 22.0, 2.0, 0.0, 272.0, 2.0, 4.0, 2.0, 0.0, 2.0, 4.0, 2.0, 0.0, 0.0, 0.0, 0.0, 10.0, 0.0, 0.0, 4.0, 0.0, 2.0, 2.0, 2.0, 0.0, 0.0, 8.0, 2.0, 0.0, 2.0, 2.0, 6.0, 0.0, 0.0, 0.0, 34.0, 2.0, 0.0, 2.0, 0.0, 2.0, 92.0, 2.0, 0.0, 2.0, 2.0, 40.0, 2.0, 0.0, 2.0, 2.0, 0.0, 14.0, 2.0, 4.0, 2.0, 2.0, 2.0, 0.0, 18.0, 2.0, 28.0, 4.0, 0.0, 2.0, 2.0, 6.0, 214.0, 26226.0, 13813.0, 27626.0)
# Put your data in a DataFrame:
df = pd.DataFrame({'before': before,
'intermediate': intermediate,
'after': after, 'bx': bx,
'x_locations': numpy.arange(len(bx))
})
#filter columns - you can put them in another graph!
df_filt_cat = df.loc[(df.bx != 'smmpg_twomb') & (df.bx != 'total-pages') & (df.bx != 'total-size')]
# filter categories that stay 0 all the way
df_filt_zero = df_filt_cat.loc[(df_filt_cat.before != 0) & (df_filt_cat.intermediate != 0) & (df_filt_cat.after != 0)]
x_locations= numpy.arange(len(bx))
width=0.27
fig = plt.figure(figsize=(50, 20))
ax = fig.add_subplot(111)
before_test_mempools_bar = ax.bar(df_filt_zero.x_locations, df_filt_zero.before, width, color='r')
before_test_mempools_bar = ax.bar(df_filt_zero.x_locations, df_filt_zero.before, width, color='r')
intermediate_test_mempools_bar = ax.bar(df_filt_zero.x_locations + width, df_filt_zero.intermediate, width, color='g')
after_test_mempools_bar = ax.bar(df_filt_zero.x_locations + width *2, df_filt_zero.after, width, color='b')
ax.set_ylabel('Memory')
ax.set_xticks(x_locations + width)
ax.set_xticklabels(bx,rotation=90)
ax.legend((before_test_mempools_bar[0],intermediate_test_mempools_bar[0],after_test_mempools_bar[0]),('BEFORE','INTERMEDIATE','AFTER'))
# just to show the result I commented this line
#fig.savefig("plot.png")
# and put this one instead:
plt.show()
It obviously still needs improvement but it's already a bit more readable.
While attempting to combine dense and sparse data with scipy.spare.hstack, I'm occasionally running into the error:
Traceback (most recent call last):
File "hstack_error.py", line 3, in <module>
X = scipy.sparse.hstack(hstack_parts)
File "/usr/lib/python2.7/dist-packages/scipy/sparse/construct.py", line 263, in hstack
return bmat([blocks], format=format, dtype=dtype)
File "/usr/lib/python2.7/dist-packages/scipy/sparse/construct.py", line 329, in bmat
raise ValueError('blocks must have rank 2')
ValueError: blocks must have rank 2
Minimal code to reproduce this is:
import scipy.sparse
hstack_parts = [[[0.17968359700312667, -0.23497267759562843, 5.5625, 12.0, 12.0, -0.3514978725245902, 4.562932312249999, 7.578125000000001, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, -1.0, -1.0, -1.0], [0.43775723232977204, -0.04553734061930783, 4.486910994764398, 12.0, 12.0, -0.33614476914571956, 2.8162986569528794, 4.74869109947644, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, -1.0, -1.0, -1.0], [0.403883732290472, -0.04826958105646641, 1.7142857142857142, 12.0, 12.0, -0.32207319092531883, 0.933412042503896, 1.851948051948052, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, -1.0, -1.0, -1.0], [0.29203081876806636, -0.11020036429872503, 1.5376623376623375, 12.0, 12.0, -0.31131701908652093, 0.964088085825974, 1.851948051948052, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, -1.0, -1.0, -1.0], [0.30639528566925406, -0.08743169398907111, 1.505, 12.0, 12.0, -0.3014608089744991, 0.917490079365, 1.745, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, -1.0, -1.0, -1.0], [1.138331763811077, 0.0, 3.2350000000000003, 12.0, 12.0, -0.5323457206576151, 0.9805158730150001, 3.2350000000000003, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, -1.0, -1.0, -1.0], [1.0770851496658955, -0.002941176470588277, 3.2375, 12.0, 12.0, -0.5199720995117647, 1.0401185770749999, 3.25, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, -1.0, -1.0, -1.0], [1.0152399481191077, -0.002941176470588277, 3.1140776699029122, 12.0, 12.0, -0.5052406417111764, 1.0414827890558251, 3.126213592233009, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, -1.0, -1.0, -1.0], [0.961141824125552, -0.0029359953024075576, 2.643776824034335, 12.0, 12.0, -0.4915900561438638, 0.8579874128476395, 2.6545064377682404, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, -1.0, -1.0, 1.0], [0.9079651211907968, -0.004110393423370539, 1.726688102893891, 12.0, 12.0, -0.4780357379095714, 0.4291079394533763, 1.7379421221864957, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, -1.0, -1.0, 1.0], [0.8545562907561834, -0.010569583088667041, 1.6746031746031749, 12.0, 12.0, -0.46648671607163833, 0.4421795595714286, 1.7031746031746033, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, -1.0, -1.0, 1.0], [0.824431155068869, -0.005871990604815115, 1.687301587301587, 12.0, 12.0, -0.4551024813223723, 0.4729531338222223, 1.7031746031746033, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, -1.0, -1.0, 1.0], [0.7862017310261765, -0.007633587786259692, 1.6825396825396823, 12.0, 12.0, -0.44442646372108047, 0.5018122734650794, 1.7031746031746033, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, -1.0, -1.0, 1.0], [0.7565618927494311, -0.007633587786259692, 1.6825396825396823, 12.0, 12.0, -0.43505183830416916, 0.5271535228063493, 1.7031746031746033, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, -1.0, -1.0, -1.0], [0.7120208806607795, -0.013505578391074599, 1.6666666666666667, 12.0, 12.0, -0.4237836507997651, 0.5576134010920637, 1.7031746031746033, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, -1.0, -1.0, -1.0], [0.6783481869678059, -0.013505578391074599, 1.6666666666666667, 12.0, 12.0, -0.4122230242395773, 0.5888637932063492, 1.7031746031746033, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, -1.0, -1.0, -1.0], [0.6499276254106391, -0.010569583088667041, 1.6746031746031749, 12.0, 12.0, -0.4003188978273635, 0.6210427253968255, 1.7031746031746033, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, -1.0, -1.0, -1.0], [0.6213577120617446, -0.008807985907222673, 1.6793650793650792, 12.0, 12.0, -0.38866543347034654, 0.6525440742857141, 1.7031746031746033, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, -1.0, 1.0, 1.0], [0.6018164150167221, -0.005284791544333521, 1.602150537634409, 12.0, 12.0, -0.3790079817322373, 0.624499857311828, 1.6159754224270355, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, -1.0, 1.0, 1.0], [0.569013826241389, -0.007046388725778097, 1.5621212121212122, 12.0, 12.0, -0.3671479532765708, 0.6329500538939395, 1.5803030303030305, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, -1.0, 1.0, 1.0], [0.5431497867155388, -0.005871990604815115, 1.5651515151515152, 12.0, 12.0, -0.3557799651379918, 0.6622829081363637, 1.5803030303030305, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, -1.0, 1.0, 1.0], [0.5210546429944948, -0.002348796241926171, 1.5170370370370367, 12.0, 12.0, -0.3441056122783324, 0.654797247837037, 1.522962962962963, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, -1.0, 1.0, 1.0], [0.4957918898245967, -0.0017615971814445763, 1.4045261669024045, 12.0, 12.0, -0.33263550256605995, 0.607527212347949, 1.4087694483734088, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, -1.0, -1.0, -1.0]]]
scipy.sparse.hstack(hstack_parts)
What does this error mean, and how do I fix my data so it no longer occurs?
The parts you are trying to join are not sparse matrix objects but ordinary dense matrix objects. You can construct sparse matrices out of the contents like so:
x_sparse = scipy.sparse.coo_matrix(hstack_parts[0])
y_sparse = scipy.sparse.coo_matrix(hstack_parts[1])
z_sparse = scipy.sparse.hstack([x_sparse, y_sparse])
To reclaim a dense representation, you can use:
z = z_sparse.todense()
Here's documentation on sparse.coo_matrix to help you determine if it's appropriate for your problem:
http://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.coo_matrix.html#scipy.sparse.coo_matrix
I'm trying to use numpy.optimize.curve_fit to estimate the frequency and phase of an on/off sequence.
This is the code I'm using:
from numpy import *
from scipy import optimize
row = array([0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 0.0, 0.0,])
def fit_func(x, a, b, c, d):
return c * sin (a * x + b) + d
p0 = [(pi/10.0), 5.0, row.std(), row.mean()]
result = optimize.curve_fit(fit_func, arange(len(row)), row, p0)
print result
This works. But on some rows, even though they seem perfectly ok, it fails.
Example of failing row:
row = array([1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,])
The error is:
RuntimeError: Optimal parameters not found: Both actual and predicted relative reductions in the sum of squares are at most 0.000000 and the relative error between two consecutive iterates is at most 0.000000
Which tells me very little about what's happened.
A quick test shows that varying the parameters in p0 will cause that row to succeed... and others to fail. Why is that?
I tried both rows of data that you provided and both worked for me just fine. I'm using Scipy 0.8.0rc3. What version are you using? Another thing that might help is to set c and d to fixed values since they really should be the same every time. I set c to 0.6311786 and d to .5. You could also use an fft with zero padding and quadratic fitting around the peak to find the frequency if you want another method. Really, any pitch estimation method is applicable since you are looking for the fundamental frequency.