I am plotting some data from a CSV. I recently edited the date range of the CSV file but the values are the same. Before, the data was being plotted simulatiously with 2 other data sets, and the legend had only one entry for this data set. After editing the CSV file, the legend now displays the label 3 times but overall graphs the data correctly. I have tried removing the other two data sets from the plot, using numpoints=1, and ensuring nothing is in a for loop (which none of this code uses one). Additionally, I made sure there wasn't 3 versions of the data saved in the same directory. Any suggestions on why this is happening and how to fix it? I'm including my plotting code in case something is in it that is wrong.
plt.plot(date_range,ice_extent1,color='red',label='MASIE')
plt.xlabel("Date (yyyy/mm)")
plt.ylabel("Sea Ice Extent (10^6 km^2)")
plt.title("Sea Ice Extent")`
plt.legend()
Related
I'm working on a dataset of SMS records [datetime_entry, sms_sent] and I was looking to copy a really effective trend visual from a well cited Electricity demand study. Does anyone know the name of this plot, or the implementation of something similar in Python (as I'm not sure this was done in Python).
I know how to subplot the 4 charts after splitting the data by quarter, I'm just stumped on the plot type and stylization.
This is what matplotlib calls an eventplot.
Essentially each vertical line represents an occurance of a Mwh demand during that specific hour. So each row in the plot should have as many vertical lines as there are days in that quarter.
While it works in this plot for these data, relying on the combination of alpha level + data density can be slightly unreliable as the data change as the number of overlapping points is not readily visible. So you can also create a similar visualization using hist2d, where you manually specify your bins.
I am doing three plots:
Line plot, Boxplot and Histogram.
All these plots don't really need hoverinfo. Also these plots don't really need to plot each point of the data.
Ass you can see the plots are very simple, however, when dealing with huge data (30 million of observations) the html results weights 5 MBs, which is a lot because there are 100 plots more like this.
At the moment I have made some optimizations...
When saving to html I put these parameters:
fig.to_html( include_plotlyjs="cdn", full_html=False)
Which reduces plot size a lot, however is not enough.
I have also tried in the Line plot specifying this parameter line = {"simplify":True} and hoverinfo = "skip". However, file size is almost the same.
Any help/ workaround is appreciated
EDIT - I was being stupid, and trying to plot strings. I converted to int and plotted again fine. Thanks to ImportanceOfBeingErnest for the hint.
I have data from 3 sensors which I want to plot, using matplotlib
Each array is of different length, and I plot them using the following line of code
plt.plot(s_1,'r',s_3,'b',s_4,'g')
plt.show()
This produces the following graph
As you can see, the green trace is not correct, and the y-axis scale is off (these is a 6 after the 21).
I'm really not sure what the problem is here.
When I plot the data individually, they are fine:
It is the last one in this series that is plotted strangely in the graph with all three at once.
To be clear, I don't understand why separately the graphs plot fine, but when the three are printed in one plot the y-axis gets messed up.
Any advice around what the issue with the three-in-one plot is would be great.
This is not duplicate, because existing answers on similar questions don't describe exactly what I need.
Matplotlib has great formatters inside and I love to use them:
ax.xaxis.set_major_locator(matplotlib.dates.MonthLocator())
ax.xaxis.set_major_formatter(matplotlib.dates.DateFormatter('%b%y'))
They let me plot such stock market charts:
This is what I need, but it has 1 issue: weekends. They are present on x axis and make my chart a little ugly.
Other questions about this issue give advice to create custom formatter. They show examples of such formatters. But no one of them do pretty formatting like matplotlib do:
May19, Jun19, Jul19...
I mean this line of code:
ax.xaxis.set_major_formatter(matplotlib.dates.DateFormatter('%b%y'))
My question is: please help me to format x axis like matplotlib do: May19, Jun19, Jul19... and don't create weekends when stock market is closed.
What you could almost always do is something similar to what Nic Wanavit suggested.
Manually set your labels, depending on what you need on your axis.
Especially in this case the plot is looking a bit ugly because you have timespans in your data that are not provided with actual data (the weekends in this case) so pyplot will simply connect these points with the corresponding length from the x-axis.
What you can do then is just to plot your data equally distant - which is correct if the data is daily - otherwise consider to interpolate it using e.g. pandas bultin interpolation.
To avoid pyplot automatically detect the index I had to do this:
df['plotidx'] = [i for i in range(len(df['close'])):
Here all the closing values for the stock are stored in a column named 'close' obvsl.
You plot this correspondingly.
Then you can obtain all the ticks created via
labels = [item.get_text() for item in ax.get_xticklabels()]
Adjust them as desired with
labels[i] = string_for_the_label_no_i
Then get them back on the graph using
ax.xaxis.set_ticklabels(labels)
You need to somewhat "update" the plot then. Also keep in mind, that resizing a lot could end up with the labels being as also said in the documentation strange location.
It is some kind of a workaround but worked fine for me because it feels natural to plot data equally distant next to each other rather then making up some data for the weekends.
Greets
to set the x ticks
assuming that you have the dates variable in dataframe row df['dates']
ax.xaxis.set_ticks(df['dates'])
I’ve been working on bokeh plots and I’m trying to plot a line graph taking values from a database. But the plot kind of traces back to the initial point and I don’t want that. I want a plot which starts at one point and stops at a certain point (and circle back). I’ve tried plotting it on other tools like SQLite browser and Excel and the plot seems ok which means I must be doing something wrong with the bokeh stuff and that the data points itself are not in error.
I’ve attached the images for reference and the line of code doing the line plot. Is there something I’ve missed?
>>> image = fig.line(“x”, “y”, color=color, source=something)
(Assume x and y are integer values and I’ve specified x and y ranges as DataRange1d(bounds=(0,None)))
Bokeh does not "auto-close" lines. You can see this is the case by looking at any number of examples in the docs and repository, but here is one in particular:
http://docs.bokeh.org/en/latest/docs/gallery/stocks.html
Bokeh's .line method will only "close up" if that is what is in the data (i.e., if the last point in the data is a repeat of the first point). I suggest you actually inspect the data values in source.data and I believe you will find this to be the case. Then the question is why is that the case and how to prevent it from doing that, but that is not really a Bokeh question.