Is there a way to add grid lines for every month? I have plots like this:
I have tried tickCount and {'interval':'month', 'step':1} but neither works.
Related
I have been trying to make a figure using plotly that combines multiple figures together. In order to do this, I have been trying to use the make_subplots function, but I have found it very difficult to have the plots added in such a way that they are properly formatted. I can currently make singular plots (as seen directly below):
However, whenever I try to combine these singular plots using make_subplots, I end up with this:
This figure has the subplots set up completely wrong, since I need each of the four subplots to contain data pertaining to the four methods (A, B, C, and D). In other words, I would like to have four subplots that look like my singular plot example above.
I have set up the code in the following way:
for sequence in sequences:
#process for making sequence profile is done here
sequence_df = pd.DataFrame(sequence_profile)
row_number=1
grand_figure = make_subplots(rows=4, cols=1)
#there are four groups per sequence, so the grand figure should have four subplots in total
for group in sequence_df["group"].unique():
figure_df_group = sequence_df[(sequence_df["group"]==group)]
figure_df_group.sort_values("sample", ascending=True, inplace=True)
figure = px.line(figure_df_group, x = figure_df_group["sample"], y = figure_df_group["intensity"], color= figure_df_group["method"])
figure.update_xaxes(title= "sample")
figure.update_traces(mode='markers+lines')
#note: the next line fails, since data must be extracted from the figure, hence why it is commented out
#grand_figure.append_trace(figure, row = row_number, col=1)
figure.update_layout(title_text="{} Profile Plot".format(sequence))
grand_figure.append_trace(figure.data[0], row = row_number, col=1)
row_number+=1
figure.write_image(os.path.join(output_directory+"{}_profile_plot_subplots_in_{}.jpg".format(sequence, group)))
grand_figure.write_image(os.path.join(output_directory+"grand_figure_{}_profile_plot_subplots.jpg".format(sequence)))
I have tried following directions (like for example, here: ValueError: Invalid element(s) received for the 'data' property) but I was unable to get my figures added as is as subplots. At first it seemed like I needed to use the graph object (go) module in plotly (https://plotly.com/python/subplots/), but I would really like to keep the formatting/design of my current singular plot. I just want the plots to be conglomerated in groups of four. However, when I try to add the subplots like I currently do, I need to use the data property of the figure, which causes the design of my scatter plot to be completely messed up. Any help for how I can ameliorate this problem would be great.
Ok, so I found a solution here. Rather than using the make_subplots function, I just instead exported all the figures onto an .html file (Plotly saving multiple plots into a single html) and then converted it into an image (HTML to IMAGE using Python). This isn't exactly the approach I would have preferred to have, but it does work.
UPDATE
I have found that plotly express offers another solution, as the px.line object has the parameter of facet that allows one to set up multiple subplots within their plot. My code is set up like this, and is different from the code above in that the dataframe does not need to be iterated in a for loop based on its groups:
sequence_df = pd.DataFrame(sequence_profile)
figure = px.line(sequence_df, x = sequence_df["sample"], y = sequence_df["intensity"], color= sequence_df["method"], facet_col= sequence_df["group"])
Although it still needs more formatting, my plot now looks like this, which is works much better for my purposes:
Matplotlib has some pretty sophisticated code figuring out how to show labels, but sometimes it cramps its labels more than looks good on presentations. Is there any way to tweek it?
For example, suppose we're plotting something against date:
figure = plt.figure(figsize=(8,1))
ax = plt.gca()
ax.set_xlim(xmin=np.datetime64('2010'), xmax=np.datetime64('2020-04-01'))
We get an x-axis like this:
But supposing we want it to show more spaced years, like this:
We can kludge it in any given case by editing the labels 'mechanically'. E.g.:
ax.set_xticks([tick for i, tick in enumerate(ax.get_xticks()) if i%2==0]) # Every other year.
ax.xaxis.set_major_formatter(matplotlib.dates.DateFormatter("%Y"))
But that's fragile, and it breaks whenever the x limits change.
Is there any way to force more spacing in the tick setup algorithm?
Oh! Found the matplotlib source code and it led me to AutoDateLocator:
ax.xaxis.set_major_locator(matplotlib.dates.AutoDateLocator(maxticks=8))
The corresponding locator for non-dates is MaxNLocator .
I am making a program, which gives me a graph of users.
But it gives me something like this after the amount of dates has grown too high.
https://i.stack.imgur.com/iwSdc.png
I want to save them stretched, make them longer so you will be able to see dates properly.
Is there any parameter for this in plot.savefig('name')?
For example this data
Option 1
You could rotate tick labels using:
plt.xticks(rotation=90)
Option 2
or do it automaticly if you have date format:
fig.autofmt_xdate()
You can try following code:
plt.xticks(rotation=90)
It will print your x-axis values vertically.
Hope this helps!
I've been working with matplotlib.pyplot to plot some data over date ranges, but have been running across some weird behavior, not too different from this question.
The primary difference between my issue and that one (aside from the suggested fix not working) is they refer to different locators (WeekdayLocator() in my case, AutoDateLocator() in theirs.) As some background, here's what I'm getting:
The expected and typical result, where my data is displayed with a reasonable date range:
And the very occasional result, where the data is given some ridiculous range of about 5 years (from what I can see):
I did some additional testing with a generic matplotlib.pyplot.plot and it seemed to be unrelated to using a subplot, or just creating the plot using the module directly.
plt.plot(some plot)
vs.
fig = plt.figure(...)
sub = fig.add_subplot(...)
sub.plot(some plot)
From what I could find, the odd behavior only happens when the data set only has one point (and therefore only having a single date to plot). The outrageous number of ticks is caused by the WeekdayLocator() which, for some reason, attempts to generate 1653 ticks for the x-axis date range (from about 2013 to 2018) based on this error output:
RuntimeError: RRuleLocator estimated to generate 1635 ticks from
2013-07-11 19:23:39+00:00 to 2018-01-02 00:11:39+00:00: exceeds Locator.MAXTICKS * 2 (20)
(This was from some experimenting with the WeekdayLocator().MAXTICKS member set to 10)
I then tried changing the Locator based on how many date points I had to plot:
# If all the entries in the plot dictionary have <= 1 data point to plot
if all(len(times[comp]) <= 1 for comp in times.keys()):
sub.xaxis.set_major_locator(md.DayLocator())
else:
sub.xaxis.set_major_locator(md.WeekdayLocator())
This worked for the edge cases where I'd have a line 2 points and a line with 1 (or just a point) and wanted the normal ticking since it didn't get messed up, but only sort of worked to fix my problem:
Now I don't have a silly amount of tick marks, but my date range is still 5 years! (Side Note: I also tried using an HourLocator(), but it attempted to generate almost 40,000 tick marks...)
So I guess my question is this: is there some way to rein in the date range explosion when only having one date to plot, or am I at the mercy of a strange bug with Matplotlib's date plotting methods?
What I would like to have is something similar to the first picture, where the date range goes from a little before the first date and a little after the last date. Even if Matplotlib were to fill up the axis range to about match the frequency of ticks in the first image, I would expect it to only span the course of a month or so, not five whole years.
Edit:
Forgot to mention that the range explosion also appears to occur regardless of which Locator I use. Plotting with zero points just results in a blank x-axis (due to no date range at all), a single point gives me the described huge date range, and multiple points/lines gives the expected date ranges.
I'm quite new to Python, pandas DataFrames and Seaborn. When I was trying to understand Seaborn better, particularly sns.lmplot, I came across a difference between two figures made of the same data, that I thought were supposed to look alike, and I wonder why that is.
Data: My data is a pandas DataFrame that has 454 rows and 19 columns. The data relevant to this question includes 4 columns and looks something like this:
Columns: Av_density; pred2; LOC; Year;
Variable type: Continuous variable; Continuous variable; Categorical variable 1...4;Categorical 2012...2014
There are no missing data points.
My aim is to draw a 2x2 figure panel describing the relationship between Av_density and pred2 separately for each LOC(=location) with years marked with different colours. I call seaborn with:
import seaborn as sns
sns.set(style="whitegrid")
np.random.seed(sum(map(ord, "linear_categorical")))
(Side point: for some reason calling "linear_quantitative" does not work, i.e. I get a "File "stdin", line 2
sns.lmplot("Av_density", "pred2", Data, col="LOC", hue="YEAR", col_wrap=2);
^
SyntaxError: invalid syntax")
Figure method 1, FacetGrid + scatter:
sur=sns.FacetGrid(Data,col="LOC", col_wrap=2,hue="YEAR")
sur.map(plt.scatter, "Av_density", "pred2" );
plt.legend()
This produces a nice scatter of the data accurately. You can see the picture here:https://drive.google.com/file/d/0B7h2wsx9mUBScEdUbGRlRk5PV1E/view?usp=sharing
Figure method 2, sns.lmplot:
sns.lmplot("Av_density", "pred2", Data, col="LOC", hue="YEAR", col_wrap=2);
This produces the figure panel divided by LOC accurately, with Years in different colours, but the scatter of the data points does not look right. Instead, it looks like lmplot has linearised the data points, and lost the original scatter points that it is supposed to be drawing in addition to the regression lines.
You can see the figure here: https://drive.google.com/file/d/0B7h2wsx9mUBSRkN5ZXhBeW9ob1E/view?usp=sharing
My data produces only three points per location per year, and I was first wondering if this is what makes the "mistake" in lmplot datapoint. Optimally I would have a shorter line describing the trend between years instead of a proper regression, but I have not figured out the code to this yet.
But before tackling that issue, I would really like to know if there is something I am doing wrong that I can fix, or if this is an issue of lmplot trying to handle my data?
Any help, comments and ideas on this are warmly welcome!
-TA-
Ps. I'm running Python 2.7.8 with Spyder 2.3.4
EDIT: I get shorter "trend lines" with the first method by adding:
sur.map(plt.plot,"Av_density", "pred2" );
Still would like to know what is messing the figure with lmplot.
The issue is probably only that the added regression line is messing up the y-axis, so that the variability in the data cannot be seen.
Try resetting the y-axis based on the variability in your original plot to see if they show the same thing, in your case e.g.
fig1 = sns.lmplot("Av_density", "pred2", Data, col="LOC", hue="YEAR", col_wrap=2);
fig1.set(ylim=(-0.03, 0.05))
plt.show(fig1)