Adjusting date tick labels through Python matplotlib - python

I would like to visualize xticklabels suitably by decreasing the frequency of each tick similar to here. Therefore, I found this example as a solution to eliminate the main issue (see below). Therefore, the code I got:
from matplotlib import pyplot as plt, dates as mdates
#Blank subplots
fig, axs = plt.subplots(4, 3, sharex='col', sharey='row', figsize = (6,3), dpi = 140)
#Loop through each chart in the subplot
for count, ax in enumerate(axs.reshape(-1)):
ax.plot(dfN[count]["Tarih"], dfN[count]["PM10"])
ax.xaxis.set_major_locator(mdates.MonthLocator(bymonth=(1, 7)))
ax.xaxis.set_minor_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%b'))
for label in ax.get_xticklabels(which='major'):
label.set(rotation=30, horizontalalignment='right')
plt.show()
However, this one throws further issue as below:
The date is not starting with my actual date data. It is starting with 1970.
The main issue:
Any suggestions ? Thanks

Related

seaborn lineplot set x-axis scale interval for visibility [duplicate]

This question already has answers here:
MonthLocator in Matplotlib
(1 answer)
Editing the date formatting of x-axis tick labels
(4 answers)
Closed 5 months ago.
Here's how plot this figure:
plt.figure(1, figsize = (20,8))
ax = sns.lineplot(data=df, x=df['timestamp'], y=df['speed'])
plt.xticks(rotation=90)
plt.title('Trip 543365 timeline', fontsize=22)
plt.ylabel('GPS speed', fontsize=18)
plt.xlabel('Timestamp', fontsize=16,)
plt.savefig('trip537685', dpi=600)
The x-axis is not readable despite setting plt.xticks(rotation=90), how to I change the scale so it appears readable?
As you have not provided the data, I have taken some random data of ~1500 rows with datetime as DD-MM-YYYY format. First, as this is in text, change it to datetime using to_datetime(), then plot it. That should, as #JohanC said, give you fairly good result. But, if you still need to adjust it, use set_major_locator() and set_major_formatter() to adjust as you need. I have shown this as interval of 3 months. You can, however, adjust it as you see fit. Hope this helps.
df=pd.read_csv('austin_weather.csv')
df.rename(columns={'Date': 'timestamp'}, inplace=True)
df.rename(columns={'TempHighF': 'speed'}, inplace=True)
df['timestamp']=pd.to_datetime(df['timestamp'],format="%d-%m-%Y")
plt.figure(1, figsize = (20,8))
ax = sns.lineplot(data=df, x=df['timestamp'], y=df['speed'])
import matplotlib.dates as mdates
ax.xaxis.set_major_locator(mdates.MonthLocator(interval=3))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b-%Y'))
It seems you have a lot of datapoints plotted so that the xticks just get overlayed due to the label font size.
If you don't need every single x-ticks displayed you can set the label locations with xticks along with an array to display only every nth tick.
Data preparation:
Just strings for x-axis lables as an example.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import random
import string
def random_string():
return ''.join(random.choices(string.ascii_lowercase +
string.digits, k=7))
size=1000
x_list = []
for i in range(size):
x_list.append(random_string())
y = np.random.randint(low=0, high=50, size=size)
df = pd.DataFrame(list(zip(x_list, y)),
columns =['timestamp', 'speed'])
Plot with a lot of datapoints for reference:
plt.figure(1, figsize = (20,8))
ax = sns.lineplot(data=df, x=df['timestamp'], y=df['speed'])
plt.xticks(rotation=90)
plt.title('Trip 543365 timeline', fontsize=22)
plt.ylabel('GPS speed', fontsize=18)
plt.xlabel('Timestamp', fontsize=16,)
plt.show()
Plot with reduced xticks:
plt.figure(1, figsize = (20,8))
ax = sns.lineplot(data=df, x=df['timestamp'], y=df['speed'])
plt.xticks(rotation=90)
plt.title('Trip 543365 timeline', fontsize=22)
plt.ylabel('GPS speed', fontsize=18)
plt.xlabel('Timestamp', fontsize=16,)
every_nth_xtick = 50
plt.xticks(np.arange(0, len(x_list)+1, every_nth_xtick))
plt.show()
To cross check you can add:
print(x_list[0])
print(x_list[50])
print(x_list[100])
Just make sure it's within the same random call.

Matplotlib: Plot on double y-axis plot misaligned

I'm trying to plot two datasets into one plot with matplotlib. One of the two plots is misaligned by 1 on the x-axis.
This MWE pretty much sums up the problem. What do I have to adjust to bring the box-plot further to the left?
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
titles = ["nlnd", "nlmd", "nlhd", "mlnd", "mlmd", "mlhd", "hlnd", "hlmd", "hlhd"]
plotData = pd.DataFrame(np.random.rand(25, 9), columns=titles)
failureRates = pd.DataFrame(np.random.rand(9, 1), index=titles)
color = {'boxes': 'DarkGreen', 'whiskers': 'DarkOrange', 'medians': 'DarkBlue',
'caps': 'Gray'}
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax2 = ax1.twinx()
plotData.plot.box(ax=ax1, color=color, sym='+')
failureRates.plot(ax=ax2, color='b', legend=False)
ax1.set_ylabel('Seconds')
ax2.set_ylabel('Failure Rate in %')
plt.xlim(-0.7, 8.7)
ax1.set_xticks(range(len(titles)))
ax1.set_xticklabels(titles)
fig.tight_layout()
fig.show()
Actual result. Note that its only 8 box-plots instead of 9 and that they're starting at index 1.
The issue is a mismatch between how box() and plot() work - box() starts at x-position 1 and plot() depends on the index of the dataframe (which defaults to starting at 0). There are only 8 plots because the 9th is being cut off since you specify plt.xlim(-0.7, 8.7). There are several easy ways to fix this, as #Sheldore's answer indicates, you can explicitly set the positions for the boxplot. Another way you can do this is to change the indexing of the failureRates dataframe to start at 1 in construction of the dataframe, i.e.
failureRates = pd.DataFrame(np.random.rand(9, 1), index=range(1, len(titles)+1))
note that you need not specify the xticks or the xlim for the question MCVE, but you may need to for your complete code.
You can specify the positions on the x-axis where you want to have the box plots. Since you have 9 boxes, use the following which generates the figure below
plotData.plot.box(ax=ax1, color=color, sym='+', positions=range(9))

Matplotlib Subplot Datetime X-Axis Ticks Not Working As Intended

I'm attempting to plot many plots, here's a sample of how the data is organized:
My intention is to build a series of subplots for either hours or days (say 7 days in a week, or 24 hours in a day) using google analytics data. My index are date-time objects.
Here's an example of how a single plot looks, when the axis is done correctly.
from datetime import datetime, date, timedelta
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
import matplotlib.dates as dates
#creating our graph and declaring our locator/formatters used in axis labelling.
hours = dates.HourLocator(interval=2)
hours_ = dates.DateFormatter('%I %p')
el = datetime(year=2016, day=1, month=3, hour=0)
fig, ax = plt.subplots(ncols = 1, nrows= 1)
fig.set_size_inches(18.5, 10.5)
fig.tight_layout()
ax.set_title(el.strftime('%a, %m/%d/%y'))
ax.plot(df_total.loc[el:el+timedelta(hours=23, minutes=59),:].index,
df_total.loc[el:el+timedelta(hours=23, minutes=59),:].hits, '-')
ax.xaxis.set_major_locator(hours)
ax.xaxis.set_major_formatter(hours_)
fig.show()
As you can see, the x axis looks good, working as intended with the right ticks/date labels.
However, when I try and run the same plot on a subplot series, I'm running into the following error. Here's my code:
fig, ax = plt.subplots(ncols = 3, nrows= 2)
fig.set_size_inches(18.5, 10.5)
fig.tight_layout()
nrows=2
ncols=3
count = 0
for row in range(nrows):
for column in range(ncols):
el = cleaned_date_range[count]
ax[row][column].set_title(el.strftime('%a, %m/%d/%y'))
ax[row][column].xaxis.set_major_locator(hours)
ax[row][column].xaxis.set_major_formatter(hours_)
ax[row][column].plot(df_total.loc[el:el+timedelta(hours=23,minutes=59),:].index, df_total.loc[el:el+timedelta(hours=23,minutes=59),:].hits)
count += 1
if count == 7:
break
However, that yields the very funky plot below, with mislabelled axes:
I experimented with adding an additional row to see if it was just covering up because of vertical space:
but was confronted with the same behavior, only the last subplot's axes appears to be working with the rest not working.
Any insight would be appreciated!
so the answer is in the following github issue raised a few years ago related to the set_major_locator() and set_major_formatter() objects:
https://github.com/matplotlib/matplotlib/issues/1086/
to quote eric:
"You are missing something, but it is something that is quite non-intuitive and easy to miss: Locators can't be shared among axes. The set_major_locator() method assigns its axis to that Locator, overwriting any axis that was previously assigned."
so the solution is to instantiate a new dates.MinuteLocator and dates.DateFormatter object for each new axes, e.g:
for ax in list_of_axes:
minutes = dates.MinuteLocator(interval=5)
minutes_ = dates.DateFormatter('%I:%M %p')
ax.xaxis.set_major_locator(minutes)
ax.xaxis.set_major_formatter(minutes_)
I've experimented and it looks like you don't need to reference the dates.Locator and dates.Formatter objects after the plot so it's ok to just re-instantiate with each loop using the same name. (I could be wrong here though!)
I had the same missing subplot datetime x-axis tick marks issue. The following code, which is quite similar to the OP's, seems to work, see the attached figure. However, I'm using matplotlib 3.1.0, perhaps the issue has been addressed in this version? But I do have one observation: if I enable fig.autofmt_xdate() for the second subplot, the first subplot datetime x-axis will not display.
fig = plt.figure()
plt.rcParams['figure.figsize'] = (width, height)
plt.subplots_adjust(wspace=0.25, hspace=0.2)
ax = fig.add_subplot(2,1,1)
ax.xaxis.set_major_locator(MonthLocator(bymonthday=1))
ax.xaxis.set_major_formatter(DateFormatter('%Y-%b'))
ax.plot(df1['DATE'], df1['Movement'], '-')
plt.ylabel(r'$D$', fontsize=18)
plt.xticks(fontsize=12)
plt.yticks(fontsize=16)
plt.legend(fontsize=16, frameon=False)
fig.autofmt_xdate()
ax = fig.add_subplot(2,1,2)
ax.xaxis.set_major_locator(MonthLocator(bymonthday=1))
ax.xaxis.set_major_formatter(DateFormatter('%Y-%b'))
ax.plot(df2['DATE'], df2['Movement'], '-')
#plt.ylabel(r'$D`enter code here`$', fontsize=18)
plt.xticks(fontsize=16)
plt.yticks(fontsize=16)
plt.legend(fontsize=16, frameon=False)
#fig.autofmt_xdate()
plt.show()

matplotlib scatter plot change distance in x-axis

I want to plot some Data with Matplotlib scatter plot.
I used the following code to plot the Data as a scatter with using the same axes for the different subplots.
import numpy as np
import matplotlib.pyplot as plt
epsilon= np.array([1,2,3,4,5])
f, (ax1, ax2, ax3, ax4) = plt.subplots(4, sharex= True, sharey=True)
ax1.scatter(epsilon, mean_percent_100_0, color='r', label='Totaldehnung= 0.000')
ax1.scatter(epsilon, mean_percent_100_03, color='g',label='Totaldehnung= 0.003')
ax1.scatter(epsilon, mean_percent_100_05, color='b',label='Totaldehnung= 0.005')
ax1.set_title('TOR_R')
ax2.scatter(epsilon, mean_percent_111_0,color='r')
ax2.scatter(epsilon, mean_percent_111_03,color='g')
ax2.scatter(epsilon, mean_percent_111_05,color='b')
ax3.scatter(epsilon, mean_percent_110_0,color='r')
ax3.scatter(epsilon, mean_percent_110_03,color='g')
ax3.scatter(epsilon, mean_percent_110_05,color='b')
ax4.scatter(epsilon, mean_percent_234_0,color='r')
ax4.scatter(epsilon, mean_percent_234_03,color='g')
ax4.scatter(epsilon, mean_percent_234_05,color='b')
# Fine-tune figure; make subplots close to each other and hide x ticks for
# all but bottom plot.
f.subplots_adjust(hspace=0.13)
plt.setp([a.get_xticklabels() for a in f.axes[:-1]], visible=False)
plt.locator_params(axis = 'y', nbins = 4)
ax1.grid()
ax2.grid()
ax3.grid()
ax4.grid()
plt.show()
Now i want to have a x-axis with smaller space between each point. I tried to change the range but it was not working. Can someone help me?
To make the x ticks come closer you might have to set the dimensions of the figure.
Since, in your case, the figure is already created, Set the size of the plot using set_size_inches method of the figure object.
This question contains a few other ways to do the same.
Adding the following line before the plt.show()
fig.set_size_inches(2,8)
Gives me this :
Which I hope is what you are trying to do.

matplotlib: Creating two (stacked) subplots with SHARED X axis but SEPARATE Y axis values

I am using matplotlib 1.2.x and Python 2.6.5 on Ubuntu 10.0.4. I am trying to create a SINGLE plot that consists of a top plot and a bottom plot.
The X axis is the date of the time series. The top plot contains a candlestick plot of the data, and the bottom plot should consist of a bar type plot - with its own Y axis (also on the left - same as the top plot). These two plots should NOT OVERLAP.
Here is a snippet of what I have done so far.
datafile = r'/var/tmp/trz12.csv'
r = mlab.csv2rec(datafile, delimiter=',', names=('dt', 'op', 'hi', 'lo', 'cl', 'vol', 'oi'))
mask = (r["dt"] >= datetime.date(startdate)) & (r["dt"] <= datetime.date(enddate))
selected = r[mask]
plotdata = zip(date2num(selected['dt']), selected['op'], selected['cl'], selected['hi'], selected['lo'], selected['vol'], selected['oi'])
# Setup charting
mondays = WeekdayLocator(MONDAY) # major ticks on the mondays
alldays = DayLocator() # minor ticks on the days
weekFormatter = DateFormatter('%b %d') # Eg, Jan 12
dayFormatter = DateFormatter('%d') # Eg, 12
monthFormatter = DateFormatter('%b %y')
# every Nth month
months = MonthLocator(range(1,13), bymonthday=1, interval=1)
fig = pylab.figure()
fig.subplots_adjust(bottom=0.1)
ax = fig.add_subplot(111)
ax.xaxis.set_major_locator(months)#mondays
ax.xaxis.set_major_formatter(monthFormatter) #weekFormatter
ax.format_xdata = mdates.DateFormatter('%Y-%m-%d')
ax.format_ydata = price
ax.grid(True)
candlestick(ax, plotdata, width=0.5, colorup='g', colordown='r', alpha=0.85)
ax.xaxis_date()
ax.autoscale_view()
pylab.setp( pylab.gca().get_xticklabels(), rotation=45, horizontalalignment='right')
# Add volume data
# Note: the code below OVERWRITES the bottom part of the first plot
# it should be plotted UNDERNEATH the first plot - but somehow, that's not happening
fig.subplots_adjust(hspace=0.15)
ay = fig.add_subplot(212)
volumes = [ x[-2] for x in plotdata]
ay.bar(range(len(plotdata)), volumes, 0.05)
pylab.show()
I have managed to display the two plots using the code above, however, there are two problems with the bottom plot:
It COMPLETELY OVERWRITES the bottom part of the first (top) plot - almost as though the second plot was drawing on the same 'canvas' as the first plot - I can't see where/why that is happening.
It OVERWRITES the existing X axis with its own indice, the X axis values (dates) should be SHARED between the two plots.
What am I doing wrong in my code?. Can someone spot what is causing the 2nd (bottom) plot to overwrite the first (top) plot - and how can I fix this?
Here is a screenshot of the plot created by the code above:
[[Edit]]
After modifying the code as suggested by hwlau, this is the new plot. It is better than the first in that the two plots are separate, however the following issues remain:
The X axis should be SHARED by the two plots (i.e. the X axis should be shown only for the 2nd [bottom] plot)
The Y values for the 2nd plot seem to be formmated incorrectly
I think these issues should be quite easy to resolve however, my matplotlib fu is not great at the moment, as I have only recently started programming with matplotlib. any help will be much appreciated.
There seem to be a couple of problems with your code:
If you were using figure.add_subplots with the full
signature of subplot(nrows, ncols, plotNum) it may have
been more apparent that your first plot asking for 1 row
and 1 column and the second plot was asking for 2 rows and
1 column. Hence your first plot is filling the whole figure.
Rather than fig.add_subplot(111) followed by fig.add_subplot(212)
use fig.add_subplot(211) followed by fig.add_subplot(212).
Sharing an axis should be done in the add_subplot command using sharex=first_axis_instance
I have put together an example which you should be able to run:
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
import matplotlib.dates as mdates
import datetime as dt
n_pts = 10
dates = [dt.datetime.now() + dt.timedelta(days=i) for i in range(n_pts)]
ax1 = plt.subplot(2, 1, 1)
ax1.plot(dates, range(10))
ax2 = plt.subplot(2, 1, 2, sharex=ax1)
ax2.bar(dates, range(10, 20))
# Now format the x axis. This *MUST* be done after all sharex commands are run.
# put no more than 10 ticks on the date axis.
ax1.xaxis.set_major_locator(mticker.MaxNLocator(10))
# format the date in our own way.
ax1.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
# rotate the labels on both date axes
for label in ax1.xaxis.get_ticklabels():
label.set_rotation(30)
for label in ax2.xaxis.get_ticklabels():
label.set_rotation(30)
# tweak the subplot spacing to fit the rotated labels correctly
plt.subplots_adjust(hspace=0.35, bottom=0.125)
plt.show()
Hope that helps.
You should change this line:
ax = fig.add_subplot(111)
to
ax = fig.add_subplot(211)
The original command means that there is one row and one column so it occupies the whole graph. So your second graph fig.add_subplot(212) cover the lower part of the first graph.
Edit
If you dont want the gap between two plots, use subplots_adjust() to change the size of the subplots margin.
The example from #Pelson, simplified.
import matplotlib.pyplot as plt
import datetime as dt
#Two subplots that share one x axis
fig,ax=plt.subplots(2,sharex=True)
#plot data
n_pts = 10
dates = [dt.datetime.now() + dt.timedelta(days=i) for i in range(n_pts)]
ax[0].bar(dates, range(10, 20))
ax[1].plot(dates, range(10))
#rotate and format the dates on the x axis
fig.autofmt_xdate()
The subplots sharing an x-axis are created in one line, which is convenient when you want more than two subplots:
fig, ax = plt.subplots(number_of_subplots, sharex=True)
To format the date correctly on the x axis, we can simply use fig.autofmt_xdate()
For additional informations, see shared axis demo and date demo from the pylab examples.
This example ran on Python3, matplotlib 1.5.1

Categories