matplotlib subplot axis label - python

I have a subplot, its x-axis label uses voltages, its csv data column values increase from 0 to 30 and then decrease from 30 to 0.
when I use this code it gives me this plot
ax2.plot(df_raw.index, df_raw.loc[:,"data_column"])
When I use below code I got the plot as as shown below
ax2.plot(df_raw.loc[:,"voltage"], df_raw.loc[:,"data_column"])
What I really want is as shown below

Try to set the label manually:
df = pd.DataFrame({'vol': list(range(101)) + list(range(99,0,-1)),
'val': [0]*10 + [1]*180 +[0]*10})
fig, ax = plt.subplots()
ax.plot(df.index, df.val)
ax.set_xticklabels(df.vol[ax.get_xticks()]
.fillna(0).astype(int))
plt.show()

Related

Matplotlib not showing correct and desired x-axis

I have a barplot I am trying to plot without the x-axis ticks overlapping. I have settled on an angle of 45 degrees, and a max. number of ticks of 50, as this is about the max. of what can be shown without overlapping (IF the ticks are tilted at 45 degrees).
However, in my attempts I ran into the problem of Matplotlib not setting the x-axis to what I desire, whatever I try. I need to plot multiple datasets, for all of which the time runs from -15.8 through somewhere around 1200-1800.
I tried several solutions I found online, but all to no avail. The code below does not work, as it does not show the correct ticks. The range stops well before the last number in the timepoints list.
import numpy as np
from matplotlib import pyplot as plt
# Mock data
timepoints = list(np.arange(-15.8, 1276.2, 4))
patient_counts = np.random.randint(300, 600, len(timepoints))
x_tick_pos = [i + 0.5 for i in range(len(timepoints))]
# Plot barplot
fig, ax = plt.subplots(figsize=(16, 10))
ax.bar(x_tick_pos, patient_counts, align='center', width=1.0)
# Set x axis ticks
ax.set_xticklabels(timepoints, rotation=45)
ax.locator_params(axis='x', nbins=20)
plt.show()
Clearly, the x-axis does not come close to the expected values.
EDIT
To expand, this question is a follow-up from this thread. The code based on the answer in that question is as follows
# Plot barplot
fig, ax = plt.subplots(figsize=(16, 10))
ax.bar(x_tick_pos, patient_counts, align='center', width=1.0)
# Set x axis ticks
ax.set_xticks(x_tick_pos)
ax.set_xticklabels(x_ticks, rotation=45)
This appears to set the right x-ticks, except they overlap a lot- hence why I want only a max of 50 ticks to show:
This might be a simple case of fixing the x_tick_pos list expression. In your mock example, if we print them out ...
x_tick_pos = [i + 0.5 for i in range(len(timepoints))]
print(x_tick_pos[:5], x_tick_pos[-5:])
... we get what your figure reflects:
[0.5, 1.5, 2.5, 3.5, 4.5] [318.5, 319.5, 320.5, 321.5, 322.5]
Changing the assignment to
x_tick_pos = [i + 0.5 for i timepoints]
would appear to give the expected ticks.
The issue is that the positioning of the ticks is written so that they line up with another graph above this one, as per this post.
There are two solutions:
forget about positioning the ticks relative to another graph, in case this bar plot is plotted in a standalone fashion
resetting the ticks after plotting the bar plot to give them correct labels:
# Plot barplot
fig, ax = plt.subplots(figsize=(16, 10))
ax.bar(x_tick_pos, patient_counts, align='center', width=1.0)
# Set x axis ticks
ticks_step = int(len(missings_df.index) / 50) # 50 here is the max. nr of ticks
x_ticks = [missings_df.index[i] for i in range(0, len(missings_df.index), int(len(missings_df)/50))]
x_tick_pos = [i + 0.5 for i in range(0, len(missings_df.index), int(len(missings_df)/50))]
ax.set_xticks(x_tick_pos)
ax.set_xticklabels(x_ticks, rotation=45)
This correctly plots the x-axis:

Garbled x-axis labels in matplotlib subplots

I am querying COVID-19 data and building a dataframe of day-over-day changes for one of the data points (positive test results) where each row is a day, each column is a state or territory (there are 56 altogether). I can then generate a chart for every one of the states, but I can't get my x-axis labels (the dates) to behave like I want. There are two problems which I suspect are related. First, there are too many labels -- usually matplotlib tidily reduces the label count for readability, but I think the subplots are confusing it. Second, I would like the labels to read vertically; but this only happens on the last of the plots. (I tried moving the rotation='vertical' inside the for block, to no avail.)
The dates are the same for all the subplots, so -- this part works -- the x-axis labels only need to appear on the bottom row of the subplots. Matplotlib is doing this automatically. But I need fewer of the labels, and for all of them to align vertically. Here is my code:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
# get current data
all_states = pd.read_json("https://covidtracking.com/api/v1/states/daily.json")
# convert the YYYYMMDD date to a datetime object
all_states[['gooddate']] = all_states[['date']].applymap(lambda s: pd.to_datetime(str(s), format = '%Y%m%d'))
# 'positive' is the cumulative total of COVID-19 test results that are positive
all_states_new_positives = all_states.pivot_table(index = 'gooddate', columns = 'state', values = 'positive', aggfunc='sum')
all_states_new_positives_diff = all_states_new_positives.diff()
fig, axes = plt.subplots(14, 4, figsize = (12,8), sharex = True )
plt.tight_layout
for i , ax in enumerate(axes.ravel()):
# get the numbers for the last 28 days
x = all_states_new_positives_diff.iloc[-28 :].index
y = all_states_new_positives_diff.iloc[-28 : , i]
ax.set_title(y.name, loc='left', fontsize=12, fontweight=0)
ax.plot(x,y)
plt.xticks(rotation='vertical')
plt.subplots_adjust(left=0.5, bottom=1, right=1, top=4, wspace=2, hspace=2)
plt.show();
Suggestions:
Increase the height of the figure.
fig, axes = plt.subplots(14, 4, figsize = (12,20), sharex = True)
Rotate all the labels:
fig.autofmt_xdate(rotation=90)
Use tight_layout at the end instead of subplots_adjust:
fig.tight_layout()

Editing the labels and position of the axis ticks on a seaborn heatmap results in an empty plot

I am trying to plot a seaborn heatmap with custom locations and labels on both axes. The dataframe looks like this:
Dataframe
I can plot this normally with seaborn.heatmap:
fig, ax = plt.subplots(figsize=(8, 8))
sns.heatmap(genome_freq.applymap(lambda x: np.log10(x+1)),
ax=ax)
plt.show()
Normal heatmap
I have a list of positions I'd like to set as the xticks (binned_chrom_genome_pos):
[1000000, 248000000, 491000000, 690000000, 881000000, 1062000000, 1233000000, 1392000000, 1538000000, 1679000000, 1814000000, 1948000000, 2081000000, 2195000000, 2301000000, 2402000000, 2490000000, 2569000000, 2645000000, 2709000000, 2772000000, 2819000000, 2868000000, 3023000000]
However, when I try to modify the xticks, the plot becomes empty:
plt.xticks(binned_chrom_genome_pos)
Modified heatmap
I also noticed that the x-axis labels do not correspond to the ticks specified.
Could someone assist me in plotting this properly?
why the code does what it does
ax.get_xticks() returns the positions of the ticks. You can see that they are between 0.5 and 3000. These values refer to the index of your data. Large values, set by plt.xticks, or ax.set_xticks are still interpreted as data indices. So, if you have 10 rows of data, and set xticks to [0, 1000], the data in your figure will only occupy 1% of the x-range, hence disappearing. I am not sure if I am making myself clear, so I will give an example with synthetic data:
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
#generating data
dic = {a:np.random.randint(0,1000,100) for a in range(0,1000000, 10000)}
genome_freq = pd.DataFrame(dic, index=range(0,1000000, 10000))
#plotting heatmaps
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(8, 4))
sns.heatmap(genome_freq.applymap(lambda x: np.log10(x+1)),
ax=ax1)
sns.heatmap(genome_freq.applymap(lambda x: np.log10(x+1)),
ax=ax2)
old_ticks = ax2.get_xticks()
print(np.min(old_ticks), np.max(old_ticks), len(old_ticks)) # prints 0.5 99.5 34
ax2.set_xticks([0,300]) # setting xticks with values way larger than your index squishes your data
plt.show()
what can be done to fix it
So, what you want to do, is to change the xticks based on the size of your data, and then overwrite xticklabels:
Given the new labels from your question:
new_labels = [1000000, 248000000, 491000000, 690000000, 881000000, 1062000000, 1233000000, 1392000000, 1538000000, 1679000000, 1814000000, 1948000000, 2081000000, 2195000000, 2301000000, 2402000000, 2490000000, 2569000000, 2645000000, 2709000000, 2772000000, 2819000000, 2868000000, 3023000000]
len(new_labels) # returns 24
fig, ax = plt.subplots(figsize=(4, 4))
sns.heatmap(genome_freq.applymap(lambda x: np.log10(x+1)),
ax=ax)
So, now we want 24 evenly spaced xticks between the former minimum and the former maximum. We can use np.linspace to achieve that:
old_ticks = ax.get_xticks()
new_ticks = np.linspace(np.min(old_ticks), np.max(old_ticks), len(new_labels))
ax.set_xticks(new_ticks)
ax.set_xticklabels(new_labels)
plt.show()

Matplotlib: Plot on double y-axis plot misaligned

I'm trying to plot two datasets into one plot with matplotlib. One of the two plots is misaligned by 1 on the x-axis.
This MWE pretty much sums up the problem. What do I have to adjust to bring the box-plot further to the left?
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
titles = ["nlnd", "nlmd", "nlhd", "mlnd", "mlmd", "mlhd", "hlnd", "hlmd", "hlhd"]
plotData = pd.DataFrame(np.random.rand(25, 9), columns=titles)
failureRates = pd.DataFrame(np.random.rand(9, 1), index=titles)
color = {'boxes': 'DarkGreen', 'whiskers': 'DarkOrange', 'medians': 'DarkBlue',
'caps': 'Gray'}
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax2 = ax1.twinx()
plotData.plot.box(ax=ax1, color=color, sym='+')
failureRates.plot(ax=ax2, color='b', legend=False)
ax1.set_ylabel('Seconds')
ax2.set_ylabel('Failure Rate in %')
plt.xlim(-0.7, 8.7)
ax1.set_xticks(range(len(titles)))
ax1.set_xticklabels(titles)
fig.tight_layout()
fig.show()
Actual result. Note that its only 8 box-plots instead of 9 and that they're starting at index 1.
The issue is a mismatch between how box() and plot() work - box() starts at x-position 1 and plot() depends on the index of the dataframe (which defaults to starting at 0). There are only 8 plots because the 9th is being cut off since you specify plt.xlim(-0.7, 8.7). There are several easy ways to fix this, as #Sheldore's answer indicates, you can explicitly set the positions for the boxplot. Another way you can do this is to change the indexing of the failureRates dataframe to start at 1 in construction of the dataframe, i.e.
failureRates = pd.DataFrame(np.random.rand(9, 1), index=range(1, len(titles)+1))
note that you need not specify the xticks or the xlim for the question MCVE, but you may need to for your complete code.
You can specify the positions on the x-axis where you want to have the box plots. Since you have 9 boxes, use the following which generates the figure below
plotData.plot.box(ax=ax1, color=color, sym='+', positions=range(9))

matplotlib: Creating two (stacked) subplots with SHARED X axis but SEPARATE Y axis values

I am using matplotlib 1.2.x and Python 2.6.5 on Ubuntu 10.0.4. I am trying to create a SINGLE plot that consists of a top plot and a bottom plot.
The X axis is the date of the time series. The top plot contains a candlestick plot of the data, and the bottom plot should consist of a bar type plot - with its own Y axis (also on the left - same as the top plot). These two plots should NOT OVERLAP.
Here is a snippet of what I have done so far.
datafile = r'/var/tmp/trz12.csv'
r = mlab.csv2rec(datafile, delimiter=',', names=('dt', 'op', 'hi', 'lo', 'cl', 'vol', 'oi'))
mask = (r["dt"] >= datetime.date(startdate)) & (r["dt"] <= datetime.date(enddate))
selected = r[mask]
plotdata = zip(date2num(selected['dt']), selected['op'], selected['cl'], selected['hi'], selected['lo'], selected['vol'], selected['oi'])
# Setup charting
mondays = WeekdayLocator(MONDAY) # major ticks on the mondays
alldays = DayLocator() # minor ticks on the days
weekFormatter = DateFormatter('%b %d') # Eg, Jan 12
dayFormatter = DateFormatter('%d') # Eg, 12
monthFormatter = DateFormatter('%b %y')
# every Nth month
months = MonthLocator(range(1,13), bymonthday=1, interval=1)
fig = pylab.figure()
fig.subplots_adjust(bottom=0.1)
ax = fig.add_subplot(111)
ax.xaxis.set_major_locator(months)#mondays
ax.xaxis.set_major_formatter(monthFormatter) #weekFormatter
ax.format_xdata = mdates.DateFormatter('%Y-%m-%d')
ax.format_ydata = price
ax.grid(True)
candlestick(ax, plotdata, width=0.5, colorup='g', colordown='r', alpha=0.85)
ax.xaxis_date()
ax.autoscale_view()
pylab.setp( pylab.gca().get_xticklabels(), rotation=45, horizontalalignment='right')
# Add volume data
# Note: the code below OVERWRITES the bottom part of the first plot
# it should be plotted UNDERNEATH the first plot - but somehow, that's not happening
fig.subplots_adjust(hspace=0.15)
ay = fig.add_subplot(212)
volumes = [ x[-2] for x in plotdata]
ay.bar(range(len(plotdata)), volumes, 0.05)
pylab.show()
I have managed to display the two plots using the code above, however, there are two problems with the bottom plot:
It COMPLETELY OVERWRITES the bottom part of the first (top) plot - almost as though the second plot was drawing on the same 'canvas' as the first plot - I can't see where/why that is happening.
It OVERWRITES the existing X axis with its own indice, the X axis values (dates) should be SHARED between the two plots.
What am I doing wrong in my code?. Can someone spot what is causing the 2nd (bottom) plot to overwrite the first (top) plot - and how can I fix this?
Here is a screenshot of the plot created by the code above:
[[Edit]]
After modifying the code as suggested by hwlau, this is the new plot. It is better than the first in that the two plots are separate, however the following issues remain:
The X axis should be SHARED by the two plots (i.e. the X axis should be shown only for the 2nd [bottom] plot)
The Y values for the 2nd plot seem to be formmated incorrectly
I think these issues should be quite easy to resolve however, my matplotlib fu is not great at the moment, as I have only recently started programming with matplotlib. any help will be much appreciated.
There seem to be a couple of problems with your code:
If you were using figure.add_subplots with the full
signature of subplot(nrows, ncols, plotNum) it may have
been more apparent that your first plot asking for 1 row
and 1 column and the second plot was asking for 2 rows and
1 column. Hence your first plot is filling the whole figure.
Rather than fig.add_subplot(111) followed by fig.add_subplot(212)
use fig.add_subplot(211) followed by fig.add_subplot(212).
Sharing an axis should be done in the add_subplot command using sharex=first_axis_instance
I have put together an example which you should be able to run:
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
import matplotlib.dates as mdates
import datetime as dt
n_pts = 10
dates = [dt.datetime.now() + dt.timedelta(days=i) for i in range(n_pts)]
ax1 = plt.subplot(2, 1, 1)
ax1.plot(dates, range(10))
ax2 = plt.subplot(2, 1, 2, sharex=ax1)
ax2.bar(dates, range(10, 20))
# Now format the x axis. This *MUST* be done after all sharex commands are run.
# put no more than 10 ticks on the date axis.
ax1.xaxis.set_major_locator(mticker.MaxNLocator(10))
# format the date in our own way.
ax1.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
# rotate the labels on both date axes
for label in ax1.xaxis.get_ticklabels():
label.set_rotation(30)
for label in ax2.xaxis.get_ticklabels():
label.set_rotation(30)
# tweak the subplot spacing to fit the rotated labels correctly
plt.subplots_adjust(hspace=0.35, bottom=0.125)
plt.show()
Hope that helps.
You should change this line:
ax = fig.add_subplot(111)
to
ax = fig.add_subplot(211)
The original command means that there is one row and one column so it occupies the whole graph. So your second graph fig.add_subplot(212) cover the lower part of the first graph.
Edit
If you dont want the gap between two plots, use subplots_adjust() to change the size of the subplots margin.
The example from #Pelson, simplified.
import matplotlib.pyplot as plt
import datetime as dt
#Two subplots that share one x axis
fig,ax=plt.subplots(2,sharex=True)
#plot data
n_pts = 10
dates = [dt.datetime.now() + dt.timedelta(days=i) for i in range(n_pts)]
ax[0].bar(dates, range(10, 20))
ax[1].plot(dates, range(10))
#rotate and format the dates on the x axis
fig.autofmt_xdate()
The subplots sharing an x-axis are created in one line, which is convenient when you want more than two subplots:
fig, ax = plt.subplots(number_of_subplots, sharex=True)
To format the date correctly on the x axis, we can simply use fig.autofmt_xdate()
For additional informations, see shared axis demo and date demo from the pylab examples.
This example ran on Python3, matplotlib 1.5.1

Categories