Plot two datasets at same position based on their index - python

I'm trying to plot two datasets (called Height and Temperature) on different y axes.
Both datasets have the same length.
Both datasets are linked together by a third dataset, RH.
I have tried to use matplotlib to plot the data using twiny() but I am struggling to align both datasets together on the same plot.
Here is the plot I want to align.
The horizontal black line on the figure is defined as the 0°C degree line that was found from Height and was used to test if both datasets, when plotted, would be aligned. They do not. There is a noticable difference between the black line and the 0°C tick from Temperature.
Rather than the two y axes changing independently from each other I would like to plot each index from Height and Temperature at the same y position on the plot.
Here is the code that I used to create the plot:
#Define number of subplots sharing y axis
f, ax1 = plt.subplots()
ax1.minorticks_on()
ax1.grid(which='major',axis='both',c='grey')
#Set axis parameters
ax1.set_ylabel('Height $(km)$')
ax1.set_ylim([np.nanmin(Height), np.nanmax(Height)])
#Plot RH
ax1.plot(RH, Height, label='Original', lw=0.5)
ax1.set_xlabel('RH $(\%)$')
ax2 = ax1.twinx()
ax2.plot(RH, Temperature, label='Original', lw=0.5, c='black')
ax2.set_ylabel('Temperature ($^\circ$C)')
ax2.set_ylim([np.nanmin(Temperature), np.nanmax(Temperature)])
Any help on this would be amazing. Thanks.

Maybe the atmosphere is wrong. :)
It sounds like you are trying to align the two y axes at particular values. Why are you doing this? The relationship of Height vs. Temperature is non-linear, so I think you are setting the stage for a confusing graph. Any particular line you plot can only be interpreted against one vertical axis.
If needed, I think you will be forced to "do some math" on the limits of the y axes. This link may be helpful:
align scales

Related

How can I make a scatter plot by averaging over points within an equally spaced grid and with a colorbar?

I have some data, where each data point has x and y coordinates and a magnitude assigned to it. I am currently plotting a scatter plot with the colours representing the magnitude of the points.
However, I would now like to group the data into a set of larger "pixels", which are illustrated in the plot below using the dashed grid (i.e. equally spaced square markers of size 0.2*0.2), where the magnitude is given by the average of the magnitudes of the points within the "pixel".
Is there a way to use a scatter plot to do this simply? Or do I need to manipulate the data myself to give this output beforehand?
fig,ax = plt.subplots()
sc = ax.scatter(x_coord, y_coord, s=100, c=mangitude, marker='s')
cbar = fig.colorbar(sc,ax=ax)
cbar.set_label('Magnitude',rotation=90)
ax.set_xlabel('x-position')
ax.set_ylabel('y-position')
Zooming into a part of the plot this gives me:

Add labels ONLY to SELECTED data points in seaborn scatter plot

I have created a seaborn scatter plot and added a trendline to it. I have some datapoints that fall very far away from the trendline (see the ones highlighted in yellow) so I'd like to add data labels only to these points, NOT to all the datapoints in the graph.
Does anyone know what's the best way to do this?
So far I've found answers to "how to add labels to ALL data points" (see this link) but this is not my case.
In the accepted answer to the question that you reference you can see that the way they add labels to all data points is by looping over the data points and calling .text(x, y, string) on the axes. You can find the documentation for this method here (seaborn is implemented on top of matplotlib). You'll have to call this method for the selected points.
In your specific case I don't know exactly what formula you want to use to find your outliers but to literally get the ones beyond the limits of the yellow rectangle that you've drawn you could try the following:
for x,y in zip(xarr, yarr):
if x < 5 and y > 5.5:
ax.text(x+0.01, y, 'outlier', horizontalalignment='left', size='medium', color='black')
Where xarr is your x-values, yarr your y-values and ax the returned axes from your call to seaborn.

How do I plot more than one set of bars per axis on a bar plot in python?

I currently use the align=’edge’ parameter and positive/negative widths in pyplot.bar() to plot the bar data of one metric to each axis. However, if I try to plot a second set of data to one axis, it covers the first set. Is there a way for pyplot to automatically space this data correctly?
lns3 = ax[1].bar(bucket_df.index,bucket_df.original_revenue,color='c',width=-0.4,align='edge')
lns4 = ax[1].bar(bucket_df.index,bucket_df.revenue_lift,color='m',bottom=bucket_df.original_revenue,width=-0.4,align='edge')
lns5 = ax3.bar(bucket_df.index,bucket_df.perc_first_priced,color='grey',width=0.4,align='edge')
lns6 = ax3.bar(bucket_df.index,bucket_df.perc_revenue_lift,color='y',width=0.4,align='edge')
This is what it looks like when I show the plot:
The data shown in yellow completely covers the data in grey. I'd like it to be shown next to the grey data.
Is there any easy way to do this? Thanks!
The first argument to the bar() plotting method is an array of the x-coordinates for your bars. Since you pass the same x-coordinates they will all overlap. You can get what you want by staggering the bars by doing something like this:
x = np.arange(10) # define your x-coordinates
width = 0.1 # set a width for your plots
offset = 0.15 # define an offset to separate each set of bars
fig, ax = plt.subplots() # define your figure and axes objects
ax.bar(x, y1) # plot the first set of bars
ax.bar(x + offset, y2) # plot the second set of bars
Since you have a few sets of data to plot, it makes more sense to make the code a bit more concise (assume y_vals is a list containing the y-coordinates you'd like to plot, bucket_df.original_revenue, bucket_df.revenue_lift, etc.). Then your plotting code could look like this:
for i, y in enumerate(y_vals):
ax.bar(x + i * offset, y)
If you want to plot more sets of bars you can decrease the width and offset accordingly.

align grid lines on two plots

I have 2 subplots in matplotlib in Python. They are stacked on top of each other.
I want to have gridlines on each plot, which I have done successfully. But each plot has a different x axis and, therefore, the vertical grid lines of the top plot are not aligned with those of the bottom plot.
I would like the grid lines of the top plot to be in the same position on the x axis as they are on the bottom plot i.e. the vertical grid lines in both plots should be aligned.
I imaging that I can tell my grid lines exactly where to be, and so I could achieve my goal by adjusting the lines until they match as well as possible.
I just hoped that there might be some easier way that would just allow me to align the gridlines on both plots.
Edit:
I don't think the shared axis stuff is quite what I want.
My top and bottom plot have very different scales, so when I share the axes, it shifts the scaling too. For example, say my top plot has data that runs from 0-100 on the x axis and on the bottom plot the data runs from 0-50. When I share the axis, the top plot only shows data from 0-50, which I don't want it to.
I want it to show from 0-100 as it did before, but just want it to share the axis and gridlines from the other plot.
You could use LinearLocator:
from matplotlib.ticker import LinearLocator
Then on each of your x-axis or only on one of them call:
N = 6 # Set number of gridlines you want to have in each graph
ax1.xaxis.set_major_locator(LinearLocator(N))
ax2.xaxis.set_major_locator(LinearLocator(N))
Or get the number of ticks from your source axis and set it on target axis:
N = source_ax.xaxis.get_major_ticks()
target_ax.xaxis.set_major_locator(LinearLocator(N))

Matplotlib: make x-axis longer

In Matplotlib I need to draw a graph with points on the x-axis on each integer between 1 and 5000 and on the y-axis only in a very limited range.
Matplotlib automatically compacts everything to let all the data fit on a (landscape) page. In my case I would like the x-axis to be as large as possible so that all points are clearly visible. Right now there's just a thick coloured line as opposed to scattered points.
How can I do this?
(I'm saving to pdf, if that helps)
You can always try to specify the dimensions (in inches) of the figure you are creating. Something along the following line might help:
fig = plt.figure(figsize=(20, 2))
ax = fig.add_subplot(111)
ax.plot(x, y)
The figsize takes a tuple of width, height in inches.

Categories