Pandas: let two lines auto scale to view the trend clearly - python

Suppose I have a dataframe df which has two columns, 100cos(x) and sin(x), if I plot it in one graph, it is not easy to view the trend of the second compare with first. How can I autoscale the second one? I just want to view the trend.
Here is the code to illustrate my point:
import pandas as pd
import matplotlib.pyplot as plt
x = np.arange(0,4*np.pi,0.02)
df = pd.DataFrame({'Sin':np.sin(x), 'Cos':100*np.cos(x)},index=x )
df.Sin.plot()
df.Cos.plot()
plt.show()
I want the blue line also takes almost the whole picture, please do not just multiply it by a number, the point is auto as if I just plot a single blue line in the figure.

In case you are too lazy to type much, consider the following solution, which does not need any extra line; just replace df.Cos.plot() by df.Cos.plot(ax=plt.gca().twinx(), color="C1").
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
x = np.arange(0,4*np.pi,0.02)
df = pd.DataFrame({'Sin':np.sin(x), 'Cos':100*np.cos(x)},index=x )
df.Sin.plot()
df.Cos.plot(ax=plt.gca().twinx(), color="C1")
plt.show()
However, I would strongly recommend looking closer at the matplotlib example, cited in epattaros answer, to understand what's happening.

An easy solution to that would be to use two different Y scales, one on each side, example from matplotlib page:
import numpy as np
import matplotlib.pyplot as plt
fig, ax1 = plt.subplots()
t = np.arange(0.01, 10.0, 0.01)
s1 = np.exp(t)
ax1.plot(t, s1, 'b-')
ax1.set_xlabel('time (s)')
# Make the y-axis label, ticks and tick labels match the line color.
ax1.set_ylabel('exp', color='b')
ax1.tick_params('y', colors='b')
ax2 = ax1.twinx()
s2 = np.sin(2 * np.pi * t)
ax2.plot(t, s2, 'r.')
ax2.set_ylabel('sin', color='r')
ax2.tick_params('y', colors='r')
fig.tight_layout()
plt.show()

Related

Why doesn't the show() function in matplotlib.pyplot work more than once for the same Axes object?

Out of pure curiosity, I would love to know why the final plt.show() does not display both plots on ax. Only the first plt.show() seems to do anything, because only the plot of y = sin(x) shows up. Here is the code sample:
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
x = np.linspace(1, 100, 10)
ax.plot(x, np.sin(x))
plt.show()
ax.plot(x, x)
plt.show()
Appreciate any help on this, because it bugs me to not understand why this is the case, even after a lot of searches. PS: I know that the code is useless and dumb, but I would still like to know for future use.
Your code
## load libraries
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
x = np.linspace(1, 100, 10)
## assign first plot
ax.plot(x, np.sin(x))
#plt.show()
## assign second plot
ax.plot(x, x)
## render the plots
plt.show()
One reason why plt.show() didn't 'show' more than once.
You are using subplots.
Your plots are on the same axes
plt.show() display all open figures. Your ax.plot(x, np.sin(x)) will be shown and the figure closed. The second is on the same ax and will not be shown anymore.
Documentation: matplotlib.pyplot.show()
[Alternate]
If however you call plt.plot() separately, (without subplots axes), you would get two plots; each with its own dimensions.
PS: below works in Jupyter (mybinder)
## load libraries
import matplotlib.pyplot as plt1
import numpy as np
x = np.linspace(1, 100, 10)
## first plot
plt1.plot(x, np.sin(x))
## render first plot
plt1.show()
Followed by
## second plot
plt1.plot(x, x)
## render the plot
plt1.show()
Clarity from the documentation: matplotlib.pyplot is a state-based interface to matplotlib
[Updated]
#JohanC lumping up the explanatory code and the alternate code.
The explanation 1, 2, and 3 remains.
The alternate code remain. or
OP can put the two plots on their own ax, and have plt.show each.
I didn't intend including this before. However, for completeness:
## load libraries
import matplotlib.pyplot as plt
import numpy as np
## tuple of desired two axes
## unpack AxesSubplot on two rows
fig, (ax1, ax2), = plt.subplots(nrows=2)
## assign variable
x = np.linspace(1, 100, 10)
## assign first plot
ax1.plot(x, np.sin(x))
#plt.show()
## assign second plot
ax2.plot(x, x)
## render the plots
## Rendering might not be 'smooth' in Jupyter
plt.show()

How can I add jitter to my seaborn and matplot plots?

I am working on trying to add Jitter to my plots using seaborn and matplot plots. I am getting mixed information form what I am reading online. Some information is saying coding needs to be done and other information show it as being as simple as jitter = True. I there another library or something that I should be importing that I am not aware of? Below is the code that I am running and trying to add jitter to:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
filename = 'https://library.startlearninglabs.uw.edu/DATASCI410/Datasets/JitteredHeadCount.csv'
headcount_df = pd.read_csv(filename)
headcount_df.describe()
%matplotlib inline
ax = plt.figure(figsize=(12, 6)).gca() # define axis
headcount_df.plot.scatter(x = 'Hour', y = 'TablesOpen', ax = ax, alpha = 0.2)
# auto_price.plot(kind = 'scatter', x = 'city-mpg', y = 'price', ax = ax)
ax.set_title('Hour vs TablesOpen') # Give the plot a main title
ax.set_ylabel('TablesOpen')# Set text for y axis
ax.set_xlabel('Hour')
ax = sns.kdeplot(headcount_df.loc[:, ['TablesOpen', 'Hour']], shade = True, cmap = 'PuBu')
headcount_df.plot.scatter(x = 'Hour', y = 'TablesOpen', ax = ax, jitter = True)
ax.set_title('Hour vs TablesOpen') # Give the plot a main title
ax.set_ylabel('TablesOpen')# Set text for y axis
ax.set_xlabel('Hour')
I receive the error: AttributeError: 'PathCollection' object has no property 'jitter' when trying to add the jitter. Any help or more information on this would be much appreciated
To add jitter to a scatter plot, first get a handle to the collection that contains the scatter dots. When a scatter plot is just created on an ax, ax.collections[-1] will be the desired collection.
Calling get_offsets() on the collection gets all the xy coordinates of the dots. Add some small random number to each of them. As in this case all coordinates are integers, adding a random number between 0 and 1 spreads the dots out evenly.
In this case the number of dots is very huge. To better see where the dots are concentrated, they can be made very small (marker=',', linewidth=0, s=1,) and be very transparent (e.g.alpha=0.1).
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
filename = 'https://library.startlearninglabs.uw.edu/DATASCI410/Datasets/JitteredHeadCount.csv'
headcount_df = pd.read_csv(filename)
fig, ax = plt.subplots(figsize=(12, 6))
headcount_df.plot.scatter(x='Hour', y='TablesOpen', marker=',', linewidth=0, s=1, alpha=.1, color='crimson', ax=ax)
dots = ax.collections[-1]
offsets = dots.get_offsets()
jittered_offsets = offsets + np.random.uniform(0, 1, offsets.shape)
dots.set_offsets(jittered_offsets)
ax.set_title('Hour vs TablesOpen') # Give the plot a main title
ax.set_ylabel('TablesOpen') # Set text for y axis
ax.set_xlabel('Hour')
ax.set_xticks(range(25))
ax.autoscale(enable=True, tight=True)
plt.tight_layout()
plt.show()
As there are a huge number of points, drawing the 2D kde takes a long time. The time can be reduced by taking a random sample from the rows. Note that to draw a 2D kde, the latest versions of Seaborn want each column as a separate parameter.
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import seaborn as sns
filename = 'https://library.startlearninglabs.uw.edu/DATASCI410/Datasets/JitteredHeadCount.csv'
headcount_df = pd.read_csv(filename)
fig, ax = plt.subplots(figsize=(12, 6))
N = 5000
rand_sel_df = headcount_df.iloc[np.random.choice(range(len(headcount_df)), N)]
ax = sns.kdeplot(rand_sel_df['Hour'], rand_sel_df['TablesOpen'], shade=True, cmap='PuBu', ax=ax)
ax.set_title('Hour vs TablesOpen')
ax.set_xticks(range(25))
plt.tight_layout()
plt.show()

Generate more plots separate in figure in one script in python

First of all, I apologies if this question was already asked and answered, I haven't found anything really specific about this so if you did, please share and I will delete this post.
What I would like to do is simply generate more separate plots after one another in separate figure in python, because I have an exercise sheet and the a) is to plot a poisson distribution and the b) is to plot a binomial distribution and so ever with c) and d), and I would like that the plots are gathered together in the same script but in separate figure.
I tried as simple as create a sin(x) and a cos(x) plot after one another but it didn't work, the sin and cos were displaying in the same plot.. My code was:
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
fig = plt.figure()
ax1 = plt.plot(np.sin(x))
ax2 = plt.plot(np.cos(x))
ax1.set_xlabel('Time (s)')
ax1.set_title('sin')
ax1.legend()
ax2.set_xlabel('Time (s)')
ax2.set_title('cos')
ax2.legend()
plt.show()
Could anyone help me ?
How about this?
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 10, 100)
fig = plt.figure()
ax1 = fig.add_subplot(211)
ax2 = fig.add_subplot(212, sharex=ax1)
ax1.plot(np.sin(x))
ax2.plot(np.cos(x))
plt.show()
I suggest you should read a simple tutorial about subplots.
EDIT:
To create separate figures:
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 10, 100)
plt.figure()
plt.plot(np.sin(x))
plt.figure()
plt.plot(np.cos(x))
plt.show()

Histogram at specific coordinates inside axes

What I want to achieve with Python 3.6 is something like this :
Obviously made in paint and missing some ticks on the xAxis. Is something like this possible? Essentially, can I control exactly where to plot a histogram (and with what orientation)?
I specifically want them to be on the same axes just like the figure above and not on separate axes or subplots.
fig = plt.figure()
ax2Handler = fig.gca()
ax2Handler.scatter(np.array(np.arange(0,len(xData),1)), xData)
ax2Handler.hist(xData,bins=60,orientation='horizontal',normed=True)
This and other approaches (of inverting the axes) gave me no results. xData is loaded from a panda dataframe.
# This also doesn't work as intended
fig = plt.figure()
axHistHandler = fig.gca()
axScatterHandler = fig.gca()
axHistHandler.invert_xaxis()
axHistHandler.hist(xData,orientation='horizontal')
axScatterHandler.scatter(np.array(np.arange(0,len(xData),1)), xData)
A. using two axes
There is simply no reason not to use two different axes. The plot from the question can easily be reproduced with two different axes:
import numpy as np
import matplotlib.pyplot as plt
plt.style.use("ggplot")
xData = np.random.rand(1000)
fig,(ax,ax2)= plt.subplots(ncols=2, sharey=True)
fig.subplots_adjust(wspace=0)
ax2.scatter(np.linspace(0,1,len(xData)), xData, s=9)
ax.hist(xData,bins=60,orientation='horizontal',normed=True)
ax.invert_xaxis()
ax.spines['right'].set_visible(False)
ax2.spines['left'].set_visible(False)
ax2.tick_params(axis="y", left=0)
plt.show()
B. using a single axes
Just for the sake of answering the question: In order to plot both in the same axes, one can shift the bars by their length towards the left, effectively giving a mirrored histogram.
import numpy as np
import matplotlib.pyplot as plt
plt.style.use("ggplot")
xData = np.random.rand(1000)
fig,ax= plt.subplots(ncols=1)
fig.subplots_adjust(wspace=0)
ax.scatter(np.linspace(0,1,len(xData)), xData, s=9)
xlim1 = ax.get_xlim()
_,__,bars = ax.hist(xData,bins=60,orientation='horizontal',normed=True)
for bar in bars:
bar.set_x(-bar.get_width())
xlim2 = ax.get_xlim()
ax.set_xlim(-xlim2[1],xlim1[1])
plt.show()
You might be interested in seaborn jointplots:
# Import and fake data
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
data = np.random.randn(2,1000)
# actual plot
jg = sns.jointplot(data[0], data[1], marginal_kws={"bins":100})
jg.ax_marg_x.set_visible(False) # remove the top axis
plt.subplots_adjust(top=1.15) # fill the empty space
produces this:
See more examples of bivariate distribution representations, available in Seaborn.

How to plot figures in subplots (Matplotlib)

I understand there are various ways to plot multiple graphs in one figure. One such way is using axes, e.g.
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.plot([range(8)])
ax.plot(...)
Since I have a function that beautifies my graphs and subsequently returns a figure, I would like to use that figure to be plotted in my subplots. It should look similar to this:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.plot(figure1) # where figure is a plt.figure object
ax.plot(figure2)
This does not work but how can I make it work? Is there a way to put figures inside subplots or a workaround to plot multiple figures in one overall figure?
Any help on this is much appreciated.
Thanks in advance for your comments.
If the goal is just to customize individual subplots, why not change your function to change the current figure on the fly rather than return a figure. From matplotlib and seaborn, can you just change the plot settings as they are being plotted?
import numpy as np
import matplotlib.pyplot as plt
plt.figure()
x1 = np.linspace(0.0, 5.0)
x2 = np.linspace(0.0, 2.0)
y1 = np.cos(2 * np.pi * x1) * np.exp(-x1)
y2 = np.cos(2 * np.pi * x2)
plt.subplot(2, 1, 1)
plt.plot(x1, y1, 'ko-')
plt.title('A tale of 2 subplots')
plt.ylabel('Damped oscillation')
import seaborn as sns
plt.subplot(2, 1, 2)
plt.plot(x2, y2, 'r.-')
plt.xlabel('time (s)')
plt.ylabel('Undamped')
plt.show()
Perhaps I don't understand your question entirely. Is this 'beautification' function complex?...
A possible solution is
import matplotlib.pyplot as plt
# Create two subplots horizontally aligned (one row, two columns)
fig, ax = plt.subplots(1,2)
# Note that ax is now an array consisting of the individual axis
ax[0].plot(data1)
ax[1].plot(data2)
However, in order to work data1,2 needs to be data. If you have a function which already plots the data for you I would recommend to include an axis argument to your function. For example
def my_plot(data,ax=None):
if ax == None:
# your previous code
else:
# your modified code which plots directly to the axis
# for example: ax.plot(data)
Then you can plot it like
import matplotlib.pyplot as plt
# Create two subplots horizontally aligned
fig, ax = plt.subplots(2)
# Note that ax is now an array consisting of the individual axis
my_plot(data1,ax=ax[0])
my_plot(data2,ax=ax[1])

Categories