Im making a density plot with matplotlib and I would also like to get rug plot under it. good example to make density plot is here How to create a density plot in matplotlib?
but I couldn't find any good example for rug plot. in R it can be done easly by rug(data).
You can plot markers at each datapoint.
from scipy import stats
import numpy as np
import matplotlib.pyplot as plt
sample = np.hstack((np.random.randn(30), np.random.randn(20)+5))
density = stats.kde.gaussian_kde(sample)
fig, ax = plt.subplots(figsize=(8,4))
x = np.arange(-6,12,0.1)
ax.plot(x, density(x))
ax.plot(sample, [0.01]*len(sample), '|', color='k')
You can find an example here!
ax = fig.add_subplot(111)
ax.plot(x1, np.zeros(x1.shape), 'b+', ms=20) # rug plot
x_eval = np.linspace(-10, 10, num=200)
ax.plot(x_eval, kde1(x_eval), 'k-', label="Scott's Rule")
ax.plot(x_eval, kde1(x_eval), 'r-', label="Silverman's Rule")
Seems to be the core of it!
You can also use Seaborn.distplot, which wraps histogram, KDE and rugs altogether. Figures made by Seaborn are also prettier by default.
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
sample = np.hstack((np.random.randn(30), np.random.randn(20)+5))
fig, ax = plt.subplots(figsize=(8,4))
sns.distplot(sample, rug=True, hist=False, rug_kws={"color": "g"},
kde_kws={"color": "k", "lw": 3})
plt.show()
Here's the answer for people just looking for a rugplot to use on a matplotlib axis: you can use a seaborn function.
import seaborn as sns
sns.rugplot(xdata, height=0.025, axis=ax, color='k')
This looks much nicer than a pure-matplotlib kludge because the rug is aligned to (flush with) the x-axis.
Related
I'm working with data that has the data has 3 plotting parameters: x,y,c. How do you create a custom color value for a scatter plot?
Extending this example I'm trying to do:
import matplotlib
import matplotlib.pyplot as plt
cm = matplotlib.cm.get_cmap('RdYlBu')
colors=[cm(1.*i/20) for i in range(20)]
xy = range(20)
plt.subplot(111)
colorlist=[colors[x/2] for x in xy] #actually some other non-linear relationship
plt.scatter(xy, xy, c=colorlist, s=35, vmin=0, vmax=20)
plt.colorbar()
plt.show()
but the result is TypeError: You must first set_array for mappable
From the matplotlib docs on scatter 1:
cmap is only used if c is an array of floats
So colorlist needs to be a list of floats rather than a list of tuples as you have it now.
plt.colorbar() wants a mappable object, like the CircleCollection that plt.scatter() returns.
vmin and vmax can then control the limits of your colorbar. Things outside vmin/vmax get the colors of the endpoints.
How does this work for you?
import matplotlib.pyplot as plt
cm = plt.cm.get_cmap('RdYlBu')
xy = range(20)
z = xy
sc = plt.scatter(xy, xy, c=z, vmin=0, vmax=20, s=35, cmap=cm)
plt.colorbar(sc)
plt.show()
Here is the OOP way of adding a colorbar:
fig, ax = plt.subplots()
im = ax.scatter(x, y, c=c)
fig.colorbar(im, ax=ax)
If you're looking to scatter by two variables and color by the third, Altair can be a great choice.
Creating the dataset
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df = pd.DataFrame(40*np.random.randn(10, 3), columns=['A', 'B','C'])
Altair plot
from altair import *
Chart(df).mark_circle().encode(x='A',y='B', color='C').configure_cell(width=200, height=150)
Plot
I am generating a KDE plot and adding the data points as a scatter plot as well. I am using the vline marker "|" for this scatterplot. How can I increase the thickness of this marker. Increasing s=200 to s=1000 increases the height as well. Is there a way to change the thickness without changing the height?
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
numberList = np.random.rand(20)
ax = sns.kdeplot(numberList)
ax = sns.scatterplot(x=numberList, y=0.1, marker='|', s=200)
plt.show()
I found a solution. Adding a parameter linewidth=3 helped.
ax = sns.scatterplot(x=numberList, y=0.1, marker="|", s=200, linewidth=3)
I want to specify the color of a line of fit within the seaborn package for an array of x and y data. Instead all I can figure out is how to change the color and shading for the kernel density function. How can I change the color for a gaussian fit? I.e. the lines below should be red and blue. It would also be great to shade in the function like the "shade":True argument.
import seaborn as sns
sns.distplot(x,kde_kws={"shade":True}, kde=False, fit=stats.gamma, hist=None, color="red", label="label 1");
sns.distplot(y,kde_kws={"shade":True}, kde=False, fit=stats.gamma, hist=None, color="blue", label="label 2");
For changing the color of the fitted curve, you need to set fit_kws argument. But fit_kws does not support shading. You can still shade the area below the fitted curve by a few extra lines of code as shown below but that I think is an answer to another question that you have posted.
import numpy as np
import seaborn as sns
import scipy.stats as stats
import matplotlib.pyplot as plt
sns.set()
np.random.seed(0)
x = np.random.randn(100)
y = np.random.normal(loc=6.0, scale=1, size=(50,))
ax = sns.distplot(x, fit_kws={"color":"red"}, kde=False,
fit=stats.gamma, hist=None, label="label 1");
ax = sns.distplot(y, fit_kws={"color":"blue"}, kde=False,
fit=stats.gamma, hist=None, label="label 2");
plt.show(block=False)
The result of the code is show below:
What I want to achieve with Python 3.6 is something like this :
Obviously made in paint and missing some ticks on the xAxis. Is something like this possible? Essentially, can I control exactly where to plot a histogram (and with what orientation)?
I specifically want them to be on the same axes just like the figure above and not on separate axes or subplots.
fig = plt.figure()
ax2Handler = fig.gca()
ax2Handler.scatter(np.array(np.arange(0,len(xData),1)), xData)
ax2Handler.hist(xData,bins=60,orientation='horizontal',normed=True)
This and other approaches (of inverting the axes) gave me no results. xData is loaded from a panda dataframe.
# This also doesn't work as intended
fig = plt.figure()
axHistHandler = fig.gca()
axScatterHandler = fig.gca()
axHistHandler.invert_xaxis()
axHistHandler.hist(xData,orientation='horizontal')
axScatterHandler.scatter(np.array(np.arange(0,len(xData),1)), xData)
A. using two axes
There is simply no reason not to use two different axes. The plot from the question can easily be reproduced with two different axes:
import numpy as np
import matplotlib.pyplot as plt
plt.style.use("ggplot")
xData = np.random.rand(1000)
fig,(ax,ax2)= plt.subplots(ncols=2, sharey=True)
fig.subplots_adjust(wspace=0)
ax2.scatter(np.linspace(0,1,len(xData)), xData, s=9)
ax.hist(xData,bins=60,orientation='horizontal',normed=True)
ax.invert_xaxis()
ax.spines['right'].set_visible(False)
ax2.spines['left'].set_visible(False)
ax2.tick_params(axis="y", left=0)
plt.show()
B. using a single axes
Just for the sake of answering the question: In order to plot both in the same axes, one can shift the bars by their length towards the left, effectively giving a mirrored histogram.
import numpy as np
import matplotlib.pyplot as plt
plt.style.use("ggplot")
xData = np.random.rand(1000)
fig,ax= plt.subplots(ncols=1)
fig.subplots_adjust(wspace=0)
ax.scatter(np.linspace(0,1,len(xData)), xData, s=9)
xlim1 = ax.get_xlim()
_,__,bars = ax.hist(xData,bins=60,orientation='horizontal',normed=True)
for bar in bars:
bar.set_x(-bar.get_width())
xlim2 = ax.get_xlim()
ax.set_xlim(-xlim2[1],xlim1[1])
plt.show()
You might be interested in seaborn jointplots:
# Import and fake data
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
data = np.random.randn(2,1000)
# actual plot
jg = sns.jointplot(data[0], data[1], marginal_kws={"bins":100})
jg.ax_marg_x.set_visible(False) # remove the top axis
plt.subplots_adjust(top=1.15) # fill the empty space
produces this:
See more examples of bivariate distribution representations, available in Seaborn.
I am drawing two subplots with Matplotlib, essentially following :
subplot(211); imshow(a); scatter(..., ...)
subplot(212); imshow(b); scatter(..., ...)
Can I draw lines between those two subplots? How would I do that?
The solution from the other answers are suboptimal in many cases (as they would only work if no changes are made to the plot after calculating the points).
A better solution would use the specially designed ConnectionPatch:
import matplotlib.pyplot as plt
from matplotlib.patches import ConnectionPatch
import numpy as np
fig = plt.figure(figsize=(10,5))
ax1 = fig.add_subplot(121)
ax2 = fig.add_subplot(122)
x,y = np.random.rand(100),np.random.rand(100)
ax1.plot(x,y,'ko')
ax2.plot(x,y,'ko')
i = 10
xy = (x[i],y[i])
con = ConnectionPatch(xyA=xy, xyB=xy, coordsA="data", coordsB="data",
axesA=ax2, axesB=ax1, color="red")
ax2.add_artist(con)
ax1.plot(x[i],y[i],'ro',markersize=10)
ax2.plot(x[i],y[i],'ro',markersize=10)
plt.show()
You could use fig.line. It adds any line to your figure. Figure lines are higher level than axis lines, so you don't need any axis to draw it.
This example marks the same point on the two axes. It's necessary to be careful with the coordinate system, but the transform does all the hard work for you.
import matplotlib.pyplot as plt
import matplotlib
import numpy as np
fig = plt.figure(figsize=(10,5))
ax1 = fig.add_subplot(121)
ax2 = fig.add_subplot(122)
x,y = np.random.rand(100),np.random.rand(100)
ax1.plot(x,y,'ko')
ax2.plot(x,y,'ko')
i = 10
transFigure = fig.transFigure.inverted()
coord1 = transFigure.transform(ax1.transData.transform([x[i],y[i]]))
coord2 = transFigure.transform(ax2.transData.transform([x[i],y[i]]))
line = matplotlib.lines.Line2D((coord1[0],coord2[0]),(coord1[1],coord2[1]),
transform=fig.transFigure)
fig.lines = line,
ax1.plot(x[i],y[i],'ro',markersize=20)
ax2.plot(x[i],y[i],'ro',markersize=20)
plt.show()
I'm not sure if this is exactly what you are looking for, but a simple trick to plot across subplots.
import matplotlib.pyplot as plt
import numpy as np
ax1=plt.figure(1).add_subplot(211)
ax2=plt.figure(1).add_subplot(212)
x_data=np.linspace(0,10,20)
ax1.plot(x_data, x_data**2,'o')
ax2.plot(x_data, x_data**3, 'o')
ax3 = plt.figure(1).add_subplot(111)
ax3.plot([5,5],[0,1],'--')
ax3.set_xlim([0,10])
ax3.axis("off")
plt.show()