How could I draw a graph showing the distribution of a group of numbers using curved filled by different colours in Python? An example graph is shown as follows. Could I do that with Matplotlib or other packages?
Pandas and Matplotlib is what you are looking for.
If you have seaborn then this can be done using the distplot or more specifically as using the kdeplot as shown below.
import seaborn as sns
import numpy as np
a= np.random.normat(0.5,0.5,1000)
sns.distplot(a);
# or kdeplot
sns.kdeplot(a, shade=True);
Related
I'm trying to visualize frequency of imported from reports items in array using Python. I'm new to drawing graphs, so how to do it, using any module?
To draw graphs in python you need to do the following,
import matplotlib as plt
plt.plot(array_1,array_2)
plt.show()
I highly recommend checking out this Matplotlib Docs
I've used seaborn plots several times from an online course. Originally plotted graphs are so different as that of my computer's. Is this because of anything on code or in graphics?
Plot on my computer:
Original plot
Supposing the code being run is exactly the same, the reason would be that you are using a newer version of seaborn than the "online course".
In order to have your graphics appear in the same manner as in the online tutorial you may call
import seaborn as sns
sns.set()
I would like to make a letter value plot using seaborn from a dataset that is too large to load into memory. Normally, I would do this:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.read_pickle('results_filename')
sns.lvplot(data=df, x='independent_variable', y='dependent_variable', hue='categorical_variable')
plt.show() # or savefig(...)
But my PC starts thrashing when results_filename is more than a few GB in size.
I can calculate the KDEs or histograms and save those to disk instead of the observations themselves.
I do that often when plotting stuff with pure matplotlib.
But is it possible to use sns.lvplot (or something similar) without passing the observations themselves to it?
I can't find out how to plot a normal distribution for the blue bar-plot in this graph.
The y-axis is made out of an list like this: [0,0,1,34,.....,34,2,0,0]
And the x-axis is just: np.arrange(len(list_above))
I'very tried several things but all created one vertical line.
So how can i plot a normal distribution for the blue bar-plot?
use matplotlib.pyplot and from scipy.stats import norm
plt.plot(dataset,norm.pdf(dataset,mean of dataset,std of dataset)
Here's the result of scatter plot using Matplotlib
And now here's the result of calling scatter plot using Pandas
Is there bug in Pandas scatter function or is it supposed to work like this?
I think the grey area you see is the boundary of each point. Use the argument edgecolors='none' or edgecolors='black' to get the same result as you get with matplotlib (see also http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.scatter)