Comparing distribution plots for better visualisation [duplicate] - python

This question already has answers here:
Seaborn data visualization misunderstanding of densities?
(1 answer)
How to do KDE(kernel density estimation) independently with seaborn?
(1 answer)
Seaborn - displot normalize KDEs for two different sample batches
(1 answer)
How to plot many kdeplots on one figure in python
(1 answer)
Closed 7 months ago.
How can I plot multiple distribution plots in one plot where I have column = "Quantity" from 5 dataframes at nation, region, division, state and DMA level and the length of dataframes and their scale differs a lot.
I used this code:
sns.displot(data= data_vert, x='dev_median_qty', hue='level', kind='kde', fill=True, palette=sns.color_palette('bright')[:5], height=5, aspect=2.5)
plt.xlim(-5, 25)
And I got this graph :
I want that area under each curve is one, or every level data gives same area under the curve without changing the distribution, so that this graph can be more visually sound and good to observe.

Related

Get bin size values in seaborn charts logscale [duplicate]

This question already has answers here:
Python - Use bins from one sns.histplot() for another / Extract bin information from an sns.histpllot()
(1 answer)
access to bin counts in seaborn distplot
(1 answer)
How can I extract the bins from seaborn's KDE distplot object?
(2 answers)
Closed 2 months ago.
I made a chart with seaborn and I would like to retrieve the bin size values.
As my bins are constant in a logarithmic scale, their size are different. Any ideas ?
Code used : sns.displot(productDF, x="Area", hue="Slice",hue_order=sliceList, bins = 50, log_scale=True, col="Slice", col_wrap = 2, col_order=sliceList)
Here after an example of my chart:
I checked the doc but seaborn doesn't seem to return any info.

Seaborn FacetGrid - How to get % instead count? [duplicate]

This question already has answers here:
How to plot percentage with seaborn distplot / histplot / displot
(3 answers)
Plot a horizontal line on a given plot
(7 answers)
Box around text in matplotlib
(3 answers)
Closed 3 months ago.
I'm tring to create a Clustering Situation, with KMeans.
This is how my datasets looks like:
With these dataset, I apply FacetGrid this way:
for c in data:
grid= sns.FacetGrid(data, col='Clusters')
grid.map(plt.hist,c)
grid.set_xticklabels(rotation=90)
Output:
For all features.
This is working ok, but the FacetGrid only show Feature Value X Count for each clusters...
This information is not too relevant too me, since all clusters have different 'len'.
E.g Customer Age for Cluster 1 plot is very higher than Customer Age for Cluster 0, since Cluster 1 has more elements.
What I need:
I need a way to compare each column of the plot relative to its total.
E.g
I'd like to see:
For each cluster and each feature.
Is it possible?
Thank you.

Visualize a binary vector [duplicate]

This question already has answers here:
Using pandas value_counts and matplotlib
(1 answer)
how to sort the result by pandas.value_counts
(1 answer)
Unnormalized histogram plots in Seaborn are not centered on X-axis
(1 answer)
Differences between seaborn histogram, countplot and distplot
(1 answer)
Closed 8 months ago.
I have a binary column in pandas dataframe. I want to visualize it, just to see how much there is 0 or 1. I used displot:
Plot = sns.displot(data = data, x = 'stroke', color = 'm')
Plot.fig.suptitle('Stroke numbers in data', size=15, y=1.12);
This did the job but it's very ugly, how do I make it only with 0 and 1 ?:
I think this is a good solution:
data["stroke"].value_counts(sort=False).plot.bar(rot=0)

Change size of the value-axis on matplolib plot [duplicate]

This question already has answers here:
Matplotlib y axis values are not ordered [duplicate]
(1 answer)
Difference in plotting with different matplotlib versions
(1 answer)
Closed 10 months ago.
The values on the y axis of this plot are too clustered, as seen where I labelled 1 in the picture.
The only was to make the numbers slightly visible is the reduce the value of labelsize= in the tick_params() method but that mades the values so small they are unreadable.
Do I have to plot all the points in the range of my list rainfall in line 16 or can I specifiy which labels I would like to place?

matplotlib hist() fails (sum of bars not equal one) with Density=True and weights [duplicate]

This question already has answers here:
Plot a histogram such that bar heights sum to 1 (probability)
(6 answers)
Plot a histogram such that the total height equals 1
(5 answers)
Closed 1 year ago.
I am stuck when working with plt.hist(density=True) although I have set weights.
My goal is to retrieve a histogram in which the sum of bars (y-axes) equals 1. It is a deeply discussed issue, however [1][2] I canĀ“t derive the right solution. Here is my code:
data = np.array(data).astype("float32")
weights = np.ones_like(data)/float(len(data))
n, bins, patches = plt.hist(x=data, density=True, bins=20, color='#0504aa',weights=weights,
alpha=0.7, rwidth=0.85)
which creates:
The bars are obviously not summing up to one. Does anybody have a solution for my issue. Maybe it is already posted somewhere, but I am not able to find it?
Greetings

Categories