How to create a histogram with an aggregated dataset [duplicate]

How to create a histogram with an aggregated dataset [duplicate] - python

This question already has answers here:
Ploting with seaborn histplot
(2 answers)
Plotting a histogram from pre-counted data in Matplotlib
(6 answers)
Histogram from data which is already binned, I have bins and frequency values
(3 answers)
Closed last month.
I have a dataset new_products which describes the number of months its been since a product launched. I aggregated that data together so that I have 'since_debut' and 'count'. Which describes the number of products that debuted 1, 2, 3....60 month ago. I am having trouble creating a histogram with seaborn.
df = since_debut count
1 1784
2 7345
3 11111
4 13255
sns.histplot(data=df, x="since_debut", y="count", bins=30, kde=True)
ValueError: Could not interpret value `since_debut` for parameter `x`
Unsure what is throwing this error and why it can't interpret the aggregated data. Any help or advice is appreciated.

Since you have already aggregated dataset shouldn't you use something like barplot:
sns.barplot(data=df, x="since_debut", y="count")
countplot should be used on original data and will aggregate data over one of the axis itself.

Related

How to get the median value in boxplot with seaborn and custom coordinate lists [duplicate]

This question already has answers here:
Getting values in Seaborn boxplot
(2 answers)
Extract outliers from Seaborn Boxplot
(2 answers)
Labeling boxplot in seaborn with median value
(3 answers)
Closed last month.
This post was edited and submitted for review last month and failed to reopen the post:
Original close reason(s) were not resolved
I am passing two lists of coordinates obtained by selectively appending them from my original data frame. I use them to create a boxplot in which the median of those values is drawn.
df = pd.read_csv("path.csv")
# get data from dataframe depending on certain conditions and store it in xValue and yValue
...
g = sns.JointGrid(x=xValues, y=yValues)
g.plot_joint(sns.scatterplot, size=0.1, color='b', linewidth=0)
g.plot_marginals(sns.boxplot, width=0.3, color='b',notch=True, showcaps=False, medianprops={"color": "coral"})
Boxplot will have to compute the median value in order to show it on the graph, so is there any way to get that numerical value?
This question is different from Labeling boxplot in seaborn with median value, Extract outliers from Seaborn Boxplot and Getting values in Seaborn boxplot because these 3 questions use columns directly from the data frame. If you read carefully my question, the data I pass to the boxplot function is an extract of each of the columns of the data frame.

Add bar labels to pandas groupby stacked bar chart [duplicate]

This question already has answers here:
How to plot and annotate a grouped bar chart
(1 answer)
Labeling a Bar Graph Created from a Grouped Pandas DataFrame where there's a NaN Category
(2 answers)
Plot groupby percentage dataframe
(2 answers)
How to annotate grouped bar plot with percent by hue/legend group
(1 answer)
Closed 11 days ago.
Is is possible to add bar values to a pandas groupby plot?
df.groupby("Responded Year-Month")["Net New Record"].value_counts().unstack(level=1).plot.bar(
stacked=True,
title="Responses by Month & Net New Record",
ylabel="Responses",
xlabel="Month",
rot=0,
color=nnr_colors)
I checked the docs and could not find any reference of values in the bars both the df.plot.bar docs and the df.plot docs.
As an example, it is very easy to accomplish something similar with a pie plot using autopct:
df.groupby("Channel")["Responded"].sum().plot.pie(
autopct='%1.0f%%',
title="Responses by Channel"
)

Get bin size values in seaborn charts logscale [duplicate]

This question already has answers here:
Python - Use bins from one sns.histplot() for another / Extract bin information from an sns.histpllot()
(1 answer)
access to bin counts in seaborn distplot
(1 answer)
How can I extract the bins from seaborn's KDE distplot object?
(2 answers)
Closed 2 months ago.
I made a chart with seaborn and I would like to retrieve the bin size values.
As my bins are constant in a logarithmic scale, their size are different. Any ideas ?
Code used : sns.displot(productDF, x="Area", hue="Slice",hue_order=sliceList, bins = 50, log_scale=True, col="Slice", col_wrap = 2, col_order=sliceList)
Here after an example of my chart:
I checked the doc but seaborn doesn't seem to return any info.

Seaborn FacetGrid - How to get % instead count? [duplicate]

This question already has answers here:
How to plot percentage with seaborn distplot / histplot / displot
(3 answers)
Plot a horizontal line on a given plot
(7 answers)
Box around text in matplotlib
(3 answers)
Closed 3 months ago.
I'm tring to create a Clustering Situation, with KMeans.
This is how my datasets looks like:
With these dataset, I apply FacetGrid this way:
for c in data:
grid= sns.FacetGrid(data, col='Clusters')
grid.map(plt.hist,c)
grid.set_xticklabels(rotation=90)
Output:
For all features.
This is working ok, but the FacetGrid only show Feature Value X Count for each clusters...
This information is not too relevant too me, since all clusters have different 'len'.
E.g Customer Age for Cluster 1 plot is very higher than Customer Age for Cluster 0, since Cluster 1 has more elements.
What I need:
I need a way to compare each column of the plot relative to its total.
E.g
I'd like to see:
For each cluster and each feature.
Is it possible?
Thank you.

Visualize a binary vector [duplicate]

This question already has answers here:
Using pandas value_counts and matplotlib
(1 answer)
how to sort the result by pandas.value_counts
(1 answer)
Unnormalized histogram plots in Seaborn are not centered on X-axis
(1 answer)
Differences between seaborn histogram, countplot and distplot
(1 answer)
Closed 8 months ago.
I have a binary column in pandas dataframe. I want to visualize it, just to see how much there is 0 or 1. I used displot:
Plot = sns.displot(data = data, x = 'stroke', color = 'm')
Plot.fig.suptitle('Stroke numbers in data', size=15, y=1.12);
This did the job but it's very ugly, how do I make it only with 0 and 1 ?:

I think this is a good solution:
data["stroke"].value_counts(sort=False).plot.bar(rot=0)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to create a histogram with an aggregated dataset [duplicate] - python

Since you have already aggregated dataset shouldn't you use something like barplot: sns.barplot(data=df, x="since_debut", y="count") countplot should be used on original data and will aggregate data over one of the axis itself.

Related

How to get the median value in boxplot with seaborn and custom coordinate lists [duplicate]

Add bar labels to pandas groupby stacked bar chart [duplicate]

Get bin size values in seaborn charts logscale [duplicate]

Seaborn FacetGrid - How to get % instead count? [duplicate]

Visualize a binary vector [duplicate]

Categories

Resources