how to plot in pandas categorical data

how to plot in pandas categorical data - python

I have this kind of dataframe:
animal age where
0 dog 1 indoor
1 cat 4 indoor
2 horse 3 outdoor
I would like to present a bar plot in which:
y axis is age, x axis is animal, and the animals are grouped in adjacent bars with different colors.
Thanks

This should do the trick
df = pd.DataFrame({"animal":["dog","cat","horse"],"age":[1,4,3],"where":["indoor","indoor","outdoor"]})
df
animal age where
0 dog 1 indoor
1 cat 4 indoor
2 horse 3 outdoor
ax = df.plot.bar(x="animal",y="age",color=["b","r","g"])
ax.legend("")
ax.set_ylabel("Age")

Another easy way. Set the intended x axis label as index and plot. By defaul, float/integer end up on the y axis
import matplotlib.pyplot as plt
df.set_index(df['animal']).plot( kind='bar')
plt.ylabel('age')

Related

How to select different sets of variables (ei value counts for a specific country) from a groupby df for a 2,2 subplot

From my original data frame, I used the group-by to create the new df as shown below, which has the natural disaster subtype counts for each country.
However, I'm unsure how to, for example, select 4 specific countries and set them as variables in a 2 by 2 plot.
The X-axis will be the disaster subtype name, with the Y being the value count, however, I can't quite figure out the right code to select this information.
This is how I grouped the countries -
g_grp= df_geo.groupby(['Country'])
c_val = pd.DataFrame(c_grp['Disaster Subtype'].value_counts())
c_val = c_val.rename(columns={'Disaster Subtype': 'Disaster Subtype', 'Disaster Subtype': 'Num of Disaster'})
c_val.head(40)
Output:
Country Disaster Subtype
Afghanistan Riverine flood 45
Ground movement 33
Flash flood 32
Avalanche 19
Drought 8
Bacterial disease 7
Convective storm 6
Landslide 6
Cold wave 5
Viral disease 5
Mudslide 3
Severe winter conditions 2
Forest fire 1
Locust 1
Parasitic disease 1
Albania Ground movement 16
Riverine flood 8
Severe winter conditions 3
Convective storm 2
Flash flood 2
Heat wave 2
Avalanche 1
Coastal flood 1
Drought 1
Forest fire 1
Viral disease 1
Algeria Ground movement 21
Riverine flood 20
Flash flood 8
Bacterial disease 2
Cold wave 2
Forest fire 2
Coastal flood 1
Drought 1
Heat wave 1
Landslide 1
Locust 1
American Samoa Tropical cyclone 4
Flash flood 1
Tsunami 1
However, let's say I want to select these for and plot 4 plots, 1 for each country, showing the number of each type of disaster happening in each country, I know I would need something along the lines of what's below, but I'm unsure how to set the x and y variables for each -- or if there is a more efficient way to set the variables/plot, that would be great. Usually, I would just use loc or iloc, but I need to be more specific with selecting.
fig, ax = subplots(2,2, figsize(16,10)
X1 = c_val.loc['Country'] == 'Afghanistan' #This doesn't work, just need something similar
y1 = c_val.loc['Num of Disasters']
X2 =
y2 =
X3 =
y3 =
X4 =
y4 =
ax[0,0].bar(X1,y1,width=.4, color=['#A2BDF2'])
ax[0,1].bar(X2,y2,width=.4,color=['#A2BDF2'])
ax[1,0].bar(X3,y3,width=.4,color=['#A2BDF2'])
ax[1,1].bar(X4,y4,width=.4,color=['#A2BDF2'])

IIUC, an simple way is to use catplot from seaborn package:
# Python env: pip install seaborn
# Anaconda env: conda install seaborn
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
g = sns.catplot(x='Disaster Subtype', y='Num of Disaster', col='Country',
data=df, col_wrap=2, kind='bar')
g.set_xticklabels(rotation=90)
g.tight_layout()
plt.show()
Update
How I can select the specific countries to be plotted in each subplot?
subdf = df.loc[df['Country'].isin(['Albania', 'Algeria'])]
g = sns.catplot(x='Disaster Subtype', y='Num of Disaster', col='Country',
data=subdf, col_wrap=2, kind='bar')
...

Plot against dummy variables and grouped values

This is some values of the table I have
country colour ...
1 Spain red
2 USA blue
3 Greece green
4 Italy white
5 USA red
6 USA blue
7 Spain red
I want to be able to group the countries together and plot it where the country is in the x axis and the total number of 'colours' is calculated for each country. For example, country USA has 2 blues and 1 red, Spain has 2 reds etc. I want this in a bar chart form. I would like this to be done using either matplotlib or seaborn.
I would assume I would have to use dummy variables for the 'colours' column but I'm not sure how to plot against a grouped column and dummy variables.
Much appreciated if you could show and explain the process. Thank you.

Try with crosstab:
pd.crosstab(df['country'], df['colour']).plot.bar()
Output:

Grouped bar chart for categories by month/year

I'm trying to use Plotly to create a stacked or grouped bar chart that has month/year on the x-axis and values on the y-axis. The data frame looks like this:
category value date
apple 4 10/2020
banana 3 10/2020
apple 2 10/2020
strawberry 1 11/2020
banana 4 11/2020
apple 9 11/2020
banana 4 12/2020
apple 7 12/2020
strawberry 4 12/2020
banana 8 12/2020
.
.
.
Assuming that newer dates will come through, and also more categories can be added, I'm trying to create a grouped bar chart that is also scrollable on the x-axis(date).
I tried this to create the grouped bar chart but it ends up being a stacked bar chart instead:
import plotly.graph_objects as go
fig_3_a = go.Figure(data=[go.Bar(
x=df['date'],
y=df['value'],
text=df['category'],
textposition='auto',
orientation ='v',
)],
layout=go.Layout(barmode='group'))
I would like something like this instead, where the different categories can possibly be assigned a different color, and the x-axis being the month/day and the y-axis being the value. Here, gender==category and x-axis==month/year. Also would need to add the scrolling for the x-axis to see all the month/year:

You can do it simply with plotly.express.
import plotly.express as px
fig = px.bar(df, x='date', y='value', color='category', barmode='group')
fig.show()
If you want to do it with go.Bar class, you need to add traces for each category.

Have a dataframe but need to make a barplot in python

Hi I have a very big dataframe, below is a snapshot. I want to calculate target % split across various worker type and plot bar graph (see attached picture)
Worker type TARGET
0 Working 1
1 State servant 0
2 Pensioner 1
3 Working 0
4 Commercial associate 1
5 State servant 0
6 Commercial associate 0
7 Pensioner 1
8 Working 1
9 Working 0

Try,
import matplotlib.pyplot as plt
ax = df[['Worker type']].plot(kind='bar', title ="Worker Type", figsize=(15, 10), legend=True, fontsize=12)
ax.set_xlabel("Worker", fontsize=12)
ax.set_ylabel("Count", fontsize=12)
plt.show()

try this:
df.groupby('Worker type').count().plot.bar(y='TARGET')

How do I make a heatmap with seaborn module in python using the pandas dataframe given below?

This is a dataframe of countries and the count of cars each country has.
It's preferred to have countries on the left/y axis and cars as bottom/x axis.

Simply set the index as country and plot the heatmap via sns.heatmap.
Here is the code:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df=pd.DataFrame({'country':['us','france','spain','italy','germany'],
'corvette':[2,0,2,11,0],
'ford':[0,1,10,0,10],
'toyota':[1,10,0,1,1]})
df.set_index(['country'],inplace=True)
print(df) #1
ax=sns.heatmap(df,cmap='coolwarm')
plt.show() #2
OUTPUT: #1
corvette ford toyota
country
us 2 0 1
france 0 1 10
spain 2 10 0
italy 11 0 1
germany 0 10 1
OUTPUT: #2

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

how to plot in pandas categorical data - python

I have this kind of dataframe: animal age where 0 dog 1 indoor 1 cat 4 indoor 2 horse 3 outdoor I would like to present a bar plot in which: y axis is age, x axis is animal, and the animals are grouped in adjacent bars with different colors. Thanks

This should do the trick df = pd.DataFrame({"animal":["dog","cat","horse"],"age":[1,4,3],"where":["indoor","indoor","outdoor"]}) df animal age where 0 dog 1 indoor 1 cat 4 indoor 2 horse 3 outdoor ax = df.plot.bar(x="animal",y="age",color=["b","r","g"]) ax.legend("") ax.set_ylabel("Age")

Another easy way. Set the intended x axis label as index and plot. By defaul, float/integer end up on the y axis import matplotlib.pyplot as plt df.set_index(df['animal']).plot( kind='bar') plt.ylabel('age')

Related

How to select different sets of variables (ei value counts for a specific country) from a groupby df for a 2,2 subplot

Plot against dummy variables and grouped values

Grouped bar chart for categories by month/year

Have a dataframe but need to make a barplot in python

How do I make a heatmap with seaborn module in python using the pandas dataframe given below?

Categories

Resources