I am trying to plot a density chart. Below you can see data and chart
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
data = {'type_sale':[100,200,400,400,200,400,300,200,210,300],
'bool':[0,1,0,1,1,0,1,1,0,1],
}
df1 = pd.DataFrame(data, columns = ['type_sale',
'bool'])
df1['bool']= df1['bool'].astype('int32')
I tried with the command above but is not working. Can anybody help me how to solve this problem ?
plot_density_chart(df1[['type_sale', 'bool']], "bool", 'type_sale',
category_var="type_sale", title='prevalence',
xlabel='Type_sale', logx="Yes", vline=None,
save_figure_name = 'type_sale_prevalence.pdf')
You can use seaborn to plot the density chart:
import seaborn as sns
g = sns.FacetGrid(df1,hue='bool')
g = g.map(sns.kdeplot,'type_sale',fill=True,alpha=0.3)
g.add_legend()
g.fig.suptitle('Prevalence', fontsize=16)
g.axes[0,0].set_xlabel('Type_sale')
Which gives you the figure:
If you want to set x-axis to log, add this :
g.axes[0,0].set_xscale('log')
Related
I have to chart a data from csv somewhere from my directory. I am using python by learning some samples online. Problem is, I can't find any solution to show all x-axis labels.
import pandas as pd
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = [7.50, 3.50]
plt.rcParams["figure.autolayout"] = True
pathcsv = r'D:\iPython\csvfile\samplecsv2.csv'
df = pd.read_csv(pathcsv)
df.set_index('Names').plot()
plt.show()
you can do that by using set_xticklabels to set the names and set_xticks to show ticks for each country. Updated code is below...
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
plt.rcParams["figure.figsize"] = [7.50, 3.50]
plt.rcParams["figure.autolayout"] = True
pathcsv = r'D:\iPython\csvfile\samplecsv2.csv'
ax =df.set_index('Names').plot()
ax.set_xticks(np.arange(len(df))) #Show ticks for each country
ax.set_xticklabels(df.Names) #Show labels as in df.Names
plt.show()
Output graph
Given that, I have a a dataset as below:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
disease_type = list(np.random.choice(['TB','P'],100))
gender = list(np.random.choice(['M','F'],100))
dict = { 'Disease Type': disease_type ,'Gender':gender }
dt = pd.DataFrame(dict)
I would like to generate a barchart diagram using pyplot which show different disease type based on gender. Somthing like the below image:
I understand that, I can do a groupby as below:
dt = dt.groupby(['Gender'], as_index=False).count()
But, i don't know how to feed it to pyplot ?
I tried the following code for visualization but it did not work for me:
fig= plt.Figure(figsize=(10,10))
ax = fig.add_axes([0.1,0.1,0.8,0.8])
ax.bar(height=dt['Disease Type'])
plt.show()
I want to change the labels [2,3,4,5] from my pie chart and instead have them say [Boomer, Gen X, Gen Y, Gen Z] respectively. I can't seem to find a direct way of doing this without changing the dataframe. Is there any way to do this by working through the code I have?
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
data = df.groupby("Q10_Ans")["Q4_Agree"].count()
pie, ax = plt.subplots(figsize=[10,6])
labels = data.keys()
plt.pie(x=data, autopct="%.1f%%", explode=[0.05]*4, labels=labels, pctdistance=0.5)
plt.title("Generations that agree data visualization will help with job prospects", fontsize=14);
pie.savefig("DeliveryPieChart.png")
how about change the code
labels = data.keys()
to
labels = ['Boomer','Gen X','Gen Y','Gen Z']
I don't know the data structure of your data, so I made a sample data and created a pie chart. Please modify your code to follow this.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
# data = df.groupby("Q10_Ans")["Q4_Agree"].count()
data = pd.DataFrame({'Q10_Ans':['Boomer','Gen X','Gen Y','Gen Z'],'Q4_Agree':[2,3,4,5]})
fig, ax = plt.subplots(figsize=[10,6])
labels = data['Q10_Ans']
ax.pie(x=data['Q4_Agree'], autopct="%.1f%%", explode=[0.05]*4, labels=labels, pctdistance=0.5)
ax.set_title("Generations that agree data visualization will help with job prospects", fontsize=14);
plt.savefig("DeliveryPieChart.png")
Here I am trying to separate the data with the factor male or not by plotting Age on x-axis and Fare on y-axis and I want to display two labels in the legend differentiating male and female with respective colors.Can anyone help me do this.
Code:
import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_csv('https://sololearn.com/uploads/files/titanic.csv')
df['male']=df['Sex']=='male'
sc1= plt.scatter(df['Age'],df['Fare'],c=df['male'])
plt.legend()
plt.show()
You could use the seaborn library which builds on top of matplotlib to perform the exact task you require. You can scatterplot 'Age' vs 'Fare' and colour code it by 'Sex' by just passing the hue parameter in sns.scatterplot, as follows:
import matplotlib.pyplot as plt
import seaborn as sns
plt.figure()
# No need to call plt.legend, seaborn will generate the labels and legend
# automatically.
sns.scatterplot(df['Age'], df['Fare'], hue=df['Sex'])
plt.show()
Seaborn generates nicer plots with less code and more functionality.
You can install seaborn from PyPI using pip install seaborn.
Refer: Seaborn docs
PathCollection.legend_elements method
can be used to steer how many legend entries are to be created and how they
should be labeled.
import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_csv('https://sololearn.com/uploads/files/titanic.csv')
df['male'] = df['Sex']=='male'
sc1= plt.scatter(df['Age'], df['Fare'], c=df['male'])
plt.legend(handles=sc1.legend_elements()[0], labels=['male', 'female'])
plt.show()
Legend guide and Scatter plots with a legend for reference.
This can be achieved by segregating the data in two separate dataframe and then, label can be set for these dataframe.
import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_csv('https://sololearn.com/uploads/files/titanic.csv')
subset1 = df[(df['Sex'] == 'male')]
subset2 = df[(df['Sex'] != 'male')]
plt.scatter(subset1['Age'], subset1['Fare'], label = 'Male')
plt.scatter(subset2['Age'], subset2['Fare'], label = 'Female')
plt.legend()
plt.show()
enter image description here
I have a scatter plot im working with and for some reason im not seeing all the x values on my graph
#%%
from pandas import DataFrame, read_csv
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
file = r"re2.csv"
df = pd.read_csv(file)
#sns.set(rc={'figure.figsize':(11.7,8.27)})
g = sns.FacetGrid(df, col='city')
g.map(plt.scatter, 'type', 'price').add_legend()
This is an image of a small subset of my plots, you can see that Res is displaying, the middle bar should be displaying Con and the last would be Mlt. These are all defined in the type column from my data set but are not displaying.
Any clue how to fix?
Python is doing what you tell it to do. Just pick different features, presumably things that make more sense for plotting, if you want to generate a more interesting plots. See this generic example below.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_theme(style="darkgrid")
tips = sns.load_dataset("tips")
sns.relplot(x="total_bill", y="tip", hue="smoker", data=tips);
Personally, I like plotly plots, which are dynamic, more than I like seaborn plots.
https://plotly.com/python/line-and-scatter/