Matplotlib plot without linear ordered - python

Is it possible to draw matplotlib chart without using a pandas plot to draw a chart without linear ordering the values on the left?
df = pd.DataFrame({
'x':[3,0,5],
'y':[10,4,20]
})
Chart made with the help of DataFrame:
plt.barh(df['x'],df['y'])
Without dataframe:
x = [3,0,5]
y= [10,4,20]
plt.barh(x,y)
it gives me the same result
Matplotlib chart
Output chart:
df.plot.barh('x','y')
Pandas output chart:
I would like to get such an output only with normal numbers and not numbers as the type of str
plt.barh(['3','0','5'],[10,4,20])
Is it possible? How could i get it?

You can use the index of the dataframe as y parameter and use the x values of the dataframe as tick_label:
plt.barh(df.index, width=df['y'], tick_label=df['x'])

Related

Seaborn Catplot not showing text labels in x axis

i am trying to make a bar catplot with a long dataframe that has three columns: property, value and playlist. Basically it was a wide dataframe that i converted to long format using pd.melt(). the problem is that whenever i try to plot it with a bar catplot the categorical data that should be on the x axis that corresponds to the property column, is just showing up as numbers.
Here is an image of how my dataframe looks:
()
and here are my code and how the plot currently looks:
code:
#bar catplot
bar_catplot = sns.catplot(
kind="bar", x="property", y="value", hue="playlist", legend=True, data=long_frame2
)
bar_catplot_figure = bar_catplot.fig
catplot_render = mpld3.fig_to_html(bar_catplot_figure)
and how the plot currently looks:
.
Thank you in advance!

Using seaborn how do I plot a column which has 70+ categories

I am trying to plot a column from a dataframe. There are about 8500 rows and the Assignment group column has about 70+ categories. How do I plot this visually using seaborn to get some meaningful output?
nlp_data['Assignment group'].hist(figsize=(17,7))
I used the hist() method to plot
you can use heatmap for such data
seaborn.heatmap

Smoothing the curve in a line plot - Values interval x axis

I'm trying to recreate the following plot:
With an online tool I could create the dataset (135 data points) which I saved in a CSV file with the following structure:
Year,Number of titles available
1959,1.57480315
1959,1.57480315
1959,1.57480315
...
1971,221.4273356
1971,215.2494175
1971,211.5426666
I created a Python file with the following code:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('file.csv')
df.plot.line(x='Year', y='Number of titles available')
plt.show()
and I'm getting the following plot:
What can I do to get a smooth line like in the original plot?
How can I have the same values in the x axis like in the original plot?
EDIT: I worked on the data set and formatting properly the dates, the plot is now better.
This is how the data set looks now:
Date,Number of available titles
1958/07/31,2.908816952
1958/09/16,3.085527674
1958/11/02,4.322502727
1958/12/19,5.382767059
...
1971/04/13,221.6766907
1971/05/30,215.4918154
1971/06/26,211.7808903
This is the plot I can get with the same code posted above:
The question now is: how can I have the same date range as in the original plot (1958 - mid 1971)?
Try taking the mean of your values that you have grouped by year. This will smooth out the discontinuities that you get each year to an average value. If that does not help, then you should apply any one of numerous filters.
df.groupby('year').mean().plot(kind='line')

Seaborn barplot - column values without estimator parameter

I am a beginner in seaborn plotting and noticed that sns.barplot shows the value of bars using a parameter called estimator.
Is there a way for the barplot to show the value of each column instead of using a statiscal approach through the estimator parameter?
For instance, I have the following dataframe:
data = [["2019/oct",10],["2019/oct",20],["2019/oct",30],["2019/oct",40],["2019/nov",20],["2019/dec",30]]
df = pd.DataFrame(data, columns=['Period', 'Observations'])
I would like to plot all values ​​from the Period "2019/oct" column (10,20,30 and 40), but the bar chart returns the average of these values ​​(25) for the period "2019/oct":
sns.barplot(x='Period',y='Observations',data=df,ci=None)
How can I bring all column values ​​to the chart?
barplot combines values with the same x, unless the have a different hue. If you want to keep the different value for "2019/oct", you could create a new column to attribute them a different hue:
data = [["2019/oct",10],["2019/oct",20],["2019/oct",30],["2019/oct",40],["2019/nov",20],["2019/dec",30]]
df = pd.DataFrame(data, columns=['Period', 'Observations'])
df['subgroup'] = df.groupby('Period').cumcount()+1
sns.barplot(x='Period',y='Observations',hue='subgroup',data=df,ci=None)

Using a Pandas dataframe index as values for x-axis in matplotlib plot

I have time series in a Pandas dateframe with a number of columns which I'd like to plot. Is there a way to set the x-axis to always use the index from a dateframe?
When I use the .plot() method from Pandas the x-axis is formatted correctly however I when I pass my dates and the column(s) I'd like to plot directly to matplotlib the graph doesn't plot correctly. Thanks in advance.
plt.plot(site2.index.values, site2['Cl'])
plt.show()
FYI: site2.index.values produces this (I've cut out the middle part for brevity):
array([
'1987-07-25T12:30:00.000000000+0200',
'1987-07-25T16:30:00.000000000+0200',
'2010-08-13T02:00:00.000000000+0200',
'2010-08-31T02:00:00.000000000+0200',
'2010-09-15T02:00:00.000000000+0200'
],
dtype='datetime64[ns]')
It seems the issue was that I had .values. Without it (i.e. site2.index) the graph displays correctly.
You can use plt.xticks to set the x-axis
try:
plt.xticks( site2['Cl'], site2.index.values ) # location, labels
plt.plot( site2['Cl'] )
plt.show()
see the documentation for more details: http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.xticks
That's Builtin Right Into To plot() method
You can use yourDataFrame.plot(use_index=True) to use the DataFrame Index On X-Axis.
The "use_index=True" sets the DataFrame Index on the X-Axis.
Read More Here: https://pandas.pydata.org/pandas-docs/version/0.23/generated/pandas.DataFrame.plot.html
you want to use matplotlib to select a 'sensible' scale just like me, there is one way can solve this question. using a Pandas dataframe index as values for x-axis in matplotlib plot. Code:
ax = plt.plot(site2['Cl'])
x_ticks = ax.get_xticks() # use matplotlib default xticks
x_ticks = list(filter(lambda x: x in range(len(site2)), x_ticks))
ax.set_xticklabels([' '] + site2.index.iloc[x_ticks].to_list())

Categories