The figure resulting from the Python code below unfortunately cuts off part of the legends. How can I avoid this? Did I miss a parameter in the sns call or is this due to how I've set up my PyCharm IDE?
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.read_csv('gm_2008_region.csv')
df = df.drop('Region', axis=1)
plt.figure()
sns.heatmap(df.corr(), square=True, cmap='RdYlGn')
plt.show()
This is the resulting figure:
The .csv file can be found here.
Try adding plt.subplots_adjust(bottom=0.28) as follows:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.read_csv('gm_2008_region.csv')
df = df.drop('Region', axis=1)
plt.figure()
sns.heatmap(df.corr(), square=True, cmap='RdYlGn')
plt.subplots_adjust(bottom=0.28)
plt.show()
Giving you:
You might want to change the figsize of plt.figure such as...
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.read_csv('gm_2008_region.csv')
df = df.drop('Region', axis=1)
plt.figure(figsize=(12, 8))
sns.heatmap(df.corr(), square=True, cmap='RdYlGn')
plt.show()
Related
So I am trying to create histograms for each specific variable in my dataset and then save it as a PNG file.
My code is as follows:
import pandas as pd
import matplotlib.pyplot as plt
x=combined_databook.groupby('x_1').hist()
x.figure.savefig("x.png")
I keep getting "AttributeError: 'Series' object has no attribute 'figure'"
Use matplotlib to create a figure and axis objects, then tell pandas which axes to plot on using the ax argument. Finally, use matplotlib (or the fig) to save the figure.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# Sample Data (3 groups, normally distributed)
df = pd.DataFrame({'gp': np.random.choice(list('abc'), 1000),
'data': np.random.normal(0, 1, 1000)})
fig, ax = plt.subplots()
df.groupby('gp').hist(ax=ax, ec='k', grid=False, bins=20, alpha=0.5)
fig.savefig('your_fig.png', dpi=200)
your_fig.png
Instead of using *.hist() I would use matplotlib.pyplot.hist().
Example :
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
y =[10, 20,30,40,100,200,300,400,1000,2000]
x = np.arange(10)
fig = plt.figure()
ax = plt.subplot(111)
ax.plot(x, y, label='$y = Values')
plt.title('my plot')
ax.legend()
plt.show()
fig.savefig('tada.png')
I have a pandas dataframe that looks like this
import pandas as pd
dt = pd.DataFrame({'var':[1,1,1,2,2,3,3,3,3,3]})
And I am creating a dist plot like this:
import seaborn as sns
fig = sns.distplot(dt['var'], norm_hist=False, kde=False, bins=3).get_figure()
And then I am saving this plot to a pdf
from matplotlib.backends.backend_pdf import PdfPages
pdf = PdfPages('foo.pdf')
pdf.savefig(fig, height=10, width=18, dpi=500, bbox_inches='tight', pad_inches=0.5)
plt.close()
How can I change the title and x_axis title at the plot ?
I think you can use pyplot
Try:
plt.xlabel("x-axis")
plt.title("title")
import seaborn as sns
import matplotlib.pyplot as plt
fig = sns.distplot(dt['var'], norm_hist=False, kde=False, bins=3).get_figure()
plt.title("something")
plt.xlabel("something")
plt.ylabel("something")
from matplotlib.backends.backend_pdf import PdfPages
pdf = PdfPages('foo.pdf')
pdf.savefig(fig, height=10, width=18, dpi=500, bbox_inches='tight', pad_inches=0.5)
plt.close() #if you want to present on jupyter you need to comment this out.
I have some values over time that i plot with the autocorrelation:
import pandas as pd
from statsmodels.graphics.tsaplots import plot_acf
import matplotlib.pyplot as plt
df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/wwwusage.csv', names=['value'], header=0)
fig, axes = plt.subplots(2, sharex=True)
axes[0].plot(df.value); axes[0]
plot_acf(df.value, ax=axes[1])
plt.show()
Which return this plot, but should return this plot.
If i use the normal acf function without the plot, I get some more values in the plot but still not all:
import pandas as pd
from statsmodels.tsa.stattools import acf
import matplotlib.pyplot as plt
df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/wwwusage.csv', names=['value'], header=0)
fig, axes = plt.subplots(2, sharex=True)
axes[0].plot(df.value)
axes[1].plot(acf(df.value))
plt.show()
Why is that? I use the same variable df.value in both plots.
Edit:
If i use pandas i get this plot, that doesn't seem right. And I'd really like to use the first function I mentioned, since it's the best plot visualisation:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/wwwusage.csv', names=['value'], header=0)
fig, axes = plt.subplots(2, sharex=True)
axes[0].plot(df.value)
df_value_acf = [df.value.autocorr(i) for i in range(1,len(df.value))]
axes[1].plot(df_value_acf)
plt.show()
I'm trying to plot out a dictionary data with matplotlib in python3.6, macOS.
I want the keys of the dict to be printed as sticks but they are not showing actually.
My code is as below:
import pandas as pd
import glob
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.pyplot import figure
%matplotlib inline
figure(num=None, figsize=(500, 100), dpi=80, facecolor='w', edgecolor='k')
D = info_dict
x = list(D.keys())
y = list(D.values())
plt.bar(x,y)
plt.xticks(range(len(D)), list(D.values()), rotation='vertical')
plt.margins(0.2)
plt.subplots_adjust(bottom=0.15)
plt.show()
And the plotted one is like this:
I'm trying to find a way to make the rows height of a Pandas DataFrame plot table fit to their content. If not possible, is there an alternative way to draw this kind of plot?
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
fig, ax = plt.subplots(1, 1,figsize =(8,2))
df = pd.DataFrame(np.round(np.random.rand(5, 3),2), columns=['a', 'b', 'c'],index=['1\na','2\na','3\na','4\na','5\na'])
ax.get_xaxis().set_visible(False) # Hide Ticks
df.plot(table=True, ax=ax)
fig.dpi = 600
Just remove every "\na" in your Dataframe index list and your problem should be solved.
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
fig, ax = plt.subplots(1, 1,figsize =(8,2))
df = pd.DataFrame(np.round(np.random.rand(5, 3),2), columns=['a', 'b', 'c'],index=['1','2','3','4','5'])
ax.get_xaxis().set_visible(False) # Hide Ticks
df.plot(table=True, ax=ax)
fig.dpi = 600