In seaborn, how to increase the graph and save as image? - python

In python3 and pandas I have this dataframe:
gastos_anuais.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 5 entries, 0 to 4
Data columns (total 2 columns):
ano 5 non-null int64
valor_pago 5 non-null float64
dtypes: float64(1), int64(1)
memory usage: 280.0 bytes
gastos_anuais.reset_index()
index ano valor_pago
0 0 2014 13,082,008,854.37
1 3 2017 9,412,069,205.73
2 2 2016 7,617,420,559.22
3 1 2015 7,470,391,492.24
4 4 2018 7,099,199,179.11
I did a pointplot chart:
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
sns.pointplot(x='ano', y='valor_pago', data=gastos_anuais)
plt.xticks(rotation=65)
plt.grid(True, linestyle="--")
plt.title("Gastos Destinados pelo Governo Federal (2014-2018)\n")
plt.xlabel("Anos")
plt.ylabel("Em bilhões de R$")
plt.show()
It worked. But I would like to:
Increase the size of the chart that appears on the screen
Can save image format, .jpeg file for example
And I do not understand why below the title of the graph appears '1e10'
Please, does anyone know how I can do it?

Increase the size of the chart that appears on the screen
Add sns.set(rc={'figure.figsize':(w, h)}) before plotting. For example:
sns.set(rc={'figure.figsize':(20, 5)})
Save as jpg
Keep a reference to the plot, get the figure and save it:
p = sns.pointplot(x='ano', y='valor_pago', data=gastos_anuais)
plt.xticks(rotation=65)
#...
# All your editions with `plt`
#...
fig = p.get_figure()
fig.savefig("gastos_anuais.jpg")
What is the 1e10 in the corner?
It is the scale. This means that the values shown in the y axis should be multiplied by 10^10 to recover the actual values of the data.
If you want to remove it, you can use:
plt.ticklabel_format(style='plain', axis='y')
But you will need to do something with the values since they distort the image.

Related

How to display only certain bins according to bin height with a pyplot histogram [duplicate]

I have used the pandas value_counts function to provide counts of unique values:
CountStatus = pd.value_counts(df['scstatus'].values, sort=True)
Output:
200 133809
304 7217
404 2176
302 740
500 159
403 4
301 1
dtype: int64
I now want to plot these values using matplotlib i.e plt.barh(CountStatus), however I keep getting the error: ValueError: incompatible sizes: argument 'width' must be length 7 or scalar.
I'm guessing this may have something to do with the left hand column being an index column. Is there a way around this to obtain a horizontal bar chart? Do I need to convert it or specify something else in the function?
Update
pandas.Series.value_counts is a Series method
Plot with pandas.Series.plot with kind='bar' or kind='barh'
import seaborn as sns
# test data, loads a pandas dataframe
df = sns.load_dataset('planets')
# display(df.head(3))
method number orbital_period mass distance year
0 Radial Velocity 1 269.300 7.10 77.40 2006
1 Radial Velocity 1 874.774 2.21 56.95 2008
2 Radial Velocity 1 763.000 2.60 19.84 2011
# plot value_counts of Series
ax = df.method.value_counts().plot(kind='barh')
ax.set_xscale('log')
Original Answer
I think you can use barh:
CountStatus.plot.barh()
Sample:
CountStatus = pd.value_counts(df['scstatus'].values, sort=True)
print CountStatus
AAC 8
AA 7
ABB 4
dtype: int64
CountStatus.plot.barh()

how to visualize columns of a dataframe python as a plot?

I have a dataframe that looks like below:
DateTime ID Temperature
2019-03-01 18:36:01 3 21
2019-04-01 18:36:01 3 21
2019-18-01 08:30:01 2 18
2019-12-01 18:36:01 2 12
I would like to visualize this as a plot, where I need the datetime in x-axis, and Temperature on the y axis with a hue of IDs, I tried the below, but i need to see the Temperature distribution for every point more clearly. Is there any other visualization technique?
x= df['DateTime'].values
y= df['Temperature'].values
hue=df['ID'].values
plt.scatter(x, y,hue,color = "red")
you can try:
df.set_index('DateTime').plot()
output:
or you can use:
df.set_index('DateTime').plot(style="x-", figsize=(15, 10))
output:

Seaborn distplot only whole numbers

How can I make a distplot with seaborn to only have whole numbers?
My data is an array of numbers between 0 and ~18. I would like to plot the distribution of the numbers.
Impressions
0 210
1 1084
2 2559
3 4378
4 5500
5 5436
6 4525
7 3329
8 2078
9 1166
10 586
11 244
12 105
13 51
14 18
15 5
16 3
dtype: int64
Code I'm using:
sns.distplot(Impressions,
# bins=np.arange(Impressions.min(), Impressions.max() + 1),
# kde=False,
axlabel=False,
hist_kws={'edgecolor':'black', 'rwidth': 1})
plt.xticks = range(current.Impressions.min(), current.Impressions.max() + 1, 1)
Plot looks like this:
What I'm expecting:
The xlabels should be whole numbers
Bars should touch each other
The kde line should simply connect the top of the bars. By the looks of it, the current one assumes to have 0s between (x, x + 1), hence why the downward spike (This isn't required, I can turn off kde)
Am I using the correct tool for the job or distplot shouldn't be used for whole numbers?
For your problem can be solved bellow code,
import seaborn as sns # for data visualization
import numpy as np # for numeric computing
import matplotlib.pyplot as plt # for data visualization
arr = np.array([1,2,3,4,5,6,7,8,9])
sns.distplot(arr, bins = arr, kde = False)
plt.xticks(arr)
plt.show()
enter image description here
In this way, you can plot histogram using seaborn sns.distplot() function.
Note: Whatever data you will pass to bins and plt.xticks(). It should be an ascending order.

Pandas: Histogram Plotting

I have a dataframe with dates (datetime) in python. How can I plot a histogram with 30 min bins from the occurrences using this dataframe?
starttime
1 2016-09-11 00:24:24
2 2016-08-28 00:24:24
3 2016-07-31 05:48:31
4 2016-09-11 00:23:14
5 2016-08-21 00:55:23
6 2016-08-21 01:17:31
.............
989872 2016-10-29 17:31:33
989877 2016-10-02 10:00:35
989878 2016-10-29 16:42:41
989888 2016-10-09 07:43:27
989889 2016-10-09 07:42:59
989890 2016-11-05 14:30:59
I have tried looking at examples from Plotting series histogram in Pandas and A per-hour histogram of datetime using Pandas. But they seem to be using a bar plot which is not what I need. I have attempted to create the histogram using temp.groupby([temp["starttime"].dt.hour, temp["starttime"].dt.minute]).count().plot(kind="hist") giving me the results as shown below
If possible I would like the X axis to display the time(e.g 07:30:00)
I think you need bar plot and for axis with times simpliest is convert datetimes to strings by strftime:
temp = temp.resample('30T', on='starttime').count()
ax = temp.groupby(temp.index.strftime('%H:%M')).sum().plot(kind="bar")
#for nicer bar some ticklabels are hidden
spacing = 2
visible = ax.xaxis.get_ticklabels()[::spacing]
for label in ax.xaxis.get_ticklabels():
if label not in visible:
label.set_visible(False)

how to plot two barh in one axis in pyqtgraph?

I need something like this:
demo data:
bottom10
Out[12]:
0 -9.823127e+08
1 -8.069270e+08
2 -6.030317e+08
3 -5.709379e+08
4 -5.224355e+08
5 -4.755464e+08
6 -4.095561e+08
7 -3.989287e+08
8 -3.885740e+08
9 -3.691114e+08
Name: amount, dtype: float64
top10
Out[13]:
0 9.360520e+08
1 9.078776e+08
2 6.603838e+08
3 4.967611e+08
4 4.409362e+08
5 3.914972e+08
6 3.547471e+08
7 3.538894e+08
8 3.368558e+08
9 3.189895e+08
Name: amount, dtype: float64
The same question for matplotlib is here:how to plot two barh in one axis
But there is not any ax.twiny() in pyqtgraph. Any other way?
I found a Widgets "BarGraphItem",which not written in offical documentation(PyQtGraph’s Widgets List). It can "rotate()" to make barh like matplotlib. It's not perfect but works!
import pyqtgraph as pg
import pandas as pd
import numpy as np
bottom10 = pd.DataFrame({'amount':-np.sort(np.random.rand(10))})
top10 = pd.DataFrame({'amount':np.sort(np.random.rand(10))[::-1]})
maxtick=max(top10.amount.max(),-bottom10.amount.min())*1.3
win1 = pg.plot()
axtop=pg.BarGraphItem(x=range(len(top10)),height=top10.amount,width=0.6,brush='r')
axtop.rotate(-90)
win1.addItem(axtop)
axbt=pg.BarGraphItem(x=range(len(top10)),height=-bottom10.amount,y0=maxtick+bottom10.amount,width=0.6,brush='g')
axbt.rotate(-90)
win1.addItem(axbt)

Categories