I have imported a file from Excel into Python. I am using pandas and matplotlib. The Excel spreadsheet contains two columns, one for discharge and one for frequency. I would like to make a histogram of the data and put the discharge on the x-axis and frequency on the y-axis.
As you have the frequency for each discharge, i.e. all the histogram bins are present, you can just create a barchart with your data.
df.plot.bar(x='discharge', y='frequency', rot=0)
Related
I have a very huge Dataframe which I want to plot. I created a smaller one for demonstration here:
How can I plot this dataframe, I would like the columns to be the x axis and sum up all the values in bars
I am trying to plot a column from a dataframe. There are about 8500 rows and the Assignment group column has about 70+ categories. How do I plot this visually using seaborn to get some meaningful output?
nlp_data['Assignment group'].hist(figsize=(17,7))
I used the hist() method to plot
you can use heatmap for such data
seaborn.heatmap
This may be a very stupid question, but when plotting a Pandas DataFrame using .plot() it is very quick and produces a graph with an appropriate index. As soon as I try to change this to a bar chart, it just seems to lose all formatting and the index goes wild. Why is this the case? And is there an easy way to just plot a bar chart with the same format as the line chart?
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df = pd.DataFrame()
df['Date'] = pd.date_range(start='01/01/2012', end='31/12/2018')
df['Value'] = np.random.randint(low=5, high=100, size=len(df))
df.set_index('Date', inplace=True)
df.plot()
plt.show()
df.plot(kind='bar')
plt.show()
Update:
For comparison, if I take the data and put it into Excel, then create a line plot and a bar ('column') plot it instantly will convert the plot and keep the axis labels as they were for the line plot. If I try to produce many (thousands) of bar charts in Python with years of daily data, this takes a long time. Is there just an equivalent way of doing this Excel transformation in Python?
Pandas bar plots are categorical in nature; i.e. each bar is a separate category and those get their own label. Plotting numeric bar plots (in the same manner a line plots) is not currently possible with pandas.
In contrast matplotlib bar plots are numerical if the input data is numbers or dates. So
plt.bar(df.index, df["Value"])
produces
Note however that due to the fact that there are 2557 data points in your dataframe, distributed over only some hundreds of pixels, not all bars are actually plotted. Inversely spoken, if you want each bar to be shown, it needs to be one pixel wide in the final image. This means with 5% margins on each side your figure needs to be more than 2800 pixels wide, or a vector format.
So rather than showing daily data, maybe it makes sense to aggregate to monthly or quarterly data first.
The default .plot() connects all your data points with straight lines and produces a line plot.
On the other hand, the .plot(kind='bar') plots each data point as a discrete bar. To get a proper formatting on the x-axis, you will have to modify the tick-labels post plotting.
I have a dataset and I want to find out how several columns values (numeric values) differ across two different groups ('group' is a column that takes either the value of 'high' or 'low').
I want to plot several barplots using a similar system/aesthetics to Seaborn's FacetGrid or PairGrid. Each plot will have a different Y value but the same X-axis (The group variable)
This is what I have so far:
sns.catplot(x='group', y='Number of findings (total)', kind="bar",
palette="muted", data=df)
But I would like to write a loop that can replace my y variable with different variables. How to do it?
I have a large dataset with more than 20k rows and about 6 columns.
I can very well plot a graph when the values are in columns. But this is a different case.
Input:
Title,2016,2015,2014,2013,2012
aaa,45,76,23,765,65
bbb,245,633,35,75,654
ccc,74,74,23,764,864
ddd,34,63,235,63,244
I am trying to plot individual bar/line/scatter graph for every row with
graph title: name from column title
x axis: headers years 2016,2015,2014,2013,2012
y axis: the values in individual row.
Every row with a new plot file can be automatically saved with title names as file names.
I have tried with bokeh and Seaborn but I am not able to plot one row after another.
Appreciate the help, Thanks in advance !