I have a very huge Dataframe which I want to plot. I created a smaller one for demonstration here:
How can I plot this dataframe, I would like the columns to be the x axis and sum up all the values in bars
Related
I have imported a file from Excel into Python. I am using pandas and matplotlib. The Excel spreadsheet contains two columns, one for discharge and one for frequency. I would like to make a histogram of the data and put the discharge on the x-axis and frequency on the y-axis.
As you have the frequency for each discharge, i.e. all the histogram bins are present, you can just create a barchart with your data.
df.plot.bar(x='discharge', y='frequency', rot=0)
Suppose I have dataframe, which has index composed of two columns and I want to plot it:
import pandas
from matplotlib import pyplot as plot
df=pandas.DataFrame(data={'floor':[1,1,1,2,2,2,3,3],'room':[1,2,3,1,1,2,1,3],'count':[1, 1, 3,2,2,4,1,5]})
df2=df.groupby(['floor','room']).sum()
df2.plot()
plot.show()
The above example will result in a plot where row numbers are used for x axis and no tick labels. Are there any facilities to use the index instead?
Say, I'd like to have x axis separated into even sections for first column of index and spread out points values of second index column inside those sections.
I have a dataset and I want to find out how several columns values (numeric values) differ across two different groups ('group' is a column that takes either the value of 'high' or 'low').
I want to plot several barplots using a similar system/aesthetics to Seaborn's FacetGrid or PairGrid. Each plot will have a different Y value but the same X-axis (The group variable)
This is what I have so far:
sns.catplot(x='group', y='Number of findings (total)', kind="bar",
palette="muted", data=df)
But I would like to write a loop that can replace my y variable with different variables. How to do it?
I want I stacked histogram where the different classes are visible.
At the moment I have the histogram without classes with this code:
plt.hist(hist_matrix2.column_name)
which produces this histogram:
and another histogram with the same data, that is grouped by the classes with this code:
hist_matrix2.groupby("number").column_name.plot.hist(alpha=0.5, bins = [0,5,10,15,20,25,30], stacked = True)
which produces this histogram:
As you can see the classes are there but it is not stacked, although the parameter is set. What can I do to stack the classes?
plt.hist has a built-in stacking flag you can set:
plt.hist(hist_matrix2.column_name, stacked=True)
Edit in response to your question, for long data (with multiple levels stacked) first you need to restructure the data into a list of lists:
wide=hist_matrix2.pivot( columns='number', values='column_name')
#This creates many missing values which pandas does not like, so we drop them
widelist=[wide[col].dropna() for col in wide.columns]
# and the stacked graph is here
plt.hist(widelist,stacked=True)
plt.show()
I have a large dataset with more than 20k rows and about 6 columns.
I can very well plot a graph when the values are in columns. But this is a different case.
Input:
Title,2016,2015,2014,2013,2012
aaa,45,76,23,765,65
bbb,245,633,35,75,654
ccc,74,74,23,764,864
ddd,34,63,235,63,244
I am trying to plot individual bar/line/scatter graph for every row with
graph title: name from column title
x axis: headers years 2016,2015,2014,2013,2012
y axis: the values in individual row.
Every row with a new plot file can be automatically saved with title names as file names.
I have tried with bokeh and Seaborn but I am not able to plot one row after another.
Appreciate the help, Thanks in advance !