I have two dataframes df1 and df2.
df1 has two columns, column 1 'key' with 20 items. Column 2 'df1_val' with values against each key.
df2 is similar but column 2 is called df2_val.
Whats the easiest way to plot a single plot with both df1_val and df2_val - x-axis assigned to keys
I would do it this way:
Start by naming your df_1value and df_2value as the same 'value', then
fig = plt.figure()
for r in [df_1,df_2]:
plt.plot(r['key'], r['value'])
plt.xlim(0,<a value>)
plt.ylim(0,<a value >)
plt.show()
An alternativ is to do this. Plot what you need for first dataframe df1 and use ax to "force" the other plots in the same graph
ax = df.plot()
```
And
```
df2.plot(ax=ax)
```
Related
I have a dictionary of dataframes where the key is the name of each dataframe and the value is the dataframe itself.
I am looking to iterate through the dictionary and quickly plot the top 10 rows in each dataframe. Each dataframe would have its own plot. I've attempted this with the following:
for df in dfs:
data = dfs[df].head(n=10)
sns.barplot(data=data, x='x_col', y='y_col', color='indigo').set_title(df)
This works, but only returns a plot for the last dataframe in the iteration. Is there a way I can modify this so that I am also able to return the subsequent plots?
By default, seaborn.barplot() plots data on the current Axes. If you didn't specify the Axes to plot on, the latter will override the previous one. To overcome this, you can either create a new figure in each loop or plot on a different axis by specifying the ax argument.
import matplotlib.pyplot as plt
for df in dfs:
data = dfs[df].head(n=10)
plt.figure() # Create a new figure, current axes also changes.
sns.barplot(data=data, x='x_col', y='y_col', color='indigo').set_title(df)
I'm trying to visualize a data frame I have with a stacked barchart, where the x is websites, the y is frequency and then the groups on the barchart are different groups using them.
This is the dataframe:
This is the plot created just by doing this:
web_data_roles.plot(kind='barh', stacked=True, figsize=(20,10))
As you can see its not what I want, vie tried changing the plot so the axes match up to the different columns of the dataframe but it just says no numerical data to plot, Not sure how to go about this anymore. so all help is appreciated
You need to organise your dataframe so that role is a column.
set_index() initial preparation
unstack() to move role out of index and make a column
droplevel() to clean up multi index columns
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1,1, figsize=[10,5],
sharey=False, sharex=False, gridspec_kw={"hspace":0.3})
df = pd.read_csv(io.StringIO("""website,role,freq
www.bbc.co.uk,director,2000
www.bbc.co.uk,technical,500
www.twitter.com,director,4000
www.twitter.com,technical,1500
"""))
df.set_index(["website","role"]).unstack(1).droplevel(0,axis=1).plot(ax=ax, kind="barh", stacked=True)
I am new python and I have two columns in a dataframe that i want to plot against date
plt.scatter(thing.date,thing.loc[:,['numbers','more_numbers']])
my intuition is the above should work (because matlab allows for this kind of thing), but it doesn't, and I'm not sure why.
Is there away around this?
I'm hoping to plot these columns for a sequence of 4 dataframes on the same axes - so i'd like to use a command like the above so I can colour the columns from each data frame to make it distinctive.
Easiest is to do a loop:
fig, ax = plt.subplots()
for col in ['numbers', 'more_numbers']:
ax.scatter(things.date, things[col], label=col)
# or
# things.scatter(x='date', y=col, label=col, ax=ax)
plt.show()
I wish to clarify two queries in this post.
I have a pandas df like below picture.
1. Plotting problem : .
When i try to plot column 0 with column 1, the values gets sorted.
example : in col_0 I have values starting from 112 till 0.
the values gets sorted in ascending order and the graph shows reversed X axis plot when i use the below code.
plt.plot(df.col_0, df.col_1)
What will be best way to avoid sorting X axis values. ?
2. All paramaters in single graph
I would like to plot all the params in a single plot. Except X axis all other params values are between 0 to 1 (same scale)
What will be best pythonic way of doing.
Any help would be appreciated.
Try to draw the series/dataframe against the index:
col_to_draw = [col for col in df.columns if col!='col0']
# if your data frame is indexed as 0,1,2,... ignore this step
tmp_df = df.reset_index()
ax = tmp_df[col_to_draw].plot(figsize=(10,6))
xtick_vals = ax.get_xticks()
ax.set_xticklabels(tmp_df.col0[xtick_vals].tolist())
Output:
I don't understand what you mean by they get sorted - does it not plot 112, 0.90178 and connect it to 110.89899, 0.90779, etc?
To share the X axis but have 2 Y axes that certain sets are plotted on, use twinx
fig, ax1 = plt.subplots()
ax1.plot(df.col_0, df.col_1)
ax2 = ax1.twinx()
ax2.plot(df.col_0, df.col_2)
re: how to plot in the order you want
I believe your intention is to actually plot these values vs. time or index. To that end, I suggest:
fig, ax1 = plt.subplots()
ax1.plot(df['Time'], df.col_0) # or df.index, df.col_0
ax2 = ax1.twinx()
ax2.plot(df['Time'], df.col_1)
I suppose this is fairly easy but I tried for a while to get an answer without much success. I want to produce a stacked bar plot for two categories but I have such information in two separate date frames:
This is the code:
first_babies = live[live.birthord == 1] # first dataframe
others = live[live.birthord != 1] # second dataframe
fig = figure()
ax1 = fig.add_subplot(1,1,1)
first_babies.groupby(by=['prglength']).size().plot(
kind='bar', ax=ax1, label='first babies') # first plot
others.groupby(by=['prglength']).size().plot(kind='bar', ax=ax1, color='r',
label='others') #second plot
ax1.legend(loc='best')
ax1.set_xlabel('weeks')
ax1.set_ylabel('frequency')
ax1.set_title('Histogram')
But I want something like this or as I said, a stacked bar plot in order to better distinguish between categories:
I can't use stacked=True because it doesn't work using two different plots and I can't create a new dataframe because first_babies and othersdon't have the same number of elements.
Thanks
First create a new column to distinguish 'first_babies':
live['first_babies'] = live['birthord'].lambda(x: 'first_babies' if x==1 else 'others')
You can unstack the groupby:
grouped = live.groupby(by=['prglength', 'first_babies']).size()
unstacked_count = grouped.size().unstack()
Now you can plot a stacked bar-plot directly:
unstacked_count.plot(kind='bar', stacked=True)