I've got pandas DataFrame, df, with index named date and the columns columnA, columnB and columnC
I am trying to scatter plot index on a x-axis and columnA on a y-axis using the DataFrame syntax.
When I try:
df.plot(kind='scatter', x='date', y='columnA')
I ma getting an error KeyError: 'date' probably because the date is not column
df.plot(kind='scatter', y='columnA')
I am getting an error:
ValueError: scatter requires and x and y column
so no default index on x-axis.
df.plot(kind='scatter', x=df.index, y='columnA')
I am getting error
KeyError: "DatetimeIndex(['1818-01-01', '1818-01-02', '1818-01-03', '1818-01-04',\n
'1818-01-05', '1818-01-06', '1818-01-07', '1818-01-08',\n
'1818-01-09', '1818-01-10',\n ...\n
'2018-03-22', '2018-03-23', '2018-03-24', '2018-03-25',\n
'2018-03-26', '2018-03-27', '2018-03-28', '2018-03-29',\n
'2018-03-30', '2018-03-31'],\n
dtype='datetime64[ns]', name='date', length=73139, freq=None) not in index"
I can plot it if I use matplotlib.pyplot directly
plt.scatter(df.index, df['columnA'])
Is there a way to plot index as x-axis using the DataFrame kind syntax?
This is kind of ugly (I think the matplotlib solution you used in your question is better, FWIW), but you can always create a temporary DataFrame with the index as a column usinng
df.reset_index()
If the index was nameless, the default name will be 'index'. Assuming this is the case, you could use
df.reset_index().plot(kind='scatter', x='index', y='columnA')
A more simple solution would be:
df['x1'] = df.index
df.plot(kind='scatter', x='x1', y='columnA')
Just create the index variable outside of the plot statement.
At least in pandas>1.4 whats easiest is this:
df['columnA'].plot(style=".")
This lets you mix scatter and line plots, as well as use the standard pandas plot interface
Related
I've got pandas DataFrame, df, with index named date and the columns columnA, columnB and columnC
I am trying to scatter plot index on a x-axis and columnA on a y-axis using the DataFrame syntax.
When I try:
df.plot(kind='scatter', x='date', y='columnA')
I ma getting an error KeyError: 'date' probably because the date is not column
df.plot(kind='scatter', y='columnA')
I am getting an error:
ValueError: scatter requires and x and y column
so no default index on x-axis.
df.plot(kind='scatter', x=df.index, y='columnA')
I am getting error
KeyError: "DatetimeIndex(['1818-01-01', '1818-01-02', '1818-01-03', '1818-01-04',\n
'1818-01-05', '1818-01-06', '1818-01-07', '1818-01-08',\n
'1818-01-09', '1818-01-10',\n ...\n
'2018-03-22', '2018-03-23', '2018-03-24', '2018-03-25',\n
'2018-03-26', '2018-03-27', '2018-03-28', '2018-03-29',\n
'2018-03-30', '2018-03-31'],\n
dtype='datetime64[ns]', name='date', length=73139, freq=None) not in index"
I can plot it if I use matplotlib.pyplot directly
plt.scatter(df.index, df['columnA'])
Is there a way to plot index as x-axis using the DataFrame kind syntax?
This is kind of ugly (I think the matplotlib solution you used in your question is better, FWIW), but you can always create a temporary DataFrame with the index as a column usinng
df.reset_index()
If the index was nameless, the default name will be 'index'. Assuming this is the case, you could use
df.reset_index().plot(kind='scatter', x='index', y='columnA')
A more simple solution would be:
df['x1'] = df.index
df.plot(kind='scatter', x='x1', y='columnA')
Just create the index variable outside of the plot statement.
At least in pandas>1.4 whats easiest is this:
df['columnA'].plot(style=".")
This lets you mix scatter and line plots, as well as use the standard pandas plot interface
I have two dataframes df1 and df2.
df1 has two columns, column 1 'key' with 20 items. Column 2 'df1_val' with values against each key.
df2 is similar but column 2 is called df2_val.
Whats the easiest way to plot a single plot with both df1_val and df2_val - x-axis assigned to keys
I would do it this way:
Start by naming your df_1value and df_2value as the same 'value', then
fig = plt.figure()
for r in [df_1,df_2]:
plt.plot(r['key'], r['value'])
plt.xlim(0,<a value>)
plt.ylim(0,<a value >)
plt.show()
An alternativ is to do this. Plot what you need for first dataframe df1 and use ax to "force" the other plots in the same graph
ax = df.plot()
```
And
```
df2.plot(ax=ax)
```
I want a scatter plot where x-axis is a datetime, y-axis is an int. And I have only a few of datapoints that are discrete and not continuous, so I don't want to connect datapoints.
My DataFrame is:
df = pd.DataFrame({'datetime':[dt.datetime(2016,1,1,0,0,0), dt.datetime(2016,1,4,0,0,0),
dt.datetime(2016,1,9,0,0,0)], 'value':[10, 7, 8]})
If I use "normal" plot than I got a "line" figure:
df.plot(x='datetime', y='value')
But how can I plot only the dots? This gives error:
df.plot.scatter(x='datetime', y='value')
KeyError: 'datetime'
Of course I can use some cheat to get the result I want, for example:
df.plot(x='datetime', y='value', marker='o', linewidth=0)
But I don't understand why the scatter version does not work...
Thank you for help!
Scatter plot can be drawn by using the DataFrame.plot.scatter() method. Scatter plot requires numeric columns for x and y axis. These
can be specified by x and y keywords each.
Alternative Approach:
In [71]: df['day'] = df['datetime'].dt.day
In [72]: df.plot.scatter(x='day', y='value')
Out[72]: <matplotlib.axes._subplots.AxesSubplot at 0x25440a1bc88>

The easiest way to plot a pandas dataframe is as described in the documentation like this:
http://pandas.pydata.org/pandas-docs/stable/visualization.html
In my case I want to create a stacked bar chart:
df2.plot(kind='bar', stacked=True);
This is all working well, but I would like to use one column of the df2 as xlabels and not simply have [1,2,3,4... etc] as labels. Is there a simple way to achieve it with an additional parameter in the plot function or do I need to do it in a more complicated way?
The plot uses the index of your dataframe as the labels so if you want to your use use a particular column, set it as your index:
df2.index = df2.labelcol
df2.plot(kind='bar', stacked=True)
I have time series in a Pandas dateframe with a number of columns which I'd like to plot. Is there a way to set the x-axis to always use the index from a dateframe?
When I use the .plot() method from Pandas the x-axis is formatted correctly however I when I pass my dates and the column(s) I'd like to plot directly to matplotlib the graph doesn't plot correctly. Thanks in advance.
plt.plot(site2.index.values, site2['Cl'])
plt.show()
FYI: site2.index.values produces this (I've cut out the middle part for brevity):
array([
'1987-07-25T12:30:00.000000000+0200',
'1987-07-25T16:30:00.000000000+0200',
'2010-08-13T02:00:00.000000000+0200',
'2010-08-31T02:00:00.000000000+0200',
'2010-09-15T02:00:00.000000000+0200'
],
dtype='datetime64[ns]')
It seems the issue was that I had .values. Without it (i.e. site2.index) the graph displays correctly.
You can use plt.xticks to set the x-axis
try:
plt.xticks( site2['Cl'], site2.index.values ) # location, labels
plt.plot( site2['Cl'] )
plt.show()
see the documentation for more details: http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.xticks
That's Builtin Right Into To plot() method
You can use yourDataFrame.plot(use_index=True) to use the DataFrame Index On X-Axis.
The "use_index=True" sets the DataFrame Index on the X-Axis.
Read More Here: https://pandas.pydata.org/pandas-docs/version/0.23/generated/pandas.DataFrame.plot.html
you want to use matplotlib to select a 'sensible' scale just like me, there is one way can solve this question. using a Pandas dataframe index as values for x-axis in matplotlib plot. Code:
ax = plt.plot(site2['Cl'])
x_ticks = ax.get_xticks() # use matplotlib default xticks
x_ticks = list(filter(lambda x: x in range(len(site2)), x_ticks))
ax.set_xticklabels([' '] + site2.index.iloc[x_ticks].to_list())