matplotlib bar chart with overlapping dates - python

I am plotting a simple bar chart using pandas/matplotlib. The x-axis is a datetime index. There are so many datapoints that the labels overlap. Is there an easy solution for this problem, no matter if I have daily, weekly, monthly, or yearly data?
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
idx = pd.date_range("2015-01-01", "2021-09-30", freq="b")
data = np.random.randn(len(idx))
df = pd.DataFrame(data={"returns": data}, index=idx)
df.plot(kind="bar")
plt.show()

Use DateFormatter to custom the xaxis but let Matplotlib handle the figure rather than Pandas:
import matplotlib.dates as mdates
# ...
fig, ax = plt.subplots(figsize=(15, 7))
ax.bar(df.index, df['returns'])
ax.xaxis.set_major_locator(mdates.YearLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter("%Y-%m"))

Related

Weird tick spacing when plotting with time on x-axis

I'm plotting a pandas time series, which work OK when plotting the timestamps on the x-axis...
import pandas as pd
import numpy as np
ts = pd.date_range("2020-01-01", "2020-01-02", freq='15T', closed="left")
vals = np.random.rand(len(ts))
pd.Series(vals, ts).plot()
...but which gives me a very unnatural tick spacing when plotting only the time part on the x-axis:
pd.Series(vals, ts.time).plot()
How can I turn the x-axis of this second plot into something more human-readable?
Use matplotlib and format xaxis:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
plt.plot(ts, vals)
myFmt = mdates.DateFormatter('%H:%M')
plt.gca().xaxis.set_major_formatter(myFmt)

Plotting more than 10K data point using Seaborn for x-axis as timestamp

I am trying to plot more than 10k data points, where I want to plot a data properties versus Timestamp. But on the x-axis the timestamps are overlapping and not visible.
How can I reduce the amount of labels on the x-axis, so that they are legible?
import pandas as pd
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
sns.set_style("whitegrid")
data = pd.read_csv('0912Testday4.csv',header=2)
for i in data.columns:
if i!='TIMESTAMP':
sns.lineplot(x="TIMESTAMP",y=i,data = data)
plt.title(f"{i} vs TIMESTAMP")
plt.show()
Example plot demonstrating the problem:
Update:TIMESTAMP was in string format by converting into datatime format it resolves the problem.
data['TIMESTAMP'] = pd.to_datetime(data['TIMESTAMP'])
Update:TIMESTAMP was in string format by converting into datetime format it resolves the problem.
data['TIMESTAMP'] = pd.to_datetime(data['TIMESTAMP'])
Please make sure that TIMESTAMP is a datetime object. This should not happen when the x axis is a datetime. (You can use pd.to_datetime to convert int, float, str, and ... to datetime.)
If TIMESTAMP is a datetime, you can use the autofmt_xdate() method:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
fig, ax = plt.subplots() # Create a figure and a set of subplots.
sns.set_style("whitegrid")
data = pd.read_csv('0912Testday4.csv',header=2)
# Use the following line if the TIMESTAMP is not a datetime.
# (You may need to change the format from "%Y-%m-%d %H:%M:%S+00:00".)
# data['TIMESTAMP'] = pd.to_datetime(data.TIMESTAMP, format="%Y-%m-%d %H:%M:%S+00:00")
for i in data.columns:
if i!='TIMESTAMP':
sns.lineplot(x="TIMESTAMP", y=i, data=data, ax=ax)
fig.autofmt_xdate() # rotate and right align date ticklabels
plt.title(f"{i} vs TIMESTAMP")
plt.show()
I didn't encounter such problem with sns.lineplot
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import seaborn as sns
sns.set_style("whitegrid")
# example data
time_stamps = pd.date_range('2019-01-01', '2020-01-01', freq='H')
vals =[np.random.randint(0, 1000) for i in time_stamps]
data_df = pd.DataFrame()
data_df['time'] = time_stamps
data_df['value'] = vals
print(data_df.shape)
# plotting
fig, ax = plt.subplots()
sns.lineplot(x='time', y='value', data=data_df)
plt.show()
sns automatically selects the x ticks and x labels.
alternatively, you can use ax.set_xticks and ax.set_xlabels to set the x ticks and x labels manually.
Also you may use fig.autofmt_xdate() to rotate the x labels

Seaborn Barplot and Formatting Dates on X-Axis

I am currently working on visualizing datasets with Seaborn and Pandas. I have some time-dependent data that I would like to graph in bar charts.
However, I am battling with two issues in Seaborn:
Formatting dates on the x-axis
Only showing a handful of dates (as
it doesn't make sense to have every day labeled on a 6 month graph)
I have found a solution for my issues in normal Matplotlib, which is:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
N = 20
np.random.seed(2022)
dates = pd.date_range('1/1/2014', periods=N, freq='m')
df = pd.DataFrame(
data={'dt':dates, 'val': np.random.randn(N)}
)
fig, ax = plt.subplots(figsize=(10, 6))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m'))
ax.bar(df['dt'], df['val'], width=25, align='center')
However, I already have most of my graphs done in Seaborn, and I would like to stay consistent. Once I convert the previous code into Seaborn, I lose the ability to format the dates:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
N = 20
np.random.seed(2022)
dates = pd.date_range('1/1/2014', periods=N, freq='m')
df = pd.DataFrame(
data={'dt':dates, 'val': np.random.randn(N)}
)
fig, ax = plt.subplots(1,1)
ax.xaxis.set_major_formatter(mdates.DateFormatter('%y-%m'))
sns.barplot(x='dt', y='val', data=df)
fig.autofmt_xdate()
When I run the code, the date format remains unchanged and I can't locate any dates with DateLocator.
Is there any way for me to format my X-Axis for dates in Seaborn in a way similar to Matplotlib with DateLocator and DateFormatter?
No, you cannot use seaborn.barplot in conjunction with matplotlib.dates ticking. The reason is that the ticks for seaborn barplots are at integer positions (0,1,..., N-1). So they cannot be interpreted as dates.
You have three options:
Use seaborn, and loop through the labels and set them to anything you want
Not use seaborn and have the advantages (and disadvantages) of matplotlib.dates tickers available.
Change the format in the dataframe prior to plotting.
Tested in python 3.10, pandas 1.5.0, matplotlib 3.5.2, seaborn 0.12.0
N = 20
np.random.seed(2022)
dates = pd.date_range('1/1/2014', periods=N, freq='m')
df = pd.DataFrame(data={'dates': dates, 'val': np.random.randn(N)})
# change the datetime format in the dataframe prior to plotting
df.dates = df.dates.dt.strftime('%Y-%m')
fig, ax = plt.subplots(1,1)
sns.barplot(x='dates', y='val', data=df)
xticks = ax.get_xticks()
xticklabels = [x.get_text() for x in ax.get_xticklabels()]
_ = ax.set_xticks(xticks, xticklabels, rotation=90)
N = 20
np.random.seed(2022)
dates = pd.date_range('1/1/2014', periods=N, freq='m')
df = pd.DataFrame(data={'dates': dates, 'val': np.random.randn(N)})
df.dates = df.dates.dt.strftime('%Y-%m')
fig, ax = plt.subplots(figsize=(10, 6))
sns.barplot(x='dates', y='val', data=df)
xticks = ax.get_xticks()
xticklabels = [x.get_text() if not i%2 == 0 else '' for i, x in enumerate(ax.get_xticklabels())]
_ = ax.set_xticks(xticks, xticklabels)

Changing the tick frequency on the x-axis

I am trying to plot a bar chart with the date vs the price of a crypto currency from a dataframe and have 731 daily samples. When i plot the graph i get the image as seen below. Due to the amount of dates the x axis is unreadable and i would like to make it so it only labels the 1st of every month on the x-axis.
This is the graph i currently have: https://imgur.com/a/QVNn4Zp
I have tried using other methods i have found online both in stackoverflow and other sources such as youtube but had no success.
This is the Code i have so far to plot the bar chart.
df.plot(kind='bar',x='Date',y='Price in USD (at 00:00:00 UTC)',color='red')
plt.show()
One option is to plot a numeric barplot with matplotlib.
Matplotlib < 3.0
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd
start = pd.to_datetime("5-1-2012")
idx = pd.date_range(start, periods= 365)
df = pd.DataFrame({'Date': idx, 'A':np.random.random(365)})
fig, ax = plt.subplots()
dates = mdates.date2num(df["Date"].values)
ax.bar(dates, df["A"], width=1)
loc = mdates.AutoDateLocator()
ax.xaxis.set_major_locator(loc)
ax.xaxis.set_major_formatter(mdates.AutoDateFormatter(loc))
plt.show()
Matplotlib >= 3.0
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
pd.plotting.register_matplotlib_converters()
start = pd.to_datetime("5-1-2012")
idx = pd.date_range(start, periods= 365)
df = pd.DataFrame({'Date': idx, 'A':np.random.random(365)})
fig, ax = plt.subplots()
ax.bar(df["Date"], df["A"], width=1)
plt.show()
Further options:
For other options see Pandas bar plot changes date format

candlestick plot from pandas dataframe, replace index by dates

This code gives plot of candlesticks with moving averages but the x-axis is in index, I need the x-axis in dates.
What changes are required?
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from mpl_finance import candlestick2_ohlc
#date format in data-> dd-mm-yyyy
nif = pd.read_csv('data.csv')
#nif['Date'] = pd.to_datetime(nif['Date'], format='%d-%m-%Y', utc=True)
mavg = nif['Close'].ewm(span=50).mean()
mavg1 = nif['Close'].ewm(span=13).mean()
fg, ax1 = plt.subplots()
cl = candlestick2_ohlc(ax=ax1,opens=nif['Open'],highs=nif['High'],lows=nif['Low'],closes=nif['Close'],width=0.4, colorup='#77d879', colordown='#db3f3f')
mavg.plot(ax=ax1,label='50_ema')
mavg1.plot(color='k',ax=ax1, label='13_ema')
plt.legend(loc=4)
plt.subplots_adjust(left=0.09, bottom=0.20, right=0.94, top=0.90, wspace=0.2, hspace=0)
plt.show()
Output:
I also had a lot of "fun" with this in the past... Here is one way of doing it using mdates:
import pandas as pd
import pandas_datareader.data as web
import datetime as dt
import matplotlib.pyplot as plt
from matplotlib.finance import candlestick_ohlc
import matplotlib.dates as mdates
ticker = 'MCD'
start = dt.date(2014, 1, 1)
#Gathering the data
data = web.DataReader(ticker, 'yahoo', start)
#Calc moving average
data['MA10'] = data['Adj Close'].rolling(window=10).mean()
data['MA60'] = data['Adj Close'].rolling(window=60).mean()
data.reset_index(inplace=True)
data['Date']=mdates.date2num(data['Date'].astype(dt.date))
#Plot candlestick chart
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax2 = fig.add_subplot(111)
ax3 = fig.add_subplot(111)
ax1.xaxis_date()
ax1.xaxis.set_major_formatter(mdates.DateFormatter('%d-%m-%Y'))
ax2.plot(data.Date, data['MA10'], label='MA_10')
ax3.plot(data.Date, data['MA60'], label='MA_60')
plt.ylabel("Price")
plt.title(ticker)
ax1.grid(True)
plt.legend(loc='best')
plt.xticks(rotation=45)
candlestick_ohlc(ax1, data.values, width=0.6, colorup='g', colordown='r')
plt.show()
Output:
Hope this helps.
Simple df:
Using plotly:
import plotly.figure_factory
fig = plotly.figure_factory.create_candlestick(df.open, df.high, df.low, df.close, dates=df.ts)
fig.show()
will automatically parse the ts column to be displayed correctly on x.
Clunky workaround here, derived from other post (if i can find again, will reference). Using a pandas df, plot by index and then reference xaxis tick labels to date strings for display. Am new to python / matplotlib, and this this solution is not so flexible, but it works basically. Also using a pd index for plotting removes the blank 'weekend' daily spaces on market price data.
Matplotlib xaxis index as dates
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from mpl_finance import candlestick2_ohlc
from mpl_finance import candlestick_ohlc
%matplotlib notebook # for Jupyter
# Format m/d/Y,Open,High,Low,Close,Adj Close,Volume
# csv data does not include NaN, or 'weekend' lines,
# only dates from which prices are recorded
DJIA = pd.read_csv('yourFILE.csv') #Format m/d/Y,Open,High,
Low,Close,Adj Close,Volume
print(DJIA.head())
fg, ax1 = plt.subplots()
cl =candlestick2_ohlc(ax=ax1,opens=DJIA['Open'],
highs=DJIA['High'],lows=DJIA['Low'],
closes=DJIA['Close'],width=0.4, colorup='#77d879',
colordown='#db3f3f')
ax1.set_xticks(np.arange(len(DJIA)))
ax1.set_xticklabels(DJIA['Date'], fontsize=6, rotation=-90)
plt.show()

Categories