Changing the tick frequency on the x-axis - python

I am trying to plot a bar chart with the date vs the price of a crypto currency from a dataframe and have 731 daily samples. When i plot the graph i get the image as seen below. Due to the amount of dates the x axis is unreadable and i would like to make it so it only labels the 1st of every month on the x-axis.
This is the graph i currently have: https://imgur.com/a/QVNn4Zp
I have tried using other methods i have found online both in stackoverflow and other sources such as youtube but had no success.
This is the Code i have so far to plot the bar chart.
df.plot(kind='bar',x='Date',y='Price in USD (at 00:00:00 UTC)',color='red')
plt.show()

One option is to plot a numeric barplot with matplotlib.
Matplotlib < 3.0
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd
start = pd.to_datetime("5-1-2012")
idx = pd.date_range(start, periods= 365)
df = pd.DataFrame({'Date': idx, 'A':np.random.random(365)})
fig, ax = plt.subplots()
dates = mdates.date2num(df["Date"].values)
ax.bar(dates, df["A"], width=1)
loc = mdates.AutoDateLocator()
ax.xaxis.set_major_locator(loc)
ax.xaxis.set_major_formatter(mdates.AutoDateFormatter(loc))
plt.show()
Matplotlib >= 3.0
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
pd.plotting.register_matplotlib_converters()
start = pd.to_datetime("5-1-2012")
idx = pd.date_range(start, periods= 365)
df = pd.DataFrame({'Date': idx, 'A':np.random.random(365)})
fig, ax = plt.subplots()
ax.bar(df["Date"], df["A"], width=1)
plt.show()
Further options:
For other options see Pandas bar plot changes date format

Related

How to remove the first and last minor tick month labels on matplotlib?

I want to generate a chart with the 12 months of a year as the x-axis labels, i.e. 'Jan' to 'Dec', positioned in the middle between the major ticks. I used the code from https://matplotlib.org/3.4.3/gallery/ticks_and_spines/centered_ticklabels.html to create the x-axis. The x-axis created has an additional 'Dec' on the left and 'Jan' on the right, i.e. a total of 14 labels instead of 12 (see attached image). However, only 'Jan' to 'Dec' are wanted on the chart. I would like to know how to remove the 'Dec' label on the left and 'Jan' label on the right? My google searches were only successful with solutions to remove all minor tick labels. Any help will be much appreciated.
I use the following code to generate the chart:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import matplotlib.dates as mdates
import matplotlib.ticker as ticker
df = pd.DataFrame(np.random.randint(0,100,size=(365, 2)), columns=list('AB'))
df.index = pd.date_range(start='1/1/2022', end='12/31/2022').strftime('%b-%d')
plt.figure()
ax = plt.gca()
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_minor_locator(mdates.MonthLocator(bymonthday=16))
ax.xaxis.set_major_formatter(ticker.NullFormatter())
ax.xaxis.set_minor_formatter(mdates.DateFormatter('%b'))
for tick in ax.xaxis.get_minor_ticks():
tick.tick1line.set_markersize(0)
tick.tick2line.set_markersize(0)
tick.label1.set_horizontalalignment('center')
plt.plot(df['A'], linewidth=0.5, color='tab:red')
plt.show()
enter image description here
Try setting your x-axis limit to values between 0 and 365. Sometimes matplotlib uses values a little outside of your data. This way, the first Dec and last Jan are automatically eliminated from the plot.
Here I modified your code with 1 argument: plt.xlim(0,365)
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import matplotlib.dates as mdates
import matplotlib.ticker as ticker
df = pd.DataFrame(np.random.randint(0,100,size=(365, 2)), columns=list('AB'))
df.index = pd.date_range(start='1/1/2022', end='12/31/2022').strftime('%b-%d')
plt.figure()
ax = plt.gca()
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_minor_locator(mdates.MonthLocator(bymonthday=16))
ax.xaxis.set_major_formatter(ticker.NullFormatter())
ax.xaxis.set_minor_formatter(mdates.DateFormatter('%b'))
for tick in ax.xaxis.get_minor_ticks():
tick.tick1line.set_markersize(0)
tick.tick2line.set_markersize(0)
tick.label1.set_horizontalalignment('center')
plt.xlim(0,365)
plt.plot(df['A'], linewidth=0.5, color='tab:red')
plt.show()

matplotlib bar chart with overlapping dates

I am plotting a simple bar chart using pandas/matplotlib. The x-axis is a datetime index. There are so many datapoints that the labels overlap. Is there an easy solution for this problem, no matter if I have daily, weekly, monthly, or yearly data?
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
idx = pd.date_range("2015-01-01", "2021-09-30", freq="b")
data = np.random.randn(len(idx))
df = pd.DataFrame(data={"returns": data}, index=idx)
df.plot(kind="bar")
plt.show()
Use DateFormatter to custom the xaxis but let Matplotlib handle the figure rather than Pandas:
import matplotlib.dates as mdates
# ...
fig, ax = plt.subplots(figsize=(15, 7))
ax.bar(df.index, df['returns'])
ax.xaxis.set_major_locator(mdates.YearLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter("%Y-%m"))

How to use time as x axis for a scatterplot with seaborn?

I have a simple dataframe with the time as index and dummy values as example.[]
I did a simple scatter plot as you see here:
Simple question: How to adjust the xaxis, so that all time values from 00:00 to 23:00 are visible in the xaxis? The rest of the plot is fine, it shows all the datapoints, it is just the labeling. Tried different things but didn't work out.
All my code so far is:
import pandas as pd
import seaborn as sns
import matplotlib.dates as mdates
from datetime import time
data = []
for i in range(0, 24):
temp_list = []
temp_list.append(time(i))
temp_list.append(i)
data.append(temp_list)
my_df = pd.DataFrame(data, columns=["time", "values"])
my_df.set_index(['time'],inplace=True)
my_df
fig = sns.scatterplot(my_df.index, my_df['values'])
fig.set(xlabel='time', ylabel='values')
I think you're gonna have to go down to the matplotlib level for this:
import pandas as pd
import seaborn as sns
import matplotlib.dates as mdates
from datetime import time
import matplotlib.pyplot as plt
data = []
for i in range(0, 24):
temp_list = []
temp_list.append(time(i))
temp_list.append(i)
data.append(temp_list)
df = pd.DataFrame(data, columns=["time", "values"])
df.time = pd.to_datetime(df.time, format='%H:%M:%S')
df.set_index(['time'],inplace=True)
ax = sns.scatterplot(df.index, df["values"])
ax.set(xlabel="time", ylabel="measured values")
ax.set_xlim(df.index[0], df.index[-1])
ax.xaxis.set_major_locator(mdates.HourLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter("%H:%M:%S"))
ax.tick_params(axis="x", rotation=45)
This produces
i think you have 2 options:
convert the time to hour only, for that just extract the hour to new column in your df
df['hour_'] = datetime.hour
than use it as your xaxis
if you need the time in the format you described, it may cause you a visibility problem in which timestamps will overlay each other. i'm using the
plt.xticks(rotation=45, horizontalalignment='right')
ax.xaxis.set_major_locator(plt.MaxNLocator(12))
so first i rotate the text then i'm limiting the ticks number.
here is a full script where i used it:
sns.set()
sns.set_style("whitegrid")
sns.axes_style("whitegrid")
for k, g in df_forPlots.groupby('your_column'):
fig = plt.figure(figsize=(10,5))
wide_df = g[['x', 'y', 'z']]
wide_df.set_index(['x'], inplace=True)
ax = sns.lineplot(data=wide_df)
plt.xticks(rotation=45,
horizontalalignment='right')
ax.yaxis.set_major_locator(plt.MaxNLocator(14))
ax.xaxis.set_major_locator(plt.MaxNLocator(35))
plt.title(f"your {k} in somthing{g.z.unique()}")
plt.tight_layout()
hope i halped

Seaborn Barplot and Formatting Dates on X-Axis

I am currently working on visualizing datasets with Seaborn and Pandas. I have some time-dependent data that I would like to graph in bar charts.
However, I am battling with two issues in Seaborn:
Formatting dates on the x-axis
Only showing a handful of dates (as
it doesn't make sense to have every day labeled on a 6 month graph)
I have found a solution for my issues in normal Matplotlib, which is:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
N = 20
np.random.seed(2022)
dates = pd.date_range('1/1/2014', periods=N, freq='m')
df = pd.DataFrame(
data={'dt':dates, 'val': np.random.randn(N)}
)
fig, ax = plt.subplots(figsize=(10, 6))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m'))
ax.bar(df['dt'], df['val'], width=25, align='center')
However, I already have most of my graphs done in Seaborn, and I would like to stay consistent. Once I convert the previous code into Seaborn, I lose the ability to format the dates:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
N = 20
np.random.seed(2022)
dates = pd.date_range('1/1/2014', periods=N, freq='m')
df = pd.DataFrame(
data={'dt':dates, 'val': np.random.randn(N)}
)
fig, ax = plt.subplots(1,1)
ax.xaxis.set_major_formatter(mdates.DateFormatter('%y-%m'))
sns.barplot(x='dt', y='val', data=df)
fig.autofmt_xdate()
When I run the code, the date format remains unchanged and I can't locate any dates with DateLocator.
Is there any way for me to format my X-Axis for dates in Seaborn in a way similar to Matplotlib with DateLocator and DateFormatter?
No, you cannot use seaborn.barplot in conjunction with matplotlib.dates ticking. The reason is that the ticks for seaborn barplots are at integer positions (0,1,..., N-1). So they cannot be interpreted as dates.
You have three options:
Use seaborn, and loop through the labels and set them to anything you want
Not use seaborn and have the advantages (and disadvantages) of matplotlib.dates tickers available.
Change the format in the dataframe prior to plotting.
Tested in python 3.10, pandas 1.5.0, matplotlib 3.5.2, seaborn 0.12.0
N = 20
np.random.seed(2022)
dates = pd.date_range('1/1/2014', periods=N, freq='m')
df = pd.DataFrame(data={'dates': dates, 'val': np.random.randn(N)})
# change the datetime format in the dataframe prior to plotting
df.dates = df.dates.dt.strftime('%Y-%m')
fig, ax = plt.subplots(1,1)
sns.barplot(x='dates', y='val', data=df)
xticks = ax.get_xticks()
xticklabels = [x.get_text() for x in ax.get_xticklabels()]
_ = ax.set_xticks(xticks, xticklabels, rotation=90)
N = 20
np.random.seed(2022)
dates = pd.date_range('1/1/2014', periods=N, freq='m')
df = pd.DataFrame(data={'dates': dates, 'val': np.random.randn(N)})
df.dates = df.dates.dt.strftime('%Y-%m')
fig, ax = plt.subplots(figsize=(10, 6))
sns.barplot(x='dates', y='val', data=df)
xticks = ax.get_xticks()
xticklabels = [x.get_text() if not i%2 == 0 else '' for i, x in enumerate(ax.get_xticklabels())]
_ = ax.set_xticks(xticks, xticklabels)

candlestick plot from pandas dataframe, replace index by dates

This code gives plot of candlesticks with moving averages but the x-axis is in index, I need the x-axis in dates.
What changes are required?
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from mpl_finance import candlestick2_ohlc
#date format in data-> dd-mm-yyyy
nif = pd.read_csv('data.csv')
#nif['Date'] = pd.to_datetime(nif['Date'], format='%d-%m-%Y', utc=True)
mavg = nif['Close'].ewm(span=50).mean()
mavg1 = nif['Close'].ewm(span=13).mean()
fg, ax1 = plt.subplots()
cl = candlestick2_ohlc(ax=ax1,opens=nif['Open'],highs=nif['High'],lows=nif['Low'],closes=nif['Close'],width=0.4, colorup='#77d879', colordown='#db3f3f')
mavg.plot(ax=ax1,label='50_ema')
mavg1.plot(color='k',ax=ax1, label='13_ema')
plt.legend(loc=4)
plt.subplots_adjust(left=0.09, bottom=0.20, right=0.94, top=0.90, wspace=0.2, hspace=0)
plt.show()
Output:
I also had a lot of "fun" with this in the past... Here is one way of doing it using mdates:
import pandas as pd
import pandas_datareader.data as web
import datetime as dt
import matplotlib.pyplot as plt
from matplotlib.finance import candlestick_ohlc
import matplotlib.dates as mdates
ticker = 'MCD'
start = dt.date(2014, 1, 1)
#Gathering the data
data = web.DataReader(ticker, 'yahoo', start)
#Calc moving average
data['MA10'] = data['Adj Close'].rolling(window=10).mean()
data['MA60'] = data['Adj Close'].rolling(window=60).mean()
data.reset_index(inplace=True)
data['Date']=mdates.date2num(data['Date'].astype(dt.date))
#Plot candlestick chart
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax2 = fig.add_subplot(111)
ax3 = fig.add_subplot(111)
ax1.xaxis_date()
ax1.xaxis.set_major_formatter(mdates.DateFormatter('%d-%m-%Y'))
ax2.plot(data.Date, data['MA10'], label='MA_10')
ax3.plot(data.Date, data['MA60'], label='MA_60')
plt.ylabel("Price")
plt.title(ticker)
ax1.grid(True)
plt.legend(loc='best')
plt.xticks(rotation=45)
candlestick_ohlc(ax1, data.values, width=0.6, colorup='g', colordown='r')
plt.show()
Output:
Hope this helps.
Simple df:
Using plotly:
import plotly.figure_factory
fig = plotly.figure_factory.create_candlestick(df.open, df.high, df.low, df.close, dates=df.ts)
fig.show()
will automatically parse the ts column to be displayed correctly on x.
Clunky workaround here, derived from other post (if i can find again, will reference). Using a pandas df, plot by index and then reference xaxis tick labels to date strings for display. Am new to python / matplotlib, and this this solution is not so flexible, but it works basically. Also using a pd index for plotting removes the blank 'weekend' daily spaces on market price data.
Matplotlib xaxis index as dates
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from mpl_finance import candlestick2_ohlc
from mpl_finance import candlestick_ohlc
%matplotlib notebook # for Jupyter
# Format m/d/Y,Open,High,Low,Close,Adj Close,Volume
# csv data does not include NaN, or 'weekend' lines,
# only dates from which prices are recorded
DJIA = pd.read_csv('yourFILE.csv') #Format m/d/Y,Open,High,
Low,Close,Adj Close,Volume
print(DJIA.head())
fg, ax1 = plt.subplots()
cl =candlestick2_ohlc(ax=ax1,opens=DJIA['Open'],
highs=DJIA['High'],lows=DJIA['Low'],
closes=DJIA['Close'],width=0.4, colorup='#77d879',
colordown='#db3f3f')
ax1.set_xticks(np.arange(len(DJIA)))
ax1.set_xticklabels(DJIA['Date'], fontsize=6, rotation=-90)
plt.show()

Categories