Polar chart of yearly data with matplolib - python

I am trying to plot data for a whole year as a polar chart in matplotlib, and having trouble locating any examples of this. I managed to convert the dates from pandas according to this thread, but I can't wrap my head around (litterally) the y-axis, or theta.
This is how far I got:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd
times = pd.date_range("01/01/2016", "12/31/2016")
rand_nums = np.random.rand(len(times),1)
df = pd.DataFrame(index=times, data=rand_nums, columns=['A'])
ax = plt.subplot(projection='polar')
ax.set_theta_direction(-1)
ax.set_theta_zero_location("N")
ax.plot(mdates.date2num(df.index.to_pydatetime()), df['A'])
plt.show()
which gives me this plot:
reducing the date range to understand what is going on
times = pd.date_range("01/01/2016", "01/05/2016") I get this plot:
I gather that the start of the series is between 90 and 135, but how can I 'remap' this so that my year date range starts and finishes at the north origin?

The polar plot's angle range ranges over a full circle in radiants, i.e. [0, 2π].
One would therefore need to normalize the date range to the full circle.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd
times = pd.date_range("01/01/2016", "12/31/2016")
rand_nums = np.random.rand(len(times),1)
df = pd.DataFrame(index=times, data=rand_nums, columns=['A'])
ax = plt.subplot(projection='polar')
ax.set_theta_direction(-1)
ax.set_theta_zero_location("N")
t = mdates.date2num(df.index.to_pydatetime())
y = df['A']
tnorm = (t-t.min())/(t.max()-t.min())*2.*np.pi
ax.fill_between(tnorm,y ,0, alpha=0.4)
ax.plot(tnorm,y , linewidth=0.8)
plt.show()

Related

How to remove the first and last minor tick month labels on matplotlib?

I want to generate a chart with the 12 months of a year as the x-axis labels, i.e. 'Jan' to 'Dec', positioned in the middle between the major ticks. I used the code from https://matplotlib.org/3.4.3/gallery/ticks_and_spines/centered_ticklabels.html to create the x-axis. The x-axis created has an additional 'Dec' on the left and 'Jan' on the right, i.e. a total of 14 labels instead of 12 (see attached image). However, only 'Jan' to 'Dec' are wanted on the chart. I would like to know how to remove the 'Dec' label on the left and 'Jan' label on the right? My google searches were only successful with solutions to remove all minor tick labels. Any help will be much appreciated.
I use the following code to generate the chart:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import matplotlib.dates as mdates
import matplotlib.ticker as ticker
df = pd.DataFrame(np.random.randint(0,100,size=(365, 2)), columns=list('AB'))
df.index = pd.date_range(start='1/1/2022', end='12/31/2022').strftime('%b-%d')
plt.figure()
ax = plt.gca()
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_minor_locator(mdates.MonthLocator(bymonthday=16))
ax.xaxis.set_major_formatter(ticker.NullFormatter())
ax.xaxis.set_minor_formatter(mdates.DateFormatter('%b'))
for tick in ax.xaxis.get_minor_ticks():
tick.tick1line.set_markersize(0)
tick.tick2line.set_markersize(0)
tick.label1.set_horizontalalignment('center')
plt.plot(df['A'], linewidth=0.5, color='tab:red')
plt.show()
enter image description here
Try setting your x-axis limit to values between 0 and 365. Sometimes matplotlib uses values a little outside of your data. This way, the first Dec and last Jan are automatically eliminated from the plot.
Here I modified your code with 1 argument: plt.xlim(0,365)
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import matplotlib.dates as mdates
import matplotlib.ticker as ticker
df = pd.DataFrame(np.random.randint(0,100,size=(365, 2)), columns=list('AB'))
df.index = pd.date_range(start='1/1/2022', end='12/31/2022').strftime('%b-%d')
plt.figure()
ax = plt.gca()
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_minor_locator(mdates.MonthLocator(bymonthday=16))
ax.xaxis.set_major_formatter(ticker.NullFormatter())
ax.xaxis.set_minor_formatter(mdates.DateFormatter('%b'))
for tick in ax.xaxis.get_minor_ticks():
tick.tick1line.set_markersize(0)
tick.tick2line.set_markersize(0)
tick.label1.set_horizontalalignment('center')
plt.xlim(0,365)
plt.plot(df['A'], linewidth=0.5, color='tab:red')
plt.show()

Changing the tick frequency on the x-axis

I am trying to plot a bar chart with the date vs the price of a crypto currency from a dataframe and have 731 daily samples. When i plot the graph i get the image as seen below. Due to the amount of dates the x axis is unreadable and i would like to make it so it only labels the 1st of every month on the x-axis.
This is the graph i currently have: https://imgur.com/a/QVNn4Zp
I have tried using other methods i have found online both in stackoverflow and other sources such as youtube but had no success.
This is the Code i have so far to plot the bar chart.
df.plot(kind='bar',x='Date',y='Price in USD (at 00:00:00 UTC)',color='red')
plt.show()
One option is to plot a numeric barplot with matplotlib.
Matplotlib < 3.0
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd
start = pd.to_datetime("5-1-2012")
idx = pd.date_range(start, periods= 365)
df = pd.DataFrame({'Date': idx, 'A':np.random.random(365)})
fig, ax = plt.subplots()
dates = mdates.date2num(df["Date"].values)
ax.bar(dates, df["A"], width=1)
loc = mdates.AutoDateLocator()
ax.xaxis.set_major_locator(loc)
ax.xaxis.set_major_formatter(mdates.AutoDateFormatter(loc))
plt.show()
Matplotlib >= 3.0
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
pd.plotting.register_matplotlib_converters()
start = pd.to_datetime("5-1-2012")
idx = pd.date_range(start, periods= 365)
df = pd.DataFrame({'Date': idx, 'A':np.random.random(365)})
fig, ax = plt.subplots()
ax.bar(df["Date"], df["A"], width=1)
plt.show()
Further options:
For other options see Pandas bar plot changes date format

datetime x-axis matplotlib labels causing uncontrolled overlap

I'm trying to plot a pandas series with a 'pandas.tseries.index.DatetimeIndex'. The x-axis label stubbornly overlap, and I cannot make them presentable, even with several suggested solutions.
I tried stackoverflow solution suggesting to use autofmt_xdate but it doesn't help.
I also tried the suggestion to plt.tight_layout(), which fails to make an effect.
ax = test_df[(test_df.index.year ==2017) ]['error'].plot(kind="bar")
ax.figure.autofmt_xdate()
#plt.tight_layout()
print(type(test_df[(test_df.index.year ==2017) ]['error'].index))
UPDATE: That I'm using a bar chart is an issue. A regular time-series plot shows nicely-managed labels.
A pandas bar plot is a categorical plot. It shows one bar for each index at integer positions on the scale. Hence the first bar is at position 0, the next at 1 etc. The labels correspond to the dataframes' index. If you have 100 bars, you'll end up with 100 labels. This makes sense because pandas cannot know if those should be treated as categories or ordinal/numeric data.
If instead you use a normal matplotlib bar plot, it will treat the dataframe index numerically. This means the bars have their position according to the actual dates and labels are placed according to the automatic ticker.
import pandas as pd
import numpy as np; np.random.seed(42)
import matplotlib.pyplot as plt
datelist = pd.date_range(pd.datetime(2017, 1, 1).strftime('%Y-%m-%d'), periods=42).tolist()
df = pd.DataFrame(np.cumsum(np.random.randn(42)),
columns=['error'], index=pd.to_datetime(datelist))
plt.bar(df.index, df["error"].values)
plt.gcf().autofmt_xdate()
plt.show()
The advantage is then in addition that matplotlib.dates locators and formatters can be used. E.g. to label each first and fifteenth of a month with a custom format,
import pandas as pd
import numpy as np; np.random.seed(42)
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
datelist = pd.date_range(pd.datetime(2017, 1, 1).strftime('%Y-%m-%d'), periods=93).tolist()
df = pd.DataFrame(np.cumsum(np.random.randn(93)),
columns=['error'], index=pd.to_datetime(datelist))
plt.bar(df.index, df["error"].values)
plt.gca().xaxis.set_major_locator(mdates.DayLocator((1,15)))
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter("%d %b %Y"))
plt.gcf().autofmt_xdate()
plt.show()
In your situation, the easiest would be to manually create labels and spacing, and apply that using ax.xaxis.set_major_formatter.
Here's a possible solution:
Since no sample data was provided, I tried to mimic the structure of your dataset in a dataframe with some random numbers.
The setup:
# imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import matplotlib.ticker as ticker
# A dataframe with random numbers ro run tests on
np.random.seed(123456)
rows = 100
df = pd.DataFrame(np.random.randint(-10,10,size=(rows, 1)), columns=['error'])
datelist = pd.date_range(pd.datetime(2017, 1, 1).strftime('%Y-%m-%d'), periods=rows).tolist()
df['dates'] = datelist
df = df.set_index(['dates'])
df.index = pd.to_datetime(df.index)
test_df = df.copy(deep = True)
# Plot of data that mimics the structure of your dataset
ax = test_df[(test_df.index.year ==2017) ]['error'].plot(kind="bar")
ax.figure.autofmt_xdate()
plt.figure(figsize=(15,8))
A possible solution:
test_df = df.copy(deep = True)
ax = test_df[(test_df.index.year ==2017) ]['error'].plot(kind="bar")
plt.figure(figsize=(15,8))
# Make a list of empty myLabels
myLabels = ['']*len(test_df.index)
# Set labels on every 20th element in myLabels
myLabels[::20] = [item.strftime('%Y - %m') for item in test_df.index[::20]]
ax.xaxis.set_major_formatter(ticker.FixedFormatter(myLabels))
plt.gcf().autofmt_xdate()
# Tilt the labels
plt.setp(ax.get_xticklabels(), rotation=30, fontsize=10)
plt.show()
You can easily change the formatting of labels by checking strftime.org

Plotting candlestick with matplotlib for time series w/o weekend gaps

trying to plot a candlestick serie after importing datas from yahoo-finance. I'm using python 2.7
I have already a serie plotted and I want to add the same one as candlestick but I don't see how I can do that :
import matplotlib.pyplot as plt
from matplotlib.finance import candlestick2_ohlc
#Reset the index to remove Date column from index
df_ohlc = data.reset_index()
#Naming columns
df_ohlc.columns = ["Date","Open","High",'Low',"Close", "Adj Close", "Volume"]
#Normal plot
ax1 = plt.subplot()
ax1.plot(df_ohlc["Date"], df_ohlc["Close"], label = "Price", color="blue", linewidth=2.0)
#Candle plot
candlestick2_ohlc(ax1,df_ohlc['Open'],df_ohlc['High'],df_ohlc['Low'],df_ohlc['Close'],width=0.6)
If I plot candlestick alone, it looks fine but the x axis is a list of integers.
If I plot candlestick alone after converting df_ohlc["Date"] to float then reconverting to datetime, it plots the serie with the correct x axis but there are gaps on the weekend even if the serie isn't defined for these dates.
Is there a way to plot both series at the same time ? I'm planning to add more series like moving average, OLS, Bollinger etc...
You can remove weekend gaps and make human-readable dates xticklabels in this way. Note that, this script is written in python 3 and there may be some differences from python 2.
import quandl
import numpy as np
from mpl_finance import candlestick_ohlc
import matplotlib.pyplot as plt
# getting data and modifying it to remove gaps at weekends
r = quandl.get('WIKI/AAPL', start_date='2016-01-01', end_date='2017-11-10')
date_list = np.array(r.index.to_pydatetime())
plot_array = np.zeros([len(r), 5])
plot_array[:, 0] = np.arange(plot_array.shape[0])
plot_array[:, 1:] = r.iloc[:, :4]
# plotting candlestick chart
fig, ax = plt.subplots()
num_of_bars = 100 # the number of candlesticks to be plotted
candlestick_ohlc(ax, plot_array[-num_of_bars:], colorup='g', colordown='r')
ax.margins(x=0.0, y=0.1)
ax.yaxis.tick_right()
x_tick_labels = []
ax.set_xlim(right=plot_array[-1, 0]+10)
ax.grid(True, color='k', ls='--', alpha=0.2)
# setting xticklabels actual dates instead of numbers
indices = np.linspace(plot_array[-num_of_bars, 0], plot_array[-1, 0], 8, dtype=int)
for i in indices:
date_dt = date_list[i]
date_str = date_dt.strftime('%b-%d')
x_tick_labels.append(date_str)
ax.set(xticks=indices, xticklabels=x_tick_labels)
plt.show()
I really need more information about your code and your dataframe, but you can use this example to do a candlestick
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
from matplotlib.finance import candlestick_ohlc
import matplotlib.dates as mdates
import datetime as dt
#Reset the index to remove Date column from index
df_ohlc = df.reset_index()
#Naming columns
df_ohlc.columns = ["Date","Open","High",'Low',"Close", "Adj Close", "Volume"]
#Converting dates column to float values
df_ohlc['Date'] = df_ohlc['Date'].map(mdates.date2num)
#Making plot
fig = plt.figure()
fig.autofmt_xdate()
ax1 = plt.subplot2grid((6,1), (0,0), rowspan=6, colspan=1)
#Converts raw mdate numbers to dates
ax1.xaxis_date()
plt.xlabel("Date")
print(df_ohlc)
#Making candlestick plot
candlestick_ohlc(ax1,df_ohlc.values,width=1, colorup='g', colordown='k',alpha=0.75)
plt.ylabel("Price")
plt.legend()
plt.show()

ploting subplot in matplotlib with pandas issue

i am try to plot subplot in matplotlib with pandas but there are issue i am facing. when i am plot subplot not show the date of stock...there is my program
import pandas as pd
import datetime
import matplotlib.pyplot as plt
import pandas.io.data
df = pd.io.data.get_data_yahoo('goog', start=datetime.datetime(2008,1,1),end=datetime.datetime(2014,10,23))
fig = plt.figure()
r = fig.patch
r.set_facecolor('#0070BB')
ax1 = fig.add_subplot(2,1,1, axisbg='#0070BB')
ax1.grid(True)
ax1.plot(df['Close'])
ax2 = fig.add_subplot(2,1,2, axisbg='#0070BB')
ax2.plot(df['Volume'])
plt.show()
run this program own your self and solve date issue.....
When you're calling matplotlib's plot(), you are only giving it one array (e.g. df['Close'] in the first case). When there's only one array, matplotlib doesn't know what to use for the x axis data, so it just uses the index of the array. This is why your x axis shows the numbers 0 to 160: there are presumably 160 items in your array.
Use ax1.plot(df.index, df['Close']) instead, since df.index should hold the date values in your pandas dataframe.
import pandas as pd
import datetime
import matplotlib.pyplot as plt
import pandas.io.data
df = pd.io.data.get_data_yahoo('goog', start=datetime.datetime(2008,1,1),end=datetime.datetime(2014,10,23))
fig = plt.figure()
r = fig.patch
r.set_facecolor('#0070BB')
ax1 = fig.add_subplot(2,1,1, axisbg='#0070BB')
ax1.grid(True)
ax1.plot(df.index, df['Close'])
ax2 = fig.add_subplot(2,1,2, axisbg='#0070BB')
ax2.plot(df.index, df['Volume'])
plt.show()

Categories