Relative Strength Index in python pandas - python

I am new to pandas. What is the best way to calculate the relative strength part in the RSI indicator in pandas? So far I got the following:
from pylab import *
import pandas as pd
import numpy as np
def Datapull(Stock):
try:
df = (pd.io.data.DataReader(Stock,'yahoo',start='01/01/2010'))
return df
print 'Retrieved', Stock
time.sleep(5)
except Exception, e:
print 'Main Loop', str(e)
def RSIfun(price, n=14):
delta = price['Close'].diff()
#-----------
dUp=
dDown=
RolUp=pd.rolling_mean(dUp, n)
RolDown=pd.rolling_mean(dDown, n).abs()
RS = RolUp / RolDown
rsi= 100.0 - (100.0 / (1.0 + RS))
return rsi
Stock='AAPL'
df=Datapull(Stock)
RSIfun(df)
Am I doing it correctly so far? I am having trouble with the difference part of the equation where you separate out upward and downward calculations

It is important to note that there are various ways of defining the RSI. It is commonly defined in at least two ways: using a simple moving average (SMA) as above, or using an exponential moving average (EMA). Here's a code snippet that calculates various definitions of RSI and plots them for comparison. I'm discarding the first row after taking the difference, since it is always NaN by definition.
Note that when using EMA one has to be careful: since it includes a memory going back to the beginning of the data, the result depends on where you start! For this reason, typically people will add some data at the beginning, say 100 time steps, and then cut off the first 100 RSI values.
In the plot below, one can see the difference between the RSI calculated using SMA and EMA: the SMA one tends to be more sensitive. Note that the RSI based on EMA has its first finite value at the first time step (which is the second time step of the original period, due to discarding the first row), whereas the RSI based on SMA has its first finite value at the 14th time step. This is because by default rolling_mean() only returns a finite value once there are enough values to fill the window.
import datetime
from typing import Callable
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pandas_datareader.data as web
# Window length for moving average
length = 14
# Dates
start, end = '2010-01-01', '2013-01-27'
# Get data
data = web.DataReader('AAPL', 'yahoo', start, end)
# Get just the adjusted close
close = data['Adj Close']
# Define function to calculate the RSI
def calc_rsi(over: pd.Series, fn_roll: Callable) -> pd.Series:
# Get the difference in price from previous step
delta = over.diff()
# Get rid of the first row, which is NaN since it did not have a previous row to calculate the differences
delta = delta[1:]
# Make the positive gains (up) and negative gains (down) Series
up, down = delta.clip(lower=0), delta.clip(upper=0).abs()
roll_up, roll_down = fn_roll(up), fn_roll(down)
rs = roll_up / roll_down
rsi = 100.0 - (100.0 / (1.0 + rs))
# Avoid division-by-zero if `roll_down` is zero
# This prevents inf and/or nan values.
rsi[:] = np.select([roll_down == 0, roll_up == 0, True], [100, 0, rsi])
rsi.name = 'rsi'
# Assert range
valid_rsi = rsi[length - 1:]
assert ((0 <= valid_rsi) & (valid_rsi <= 100)).all()
# Note: rsi[:length - 1] is excluded from above assertion because it is NaN for SMA.
return rsi
# Calculate RSI using MA of choice
# Reminder: Provide ≥ `1 + length` extra data points!
rsi_ema = calc_rsi(close, lambda s: s.ewm(span=length).mean())
rsi_sma = calc_rsi(close, lambda s: s.rolling(length).mean())
rsi_rma = calc_rsi(close, lambda s: s.ewm(alpha=1 / length).mean()) # Approximates TradingView.
# Compare graphically
plt.figure(figsize=(8, 6))
rsi_ema.plot(), rsi_sma.plot(), rsi_rma.plot()
plt.legend(['RSI via EMA/EWMA', 'RSI via SMA', 'RSI via RMA/SMMA/MMA (TradingView)'])
plt.show()

dUp= delta[delta > 0]
dDown= delta[delta < 0]
also you need something like:
RolUp = RolUp.reindex_like(delta, method='ffill')
RolDown = RolDown.reindex_like(delta, method='ffill')
otherwise RS = RolUp / RolDown will not do what you desire
Edit: seems this is a more accurate way of RS calculation:
# dUp= delta[delta > 0]
# dDown= delta[delta < 0]
# dUp = dUp.reindex_like(delta, fill_value=0)
# dDown = dDown.reindex_like(delta, fill_value=0)
dUp, dDown = delta.copy(), delta.copy()
dUp[dUp < 0] = 0
dDown[dDown > 0] = 0
RolUp = pd.rolling_mean(dUp, n)
RolDown = pd.rolling_mean(dDown, n).abs()
RS = RolUp / RolDown

My answer is tested on StockCharts sample data.
StockChart RSI info
def RSI(series, period):
delta = series.diff().dropna()
u = delta * 0
d = u.copy()
u[delta > 0] = delta[delta > 0]
d[delta < 0] = -delta[delta < 0]
u[u.index[period-1]] = np.mean( u[:period] ) #first value is sum of avg gains
u = u.drop(u.index[:(period-1)])
d[d.index[period-1]] = np.mean( d[:period] ) #first value is sum of avg losses
d = d.drop(d.index[:(period-1)])
rs = pd.DataFrame.ewm(u, com=period-1, adjust=False).mean() / \
pd.DataFrame.ewm(d, com=period-1, adjust=False).mean()
return 100 - 100 / (1 + rs)
#sample data from StockCharts
data = pd.Series( [ 44.34, 44.09, 44.15, 43.61,
44.33, 44.83, 45.10, 45.42,
45.84, 46.08, 45.89, 46.03,
45.61, 46.28, 46.28, 46.00,
46.03, 46.41, 46.22, 45.64 ] )
print RSI( data, 14 )
#output
14 70.464135
15 66.249619
16 66.480942
17 69.346853
18 66.294713
19 57.915021

I too had this question and was working down the rolling_apply path that Jev took. However, when I tested my results, they didn't match up against the commercial stock charting programs I use, such as StockCharts.com or thinkorswim. So I did some digging and discovered that when Welles Wilder created the RSI, he used a smoothing technique now referred to as Wilder Smoothing. The commercial services above use Wilder Smoothing rather than a simple moving average to calculate the average gains and losses.
I'm new to Python (and Pandas), so I'm wondering if there's some brilliant way to refactor out the for loop below to make it faster. Maybe someone else can comment on that possibility.
I hope you find this useful.
More info here.
def get_rsi_timeseries(prices, n=14):
# RSI = 100 - (100 / (1 + RS))
# where RS = (Wilder-smoothed n-period average of gains / Wilder-smoothed n-period average of -losses)
# Note that losses above should be positive values
# Wilder-smoothing = ((previous smoothed avg * (n-1)) + current value to average) / n
# For the very first "previous smoothed avg" (aka the seed value), we start with a straight average.
# Therefore, our first RSI value will be for the n+2nd period:
# 0: first delta is nan
# 1:
# ...
# n: lookback period for first Wilder smoothing seed value
# n+1: first RSI
# First, calculate the gain or loss from one price to the next. The first value is nan so replace with 0.
deltas = (prices-prices.shift(1)).fillna(0)
# Calculate the straight average seed values.
# The first delta is always zero, so we will use a slice of the first n deltas starting at 1,
# and filter only deltas > 0 to get gains and deltas < 0 to get losses
avg_of_gains = deltas[1:n+1][deltas > 0].sum() / n
avg_of_losses = -deltas[1:n+1][deltas < 0].sum() / n
# Set up pd.Series container for RSI values
rsi_series = pd.Series(0.0, deltas.index)
# Now calculate RSI using the Wilder smoothing method, starting with n+1 delta.
up = lambda x: x if x > 0 else 0
down = lambda x: -x if x < 0 else 0
i = n+1
for d in deltas[n+1:]:
avg_of_gains = ((avg_of_gains * (n-1)) + up(d)) / n
avg_of_losses = ((avg_of_losses * (n-1)) + down(d)) / n
if avg_of_losses != 0:
rs = avg_of_gains / avg_of_losses
rsi_series[i] = 100 - (100 / (1 + rs))
else:
rsi_series[i] = 100
i += 1
return rsi_series

You can use rolling_apply in combination with a subfunction to make a clean function like this:
def rsi(price, n=14):
''' rsi indicator '''
gain = (price-price.shift(1)).fillna(0) # calculate price gain with previous day, first row nan is filled with 0
def rsiCalc(p):
# subfunction for calculating rsi for one lookback period
avgGain = p[p>0].sum()/n
avgLoss = -p[p<0].sum()/n
rs = avgGain/avgLoss
return 100 - 100/(1+rs)
# run for all periods with rolling_apply
return pd.rolling_apply(gain,n,rsiCalc)

# Relative Strength Index
# Avg(PriceUp)/(Avg(PriceUP)+Avg(PriceDown)*100
# Where: PriceUp(t)=1*(Price(t)-Price(t-1)){Price(t)- Price(t-1)>0};
# PriceDown(t)=-1*(Price(t)-Price(t-1)){Price(t)- Price(t-1)<0};
# Change the formula for your own requirement
def rsi(values):
up = values[values>0].mean()
down = -1*values[values<0].mean()
return 100 * up / (up + down)
stock['RSI_6D'] = stock['Momentum_1D'].rolling(center=False,window=6).apply(rsi)
stock['RSI_12D'] = stock['Momentum_1D'].rolling(center=False,window=12).apply(rsi)
Momentum_1D = Pt - P(t-1) where P is closing price and t is date

You can get a massive speed up of Bill's answer by using numba. 100 loops of 20k row series( regular = 113 seconds, numba = 0.28 seconds ). Numba excels with loops and arithmetic.
import numpy as np
import numba as nb
#nb.jit(fastmath=True, nopython=True)
def calc_rsi( array, deltas, avg_gain, avg_loss, n ):
# Use Wilder smoothing method
up = lambda x: x if x > 0 else 0
down = lambda x: -x if x < 0 else 0
i = n+1
for d in deltas[n+1:]:
avg_gain = ((avg_gain * (n-1)) + up(d)) / n
avg_loss = ((avg_loss * (n-1)) + down(d)) / n
if avg_loss != 0:
rs = avg_gain / avg_loss
array[i] = 100 - (100 / (1 + rs))
else:
array[i] = 100
i += 1
return array
def get_rsi( array, n = 14 ):
deltas = np.append([0],np.diff(array))
avg_gain = np.sum(deltas[1:n+1].clip(min=0)) / n
avg_loss = -np.sum(deltas[1:n+1].clip(max=0)) / n
array = np.empty(deltas.shape[0])
array.fill(np.nan)
array = calc_rsi( array, deltas, avg_gain, avg_loss, n )
return array
rsi = get_rsi( array or series, 14 )

rsi_Indictor(close,n_days):
rsi_series = pd.DataFrame(close)
# Change = close[i]-Change[i-1]
rsi_series["Change"] = (rsi_series["Close"] - rsi_series["Close"].shift(1)).fillna(0)
# Upword Movement
rsi_series["Upword Movement"] = (rsi_series["Change"][rsi_series["Change"] >0])
rsi_series["Upword Movement"] = rsi_series["Upword Movement"].fillna(0)
# Downword Movement
rsi_series["Downword Movement"] = (abs(rsi_series["Change"])[rsi_series["Change"] <0]).fillna(0)
rsi_series["Downword Movement"] = rsi_series["Downword Movement"].fillna(0)
#Average Upword Movement
# For first Upword Movement Mean of first n elements.
rsi_series["Average Upword Movement"] = 0.00
rsi_series["Average Upword Movement"][n] = rsi_series["Upword Movement"][1:n+1].mean()
# For Second onwords
for i in range(n+1,len(rsi_series),1):
#print(rsi_series["Average Upword Movement"][i-1],rsi_series["Upword Movement"][i])
rsi_series["Average Upword Movement"][i] = (rsi_series["Average Upword Movement"][i-1]*(n-1)+rsi_series["Upword Movement"][i])/n
#Average Downword Movement
# For first Downword Movement Mean of first n elements.
rsi_series["Average Downword Movement"] = 0.00
rsi_series["Average Downword Movement"][n] = rsi_series["Downword Movement"][1:n+1].mean()
# For Second onwords
for i in range(n+1,len(rsi_series),1):
#print(rsi_series["Average Downword Movement"][i-1],rsi_series["Downword Movement"][i])
rsi_series["Average Downword Movement"][i] = (rsi_series["Average Downword Movement"][i-1]*(n-1)+rsi_series["Downword Movement"][i])/n
#Relative Index
rsi_series["Relative Strength"] = (rsi_series["Average Upword Movement"]/rsi_series["Average Downword Movement"]).fillna(0)
#RSI
rsi_series["RSI"] = 100 - 100/(rsi_series["Relative Strength"]+1)
return rsi_series.round(2)
For More Information

You do this using finta package as well just to add above
ref: https://github.com/peerchemist/finta/tree/master/examples
import pandas as pd
from finta import TA
import matplotlib.pyplot as plt
ohlc = pd.read_csv("C:\\WorkSpace\\Python\\ta-lib\\intraday_5min_IBM.csv", index_col="timestamp", parse_dates=True)
ohlc['RSI']= TA.RSI(ohlc)

It is not really necessary to calculate the mean, because after they are divided, you only need to calculate the sum, so we can use Series.cumsum ...
def rsi(serie, n):
diff_serie = close.diff()
cumsum_incr = diff_serie.where(lambda x: x.gt(0), 0).cumsum()
cumsum_decr = diff_serie.where(lambda x: x.lt(0), 0).abs().cumsum()
rs_serie = cumsum_incr.div(cumsum_decr)
rsi = rs_serie.mul(100).div(rs_serie.add(1)).fillna(0)
return rsi

Less code here but seems to work for me:
df['Change'] = (df['Close'].shift(-1)-df['Close']).shift(1)
df['ChangeAverage'] = df['Change'].rolling(window=2).mean()
df['ChangeAverage+'] = df.apply(lambda x: x['ChangeAverage'] if x['ChangeAverage'] > 0 else 0,axis=1).rolling(window=14).mean()
df['ChangeAverage-'] = df.apply(lambda x: x['ChangeAverage'] if x['ChangeAverage'] < 0 else 0,axis=1).rolling(window=14).mean()*-1
df['RSI'] = 100-(100/(1+(df['ChangeAverage+']/df['ChangeAverage-'])))

Related

Performing a cumulative sum with the loop variable itself

I want to get my code to loop so that every time it performs the calculation, it adds basically does a cumulative sum for my variable delta_omega. i.e. for every calculation, it takes the previous values in the delta_omega array, adds them together and uses that value to perform the calculation again and so on. I'm really not sure how to go about this as I want to plot these results too.
import numpy as np
import matplotlib.pyplot as plt
delta_omega = np.linspace(-900*10**6, -100*10**6, m) #Hz - range of frequencies
i = 0
while i<len(delta_omega):
delta = delta_omega[i] - (k*v_cap) + (mu_eff*B)/hbar
p_ee = (s0*L/2) / (1 + s0 + (2*delta/L)**2) #population of the excited state
R = L * p_ee # scattering rate
F = hbar*k*(R) #scattering force on atoms
a = F/m_Rb #acceleration assumed constant
vf_slower = (v_cap**2 - (2*a*z0))**0.5 #velocity at the end of the slower
t_d = 1/a * (v_cap - vf_slower) #time taken during slower
# -------- After slower --------
da = 0.1 #(m) distance from end of slower to the middle of the MOT
vf_MOT = (vf_slower**2 - (2*a*da))**0.5 #(m/s) - velocity of the particles at MOT center
t_a = da/vf_MOT #(s) time taken after slower
r0 = 0.01 #MOT capture radius
vr_max = r0/(t_b+t_d+t_a) #maximum transveral velocity
vz_max = (v_cap**2 + 2*a_max*z0)**0.5 #m/s - maximum axial velocity
# -------- Flux of atoms captured --------
P = 10**(4.312-(4040/T)) #vapour pressure for liquid phase (use 4.857 for solid phase)
A = 5*10**-4 #area of the oven aperture
n = P/(k_b*T) #atomic number density
f_oven = ((n*A)/4) * (2/(np.pi)**0.5) * ((2*k_b*T)/m_Rb)**0.5
f = f_oven * (1 - np.exp(-vr_max**2/vp**2))*(1 - np.exp(-vz_max**2/vp**2))
i+=1
plt.plot(delta_omega, f)
A simple cumulative sum would be defining the variable outside the loop and adding to it
i = 0
x = 0
while i < 10:
x = x + 5 #do your work on the cumulative value here
i += 1
print("cumulative sum: {}".format(x))
so define a variable that will contain the cumulative sum, and every loop, add to it

Piecewise objective functions using Pyomo

I'm currently trying to use Pyomo to solve a battery dispatch problem, i.e. Given demand, solar generation and price to buy from the grid and a price to sell back to the grid, when and how much should the battery (dis)/charge.
I am new to Pyomo and I have tried to use the following code.
'''
import pyomo.environ as pyomo
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# A piecewise example
# We can bound the X with min and max
# Xmin = -1, Xmax = 1
#
#
# / Y * SP, , 0 <= Y <= 1
# X(Y) = |
# \ Y * P , -1 <= Y >= 0
# We consider a flat price for purchasing electricity
df = pd.read_csv('optimal_dispatch_flatprice.csv').iloc[:,1:]
P = df.iloc[:,2] #Price to buy (fixed)
S = df.iloc[:,1] #Solar output
L = df.iloc[:,0] #Demand (load)
SP = df.iloc[:,4] #Price to sell (fixed)
T = len(df)
#Z : charge of battery at time t (how much is in the battery)
Zmin = 0.0
Zmax = 12
#Qt = amount the battery (dis)/charges at time t
Qmin = -5.0
Qmax = 5.0
RANGE_POINTS = {-1.0:-2.4, 0.0:0.0, 1.0:13.46}
def f(model,x):
return RANGE_POINTS[x]
model = pyomo.ConcreteModel()
model.Y = pyomo.Var(times, domain=pyomo.Reals)
model.X = pyomo.Var()
times = range(T)
times_plus_1 = range(T+1)
# Decisions variables
model.Q = pyomo.Var(times, domain=pyomo.Reals) # how much to (dis)/charge
model.Z = pyomo.Var(times_plus_1, domain=pyomo.NonNegativeReals) # SoB
# constraints
model.cons = pyomo.ConstraintList()
model.cons.add(model.Z[0] == 0)
for t in times:
model.cons.add(pyomo.inequality(Qmin, model.Q[t], Qmax))
model.cons.add(pyomo.inequality(Zmin, model.Z[t], Zmax))
model.cons.add(model.Z[t+1] == model.Z[t] - model.Q[t])
model.cons.add(model.Y[t] == L[t]- S[t] - model.Q[t])
model.cons = pyomo.Piecewise(model.X,model.Y, # range and domain variables
pw_pts=[-1,0,1] ,
pw_constr_type='EQ',
f_rule=f)
model.cost = pyomo.Objective(expr = model.X, sense=pyomo.minimize)
'''
I get the error "'IndexedVar' object has no attribute 'lb'.
I think this is referring to the fact that model.Y is index with times.
Can anyone explain how to set the problem up?
Since one of the variables is indexed, you need to provide the index set as the first argument to Piecewise. E.g., Piecewise(times,model.X,model.Y,...

Is it possible to loop to a certain value and carry on further calculations with this value?

I am new here and new in programming, so excuse me if the question is not formulated clearly enough.
For a uni assignment, my labpartner and I are programming a predator-prey system.
In this predator-prey system, there is a certain load factor 'W0'.
We want to find a load factor W0, accurate to 5 significant digits, for which applies that there will never be less than 250 predators (wnum[1] in our code). We want to find this value of W0 and we need the code to carry on further calculations with this found value of W0. Here is what we've tried so far, but python does not seem to give any response:
# Import important stuff and settings
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
print ('Results of Group 4')
def W0():
W0 = 2.0
while any(wnum[1])<250:
W0 = W0-0.0001
return W0
def W(t):
if 0 <= t < 3/12:
Wt = 0
elif 3/12 <= t <= 8/12:
Wt = W0
elif 8/12 < t < 1:
Wt = 0
else:
Wt = W(t - 1)
return Wt
# Define the right-hand-side function
def rhsf(t,y):
y1 = y[0]
y2 = y[1]
f1 = (2-2*10**-3*y2)*y1-W(t)*y1
f2 = (-3.92+7*10**-3*y1)*y2
return np.array([f1,f2])
# Define one step of the RK4 method
def RK4Step(tn,wn,Dt,f):
# tn = current time
# wn = known approximation at time tn
# Dt = the time step to use
# f = the right-hand-side function to use
# wnplus1 = the new approximation at time tn+Dt
k1 = Dt*f(tn,wn)
k2 = Dt*f(tn+0.5*Dt,wn+0.5*k1)
k3 = Dt*f(tn+0.5*Dt,wn+0.5*k2)
k4 = Dt*f(tn+Dt,wn+k3)
wnplus1 = wn + 1/6*(k1 +2*k2 +2*k3 +k4)
return wnplus1
# Define the complete RK4 method
def RK4Method(t0,tend,Dt,f,y0):
# t0 = initial time of simulation
# tend = final time of simulation
# Dt = the time step to use
# f = the right-hand-side function to use
# y0 = the initial values
# calculate the number of time steps to take
N = int(np.round((tend-t0)/Dt))
# make the list of times t which we want the solution
time = np.linspace(t0,tend,num=N+1)
# make sure Dt matches with the number of time steps
Dt = (tend-t0)/N
# Allocate memory for the approximations
# row i represents all values of variable i at all times
# column j represents all values of all variables at time t_j
w = np.zeros((y0.size,N+1))
# store the (given) initial value
w[:,0] = y0
# Perform all time steps
for n,tn in enumerate(time[:-1]):
w[:,n+1] = RK4Step(tn,w[:,n],Dt,f)
return time, w
# Set all known values and settings
t0 = 0.0
tend = 10.0
y0 = np.array([600.0,1000.0])
Dt = 0.5/(2**7)
# Execute the method
tnum, wnum = RK4Method(t0,tend,Dt,rhsf,y0)
# Make a nice table
alldata = np.concatenate(([tnum],wnum),axis=0).transpose()
table = pd.DataFrame(alldata,columns=['t','y1(t)','y2(t)'])
print('\nA nice table of the simulation:\n')
print(table)
# Make a nice picture
plt.close('all')
plt.figure()
plt.plot(tnum,wnum[0,:],label='$y_1$',marker='o',linestyle='-')
plt.plot(tnum,wnum[1,:],label='$y_2$',marker='o',linestyle='-')
plt.xlabel('$t$')
plt.ylabel('$y(t)$')
plt.title('Simulation')
plt.legend()
# Do an error computation
# Execute the method again with a doubled time step
tnum2, wnum2 = RK4Method(t0,tend,2.0*Dt,rhsf,y0)
# Calculate the global truncation errors at the last simulated time
errors = (wnum[:,-1] - wnum2[:,-1])/(2**4-1)
print('\nThe errors are ',errors[0],' for y1 and ',errors[1],' for y2 at time t=',tnum[-1])

Angle Interpolation

I was trying to interpolate the angle which are in list.
Dir DirOffset
0 109.6085
30 77.5099
60 30.5287
90 -10.2748
120 -75.359
150 -147.6015
180 -162.7055
210 21.0103
240 3.5502
270 -11.5475
300 -39.8371
330 -109.5473
360 109.6085
I have written the code to interpolate angle(It keeps on calculating the mean in between angle to reach the interpolation value) which is taking long time. Please help me if some one have the faster and shorter code.
from cmath import rect, phase
from math import radians, degrees, sqrt
#Calculate the mean of angles in List
def mean_angle(degArray):
return degrees(phase(sum(rect(1, radians(d)) for d in degArray)/len(degArray)))
#Calculate Interpolation Angle
def Interpolate_angle(Dir, DirOffset, ValuetoInterpolate):
#Create Lower and Higher bin of ValuetoInterpolate
DirLBin = round(float(ValuetoInterpolate)/30,0)*30
DirHBin = round(float(ValuetoInterpolate+15)/30,0)*30
#Check if the ValuetoInterpolate lies between Lower and Higher bin
if DirLBin == DirHBin:
DirLBin = DirHBin-30
if DirLBin <= ValuetoInterpolate <= DirHBin:
DBin = [float(DirLBin), float(DirHBin)]
Doff = [DirOffset[Dir.index(DirLBin)], DirOffset[Dir.index(DirHBin)]]
else:
DirHBin = DirLBin+30
DBin = [float(DirLBin), float(DirHBin)]
Doff = [DirOffset[Dir.index(DirLBin)], DirOffset[Dir.index(DirHBin)]]
else:
DBin = [float(DirLBin), float(DirHBin)]
Doff = [DirOffset[Dir.index(DirLBin)], DirOffset[Dir.index(DirHBin)]]
#Run 50 iterations to calculate the mean of angle and find the ValuetoInterpolate
for i in range(51):
DMean = mean_angle(DBin)
DOMean = mean_angle(Doff)
if DMean < 0 :
DMean = 360+DMean
if DBin[0] <= ValuetoInterpolate <=DMean:
DBin = [float(DBin[0]), float(DMean)]
Doff = [float(Doff[0]), float(DOMean)]
else:
DBin = [float(DMean), float(DBin[1])]
Doff = [float(DOMean), float(Doff[1])]
return DOMean
Dir = range(0,370,30)
DirOffset = [109.6085,77.5099,30.5287,-10.2748,-75.359,-147.6015,-162.7055,21.0103,3.5502,-11.5475,-39.8371,-109.5473,109.6085]
ValuetoInterpolate = 194.4
print Interpolate_angle(Dir, DirOffset, ValuetoInterpolate)
I got the solution for above question after searching answers from stackoverflow, then I modified little bit to get the solution as per my requirement. The solution might be useful for some one in need of it.
I interpolated the Degrees using below function for each directional bin (0,30,60....360) till 360(360 and 0 degree will be same) and store them in dictionary to create a DataFrame(pandas DataFrame) and left join it with main DataFrame and process further.
def InterpolateDegrees(109.6085,77.5099)
will return interpolated array of DirectionOffset 0 to 30 degree with an interval of 0.1 (0.0, 0.1, 0.2, 0.3......28.7, 29.8, 29.9)
import numpy as np
from math import fabs
def InterpolateDegrees(start, end, BinSector=12):
BinAngle = 360/BinSector
amount = np.arange(0,1,(1/(BinAngle/0.1)))
dif = fabs(end-start)
if dif >180:
if end>start:
start+=360
else:
end+=360
#Interpolate it
value = (start + ((end-start)*amount))
#Wrap it
rzero = 360
Arr = np.where((value>=0) & (value<=360), (value), (value % rzero))
return Arr
Here is a Pandas/Numpy based solution for interpolating an angle series with NaN data.
import pandas as pd
import numpy as np
def interpolate_degrees(series: pd.Series) -> pd.Series:
# I don't want to modify in place
series = series.copy()
# convert to radians
a = np.radians(series)
# unwrap if not nan
a[~np.isnan(a)] = np.unwrap(a[~np.isnan(a)])
series.update(a)
# interpolate unwrapped values
interpolated = series.interpolate()
# wrap 0 - 360 (2*pi)
wrapped = (interpolated + 2*np.pi) % (2 * np.pi)
# cconvert back to degrees
degrees = np.degrees(wrapped)
series.update(degrees)
return series
Usage:
angle = [350, np.nan, 355, np.nan, 359, np.nan, 1, np.nan, 10]
df = pd.DataFrame(data=angle, columns=['angle'])
df['interpolated'] = interpolate_degrees(df.angle)

Average True Range and Exponential Moving Average Functions on PandasDataSeries needed

I am stuck while calculating Average True Range[ATR] of a Series.
ATR is basically a Exp Movin Avg of TrueRange[TR]
TR is nothing but MAX of -
Method 1: Current High less the current Low
Method 2: Current High less the previous Close (absolute value)
Method 3: Current Low less the previous Close (absolute value)
In Pandas we dont have an inbuilt EMA function. Rather we have EWMA which is a weighted moving average.
If someone helps to calculate EMA that also will be good enough
def ATR(df,n):
df['H-L']=abs(df['High']-df['Low'])
df['H-PC']=abs(df['High']-df['Close'].shift(1))
df['L-PC']=abs(df['Low']-df['Close'].shift(1))
df['TR']=df[['H-L','H-PC','L-PC']].max(axis=1)
df['ATR_' + str(n)] =pd.ewma(df['TR'], span = n, min_periods = n)
return df
The above code doesnt give error but it also doesnt give correct values either. I compared it with manually calculating ATR values on same dataseries in excel and values were different
ATR excel formula-
Current ATR = [(Prior ATR x 13) + Current TR] / 14
- Multiply the previous 14-day ATR by 13.
- Add the most recent day's TR value.
- Divide the total by 14
This is the dataseries I used as a sample
start='2016-1-1'
end='2016-10-30'
auro=web.DataReader('AUROPHARMA.NS','yahoo',start,end)
You do need to use ewma
See here: An exponential moving average (EMA) is a type of moving average that is similar to a simple moving average, except that more weight is given to the latest data.
Read more: Exponential Moving Average (EMA) http://www.investopedia.com/terms/e/ema.asp#ixzz4ishZbOGx
I dont think your excel formula is right... Here is a manual way to calculate ema in python
def exponential_average(values, window):
weights = np.exp(np.linspace(-1.,0.,window))
weights /= weights.sum()
a = np.convolve(values, weights) [:len(values)]
a[:window]=a[window]
return a
scipy.signal.lfilter could help you.
scipy.signal.lfilter(b, a, x, axis=-1,zi=None)
The filter function is implemented as a direct II transposed structure. This means that the filter implements:
a[0]*y[n] = b[0]*x[n] + b[1]*x[n-1] + ... + b[M]*x[n-M]
- a[1]*y[n-1] - ... - a[N]*y[n-N]
If we normalize the above formula, we obtain the following one:
y[n] = b'[0]*x[n] + b'[1]*x[n-1] + ... + b'[M]*x[n-M]
- a'[1]*y[n-1] + ... + a'[N]*y[n-N]
where b'[i] = b[i]/a[0], i = 0,1,...,M; a'[j] = a[j]/a[0],j = 1,2,...,N
and a'[0] = 1
Exponential Moving Average formula:
y[n] = alpha*x[n] + (1-alpha)*y[n-1]
So to apply scipy.signal.lfilter, by the formula above we can set a and b as below:
a[0] = 1, a[1] = -(1-alpha)
b[0] = alpha
My implementation is as below, hope it to help you.
def ema(values, window_size):
alpha = 2./ (window_size + 1)
a = np.array([1, alpha - 1.])
b = np.array([alpha])
zi = sig.lfilter_zi(b, a)
y, _ = sig.lfilter(b, a, values, zi=zi)
return y

Categories