Questions on pandas moving average

Questions on pandas moving average - python

I am a beginner of python and pandas. I am having difficulty with making volatility adjusted moving average, so I need your help.
Volatility adjusted moving average is a kind of moving average, of which moving average period is not static, but dynamically adjusted according to volatility.
What I'd like to code is,
Get stock data from yahoo finance (monthly close)
Calculate monthly volatility X some constant --> use variables of dynamic moving average period
Calculate dynamic moving average
I've tried this code, but only to fail. I don't know what the problem is. If you know the problem, or any better code suggestion, please let me know.
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import pandas_datareader.data as web
def price(stock, start):
price = web.DataReader(name=stock, data_source='yahoo', start=start)['Adj Close']
price = price / price[0]
a = price.resample('M').last().to_frame()
a.columns = ['price']
return a
a = price('SPY','2000-01-01')
a['volperiod'] = round(a.rolling(12).std()*100)*2
for i in range(len(a.index)):
k = a['price'].rolling(int(a['volperiod'][i])).mean()
a['ma'][i] = k[i]
print(a)

first of all: you need to calculate pct_change on price to calculate volatility of returns
my solution
def price(stock, start):
price = web.DataReader(name=stock, data_source='yahoo', start=start)['Adj Close']
return price.div(price.iat[0]).resample('M').last().to_frame('price')
a = price('SPY','2000-01-01')
v = a.pct_change().rolling(12).std().dropna().mul(200).astype(int)
def dyna_mean(x):
end = a.index.get_loc(x.name)
start = end - x.price
return a.price.iloc[start:end].mean()
pd.concat([a.price, v.price, v.apply(dyna_mean, axis=1)],
axis=1, keys=['price', 'vol', 'mean'])

Related

How to create a column that returns the slope of a moving average?

on a dataframe that contains the price of bitcoin, I want to measure the strength of a trend by displaying the angle of the slope of a moving average (calculated over 20 periods) on each row.
A moving average allows you to analyze a time series, removing transient fluctuations in order to highlight longer term trends.
To calculate a simple 20-period moving average for trading purposes, we take the last 20 closing prices, add them together and divide the result by 20.
I started by trying to use the linregress function of scipy but I get the exception "len() of unsized object" that I could not solve:
from scipy.stats import linregress
x = df.iloc[-1, 8] # -1:last row, 8: sma20
y = df['sma20']
df['slope_deg'] = df.apply(linregress(x, y))
I then used the atan function of the math module but the result returned is always nan, whatever the row is:
import math
df['sma20'] = df['Close'].rolling(20).mean()
slope=((df['sma20'][0]-df['sma20'][20])/20)
df['slope_deg'] = math.atan(slope) * 180 / math.pi
... or 45 :
import math
df['sma20'] = df['Close'].rolling(20).mean()
df['slope_deg'] = math.atan(1) * 180 / math.pi
df
Here is an example of code with the date as an index, the price used to calculate the moving average, and the moving average (over 5 periods for the example):
df= pd.DataFrame({'date':np.tile( pd.date_range('1/1/2011',
periods=25, freq='D'), 4 ),
'price':(np.random.randn(100).cumsum() + 10),
'sma5':df['price'].rolling(5).mean()
})
df.head(10)
Can someone help me to create a column that returns the slope of a moving average?

OK, I did the 20 day sma, I am not so sure about the slope part, since you didnt clearly specify what you need.
I am assuming slope values, in degrees, as follows:
arctan( (PriceToday - Price20daysAgo)/ 20 )
Here you have the code:
EDIT 1: simplified 'slope' code and adapted following #Oliver 's suggestion.
import pandas as pd
import yfinance as yf
btc = yf.download('BTC-USD', period='1Y')
btc['sma20'] = btc.rolling(20).mean()['Adj Close']
btc['slope'] = np.degrees(np.arctan(btc['sma20'].diff()/20))
btc = btc[['Adj Close','sma20','slope']].dropna()
Output:
btc
Adj Close sma20 slope
Date
2021-03-15 55907.199219 51764.509570 86.767651
2021-03-16 56804.902344 52119.488086 86.775283
2021-03-17 58870.894531 52708.340234 88.054732
2021-03-18 57858.921875 53284.298242 88.011217
2021-03-19 58346.652344 53892.208203 88.115671
... ... ... ...
2022-02-19 40122.156250 41560.807227 79.715989
2022-02-20 38431.378906 41558.219922 -7.371144
2022-02-21 37075.281250 41474.820312 -76.514600
2022-02-22 38286.027344 41541.472461 73.297321
2022-02-23 38748.464844 41621.165625 75.911862
As you can see, the slope value means little as it is. Thats because the variation in price from a 20 days spam is far greater than 20 units, the value representing the time window you chose to use.
Plotting prices and sma20 vs date.
btc[['Adj Close','sma20']].plot(figsize=(14,7));

How to calculate daily evapotranspiration by hargreaves-samani equation and using python?

I have a ten-year weather data including maximum temperature (Tmax), minimum temperature (Tmin), rainfall and solar radiation (Ra) for each day.
At first, I would like to calculate evapotranspiration (ETo) for each day using the following equation:
ETo=0.0023*(((Tmax+Tmin)/2)+17.8)*sqrt(Tmax-Tmin)*Ra
Then, calculation of the monthly and yearly average of all parameters (Tmax,Tmin, Rainfall, Ra and ETo) and print them in Excel format.
I have written some parts. could you possibly help me with completing it? I think it may need a loop.
import numpy as np
import pandas as pd
import math as mh
# load the weather data file
data_file = pd.read_excel(r'weather data.xlsx', sheet_name='city_1')
# defining time
year = data_file['Year']
month = data_file['month']
day = data_file['day']
# defining weather parameters
Tmax = data_file.loc[:,'Tmax']
Tmin = data_file.loc[:,'Tmin']
Rainfall = data_file.loc[:,'Rainfall']
Ra = data_file.loc[:,'Ra']
# adjusting time to start at zero
year = year-year[0]
month=month-month[0]
day=day-day[0]
#calculation process for estimation of evapotranspiration
ET0=(0.0023*(((Tmax+Tmin)/2)+17.8)*(mh.sqrt(Tmax-Tmin))*Ra

Looks like you've got one data row (record) per day.
Since you already have Tmax, Tmin, Rainfall, and Sunhours in the row, you could add a net ET0 row with the calculation like this:
data_file['ET0'] = data_file.apply(lambda x: 0.0023*(((x.Tmax+x.Tmin)/2)+17.8)*(mh.sqrt(x.Tmax-x.Tmin))*x.Ra, axis=0)

Own RSI calculation divers from Altcoin Tradingview RSI ... Does anyone know why?

I've tried several ways to get the same RSI like Tradingview. The funny thing is, that my own calculated RSI matches e.g. the Bitcoin related RSI's perfectly. But when i try to calculate the RSI for altcoins, it's different. I have tried different EMA/RMAs, Excel recreation and of course python. Even: XRSIs (eg: RSI = 0,6 RSI-XRP + 0,4 RSI-BTC), but never got the same result.
Does anyone know how Tradingview is calculating the AltCoin RSIs?
Thank you in advance,
Best regards,
Domi

The calculation of the RSI should be the same for any kind of data in Tradingview. In Pinescript the RSI can be calculated from scratch as follows:
pine_rsi(x, y) =>
u = max(x - x[1], 0) // upward change
d = max(x[1] - x, 0) // downward change
rs = rma(u, y) / rma(d, y)
res = 100 - 100 / (1 + rs)
res
If the results differ it might be due to rounding errors, another cause might the use of data different from the one provided by Tradingview.

They are using a smoothed RSI formula. I checked it against the one year chart which uses daily bars.
import yfinance as yf
import talib as ta
#get data
ticker = yf.Ticker("BTC-USD")
period = '10y'
interval = '1d'
data = ticker.history(interval=interval, period= period)
df = data .reset_index()
df = df.rename(columns={"index": "Date"})
df['RSI_Ta'] = ta.RSI(df['Close'], timeperiod=14)
df
It is the same as Yahoo data somehow. Which is strange because I buy and sell as a market maker for higher and lower prices most every day.

How to implement a normality-check function in python?

I am building a Monte Carlo simulation in order to study the behaviour of a set of 1000 iterations. Every simulation has an output graph given by a Pandas dataframe converted into a png by matplotlib.pyplot. Since I am not sure that every output is a Normal ditribution, even if a read an article about this and it secures every output is, I'd like to understand how to check it.
I've found something in this link but I didn't understand which one is the best and how to implement it.
Here's the code:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
sns.set_style('whitegrid')
avg = 1
std_dev = .1
num_reps = 500
num_simulations = 1000
#generate a list of percentages that will replicate our historical normsal distribution
#two decimal places in order to make it very easy to see the boundaries
pct_to_target = np.random.normal(avg, std_dev, num_reps).round(2)
#input of historical datas
sales_target_values = [75_000, 100_000, 200_000, 300_000, 400_000, 500_000]
sales_target_prob = [.3, .3, .2, .1, .05, .05]
sales_target = np.random.choice(sales_target_values, num_reps, p=sales_target_prob)
#build up a pandas dataframe
df = pd.DataFrame(index=range(num_reps), data={'Pct_To_Target': pct_to_target,
'Sales_Target': sales_target})
df['Sales'] = df['Pct_To_Target'] * df['Sales_Target']
#Here is what our new dataframe looks like
print("how our dataframe looks like")
print(df)
#Return the commission rate based on the excell table
def calc_commission_rate(x):
if x <= .90:
return .02
if x <= .99:
return .03
else:
return .04
#create our commission rate and multiply it times sales
df['Commission_Rate'] = df['Pct_To_Target'].apply(calc_commission_rate)
df['Commission_Amount'] = df['Commission_Rate'] * df['Sales']
print(df)
# Define a list to keep all the results from each simulation that we want to analyze
all_stats = []
# Loop through many simulations
for i in range(num_simulations):
# Choose random inputs for the sales targets and percent to target
sales_target = np.random.choice(sales_target_values, num_reps, p=sales_target_prob)
pct_to_target = np.random.normal(avg, std_dev, num_reps).round(2)
# Build the dataframe based on the inputs and number of reps
df = pd.DataFrame(index=range(num_reps), data={'Pct_To_Target': pct_to_target,
'Sales_Target': sales_target})
# Back into the sales number using the percent to target rate
df['Sales'] = df['Pct_To_Target'] * df['Sales_Target']
# Determine the commissions rate and calculate it
df['Commission_Rate'] = df['Pct_To_Target'].apply(calc_commission_rate)
df['Commission_Amount'] = df['Commission_Rate'] * df['Sales']
#print(df)
# We want to track sales,commission amounts and sales targets over all the simulations
all_stats.append([df['Sales'].sum().round(0),
df['Commission_Amount'].sum().round(0),
df['Sales_Target'].sum().round(0)])
results_df = pd.DataFrame.from_records(all_stats, columns=['Sales',
'Commission_Amount',
'Sales_Target'])
results_df.describe().style.format('{:,}')
print(results_df)
results_df['Commission_Amount'].plot(kind='hist', title="Total Commission Amount")
plt.savefig('graph.png')
# results_df['Sales'].plot(kind='hist')
# plt.savefig('graph2.png')
print(results_df)
I'd like to add a function that checks if the output distribution is a Gaussian (normal) distribution , because I am not sure that it actually is at every running.

Momentum portfolio(trend following) quant simulation on pandas

I am trying to construct trend following momentum portfolio strategy based on S&P500 index (momthly data)
I used Kaufmann's fractal efficiency ratio to filter out whipsaw signal
(http://etfhq.com/blog/2011/02/07/kaufmans-efficiency-ratio/)
I succeeded in coding, but it's very clumsy, so I need advice for better code.
Strategy
Get data of S&P 500 index from yahoo finance
Calculate Kaufmann's efficiency ratio on lookback period X (1 , if close > close(n), 0)
Averages calculated value of 2, from 1 to 12 time period ---> Monthly asset allocation ratio, 1-asset allocation ratio = cash (3% per year)
I am having a difficulty in averaging 1 to 12 efficiency ratio. Of course I know that it can be simply implemented by for loop and it's very easy task, but I failed.
I need more concise and refined code, anybody can help me?
a['meanfractal'] bothers me in the code below..
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import pandas_datareader.data as web
def price(stock, start):
price = web.DataReader(name=stock, data_source='yahoo', start=start)['Adj Close']
return price.div(price.iat[0]).resample('M').last().to_frame('price')
a = price('SPY','2000-01-01')
def fractal(a,p):
a['direction'] = np.where(a['price'].diff(p)>0,1,0)
a['abs'] = a['price'].diff(p).abs()
a['volatility'] = a.price.diff().abs().rolling(p).sum()
a['fractal'] = a['abs'].values/a['volatility'].values*a['direction'].values
return a['fractal']
def meanfractal(a):
a['meanfractal']= (fractal(a,1).values+fractal(a,2).values+fractal(a,3).values+fractal(a,4).values+fractal(a,5).values+fractal(a,6).values+fractal(a,7).values+fractal(a,8).values+fractal(a,9).values+fractal(a,10).values+fractal(a,11).values+fractal(a,12).values)/12
a['portfolio1'] = (a.price/a.price.shift(1).values*a.meanfractal.shift(1).values+(1-a.meanfractal.shift(1).values)*1.03**(1/12)).cumprod()
a['portfolio2'] = ((a.price/a.price.shift(1).values*a.meanfractal.shift(1).values+1.03**(1/12))/(1+a.meanfractal.shift(1))).cumprod()
a=a.dropna()
a=a.div(a.ix[0])
return a[['price','portfolio1','portfolio2']].plot()
print(a)
plt.show()

You could simplify further by storing the values corresponding to p in a DF rather than computing for each series separately as shown:
def fractal(a, p):
df = pd.DataFrame()
for count in range(1,p+1):
a['direction'] = np.where(a['price'].diff(count)>0,1,0)
a['abs'] = a['price'].diff(count).abs()
a['volatility'] = a.price.diff().abs().rolling(count).sum()
a['fractal'] = a['abs']/a['volatility']*a['direction']
df = pd.concat([df, a['fractal']], axis=1)
return df
Then, you could assign the repeating operations to a variable which reduces the re-computation time.
def meanfractal(a, l=12):
a['meanfractal']= pd.DataFrame(fractal(a, l)).sum(1,skipna=False)/l
mean_shift = a['meanfractal'].shift(1)
price_shift = a['price'].shift(1)
factor = 1.03**(1/l)
a['portfolio1'] = (a['price']/price_shift*mean_shift+(1-mean_shift)*factor).cumprod()
a['portfolio2'] = ((a['price']/price_shift*mean_shift+factor)/(1+mean_shift)).cumprod()
a.dropna(inplace=True)
a = a.div(a.ix[0])
return a[['price','portfolio1','portfolio2']].plot()
Resulting plot obtained:
meanfractal(a)
Note: If speed is not a major concern, you could perform the operations via the built-in methods present in pandas instead of converting them into it's corresponding numpy array values.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Questions on pandas moving average - python

Related

How to create a column that returns the slope of a moving average?

How to calculate daily evapotranspiration by hargreaves-samani equation and using python?

Own RSI calculation divers from Altcoin Tradingview RSI ... Does anyone know why?

How to implement a normality-check function in python?

Momentum portfolio(trend following) quant simulation on pandas

Categories

Resources