expectation and variance of future stock price under binary tree - python

Probably a over-simplified model for stock price: on each day, the price will go up by a factor 1.05 with probability 0.6 or will go down to 1/1.05 with probability 0.4. So this is a non-symmetrical binary tree. How can I analytically calculate the expectation and variance of this stock price on future date, say day 100. Also, is there any module in python to handle binary tree model like this? appreciate code to implement this.
Best regards

import random as r
s = 100 # starting value
^^Initial conditions. Simulating one day on the stock market:
def day(stock_value): #One day in the stock market
k = r.uniform(0,1)
if k < 0.6:
output = 1.05*stock_value
else:
output = stock_value/1.05
return(output)
Simulating 100 days on the stock market:
for j in range(100): #simulates 100 days in the stock market
s = day(s)
print(s)
Simulating 100 days 1000 times:
data = []
for i in range(1000):
s = [100]
for j in range(100):
s.append(day(s[j]))
data.append(s)
Converting the data to only consider the last day:
def mnnm(mat): #Makes an mxn matrix into an nxm matrix
out = []
for j in range(len(mat[0])):
out.append([])
for j in range(len(mat[0])):
for m in range(len(mat)):
out[j].append(mat[m][j])
return(out)
data = mnnm(data)
data = data[-1]
Taking a mean average:
def lst_avg(lst): #Returns the average of a list
output = 0
for j in range(len(lst)):
output+= lst[j]/len(lst)
return(output)
mean = lst_avg(data)
Variance:
import numpy as np
for h in range(len(data)):
data[h] = data[h]**2
mean_square = lst_avg(data)
variance = np.fabs(mean_square - mean**2)

The theoretical value after 1 day is (assuming value on day 0 is A)
A * 0.6 * 1.05 + 100 * 0.4/1.05
And after 100 days it's
A * (0.603 + 0.380952...)**100 so...
(in the following I use 1 as stock price on day 0.)
p1 = 0.6
p2 = 0.4
x1 = 1.05
x2 = 1/1.05
initial_value = 1
no_of_days = 100
# 1 day
expected_value_after_1_day = initial_value * ( p1*x1 + p2*x2)
print (expected_value_after_1_day, 'is the expected value of price after 1 day')
ex_squared_value_1_day = initial_value * (p1*x1**2 + p2*x2**2)
# variance can be calculated as follows
variance_day_1 = ex_squared_value_1_day - expected_value_after_1_day**2
# or an alternative calculation, summing the squares of the differences from the mean
alt_variance_day_1 = p1 * (x1 - expected_value_after_1_day) ** 2 + p2 * (x2 - expected_value_after_1_day) ** 2
print ('Variance after one day is', variance_day_1)
# 100 days
expected_value_n_days = initial_value * (p1*x1 + p2*x2) ** no_of_days
ex_squared_value_n_days = initial_value * (p1*x1**2 + p2*x2**2) ** no_of_days
ex_value_n_days_squared = expected_value_n_days ** 2
variance_n_days = ex_squared_value_n_days - ex_value_n_days_squared
print(expected_value_n_days, 'is the expected value of price after {} days'.format(no_of_days))
print(ex_squared_value_n_days, 'is the expected value of the square of the price after {} days'.format(no_of_days))
print(ex_value_n_days_squared, 'is the square of the expected value of the price after {} days'.format(no_of_days))
print(variance_n_days, 'is the variance after {} days'.format(no_of_days))
It probably looks a bit old-school, hope you don't mind!
Output
1.0109523809523808 is the expected value of price after 1 day
Variance after one day is 0.0022870748299321786
2.972144657651404 is the expected value of price after 100 days
11.046309656223373 is the expected value of the square of the price after 100 days
8.833643866005783 is the square of the expected value of the price after 100 days
2.2126657902175904 is the variance after 100 days

Related

inf answer when using while loops in python to calculate balances

I have to write a program that calculates Balkcom's and Brissie's balances using while-loop and then print the first year where Balkcom's balance surpasses Brissie's
The problem is that when I run the program it gives me inf as the answers for both balances, is there a way to fix this or is there another way to do it using while-loops?
Important info:
Balkcom initial deposit: 1
Brissie initial deposit: 100000
Balkcom interest rate: 5%
Brissie interest rate: 4%
#Define constant & variables
BALKCOM_INI_DEPOSIT = 1
BALKCOM_INT_RATE = 1.05
BRISSIE_INI_DEPOSIT = 100000
BRISSIE_INT_RATE = 104000
balkcom_balance = BALKCOM_INI_DEPOSIT * BALKCOM_INT_RATE
brissie_balance = BRISSIE_INI_DEPOSIT * BRISSIE_INT_RATE
year = 0
sys.set_int_max_str_digits(100000)
while balkcom_balance < brissie_balance:
balkcom_balance = balkcom_balance * BALKCOM_INT_RATE
brissie_balance = brissie_balance * BRISSIE_INT_RATE
year = year + 1
print("Year:", year, " " + "Balkcom balance:", balkcom_balance, "Brissie balance:", brissie_balance)

Find the maximum product sales with planning schedules

I'm working on a dynamic programming problem and actually, I'm not quite sure whether it is dynamic programming since moving average M is based on previous M. No need to consider the efficiency. The problem requires selling a product over T time periods and maximizing the total actual sale amount. The total number of products is N and I plan to sell some products over different periods n0,n1,⋯,nT−1 and ∑ni=N.
In conclusion, this question wants to find the most optimal schedule for n0,n1,⋯,nT−1 such that ∑ni=N, which maximizes the ∑Si.
And the actual sale amount Si are based on current moving average M and current ni.
Assume that α=0.001 and π=0.5
Initialize M=0. Then for i=0,1,…,T−1
Compute new Mi=⌈0.5∗(Mi+ni)⌉
At time i we sell Si = ⌈(1−α*M^πi)*ni⌉ products
Continue this process until the last time period. For example, assume we already know ni for all periods, the trading will be below
M = 0
T = 4
N = 10000
alpha = 1e-3
pi = 0.5
S = np.zeros(T,dtype='i')
n = np.array([5000,1000,2000,2000])
print(n)
total = 0
for i in range(T):
M = math.ceil(0.5*(M + n[i]))
S[i] = math.ceil((1 - alpha*M**pi)*n[i])
total += S[i]
print('at time %d, M = %d and we sell %d products' %(i,M,S[i]))
print('total sold =', total)
My idea is to keep track of the state based on t time period, n products left, and m moving average as index and store the actual sale in a high dimension matrix. I think the upper bound for moving average is just [0,n] I'm still confusing how to program it. Could someone provide ideas about how to fix some problems in my programming? Thank you very much.
The below is some of my crude codes but the output is a little strange.
def DPtry(N,T,alpha,pi,S):
schedule = np.zeros(T)
M = 0
for n in range(0,N+1):
for m in range(0,n+1):
S[T-1,n,m] = math.ceil((1 - alpha*m**pi)*n)
for k in range(1,T):
t = T - k - 1
print("t = ",t)
for n in range(0,N+1):
for m in range(0,n+1):
best = -1
for plan in range(0,n+1):
salenow = math.ceil((1 - alpha*m**pi)*plan)
M = math.ceil(0.5*(m + plan))
salelater = S[t+1,n-plan,M]
candidate = salenow + salelater
if candidate > best:
best = candidate
S[t,n,m] = best
print(S[0,N,0])
N = 100
T = 5
pi = .5
alpha = 1e-3
S = np.zeros((T,N+1,N+1))
DPtry(N,T,alpha,pi,S)

Calculating monthly growth percentage from cumulative total growth

I am trying to calculate a constant for month-to-month growth rate from an annual growth rate (goal) in Python.
My question has arithmetic similarities to this question, but was not completely answered.
For example, if total annual sales for 2018 are $5,600,000.00 and I have an expected 30% increase for the next year, I would expect total annual sales for 2019 to be $7,280,000.00.
BV_2018 = 5600000.00
Annual_GR = 0.3
EV_2019 = (BV * 0.3) + BV
I am using the last month of 2018 to forecast the first month of 2019
Last_Month_2018 = 522000.00
Month_01_2019 = (Last_Month_2018 * CONSTANT) + Last_Month_2018
For the second month of 2019 I would use
Month_02_2019 = (Month_01_2019 * CONSTANT) + Month_01_2019
...and so on and so forth
The cumulative sum of Month_01_2019 through Month_12_2019 needs to be equal to EV_2019.
Does anyone know how to go about calculating the constant in Python? I am familiar with the np.cumsum function, so that part is not an issue. My problem is I cannot solve for the constant I need.
Thank you in advance and please do not hesitate to ask for further clarification.
More clarification:
# get beginning value (BV)
BV = 522000.00
# get desired end value (EV)
EV = 7280000.00
We are trying to get from BV to EV (which is a cumulative sum) by calculating the cumulative sum of the [12] monthly totals. Each monthly total will have a % increase from the previous month that is constant across months. It is this % increase that I want to solve for.
Keep in mind, BV is the last month of the previous year. It is from BV that our forecast (i.e., Months 1 through 12) will be calculated. So, I'm thinking that it makes sense to go from BV to the EV plus the BV. Then, just remove BV and its value from the list, giving us EV as the cumulative total of Months 1 through 12.
I imagine using this constant in a function like this:
def supplier_forecast_calculator(sales_at_cost_prior_year, sales_at_cost_prior_month, year_pct_growth_expected):
"""
Calculates monthly supplier forecast
Example:
monthly_forecast = supplier_forecast_calculator(sales_at_cost_prior_year = 5600000,
sales_at_cost_prior_month = 522000,
year_pct_growth_expected = 0.30)
monthly_forecast.all_metrics
"""
# get monthly growth rate
monthly_growth_expected = CONSTANT
# get first month sales at cost
month1_sales_at_cost = (sales_at_cost_prior_month*monthly_growth_expected)+sales_at_cost_prior_month
# instantiate lists
month_list = ['Month 1'] # for months
sales_at_cost_list = [month1_sales_at_cost] # for sales at cost
# start loop
for i in list(range(2,13)):
# Append month to list
month_list.append(str('Month ') + str(i))
# get sales at cost and append to list
month1_sales_at_cost = (month1_sales_at_cost*monthly_growth_expected)+month1_sales_at_cost
# append month1_sales_at_cost to sales at cost list
sales_at_cost_list.append(month1_sales_at_cost)
# add total to the end of month_list
month_list.insert(len(month_list), 'Total')
# add the total to the end of sales_at_cost_list
sales_at_cost_list.insert(len(sales_at_cost_list), np.sum(sales_at_cost_list))
# put the metrics into a df
all_metrics = pd.DataFrame({'Month': month_list,
'Sales at Cost': sales_at_cost_list}).round(2)
# return the df
return all_metrics
Let r = 1 + monthly_rate. Then, the problem we are trying to solve is
r + ... + r**12 = EV/BV. We can use numpy to get the numeric solution. This should be relatively fast in practice. We are solving a polynomial r + ... + r**12 - EV/BV = 0 and recovering monthly rate from r. There will twelve complex roots, but only one real positive one - which is what we want.
import numpy as np
# get beginning value (BV)
BV = 522000.00
# get desired end value (EV)
EV = 7280000.00
def get_monthly(BV, EV):
coefs = np.ones(13)
coefs[-1] -= EV / BV + 1
# there will be a unique positive real root
roots = np.roots(coefs)
return roots[(roots.imag == 0) & (roots.real > 0)][0].real - 1
rate = get_monthly(BV, EV)
print(rate)
# 0.022913299846925694
Some comments:
roots.imag == 0 may be problematic in some cases since roots uses a numeric algorithm. As an alternative, we can pick a root with the least imaginary part (in absolute value) among all roots with a positive real part.
We can use the same method to get rates for other time intervals. For example, for weekly rates, we can replace 13 == 12 + 1 with 52 + 1.
The above polynomial has a solution by radicals, as outlined here.
Update on performance. We could also frame this as a fixed point problem, i.e. to look for a fixed point of a function
x = EV/BV * x ** 13 - EV/BV + 1
The fix point x will be equal to (1 + rate)**13.
The following pure-Python implementation is roughly four times faster than the above numpy version on my machine.
def get_monthly_fix(BV, EV, periods=12):
ratio = EV / BV
r = guess = ratio
while True:
r = ratio * r ** (1 / periods) - ratio + 1
if abs(r - guess) < TOLERANCE:
return r ** (1 / periods) - 1
guess = r
We can make this run even faster with a help of numba.jit.
I am not sure if this works (tell me if it doesn't) but try this.
def get_value(start, end, times, trials=100, _amount=None, _last=-1, _increase=None):
#don't call with _amount, _last, or _increase! Only start, end and times
if _amount is None:
_amount = start / times
if _increase is None:
_increase = start / times
attempt = 1
for n in range(times):
attempt = (attempt * _amount) + attempt
if attempt > end:
if _last != 0:
_increase /= 2
_last = 0
_amount -= _increase
elif attempt < end:
if _last != 1:
_increase /= 2
_last = 1
_amount += _increase
else:
return _amount
if trials <= 0:
return _amount
return get_value(start, end, times, trials=trials-1,
_amount=_amount, _last=_last, _increase=_increase)
Tell me if it works.
Used like this:
get_value(522000.00, 7280000.00, 12)

Recursive function for CPI

Having a play around trying to better understand recursion. I want to make a function that shows the CPI increase for a particular year given a starting amount.
Assuming the starting amount is 100000 and CPI rate is 5%, then f(0) = 100000, f(1) = 5000, f(2) = 5250 etc.I want to return the CPIincrease column from below
rate 0.05
n TotalCPI CPIincrease
0 100000
1 105000 5000
2 110250 5250
3 115762.5 5512.5
4 121550.625 5788.125
5 127628.1563 6077.53125
6 134009.5641 6381.407812
7 140710.0423 6700.478203
8 147745.5444 7035.502113
9 155132.8216 7387.277219
10 162889.4627 7756.64108
So far I have the TotalCPI from column of the above table
def CPIincreases(n):
if n<=0:
return initial
else:
return (CPIincreases(n-1))*(1+CPIrate)
initial = 100000
CPIrate = 0.05
print(CPIincreases(1),CPIincreases(2),CPIincreases(3),CPIincreases(4))
output: 105000.0 110250.0 115762.5 121550.625
Now I'm lost. Because the output shows the I should be adding in
CPIincrease(n) - CPIincrease(n-1)
Somewhere.
Any Help greatly appreciated, even if its to say this function is not possible.
Cheers
The function you have created calculates the total value of the lump sum over time, and as you have pointed out, you can call it twice (once with year and once with year - 1) and take the difference to get your answer.
If you really want do this recursively in one go we need to think through the base cases:
Year 0: At the beginning there is no interest
return 0
Year 1: After the first year the change is just the initial amount times the interest rate
return initial * rate
Year 2+: From this year on, we make the same as last year, plus the interest on that interest
return last_year + rate * last_year
Or just: return last_year * (1 + rate)
Now we can put it all together:
def cpi_increase(year, initial, rate):
if year == 0:
return 0
if year == 1:
return initial * rate
return (1 + rate) * cpi_increase(year - 1, initial, rate)
If we print this out we can see the values about match up:
initial = 100000
rate = 0.05
for year in range(11):
print('{year:<5} {total:<21} {cpi_increase}'.format(
year=year,
total=initial * (1 + rate) ** year,
cpi_increase=cpi_increase(year, initial, rate)
))
The values:
0 100000.0 0
1 105000.0 5000.0
2 110250.0 5250.0
3 115762.50000000001 5512.5
4 121550.62500000003 5788.125
5 127628.15625000003 6077.53125
6 134009.56406250005 6381.407812500001
7 140710.04226562506 6700.478203125001
8 147745.5443789063 7035.502113281251
9 155132.8215978516 7387.2772189453135
10 162889.4626777442 7756.64107989258
Thinking through our base cases also shows how to create the direct calculation. At year y we have applied the (1 + rate) multiplication y - 1 times and the base (initial * rate) once. This gives us:
def cpi_increase_direct(year, initial, rate):
if year <= 0:
return 0
return initial * rate * (1 + rate) ** (year - 1)
I like how Jon's answers is more elaborate. Here's my code, I've tried to make variable names self explanatory but I'll briefly describe them as well.
total_cpi: first column
cpi_increase: 2nd column
cpi_rate: CPIrate
If we need to solve it in one recursive function, we can solve this problem only by using state variables:-
def calculate_cpi_increase(total_cpi, cpi_increase, year):
if year == 0:
return total_cpi, cpi_increase
else:
return calculate_cpi_increase(total_cpi*(1+cpi_rate), total_cpi*cpi_rate, year-1)
cpi_rate = 0.05
calculate_cpi_increase(100000, 0, 10)
result: (162889.46267774416, 7756.641079892579)
First of all you should not call a recursive function with all the values that you want to print. For example if you call F(2),F(3),F(4) and F(5) you would repeat the calculations for F(2) 4 times, as every other call needs this calculation.
You should not use global variables also, you can use my simple approach and encapsulate them in another function. In this code I generate not only one value, this generates the full table, a list of python tuples. Any tuple are two values, the value and the increment per iteration. The full table is printed by another function.
def CPITable(n, initial = 100000, CPIrate = 0.05):
def CPITableRecurse(n):
if n<=0:
return [(initial,0)]
else:
CPI = CPITable(n-1)
inc = CPI[-1][0] * CPIrate
CPI.append((CPI[-1][0] + inc , inc ))
return CPI
return CPITableRecurse(n)
def printTable(table):
i = 0
for line in table:
print ( str(i) + " %5.2f %5.2f" % line)
i += 1
printTable(CPITable(6))
#output:
# 0 100000.00 0.00
# 1 105000.00 5000.00
# 2 110250.00 5250.00
# 3 115762.50 5512.50
# 4 121550.62 5788.12
# 5 127628.16 6077.53
# 6 134009.56 6381.41

Standard deviation of combinations of dices

I am trying to find stdev for a sequence of numbers that were extracted from combinations of dice (30) that sum up to 120. I am very new to Python, so this code makes the console freeze because the numbers are endless and I am not sure how to fit them all into a smaller, more efficient function. What I did is:
found all possible combinations of 30 dice;
filtered combinations that sum up to 120;
multiplied all items in the list within result list;
tried extracting standard deviation.
Here is the code:
import itertools
import numpy
dice = [1,2,3,4,5,6]
subset = itertools.product(dice, repeat = 30)
result = []
for x in subset:
if sum(x) == 120:
result.append(x)
my_result = numpy.product(result, axis = 1).tolist()
std = numpy.std(my_result)
print(std)
Note that D(X^2) = E(X^2) - E(X)^2, you can solve this problem analytically by following equations.
f[i][N] = sum(k*f[i-1][N-k]) (1<=k<=6)
g[i][N] = sum(k^2*g[i-1][N-k])
h[i][N] = sum(h[i-1][N-k])
f[1][k] = k ( 1<=k<=6)
g[1][k] = k^2 ( 1<=k<=6)
h[1][k] = 1 ( 1<=k<=6)
Sample implementation:
import numpy as np
Nmax = 120
nmax = 30
min_value = 1
max_value = 6
f = np.zeros((nmax+1, Nmax+1), dtype ='object')
g = np.zeros((nmax+1, Nmax+1), dtype ='object') # the intermediate results will be really huge, to keep them accurate we have to utilize python big-int
h = np.zeros((nmax+1, Nmax+1), dtype ='object')
for i in range(min_value, max_value+1):
f[1][i] = i
g[1][i] = i**2
h[1][i] = 1
for i in range(2, nmax+1):
for N in range(1, Nmax+1):
f[i][N] = 0
g[i][N] = 0
h[i][N] = 0
for k in range(min_value, max_value+1):
f[i][N] += k*f[i-1][N-k]
g[i][N] += (k**2)*g[i-1][N-k]
h[i][N] += h[i-1][N-k]
result = np.sqrt(float(g[nmax][Nmax]) / h[nmax][Nmax] - (float(f[nmax][Nmax]) / h[nmax][Nmax]) ** 2)
# result = 32128174994365296.0
You ask for a result of an unfiltered lengths of 630 = 2*1023, impossible to handle as such.
There are two possibilities that can be combined:
Include more thinking to pre-treat the problem, e.g. on how to sample only
those with sum 120.
Do a Monte Carlo simulation instead, i.e. don't sample all
combinations, but only a random couple of 1000 to obtain a representative
sample to determine std sufficiently accurate.
Now, I only apply (2), giving the brute force code:
N = 30 # number of dices
M = 100000 # number of samples
S = 120 # required sum
result = [[random.randint(1,6) for _ in xrange(N)] for _ in xrange(M)]
result = [s for s in result if sum(s) == S]
Now, that result should be comparable to your result before using numpy.product ... that part I couldn't follow, though...
Ok, if you are out after the standard deviation of the product of the 30 dices, that is what your code does. Then I need 1 000 000 samples to get roughly reproducible values for std (1 digit) - takes my PC about 20 seconds, still considerably less than 1 million years :-D.
Is a number like 3.22*1016 what you are looking for?
Edit after comments:
Well, sampling the frequency of numbers instead gives only 6 independent variables - even 4 actually, by substituting in the constraints (sum = 120, total number = 30). My current code looks like this:
def p2(b, s):
return 2**b * 3**s[0] * 4**s[1] * 5**s[2] * 6**s[3]
hits = range(31)
subset = itertools.product(hits, repeat=4) # only 3,4,5,6 frequencies
product = []
permutations = []
for s in subset:
b = 90 - (2*s[0] + 3*s[1] + 4*s[2] + 5*s[3]) # 2 frequency
a = 30 - (b + sum(s)) # 1 frequency
if 0 <= b <= 30 and 0 <= a <= 30:
product.append(p2(b, s))
permutations.append(1) # TODO: Replace 1 with possible permutations
print numpy.std(product) # TODO: calculate std manually, considering permutations
This computes in about 1 second, but the confusing part is that I get as a result 1.28737023733e+17. Either my previous approaches or this one has a bug - or both.
Sorry - not that easy: The sampling is not of the same probability - that is the problem here. Each sample has a different number of possible combinations, giving its weight, which has to be considered before taking the std-deviation. I have drafted that in the code above.

Categories