How to implement exponential smoothing manually with Python? - python

This is my first question here and I'm also new to Python (without a CS background, I must add) as well!
I'm trying to implement triple exponential smoothing to make predictions. My data is based on AIS data and I'm focusing on SOG (Speed Over Ground) values specifically. Mathematical approach that I'm following is the Triple Exponential Smoothing Model.
I've still only followed the basics of Python and I'm struggling to figure out the iteration part. What I expect, however, is to read data from a CSV (which includes Time and SOG) and forecast the Speed values, so I can compare the predicted and real values.
Here is the example/test data table that I'm using atm.
I tried coding the equation part (shown below) and I know it is beyond sloppy. But I didn't want to come here without anything.
alpha = 0.9
m = 3
def test(ssv_current, x_current, ssv_previous, dsv_previous, tsv_previous):
# ssv = single smoothing value (s'(t-1) and s'(t))
ssv_current = (alpha * x_current) + ((1 - alpha) * ssv_previous)
# dsv = double smoothing value (s''(t-1) and s''(t))
dsv_current = (alpha * ssv_current) + ((1 - alpha) * dsv_previous)
# tsv = triple smoothing value (s'''(t-1) and s'''(t))
tsv_current = (alpha * dsv_current) + ((1 - alpha) * tsv_previous)
at = (3 * ssv_current) - (3 * dsv_current) + tsv_current
bt = ((alpha ** 2) / (2 * ((1 - alpha) ** 2))) * (((6 - 5 * alpha) * ssv_current) - ((10 - 8 * alpha) * dsv_current)
+ ((4 - 3 * alpha) * tsv_current))
ct = ((alpha ** 2) / ((1 - alpha) ** 2)) * (ssv_current - (2 * dsv_current) + tsv_current)
ft = at + (m * bt) + (0.5 * (m ** 2) * ct) # mth predicted value at time t
I know both my question and piece of code seem trash, but I look forward to learning from this community. I've only worked with MatLab before and any tip here would really help me.
TIA!
EDIT: I realized my post does not convey what I really want. Basically, I want the code to read through the speed values one-by-one and iterate through it and print the predicted value.

A very basic iterator would be
import csv
datafile = open('datafile.csv', 'r')
csv_file = csv.reader(datafile)
for row in csv_file:
print(row)
Each 'row' item would have the data
Refer: CSV Library reference
You could do the same with pandas as well.
import pandas as pd
df = pd.read_csv('datafile.csv')
Now, you dont need to iterate. Just do calculations using entire columns at once and pandas will create those results.
e.g.
df['total'] = df['a'] + df['b']
Just like that
Refer: Pandas

Related

Implement a system of stochastic ODEs using python

I want to add noise on a system of ODEs from (Ramos & al. 2021) (Kind of SIR model)
system
I implemented the Milstein scheme on some relevant equation
# Create Brownian Motion
np.random.seed(1)
dS = np.sqrt(dt) * np.random.randn(tmax)
dE=np.sqrt(dt) * np.random.randn(tmax)
dI=np.sqrt(dt) * np.random.randn(tmax)
dIu=np.sqrt(dt) * np.random.randn(tmax)
dDu=np.sqrt(dt) * np.random.randn(tmax)
dHR=np.sqrt(dt) * np.random.randn(tmax)
dHD=np.sqrt(dt) * np.random.randn(tmax)
dB=[dS,dE,dI,dIu,dDu,dHR,dHD]
sigma=[0.5,0,0,0,0,0,0]
#Brief definition of systemf function evaluating the second term at time t for each variant i
for i in range(nvariants):
newe = S[0]*(mbetae[i]*E[i] + mbetai[i]*I[i] + mbetaiu[i]*Iu[i] + mbetahr[i]*HR[i] + mbetahd[i]*HD[i])/totalpop
newi = gammae * E[i]
newhid = gammai * I[i]
newhiu = gammaiu * Iu[i]
newr= gammahr * HR[i]
newd = gammahd*HD[i]
newq = gammaq * Q[i]
neweS = neweS + newe
fE[i] = newe-newi
fI[i] = newi - newhid
fIu[i]= (1-theta[i]-omegau)*newhid - newhiu
fHR[i]= p[i]*(theta[i]-fatrate[i])*newhid - newr
fHD[i]= fatrate[i] *newhid - newd
fQ[i] = (1-p[i])*(theta[i]-fatrate[i])*newhid + newr - newq
fS[0] = - neweS -(vjRK[int(mt.floor(t))]) #fS.insert(0,-neweS -(vjRK[mt.floor(t)]))
return [fS, fE, fI, fIu, fHR, fHD, fQ]
for t in range(delayini,tmax-1):
fsyseval=systemf(t,states[t],beta[2*t],gamma[2*t],frate[2*t],theta[2*t],p[2*t],omegau[2*t],vjsum)
#Running the scheme
for s in range(numstates):
for i in range(nvariants):
states[t+1][s][i] =states[t][s][i]+fsyseval[s][i]*dt+sigma[s]*dB[s][t]*states[t][s][i] + 0.5*sigma[s]**2 * states[t][s][i] * (dB[s][t] ** 2 - dt)
The problem is when I plot the results of each variable (susceptible-infected ....) the result is very strange and have nothing to do with the deterministic model (I see no fluctuations and the shape is not even close to deterministic one) which is illogic. so, I thought that maybe I didn't implement well the stochastic scheme and I missed something.
Now I want to know if my implementation of stochasticity is correct (if yes why the results show no fluctuation despite the high level of noise)
If no, how can I add the stochastic part correctly ?
I thank you for advance for your help

How do i incorporate this formula into my python program?

I am new to Python and trying to figure a way to input numbers to a mathematical formula.
Mathematical formula as shown:
I have developed based on the above formula, but not fully sure why it cant work.
def cumulative_function():
alpha = 0.05
fx_value = 5
variable_a = -100
k = 2
cdf_formula = alpha * (fx_value * (a) + fx_value * (k - 2 * (alpha) + fx * (k - alpha) + fx_value * (value_x)))
return cdf_formula
cdf = cumulative_function()
print (f"value is: {cdf}")
*EDIT:
How do I insert the infinity ('...') in the formula to my program?
Would appreciate anyone who can help me out:)
fx(a) is not fx *(a) but actually a function to which you are passing a value 'a'
for example let fx()=x*2+x
then fx(a) will be a*2+a
fx(5) will be 5*2+5=10+15

Python - Multiprocessing with multiple for-loops

I know there are other questions asked concerning this topic so I'm sorry I have to ask it again, but I cannot get it to work since I'm quite new to this topic.
I have four for-loop (nested) in which certain algbraic calculations are done (matrix operations for example). These calculations take too much time to complete, so I was hoping I could speed this up with Multiprocessing.
The code is given below. I simulated the ranges and matrix sizes here, but in my code these ranges are really used (so it's not strange that it takes so long). You should be able to run it directly when copy-paste the code.
import numpy as np
from scipy.linalg import fractional_matrix_power
import math
#Lists for the loop (and one value)
x_list = np.arange(0, 32, 1)
y_list = np.arange(0, 32, 1)
a_list = np.arange(0, 501, 1)
b_list = np.arange(0, 501, 1)
c_list = np.arange(0, 64, 1)
d_number = 32
#Matrices
Y = np.arange(2048).reshape(32, 64)
g = np.asmatrix(np.empty([d_number, 1], dtype=np.complex_))
A = np.empty([len(a_list), len(b_list), len(c_list)], dtype=np.complex_)
A_in = np.empty([len(a_list), len(b_list)], dtype=np.complex_)
for ai in range(len(a_list)):
for bi in range(len(b_list)):
for ci in range(len(c_list)):
f_k_i = c_list[ci]
X_i = np.asmatrix([Y[:, ci]]).T
for di in range(d_number):
r = math.sqrt((x_list[di] - a_list[ai])**2 + (y_list[di] - b_list[bi])**2 + 63**2)
g[di, 0] = np.exp(-2 * np.pi * 1j * f_k_i * (r / 8)) / r #g is a vector
A[-bi, -ai, ci] = ((1 / np.linalg.norm(g)**2) * (((g.conj().T * fractional_matrix_power((X_i * X_i.conj().T), (1/5)) * g) / np.linalg.norm(g)**2)**2)).item(0)
A_in[-bi, -ai] = (1 / len(c_list)) * sum(A[-bi, -ai, :])
What is the best way to approach this? If multiprocessing is the solution, how to implement this for my case (since I couldn't figure that out).
Thanks in advance.
One way to approach it would be to move the two inside loops into a function taking ai and bi as parameters and returning the indexes and the result. Then use multiprocessing.Pool.imap_unordered() to run the function on ai, bi pairs. Something like this (untested):
def partial_calc(index):
"""
This function replaces the inner two loops to calculate the value of
A_in[-bi, -ai]. index is a tuple (ai, bi).
"""
ai, bi = index
for ci in range(len(c_list)):
f_k_i = c_list[ci]
X_i = np.asmatrix([Y[:, ci]]).T
for di in range(d_number):
r = math.sqrt((x_list[di] - a_list[ai])**2 + (y_list[di] - b_list[bi])**2 + 63**2)
g[di, 0] = np.exp(-2 * np.pi * 1j * f_k_i * (r / 8)) / r #g is a vector
A[-bi, -ai, ci] = ((1 / np.linalg.norm(g)**2) * (((g.conj().T * fractional_matrix_power((X_i * X_i.conj().T), (1/5)) * g) / np.linalg.norm(g)**2)**2)).item(0)
return ai, bi, (1 / len(c_list)) * sum(A[-bi, -ai, :])
def main():
with multiprocessing.Pool(None) as p:
# this replaces the outer two loops
indices = itertools.product(range(len(a_list)), range(len(b_list)))
partial_results = p.imap_unordered(partial_calc, indices)
for ai, bi, value in partial_results:
A_in[-bi, -ai] = value
#... do something with A_in ...
if __name__ == "__main__":
main()
Or put the inner three loops into the function and generate one "row" for A_in at a time. Profile it both ways and see which is faster.
The trick will be setting up the lists (a_list, b_list, etc) and the Y matrix. And that depends on their characteristics (constant, quickly/slowly calculated, large/small, etc).

Using fsolve to solve a set of equations where one is a sigma function that changes every iteration

I am new to python and I am working on a finance project to solve a set of equations that enables me to go from par spread to flat spread in terms of CDS.
I have a set of data for the upfront (U) and years (i), where to set the data sample, I name upfront with x and years in y
x = [-0.007,-0.01,-0.009,-0.004,0.005,0.011,0.018,0.027,0.037,0.048]
y = [1,2,3,4,5,6,7,8,9,10]
Here are the 3 equations that I am trying to solve together:
U = A(s(i)-c)
L(i) = 1 - (1 - (s(i) / (1 - R)) ** i) / (1 - (1 / (s(i-1) - R)) ** (i - 1))
A = sum([((1 - L(i)) / (1 + r)) ** j for j in range(1, i+1)])
Detailed explanation:
The goal is to solve and list the results for all 10 values of variable s
1st equation is used to calculate the upfront amount, where s is unknown
2nd equation is used to calculate the hazard rate L, where R is recovery rate, s(i) is the current s term and s(i-1) is the previous s term.
Visual representation of equation 2:
3rd equation is used to calculate the annual risky annuity, the purpose of this equation is to calculate and sum the risk annuities. For example, if i=1, then there should be one term in the equation. If i=2, then there should be 2 terms in the equation where they are summed. This repeats until the 10th iteration where there are 10 values and they are summed.
Visual representation of equation 3:
To attempt to solve the problem, I wrote the following code (which doesn't run yet):
x = [-0.007,-0.01,-0.009,-0.004,0.005,0.011,0.018,0.027,0.037,0.048]
y = [1,2,3,4,5,6,7,8,9,10]
c = 0.01
r = 0.01
R = 0.4
def eqs(s, U, t, c=0.01, r=0.01, R=0.4):
L = 1 - (1 - (s / (1 - R)) ** t) / (1 - (1 / (1 - R)) ** (t - 1))
A = sum([((1 - L) / (1 + r)) ** j for j in range(1, i+1)])
s = (U/A) + c
return L, A, s
for U, t in zip(x, y):
s = fsolve(eq1, 0.01, (U, t,))
print(s, U, t)
Main obstacles:
I haven't found a way where I can make Equation 3 work.
I also haven't been able to pass through 2 sets of values into the for loop that then calls the function
I wasn't able to loop the previous spread value, s(i-1), back into the iteration to compute the next value
I was able to solve it manually on python by changing the third equation every iteration and inputting the previous results
I am hoping I can find some solution to my problem, thank you for your help in advance!
It took me a bit but I think I got it. Your main problem is that you can't code formulas which describe a complex problem, then call a 'magic' fsolve function and hoping that python will solve it for you, without even defining what is the unknown.
It doesn't work that way. You have to make your problem simple enough so that it can be solved with existing functions from some libraries. Python has no form of intelligence nor divination.
As I said in my comments, the fsolve() from scipy.optimise can only solve problems of the form f(x)=0.
If you want to use it, you have to transform your complex problem in a simple f(x)=0. problem.
Starting from your 3rd equation s = (U/A) + c we can deduce that s - (U/A) - c = 0
Given that A is a function of L and L is a function of s, if you define a function f(s)= s - (U/A) - c then s is the solution of f(s)=0.
It is what I did in the following code :
from scipy.optimize import fsolve
def Lambda(s,sold,R,t):
num = (1 - s / (1 - R)) ** t
den = (1 - sold / (1 - R)) ** (t - 1)
return 1-num/den
def Annuity(L,r,Aold,j):
return Aold + ((1 - L) / (1 + r)) ** j
def f(s,U, sold,R,t,r,Aold,j):
L=Lambda(s,sold,R,t)
A=Annuity(L,r,Aold,j)
return s - (U/A) - c
x = [-0.007,-0.01,-0.009,-0.004,0.005,0.011,0.018,0.027,0.037,0.048]
y = [1,2,3,4,5,6,7,8,9,10]
c = 0.01
r = 0.01
R = 0.4
sold=0.
Aold=0.
for n,(U, t) in enumerate(zip(x, y)):
j=n+1
print("j={},U={},t={}".format(j,U,t))
init = 0.01 # The starting estimate for the roots of f(s) = 0.
roots = fsolve(f,init,args=(U, sold,R,t,r,Aold,j))
s=roots[0]
L=Lambda(s,sold,R,t)
A=Annuity(L,r,Aold,j)
print("s={},L={},A={}".format(s,L,A))
print
sold=s
Aold=A
It gives following outputs :
j=1,U=-0.007,t=1
s=0.00289571337037,L=0.00482618895061,A=0.985320604999
j=2,U=-0.01,t=2
s=0.00485464221105,L=0.0113452406083,A=1.94349944361
j=3,U=-0.009,t=3
s=0.00685582655826,L=0.0180633847507,A=2.86243751076
j=4,U=-0.004,t=4
s=0.00892769166807,L=0.0251666093582,A=3.73027037175
j=5,U=0.005,t=5
s=0.0111024600844,L=0.0328696834011,A=4.53531159145
j=6,U=0.011,t=6
s=0.0120640333844,L=0.0280806661972,A=5.32937116379
j=7,U=0.018,t=7
s=0.0129604367831,L=0.0305170484121,A=6.08018387787
j=8,U=0.027,t=8
s=0.0139861021632,L=0.0351929301367,A=6.77353436882
j=9,U=0.037,t=9
s=0.0149883645118,L=0.0382416644539,A=7.41726068981
j=10,U=0.048,t=10
s=0.0159931206639,L=0.041597709395,A=8.00918297693
No idea if it's correct, but it looks likely to me. I guess you got the idea now and you will be able to make some adjustments.

spectrogram by fft using python

I am trying to understand a piece of code which in my opinion trying to apply filter first and then compute FFT.
I don't understand how it is doing that. Can anyone please explain that to me.
Here is the code:
# Parameters to create the spectrogram
N = 160000 # No. of frames in .wav file
K = 512
step = 4
wind = 0.5 * (1 - np.cos(np.array(range(K)) * 2 * np.pi / (K - 1))) # 0.5*2*sin(o/2), creation of filter window
ffts = []
def wav_to_floats(file):
s = wave.open(file, 'r')
str_sig = s.readframes(s.getnframes())
y = np.fromstring(str_sig, np.short)
s.close()
return y
for file_index in range(len(label)):
test_flag = label.iloc[file_index]['fold'] # 0 - training data, 1 - test data
fname = label.iloc[file_index]['filename']
#-------------from here i dont understand mainly------------
spectogram = []
s = wav_to_floats(essential_folder+'src_wavs/'+fname+'.wav')
for j in range(int((step*N/K) - step)):
vec = s[j * K/step : (j+step) * K/step] * wind
spectogram.append(abs(fft(vec, K)[:K / 2]))
ffts.append(np.array(spectogram))
First of all, it converts the file from wav to float( s = wav_to_floats(essential_folder+'src_wavs/'+fname+'.wav')`
, because to calculate fft you need a float number. After that, it does the convolution between the signal and the window(probably a windowed filter)
for j in range(int((step*N/K) - step)):
vec = s[j * K/step : (j+step) * K/step] * wind
takes the modulus of the fft (because fft gives to you a complex number which carries information about modulus and phase) and adds this vector to ffts

Categories