I am trying to calculate g(x_(i+2)) from the value g(x_(i+1)) and g(x_i), i is an integer, assuming I(x) and s(x) are Gaussian function. If we know x_i = 100, then the summation from 0 to 100, I don't know how to handle g(x_i) with the subscript in python, knowing the first and second value, we can find the third value, after n cycle, we can find the nth value.
Equation:
code:
import numpy as np
from matplotlib import pyplot as p
from math import pi
def f_s(x, mu_s, sig_s):
ss = -np.power(x - mu_s, 2) / (2 * np.power(sig_s, 2))
return np.exp(ss) / (np.power(2 * pi, 2) * sig_s)
def f_i(x, mu_i, sig_i):
ii = -np.power(x - mu_i, 2) / (2 * np.power(sig_i, 2))
return np.exp(ii) / (np.power(2 * pi, 2) * sig_i)
# problems occur in this part
def g(x, m, mu_s, sig_s, mu_i, sig_i):
for i in range(1, m): # specify the number x, x_1, x_2, x_3 ......X_m
h = (x[i + 1] - x[i]) / e
for n in range(0, x[i]): # calculate summation
sum_f = (f_i(x[i], mu_i, sig_i) - f_s(x[i] - n, mu_s, sig_s) * g_x[n]) * np.conj(f_s(n +
x[i], mu_s, sig_s))
g_x[1] = 1 # initial value
g_x[2] = 5
g_x[i + 2] = h * sum_f + 2 * g_x[i + 1] - g_x[i]
return g_x[i + 2]
x = np.linspace(-10, 10, 10000)
e = 1
d = 0.01
m = 1000
mu_s = 2
sig_s = 1
mu_i = 1
sig_i = 1
p.plot(x, g(x, m, mu_s, sig_s, mu_i, sig_i))
p.legend()
p.show()
result:
I(x) and s(x)
Related
I'm having trouble with the slow computation of my Python code. Based on the pycallgraph below, the bottleneck seems to be the module named miepython.miepython.mie_S1_S2 (highlighted by pink), which takes 0.47 seconds per call.
The source code for this module is as follows:
import numpy as np
from numba import njit, int32, float64, complex128
__all__ = ('ez_mie',
'ez_intensities',
'generate_mie_costheta',
'i_par',
'i_per',
'i_unpolarized',
'mie',
'mie_S1_S2',
'mie_cdf',
'mie_mu_with_uniform_cdf',
)
#njit((complex128, float64, float64[:]), cache=True)
def _mie_S1_S2(m, x, mu):
"""
Calculate the scattering amplitude functions for spheres.
The amplitude functions have been normalized so that when integrated
over all 4*pi solid angles, the integral will be qext*pi*x**2.
The units are weird, sr**(-0.5)
Args:
m: the complex index of refraction of the sphere
x: the size parameter of the sphere
mu: array of angles, cos(theta), to calculate scattering amplitudes
Returns:
S1, S2: the scattering amplitudes at each angle mu [sr**(-0.5)]
"""
nstop = int(x + 4.05 * x**0.33333 + 2.0) + 1
a = np.zeros(nstop - 1, dtype=np.complex128)
b = np.zeros(nstop - 1, dtype=np.complex128)
_mie_An_Bn(m, x, a, b)
nangles = len(mu)
S1 = np.zeros(nangles, dtype=np.complex128)
S2 = np.zeros(nangles, dtype=np.complex128)
nstop = len(a)
for k in range(nangles):
pi_nm2 = 0
pi_nm1 = 1
for n in range(1, nstop):
tau_nm1 = n * mu[k] * pi_nm1 - (n + 1) * pi_nm2
S1[k] += (2 * n + 1) * (pi_nm1 * a[n - 1]
+ tau_nm1 * b[n - 1]) / (n + 1) / n
S2[k] += (2 * n + 1) * (tau_nm1 * a[n - 1]
+ pi_nm1 * b[n - 1]) / (n + 1) / n
temp = pi_nm1
pi_nm1 = ((2 * n + 1) * mu[k] * pi_nm1 - (n + 1) * pi_nm2) / n
pi_nm2 = temp
# calculate norm = sqrt(pi * Qext * x**2)
n = np.arange(1, nstop + 1)
norm = np.sqrt(2 * np.pi * np.sum((2 * n + 1) * (a.real + b.real)))
S1 /= norm
S2 /= norm
return [S1, S2]
Apparently, the source code is jitted by Numba so it should be faster than it actually is. The number of iterations in for loop in this function is around 25,000 (len(mu)=50, len(a)-1=500).
Any ideas on how to speed up this computation? Is something hindering the fast computation of Numba? Or, do you think the computation is already fast enough?
[More details]
In the above, another function _mie_An_Bn is being used. This function is also jitted, and the source code is as follows:
#njit((complex128, float64, complex128[:], complex128[:]), cache=True)
def _mie_An_Bn(m, x, a, b):
"""
Compute arrays of Mie coefficients A and B for a sphere.
This estimates the size of the arrays based on Wiscombe's formula. The length
of the arrays is chosen so that the error when the series are summed is
around 1e-6.
Args:
m: the complex index of refraction of the sphere
x: the size parameter of the sphere
Returns:
An, Bn: arrays of Mie coefficents
"""
psi_nm1 = np.sin(x) # nm1 = n-1 = 0
psi_n = psi_nm1 / x - np.cos(x) # n = 1
xi_nm1 = complex(psi_nm1, np.cos(x))
xi_n = complex(psi_n, np.cos(x) / x + np.sin(x))
nstop = len(a)
if m.real > 0.0:
D = _D_calc(m, x, nstop + 1)
for n in range(1, nstop):
temp = D[n] / m + n / x
a[n - 1] = (temp * psi_n - psi_nm1) / (temp * xi_n - xi_nm1)
temp = D[n] * m + n / x
b[n - 1] = (temp * psi_n - psi_nm1) / (temp * xi_n - xi_nm1)
xi = (2 * n + 1) * xi_n / x - xi_nm1
xi_nm1 = xi_n
xi_n = xi
psi_nm1 = psi_n
psi_n = xi_n.real
else:
for n in range(1, nstop):
a[n - 1] = (n * psi_n / x - psi_nm1) / (n * xi_n / x - xi_nm1)
b[n - 1] = psi_n / xi_n
xi = (2 * n + 1) * xi_n / x - xi_nm1
xi_nm1 = xi_n
xi_n = xi
psi_nm1 = psi_n
psi_n = xi_n.real
The example inputs are like the followings:
m = 1.336-2.462e-09j
x = 8526.95
mu = np.array([-1., -0.7500396, 0.46037385, 0.5988121, 0.67384093, 0.72468684, 0.76421644, 0.79175856, 0.81723714, 0.83962897, 0.85924182, 0.87641596, 0.89383665, 0.90708978, 0.91931481, 0.93067567, 0.94073113, 0.94961222, 0.95689496, 0.96467123, 0.97138347, 0.97791831, 0.98339434, 0.98870543, 0.99414948, 0.9975728 0.9989995, 0.9989995, 0.9989995, 0.9989995, 0.9989995,0.99899951, 0.99899951, 0.99899951, 0.99899951, 0.99899951, 0.99899951, 0.99899951, 0.99899951, 0.99899951, 0.99899952, 0.99899952,
0.99899952, 0.99899952, 0.99899952, 0.99899952, 0.99899952, 0.99899952, 0.99899952, 1. ])
I am focussing on _mie_S1_S2 since it appear to be the most expensive function on the provided example dataset.
First of all, you can use the parameter fastmath=True to the JIT to accelerate the computation if there is no values like +Inf, -Inf, -0 or NaN computed.
Then you can pre-compute some expensive expression containing divisions or implicit integer-to-float conversions. Note that (2 * n + 1) / n = 2 + 1/n and (n + 1) / n = 1 + 1/n. This can be useful to reduce the number of precomputed array but did not change the performance on my machine (this may change regarding the target architecture). Note also that such a precomputation have a slight impact on the result accuracy (most of the time negligible and sometime better than the reference implementation).
On my machine, this strategy make the code 4.5 times faster with fastmath=True and 2.8 times faster without.
The k-based loop can be parallelized using parallel=True and prange of Numba. However, this may not be always faster on all machines (especially the ones with a lot of cores) since the loop is pretty fast.
Here is the final code:
#njit((complex128, float64, float64[:]), cache=True, parallel=True)
def _mie_S1_S2_opt(m, x, mu):
nstop = int(x + 4.05 * x**0.33333 + 2.0) + 1
a = np.zeros(nstop - 1, dtype=np.complex128)
b = np.zeros(nstop - 1, dtype=np.complex128)
_mie_An_Bn(m, x, a, b)
nangles = len(mu)
S1 = np.zeros(nangles, dtype=np.complex128)
S2 = np.zeros(nangles, dtype=np.complex128)
factor1 = np.empty(nstop, dtype=np.float64)
factor2 = np.empty(nstop, dtype=np.float64)
factor3 = np.empty(nstop, dtype=np.float64)
for n in range(1, nstop):
factor1[n - 1] = (2 * n + 1) / (n + 1) / n
factor2[n - 1] = (2 * n + 1) / n
factor3[n - 1] = (n + 1) / n
nstop = len(a)
for k in nb.prange(nangles):
pi_nm2 = 0
pi_nm1 = 1
for n in range(1, nstop):
i = n - 1
tau_nm1 = n * mu[k] * pi_nm1 - (n + 1.0) * pi_nm2
S1[k] += factor1[i] * (pi_nm1 * a[i] + tau_nm1 * b[i])
S2[k] += factor1[i] * (tau_nm1 * a[i] + pi_nm1 * b[i])
temp = pi_nm1
pi_nm1 = factor2[i] * mu[k] * pi_nm1 - factor3[i] * pi_nm2
pi_nm2 = temp
# calculate norm = sqrt(pi * Qext * x**2)
n = np.arange(1, nstop + 1)
norm = np.sqrt(2 * np.pi * np.sum((2 * n + 1) * (a.real + b.real)))
S1 /= norm
S2 /= norm
return [S1, S2]
%timeit -n 1000 _mie_S1_S2_opt(m, x, mu)
On my machine with 6 cores, the final optimized implementation is 12 times faster with fastmath=True and 8.8 times faster without. Note that using similar strategies in other functions may also helps to speed up them.
Code:
from scipy.integrate import odeint
import numpy as np
import matplotlib.pyplot as plt
# parameters
S = 0.0001
M = 30.03
K = 113.6561
Vr = 58
R = 8.3145
T = 298.15
Q = 0.000133
Vp = 0.000022
Mr = 36
Pvap = 1400
wf = 0.001
tr = 1200
mass = 40000
# define t
time = 14400
t = np.arange(0, time + 1, 1)
# define initial state
Cv0 = (mass / Vp) * wf # Cv(0)
Cr0 = (mass / Vp) * (1 - wf)
Cair0 = 0 # Cair(0)
# define function and solve ode
def model(x, t):
C = x[0] # C is Cair(t)
c = x[1] # c is Cv(t)
a = Q + (K * S / Vr)
b = (K * S * M) / (Vr * R * T)
s = (K * S * M) / (Vp * R * T)
w = (1 - wf) * 1000
Peq = (c * Pvap) / (c + w * c * M / Mr)
Pair = (C * R * T) / M
dcdt = -s * (Peq - Pair)
if t <= tr:
dCdt = -a * C + b * Peq
else:
dCdt = -a * C
return [dCdt, dcdt]
x = odeint(model, [Cair0, Cv0], t)
C = x[:, 0]
c = x[:, 1]
Now, I want to figure out wf value when I know C(0)(when t is 0) and C(tr)(when t is tr)(Therefore I know two kind of t and C(t)).
I found some links(Curve Fit Parameters in Multiple ODE Function, Solving ODE with Python reversely, https://medium.com/analytics-vidhya/coronavirus-in-italy-ode-model-an-parameter-optimization-forecast-with-python-c1769cf7a511, https://kitchingroup.cheme.cmu.edu/blog/2013/02/18/Fitting-a-numerical-ODE-solution-to-data/) related to this, although I cannot get the hang of subject.
Can I fine parameter wf with two data((0, C(0)), (tr, C(tr)) and ode?
First, ODE solvers assume smooth right-hand-side functions. So the if t <= tr:... statement in your code isn't going to work. Two separate integrations must be done to deal with the discontinuity. Integrate to tf, then use the solution at tf as initial conditions to integrate beyond tf for the new ODE function.
But it seems like your main problem (solving for wf) only involves integrating to tf (not beyond), so we can ignore that issue when solving for wf
Now, I want to figure out wf value when I know C(0)(when t is 0) and C(tr)(when t is tr)(Therefore I know two kind of t and C(t)).
You can do a non-linear solve for wf:
from scipy.integrate import odeint
import numpy as np
import matplotlib.pyplot as plt
# parameters
S = 0.0001
M = 30.03
K = 113.6561
Vr = 58
R = 8.3145
T = 298.15
Q = 0.000133
Vp = 0.000022
Mr = 36
Pvap = 1400
mass = 40000
# initial condition for wf
wf_initial = 0.02
# define t
tr = 1200
t_eval = np.array([0, tr], np.float)
# define initial state. This is C(t = 0)
Cv0 = (mass / Vp) * wf_initial # Cv(0)
Cair0 = 0 # Cair(0)
init_cond = np.array([Cair0, Cv0],np.float)
# Definte the final state. This is C(t = tr)
final_state = 3.94926615e-03
# define function and solve ode
def model(x, t, wf):
C = x[0] # C is Cair(t)
c = x[1] # c is Cv(t)
a = Q + (K * S / Vr)
b = (K * S * M) / (Vr * R * T)
s = (K * S * M) / (Vp * R * T)
w = (1 - wf) * 1000
Peq = (c * Pvap) / (c + w * c * M / Mr)
Pair = (C * R * T) / M
dcdt = -s * (Peq - Pair)
dCdt = -a * C + b * Peq
return [dCdt, dcdt]
# define non-linear system to solve
def function(x):
wf = x[0]
x = odeint(model, init_cond, t_eval, args = (wf,), rtol = 1e-10, atol = 1e-10)
return x[-1,0] - final_state
from scipy.optimize import root
sol = root(function, np.array([wf_initial]), method='lm')
print(sol.success)
wf_solution = sol.x[0]
x = odeint(model, init_cond, t_eval, args = (wf_solution,), rtol = 1e-10, atol = 1e-10)
print(wf_solution)
print(x[-1])
print(final_state)
I have a cost function f(r, Q), which is obtained in the code below. The cost function f(r, Q) is a function of two variables r and Q. I want to plot the values of the cost function for all values of r and Q in the range given below and also find the global minimum value of f(r, Q).
The range of r and Q are respectively :
0 < r < 5000
5000 < Q < 15000
The plot should be in r, Q and f(r,Q) axis.
Code for the cost function:
from numpy import sqrt, pi, exp
from scipy import optimize
from scipy.integrate import quad
import numpy as np
mean, std = 295, 250
l = 7
m = 30
p = 15
w = 7
K = 100
c = 5
h = 0.001 # per unit per day
# defining Cumulative distribution function
def cdf(x):
cdf_eqn = lambda t: (1 / (std * sqrt(2 * pi))) * exp(-(((t - mean) ** 2) / (2 * std ** 2)))
cdf = quad(cdf_eqn, -np.inf, x)[0]
return cdf
# defining Probability density function
def pdf(x):
return (1 / (std * sqrt(2 * pi))) * exp(-(((x - mean) ** 2) / (2 * std ** 2)))
# getting the equation in place
def G(r, Q):
return K + c * Q \
+ w * (quad(cdf, 0, Q)[0] + quad(lambda x: cdf(r + Q - x) * cdf(x), 0, r)[0]) \
+ p * (mean * l - r + quad(cdf, 0, r)[0])
def CL(r, Q):
return (Q - r + mean * l - quad(cdf, 0, Q)[0]
- quad(lambda x: cdf(r + Q - x) * cdf(x), 0, r)[0]
+ quad(cdf, 0, r)[0]) / mean
def I(r, Q):
return h * (Q + r - mean * l - quad(cdf, 0, Q)[0]
- quad(lambda x: cdf(r + Q - x) * cdf(x), 0, r)[0]
+ quad(cdf, 0, r)[0]) / 2
def f(params):
r, Q = params
TC = G(r, Q)/CL(r, Q) + I(r, Q)
return TC
How to plot this function f(r,Q) in a 3D plot and also get the global minima or minimas and values of r and Q at that particular point.
Additionally, I already tried using scipy.optimize.minimize to minimise the cost function f(r, Q) but the problem I am facing is that, it outputs the results - almost same as the initial guess given in the parameters for optimize.minimize. Here is the code for minimizing the function:
initial_guess = [2500., 10000.]
result = optimize.minimize(f, initial_guess, bounds=[(1, 5000), (5000, 15000)], tol=1e-3)
print(result)
Output:
fun: 2712.7698818644253
hess_inv: <2x2 LbfgsInvHessProduct with dtype=float64>
jac: array([-0.01195986, -0.01273293])
message: b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'
nfev: 6
nit: 1
status: 0
success: True
x: array([ 2500.01209628, 10000.0127784 ])
The output x: array([ 2500.01209628, 10000.0127784 ]) - Which I doubt is the real answer and also it is almost same as the initial guess provided. Am I doing anything wrong in minimizing or is there any other way to do it? So I want to plot the cost function and look around for myself.
It could be great if I can have an interactive plot to play around with
My answer is concerned only with plotting but in the end I'll comment on the issue of minimax.
For what you need a 3D surface plot is, imho, overkill, I'll show you instead show the use of contourf and contour to have a good idea of what is going on with your function.
First, the code — key points:
your code, as is, cannot be executed in a vector context, so I wrote an explicit loop to compute the values,
due to Matplotib design, the x axis of matrix data is associated on columns, this has to be accounted for,
the results of the countour and contourf must be saved because they are needed for the labels and the color bar, respectively,
no labels or legends because I don't know what you are doing.
That said, here it is the code
import matplotlib.pyplot as plt
import numpy as np
from numpy import sqrt, pi, exp
from scipy.integrate import quad
mean, std = 295, 250
l, m, p = 7, 30, 15
w, K, c = 7, 100, 5
h = 0.001 # per unit per day
# defining Cumulative distribution function
def cdf(x):
cdf_eqn = lambda t: (1 / (std * sqrt(2 * pi))) * exp(-(((t - mean) ** 2) / (2 * std ** 2)))
cdf = quad(cdf_eqn, -np.inf, x)[0]
return cdf
# defining Probability density function
def pdf(x):
return (1 / (std * sqrt(2 * pi))) * exp(-(((x - mean) ** 2) / (2 * std ** 2)))
# getting the equation in place
def G(r, Q):
return K + c * Q \
+ w * (quad(cdf, 0, Q)[0] + quad(lambda x: cdf(r + Q - x) * cdf(x), 0, r)[0]) \
+ p * (mean * l - r + quad(cdf, 0, r)[0])
def CL(r, Q):
return (Q - r + mean * l - quad(cdf, 0, Q)[0]
- quad(lambda x: cdf(r + Q - x) * cdf(x), 0, r)[0]
+ quad(cdf, 0, r)[0]) / mean
def I(r, Q):
return h * (Q + r - mean * l - quad(cdf, 0, Q)[0]
- quad(lambda x: cdf(r + Q - x) * cdf(x), 0, r)[0]
+ quad(cdf, 0, r)[0]) / 2
# pulling it all together
def f(r, Q):
TC = G(r, Q)/CL(r, Q) + I(r, Q)
return TC
nr, nQ = 6, 11
r = np.linspace(0, 5000, nr)
Q = np.linspace(5000, 15000, nQ)
z = np.zeros((nr, nQ)) # r ←→ y, Q ←→ x
for i, ir in enumerate(r):
for j, jQ in enumerate(Q):
z[i, j] = f(ir, jQ)
print('%2d: '%i, ','.join('%8.3f'%v for v in z[i]))
fig, ax = plt.subplots()
cf = plt.contourf(Q, r, z)
cc = plt.contour( Q, r, z, colors='k')
plt.clabel(cc)
plt.colorbar(cf, orientation='horizontal')
ax.set_aspect(1)
plt.show()
and here the results of its execution
$ python cost.py
0: 4093.654,3661.777,3363.220,3120.073,2939.119,2794.255,2675.692,2576.880,2493.283,2426.111,2359.601
1: 4072.865,3621.468,3315.193,3068.710,2887.306,2743.229,2626.065,2528.934,2447.123,2381.802,2316.991
2: 4073.852,3622.443,3316.163,3069.679,2888.275,2744.198,2627.035,2529.905,2448.095,2382.775,2317.965
3: 4015.328,3514.874,3191.722,2939.397,2758.876,2618.292,2505.746,2413.632,2336.870,2276.570,2216.304
4: 3881.198,3290.628,2947.273,2694.213,2522.845,2394.095,2293.867,2213.651,2148.026,2098.173,2047.140
5: 3616.675,2919.726,2581.890,2352.015,2208.814,2106.289,2029.319,1969.438,1921.555,1887.398,1849.850
$
I can add that global minimum and global maximum are in the corners, while there are two sub-horizontal lines of local minima (lower line) and local maxima (upper line) in the approximate regions r ≈ 1000 and r ≈ 2000.
import math
import numpy as np
S0 = 100.; K = 100.; T = 1.0; r = 0.05; sigma = 0.2
M = 100; dt = T / M; I = 500000
S = np.zeros((M + 1, I))
S[0] = S0
for t in range(1, M + 1):
z = np.random.standard_normal(I)
S[t] = S[t - 1] * np.exp((r - 0.5 * sigma ** 2) * dt + sigma *
math.sqrt(dt) * z)
C0 = math.exp(-r * T) * np.sum(np.maximum(S[-1] - K, 0)) / I
print ("European Option Value is ", C0)
It gives a value of around 10.45 as you increase the number of simulations, but using the B-S formula the value should be around 10.09. Anybody know why the code isn't giving a number closer to the formula?
I'm writing the prorgram on python that can approximate time series by sin waves.
The program uses DFT to find sin waves, after that it chooses sin waves with biggest amplitudes.
Here's my code:
__author__ = 'FATVVS'
import math
# Wave - (amplitude,frequency,phase)
# This class was created to sort sin waves:
# - by anplitude( set freq_sort=False)
# - by frequency (set freq_sort=True)
class Wave:
#flag for choosing sort mode:
# False-sort by amplitude
# True-by frequency
freq_sort = False
def __init__(self, amp, freq, phase):
self.freq = freq #frequency
self.amp = amp #amplitude
self.phase = phase
def __lt__(self, other):
if self.freq_sort:
return self.freq < other.freq
else:
return self.amp < other.amp
def __gt__(self, other):
if self.freq_sort:
return self.freq > other.freq
else:
return self.amp > other.amp
def __le__(self, other):
if self.freq_sort:
return self.freq <= other.freq
else:
return self.amp <= other.amp
def __ge__(self, other):
if self.freq_sort:
return self.freq >= other.freq
else:
return self.amp >= other.amp
def __str__(self):
s = "(amp=" + str(self.amp) + ",frq=" + str(self.freq) + ",phase=" + str(self.phase) + ")"
return s
def __repr__(self):
return self.__str__()
#Discrete Fourier Transform
def dft(series: list):
n = len(series)
m = int(n / 2)
real = [0 for _ in range(n)]
imag = [0 for _ in range(n)]
amplitude = []
phase = []
angle_const = 2 * math.pi / n
for w in range(m):
a = w * angle_const
for t in range(n):
real[w] += series[t] * math.cos(a * t)
imag[w] += series[t] * math.sin(a * t)
amplitude.append(math.sqrt(real[w] * real[w] + imag[w] * imag[w]) / n)
phase.append(math.atan(imag[w] / real[w]))
return amplitude, phase
#extract waves from time series
# series - time series
# num - number of waves
def get_waves(series: list, num):
amp, phase = dft(series)
m = len(amp)
waves = []
for i in range(m):
waves.append(Wave(amp[i], 2 * math.pi * i / m, phase[i]))
waves.sort()
waves.reverse()
waves = waves[0:num]#extract best waves
print("the program found the next %s sin waves:"%(num))
print(waves)#print best waves
return waves
#approximation by sin waves
#series - time series
#num- number of sin waves
def sin_waves_appr(series: list, num):
n = len(series)
freq = get_waves(series, num)
m = len(freq)
model = []
for i in range(n):
summ = 0
for j in range(m): #sum by sin waves
summ += freq[j].amp * math.sin(freq[j].freq * i + freq[j].phase)
model.append(summ)
return model
if __name__ == '__main__':
import matplotlib.pyplot as plt
N = 500 # length of time series
num = 2 # number of sin wawes, that we want to find
#y - generate time series
y = [2 * math.sin(0.05 * t + 0.5) + 0.5 * math.sin(0.2 * t + 1.5) for t in range(N)]
model = sin_waves_appr(y, num) #generate approximation model
## ------------------plotting-----------------
plt.figure(1)
# plotting of time series and his approximation model
plt.subplot(211)
h_signal, = plt.plot(y, label='source timeseries')
h_model, = plt.plot(model, label='model', linestyle='--')
plt.legend(handles=[h_signal, h_model])
plt.grid()
# plotting of spectre
amp, _ = dft(y)
xaxis = [2*math.pi*i / N for i in range(len(amp))]
plt.subplot(212)
h_freq, = plt.plot(xaxis, amp, label='spectre')
plt.legend(handles=[h_freq])
plt.grid()
plt.show()
But I've got a strange result:
In the program I've created a time series from two sin waves:
y = [2 * math.sin(0.05 * t + 0.5) + 0.5 * math.sin(0.2 * t + 1.5) for t in range(N)]
And my program found wrong parameters of the sin waves:
the program found the next 2 sin waves:
[(amp=0.9998029885151699,frq=0.10053096491487339,phase=1.1411803525843616), (amp=0.24800925225626422,frq=0.40212385965949354,phase=0.346757128184013)]
I suppuse, that my problem is wrong scaling of wave parameters, but I'm not sure.
There're two places, where the program does scaling. The first place is creating of waves:
for i in range(m):
waves.append(Wave(amp[i], 2 * math.pi * i / m, phase[i]))
And the second place is sclaling of the x-axis:
xaxis = [2*math.pi*i / N for i in range(len(amp))]
But my suppose may be wrong. I've tried to change scaling many times, and it haven't solved my problem.
What may be wrong with the code?
So, these lines I believe are wrong:
for t in range(n):
real[w] += series[t] * math.cos(a * t)
imag[w] += series[t] * math.sin(a * t)
amplitude.append(math.sqrt(real[w] * real[w] + imag[w] * imag[w]) / n)
phase.append(math.atan(imag[w] / real[w]))
I believe it should be dividing by m instead of n, since you are only working with computing half the points. That will fix the amplitude problem. Also, the computation of imag[w] is missing a negative sign. Taking into account the atan2 fix, it would look like:
for t in range(n):
real[w] += series[t] * math.cos(a * t)
imag[w] += -1 * series[t] * math.sin(a * t)
amplitude.append(math.sqrt(real[w] * real[w] + imag[w] * imag[w]) / m)
phase.append(math.atan2(imag[w], real[w]))
The next one is here:
for i in range(m):
waves.append(Wave(amp[i], 2 * math.pi * i / m, phase[i]))
The divide by m is not right. amp has only half the points it should, so using the length of amp isn't right here. It should be:
for i in range(m):
waves.append(Wave(amp[i], 2 * math.pi * i / (m * 2), phase[i]))
Finally, your model reconstruction has a problem:
for j in range(m): #sum by sin waves
summ += freq[j].amp * math.sin(freq[j].freq * i + freq[j].phase)
It should use cosine instead (sine introduces a phase shift):
for j in range(m): #sum by cos waves
summ += freq[j].amp * math.cos(freq[j].freq * i + freq[j].phase)
When I fix all of that, the model and the DFT both make sense: