How do you recreate sigma notation using Numpy? - python

I need to plot the following function using Python, numpy and matplotlib:
for the values of N = 5, 20 and 60.
I've created a list of odd numbers using:
def odd(n):
nums = []
for i in range(1, 2*n, 2):
nums.append(i)
return nums
But I don't know how to use this in a sigma function because I need to vary my x values and sum over the function for the range of odd(n).

If you want to plot (i.e. visualise) the function for some N, then the procedure is as follows:
Generate an array of x values. In this case, ranging from -pi to pi makes most sense.
Write a loop that computes one sin() at a time, and sum the result in a different array, which we call Psi.
Finally multiply the Psi by the constant 2/(N+1).
Plot the result
import numpy as np
import matplotlib.pyplot as plt
# x is 100 equally spaced points from -pi to pi, inclusive
x = np.linspace(-np.pi, np.pi, 100)
Psi = 0*x # now Psi is an array of zeros
N = 60
# second input of range is N+1 since our index n satisfies 1 <= n < N+1
# third input makes n increment by 2 each loop instead of the default 1
for n in range(1, N+1, 2):
Psi += -1**((n-1)/2) * np.sin(n*x)
Psi *= 2/(N+1)
plt.plot(x, Psi)

Code without pure Python loops:
def Psi(x, N=7):
"""Note: N should be odd """
_s = np.arange(1, int((N + 1) / 2) + 1)
return 2 * np.sum(np.where(_s % 2, 1, -1) * np.sin((2 * _s - 1) * x)) / (N + 1)

This code is without loops and should work for any value of x and N.
x must be an array or list with more than 1 element
import numpy as np
from numpy import matlib
import matplotlib.pyplot as plt
def psi(x,N):
n=np.arange(0,N,2)+1
sigma = matlib.repmat((-1)**((n-1)/2),len(x),1).T*np.sin(matlib.repmat(n,len(x),1).T*x)
PSI = (2/(N+1))*np.sum(sigma,axis=0)
return PSI
x=np.linspace(0,2*np.pi,50)
N=5
y = psi(x,N)
plt.plot(y)

Related

How to plot curve with given polynomial coefficients?

using Python I have an array with coefficients from a polynomial, let's say
polynomial = [1,2,3,4]
which means the equation:
y = 4x³ + 3x² + 2x + 1
(so the array is in reversed order)
Now how do I plot this into a visual curve in the Jupyter Notebook?
There was a similar question:
Plotting polynomial with given coefficients
but I didn't understand the answer (like what is a and b?).
And what do I need to import to make this happen?
First, you have to decide the limits for x in your plot. Let's say x goes from -2 to 2. Let's also ask for a hundred points on our curve (this can be any sufficiently large number for your interval so that you get a smooth-looking curve)
Let's create that array:
lower_limit = -2
upper_limit = 2
num_pts = 100
x = np.linspace(lower_limit, upper_limit, num_pts)
Now, let's evaluate y at each of these points. Numpy has a handy polyval() that'll do this for us. Remember that it wants the coefficients ordered by highest exponent to lowest, so you'll have to reverse the polynomial list
poly_coefs = polynomial[::-1] # [4, 3, 2, 1]
y = np.polyval(poly_coefs, x)
Finally, let's plot everything:
plt.plot(x, y, '-r')
You'll need the following imports:
import numpy as np
from matplotlib import pyplot as plt
If you don't want to import numpy, you can also write vanilla python methods to do the same thing:
def linspace(start, end, num_pts):
step = (end - start) / (num_pts - 1)
return [start + step * i for i in range(num_pts)]
def polyval(coefs, xvals):
yvals = []
for x in xvals:
y = 0
for power, c in enumerate(reversed(coefs)):
y += c * (x ** power)
yvals.append(y)
return yvals

The normalized cross-correlation of two signals in python

I wanted to calculate the normalized cross-correlation function of two signals where "x" axes is the time delay and "y" axes is value of correlation between -1 and 1. so I decided to use scipy.
I use the command corr = signal.correlate(s1['Strain'], s2['Strain'], mode='full')
where s1['Strain'] and s2['Strain'] are the pandas dataframe values but it doesn't return the normalized function with "x" axes as time delay.
Here is example data
s1:
Strain
0 -1.587702e-22
1 -1.425868e-22
2 -1.174897e-22
3 -8.559119e-23
4 -4.949480e-23
. .
. .
. .
for s2 it looks similar. I knew the sampling of both datasets, it's 4096 kHz.
Thank for your help.
First of all to get normalized coefficient (such that as lag 0, we get the Pearson correlation):
divide both signals by their standard deviation
scale by the length of the signal over which the convolution is done (shortest signal)
out = correlate(x/np.std(x), y/np.std(y), 'full') / min(len(x), len(y))
Now for the lags, from the official documentation of correlate one can read that the full output of cross-correlation is given by:
z[k] = (x * y)(k - N + 1)
= \sum_{l=0}^{||x||-1}x_l y_{l-k+N-1}^{*}\]
Where * denotes the convolution, and k goes from 0 up to ||x|| + ||y|| - 2 precisely. N is max(len(x), len(y)).
The lags are denoted above as the argument of the convolution (x * y), so they range from 0 - N + 1 to ||x|| + ||y|| - 2 - N + 1 which is n - 1 with n=min(len(x), len(y)).
Also, by briefly looking at the source code, I think they swap x and y sometimes if convenient... (hence the min(len(x), len(y)) in the normalisation above. However this implies to change the start of our lags, therefore:
N = max(len(x), len(y))
n = min(len(x), len(y))
# if len(x) < (len(y):
lags = np.arange(-N + 1, n)
# else:
lags = np.arange(-n + 1, N)
Summary
Check this code on two time-series for which you want to plot the cross-correlation of:
import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import correlate
def plot_xcorr(x, y):
"Plot cross-correlation (full) between two signals."
N = max(len(x), len(y))
n = min(len(x), len(y))
if N == len(y):
lags = np.arange(-N + 1, n)
else:
lags = np.arange(-n + 1, N)
c = correlate(x / np.std(x), y / np.std(y), 'full')
plt.plot(lags, c / n)
plt.show()
To calculate the time delay between two signals, we need to find the cross-correlation between two signals and find the argmax.
Assuming data_1 and data_2 are samples of two signals:
import numpy as np
import pandas as pd
correlation = np.correlate(data_1, data_2, mode='same')
delay = np.argmax(correlation) - int(len(correlation)/2)

Implementing Bates distribution

I've been trying to plot the Bates distribution curve, The Bates distribution is the distribution of the mean of n independent standard uniform variates (from 0 to 1).
(I worked on the interval [-1;1], I made a simple change of variable).
The curve destabilizes after such number of n, which prevents me from moving forward.
In order to consider that the variable x is continuous, I sampled interval in 10**6 samples. Here are some examples for different n:
But for n greater than 29, the curve diverges, and the greater n, the closer the deformation caused by the divergence is to the (mean) center of the curve:
The Bates distribution of probability is defined as follows:
My code:
samples=10**6
def combinaison(n,k): # combination of K out of N
cnk=fac(n)/(fac(k)*fac(abs(n-k))) # fac is factoriel
return cnk
def dens_probas(a,b,n):
x=np.linspace(a, b, num=samples)
y=(x-a)/(b-a)
F=list()
for i in range(0,len(y)):
g=0
for k in range(0,int(n*y[i]+1)):
g=g+pow(-1,k)*combinaison(n,k)*pow(y[i]-k/n,n-1)
d=(n**n/fac(n-1))*g
F.append(d)
return F
Any idea to correct the divergence for larger n?
The main problem is that the formula with alternating sums is extremely prone to numerical accuracy issues.
One trick to avoid the problems on the right side, is to assume the distribution is symmetric and only calculate half of it.
A straightforward accuracy optimization is to replace the factorials in the formula for combinaison by a call to scipy.special.comb. This avoids that very large numbers need to be divided.
A smaller accuracy optimization is to calculate g for even and odd numbers together. But at first sight the formula can not be reduced much, so replacing:
for k in range(0, int(floor(n * y[i] + 1))):
g += pow(-1, k) * combinaison(n, k) * pow(y[i] - k / n, n - 1)
By:
last_k = int(floor(n * y[i]))
for k in range(0, last_k + 1, 2): # note that k increments in steps of 2
if k == last_k:
g += combinaison(n, k) * (pow(y[i] - k / n, n - 1))
else:
g += combinaison(n, k) * (pow(y[i] - k / n, n - 1) - pow(y[i] - (k + 1)/ n, n - 1) * (n - k) / (k + 1))
Some other remarks:
The variable samples is only used to tell the division in the xaxis. A much smaller number will suffice. (In the code below I renamed the variable to xaxis_steps).
Using append for F will be extremely slow. It is better to create a numpy array of the correct size and then fill it in. (This also makes the copying of the halves easier.)
from matplotlib import pyplot as plt
import numpy as np
from scipy.special import comb
from math import factorial as fac
from math import floor
xaxis_steps = 500
def combinaison(n, k): # combination of K out of N
return comb(n, k)
def dens_probas(a, b, n):
x = np.linspace(a, b, num=xaxis_steps)
y = (x - a) / (b - a)
F = np.zeros_like(y)
for i in range(0, (len(y)+1) // 2):
g = 0
for k in range(0, int(floor(n * y[i] + 1))):
g += pow(-1, k) * combinaison(n, k) * pow(y[i] - k / n, n - 1)
F[i] = (n ** n / fac(n - 1)) * g
F[-i-1] = F[i] # symmetric graph
plt.plot(x, F, label=f'n={n}')
return F
for n in (5, 30, 50, 80, 90):
dens_probas(-1, 1, n)
plt.legend()
plt.show()
All these optimizations together move the accuracy problem from n=30 to around n=80:
A completely different approach would be to generate a lot of uniform samples and take the means. From those samples a kde plot can be generated. The smoothness of such curves depends on the number of samples. A kde can be plotted directly via seaborn's kdeplot. You can also separately calculate the kde function, then apply it to a given x range and plot it via standard matplotlib.
import numpy as np
from matplotlib import pyplot as plt
from scipy.stats import gaussian_kde
num_samples = 10 ** 5
def dens_probas(a, b, n):
samples = np.random.uniform(a, b, size=(num_samples, n)).mean(axis=1)
samples = np.hstack([samples, a + b - samples]) # force symmetry; this is not strictly necessary
return gaussian_kde(samples)
for n in (5, 30, 50, 80, 90, 200):
kde = dens_probas(-1, 1, n)
xs = np.linspace(-1, 1, 1000)
F = kde(xs)
plt.plot(xs, F, label=f'n={n}')
plt.legend()
plt.show()

Implementation of a threshold detection function in Python

I want to implement following trigger function in Python:
Input:
time vector t [n dimensional numpy vector]
data vector y [n dimensional numpy vector] (values correspond to t vector)
threshold tr [float]
Threshold type vector tr_type [m dimensional list of int values]
Output:
Threshold time vector tr_time [m dimensional list of float values]
Function:
I would like to return tr_time which consists of the exact (preferred also interpolated which is not yet in code below) time values at which y is crossing tr (crossing means going from less then to greater then or the other way around). The different values in tr_time correspond to the tr_type vector: the elements of tr_type indicate the number of the crossing and if this is an upgoing or a downgoing crossing. For example 1 means first time y goes from less then tr to greater than tr, -3 means the third time y goes from greater then tr to less then tr (third time means along the time vector t)
For the moment I have next code:
import numpy as np
import matplotlib.pyplot as plt
def trigger(t, y, tr, tr_type):
triggermarker = np.diff(1 * (y > tr))
positiveindices = [i for i, x in enumerate(triggermarker) if x == 1]
negativeindices = [i for i, x in enumerate(triggermarker) if x == -1]
triggertime = []
for i in tr_type:
if i >= 0:
triggertime.append(t[positiveindices[i - 1]])
elif i < 0:
triggertime.append(t[negativeindices[i - 1]])
return triggertime
t = np.linspace(0, 20, 1000)
y = np.sin(t)
tr = 0.5
tr_type = [1, 2, -2]
print(trigger(t, y, tr, tr_type))
plt.plot(t, y)
plt.grid()
Now I'm pretty new to Python so I was wondering if there is a more Pythonic and more efficient way to implement this. For example without for loops or without the need to write separate code for upgoing or downgoing crossings.
You can use two masks: the first separates the value below and above the threshold, the second uses np.diff on the first mask: if the i and i+1 value are both below or above the threshold, np.diff yields 0:
import numpy as np
import matplotlib.pyplot as plt
t = np.linspace(0, 8 * np.pi, 400)
y = np.sin(t)
th = 0.5
mask = np.diff(1 * (y > th) != 0)
plt.plot(t, y, 'bx', markersize=3)
plt.plot(t[:-1][mask], y[:-1][mask], 'go', markersize=8)
Using the slice [:-1] will yield the index "immediately before" crossing the threshold (you can see that in the chart). if you want the index "immediately after" use [1:] instead of [:-1]

2d sum using an array - Python

I'm trying to sum a two dimensional function using the array method, somehow, using a for loop is not outputting the correct answer. I want to find (in latex) $$\sum_{i=1}^{M}\sum_{j=1}^{M_2}\cos(i)\cos(j)$$ where according to Mathematica the answer when M=5 is 1.52725. According to the for loop:
def f(N):
s1=0;
for p1 in range(N):
for p2 in range(N):
s1+=np.cos(p1+1)*np.cos(p2+1)
return s1
print(f(4))
is 0.291927.
I have thus been trying to use some code of the form:
def f1(N):
mat3=np.zeros((N,N),np.complex)
for i in range(0,len(mat3)):
for j in range(0,len(mat3)):
mat3[i][j]=np.cos(i+1)*np.cos(j+1)
return sum(mat3)
which again
print(f1(4))
outputs 0.291927. Looking at the array we should find for each value of i and j a matrix of the form
mat3=[[np.cos(1)*np.cos(1),np.cos(2)*np.cos(1),...],[np.cos(2)*np.cos(1),...]...[np.cos(N+1)*np.cos(N+1)]]
so for N=4 we should have
mat3=[[np.cos(1)*np.cos(1) np.cos(2)*np.cos(1) ...] [np.cos(2)*np.cos(1) ...]...[... np.cos(5)*np.cos(5)]]
but what I actually get is the following
mat3=[[0.29192658+0.j 0.+0.j 0.+0.j ... 0.+0.j] ... [... 0.+0.j]]
or a matrix of all zeros apart from the mat3[0][0] element.
Does anybody know a correct way to do this and get the correct answer? I chose this as an example because the problem I'm trying to solve involves plotting a function which has been summed over two indices and the function that python outputs is not the same as Mathematica (i.e., a function of the form $$f(E)=\sum_{i=1}^{M}\sum_{j=1}^{M_2}F(i,j,E)$$).
The return statement is not indented correctly in your sample code. It returns immediately in the first loop iteration. Indent it on the function body instead, so that both for loops finish:
def f(N):
s1=0;
for p1 in range(N):
for p2 in range(N):
s1+=np.cos(p1+1)*np.cos(p2+1)
return s1
>>> print(f(5))
1.527247272700347
I have moved your code to a more numpy-ish version:
import numpy as np
N = 5
x = np.arange(N) + 1
y = np.arange(N) + 1
x = x.reshape((-1, 1))
y = y.reshape((1, -1))
mat = np.cos(x) * np.cos(y)
print(mat.sum()) # 1.5272472727003474
The trick here is to reshape x to a column and y to a row vector. If you multiply them, they are matched up like in your loop.
This should be more performant, since cos() is only called 2*N times. And it avoids loops (bad in python).
UPDATE (regarding your comment):
This pattern can be extended in any dimension. Basically, you get something like a crossproduct. Where every instance of x is matched up with every instance of y, z, u, k, ... Along the corresponding dimensions.
It's a bit confusing to describe, so here is some more code:
import numpy as np
N = 5
x = np.arange(N) + 1
y = np.arange(N) + 1
z = np.arange(N) + 1
x = x.reshape((-1, 1, 1))
y = y.reshape((1, -1, 1))
z = z.reshape((1, 1, -1))
mat = z**2 * np.cos(x) * np.cos(y)
# x along first axis
# y along second, z along third
# mat[0, 0, 0] == 1**2 * np.cos(1) * np.cos(1)
# mat[0, 4, 2] == 3**2 * np.cos(1) * np.cos(5)
If you use this for many dimensions, and big values for N, you will run into memory problems, though.

Categories