Related
I have a spectrum that I want to subtract a baseline from. The spectrum data are:
1.484043000000000001e+00 1.121043091000000004e+03
1.472555999999999976e+00 1.140899658000000045e+03
1.461239999999999872e+00 1.135047851999999921e+03
1.450093000000000076e+00 1.153286499000000049e+03
1.439112000000000169e+00 1.158624877999999853e+03
1.428292000000000117e+00 1.249718872000000147e+03
1.417629999999999946e+00 1.491854857999999922e+03
1.407121999999999984e+00 2.524922362999999677e+03
1.396767000000000092e+00 4.102439940999999635e+03
1.386559000000000097e+00 4.013319579999999860e+03
1.376497999999999999e+00 3.128252441000000090e+03
1.366578000000000070e+00 2.633181152000000111e+03
1.356797999999999949e+00 2.340077147999999852e+03
1.347154999999999880e+00 2.099404540999999881e+03
1.337645999999999891e+00 2.012083983999999873e+03
1.328268000000000004e+00 2.052154540999999881e+03
1.319018999999999942e+00 2.061067871000000196e+03
1.309895999999999949e+00 2.205770507999999609e+03
1.300896999999999970e+00 2.199266602000000148e+03
1.292019000000000029e+00 2.317792235999999775e+03
1.283260000000000067e+00 2.357031494000000293e+03
1.274618000000000029e+00 2.434981689000000188e+03
1.266089999999999938e+00 2.540746337999999923e+03
1.257675000000000098e+00 2.605709472999999889e+03
1.249370000000000092e+00 2.667244141000000127e+03
1.241172999999999860e+00 2.800522704999999860e+03
I've taken only every 20th data point from the actual data file, but the general shape is preserved.
import matplotlib.pyplot as plt
share = the_above_array
plt.plot(share)
Original_spectrum
There is a clear tail in around the high x values. Assume the tail is an artifact and needs to be removed. I've tried solutions using the ALS algorithm by P. Eilers, a rubberband approach, and the peakutils package, but these end up subtracting the tail and creating a rise around the low x values or not creating a suitable baseline.
ALS algorithim, in this example I am using lam=1E6 and p=0.001; these were the best parameters I was able to manually find:
# ALS approach
from scipy import sparse
from scipy.sparse.linalg import spsolve
def baseline_als(y, lam, p, niter=10):
L = len(y)
D = sparse.csc_matrix(np.diff(np.eye(L), 2))
w = np.ones(L)
for i in range(niter):
W = sparse.spdiags(w, 0, L, L)
Z = W + lam * D.dot(D.transpose())
z = spsolve(Z, w*y)
w = p * (y > z) + (1-p) * (y < z)
return z
baseline = baseline_als(share[:,1], 1E6, 0.001)
baseline_subtracted = share[:,1] - baseline
plt.plot(baseline_subtracted)
ALS_plot
Rubberband approach:
# rubberband approach
from scipy.spatial import ConvexHull
def rubberband(x, y):
# Find the convex hull
v = ConvexHull(share).vertices
# Rotate convex hull vertices until they start from the lowest one
v = np.roll(v, v.argmax())
# Leave only the ascending part
v = v[:v.argmax()]
# Create baseline using linear interpolation between vertices
return np.interp(x, x[v], y[v])
baseline_rubber = rubberband(share[:,0], share[:,1])
intensity_rubber = share[:,1] - baseline_rubber
plt.plot(intensity_rubber)
Rubber_plot
peakutils package:
# peakutils approach
import peakutils
baseline_peakutils = peakutils.baseline(share[:,1])
intensity_peakutils = share[:,1] - baseline_peakutils
plt.plot(intensity_peakutils)
Peakutils_plot
Are there any suggestions, aside from masking the low x value data, for constructing a baseline and subtracting the tail without creating a rise in the low x values?
I found a set of similar ALS algorithms here. One of these algorithms, asymmetrically reweighted penalized least squares smoothing (arpls), gives a slightly better fit than als.
# arpls approach
from scipy.linalg import cholesky
def arpls(y, lam=1e4, ratio=0.05, itermax=100):
r"""
Baseline correction using asymmetrically
reweighted penalized least squares smoothing
Sung-June Baek, Aaron Park, Young-Jin Ahna and Jaebum Choo,
Analyst, 2015, 140, 250 (2015)
"""
N = len(y)
D = sparse.eye(N, format='csc')
D = D[1:] - D[:-1] # numpy.diff( ,2) does not work with sparse matrix. This is a workaround.
D = D[1:] - D[:-1]
H = lam * D.T * D
w = np.ones(N)
for i in range(itermax):
W = sparse.diags(w, 0, shape=(N, N))
WH = sparse.csc_matrix(W + H)
C = sparse.csc_matrix(cholesky(WH.todense()))
z = spsolve(C, spsolve(C.T, w * y))
d = y - z
dn = d[d < 0]
m = np.mean(dn)
s = np.std(dn)
wt = 1. / (1 + np.exp(2 * (d - (2 * s - m)) / s))
if np.linalg.norm(w - wt) / np.linalg.norm(w) < ratio:
break
w = wt
return z
baseline = baseline_als(share[:,1], 1E6, 0.001)
baseline_subtracted = share[:,1] - baseline
plt.plot(baseline_subtracted, 'r', label='als')
baseline_arpls = arpls(share[:,1], 1e5, 0.1)
intensity_arpls = share[:,1] - baseline_arpls
plt.plot(intensity_arpls, label='arpls')
plt.legend()
ARPLS plot
Fortunately, this improvement becomes better when using the data from the entire spectrum:
Note the parameters for either algorithm were different. For now, I think the arpls algorithm is as close as I can get, at least for spectra that look like this. We'll see how robust the algorithm can fit spectra with different shapes. Of course, I am always open to suggestions or improvements!
Have a look at the RamPy library in python, which proposes various baseline subtraction algorithms. This includes splines, ARPLS, ALS, polynomial functions, and many more. It also offers various other features, such as resampling, normalisation, and peak fitting examples.
In your case, a simple spline function fitted before and after the peak should easily do the job. Have a look at this example Jupyter notebook.
So I suppose to calculate the convolution between Fourier Transformed image and the mask.
from scipy import fftpack
import numpy as np
import imageio
from PIL import Image, ImageDraw
import cv2
import matplotlib.pyplot as plt
import math
from scipy.ndimage.filters import convolve
input_image = Image.open('....image....')
input_image=np.array(input_image)
M,N = input_image.shape[0],input_image.shape[1]
FT_img = fftpack.fftshift(fftpack.fft2(input_image))
n = 2; # order value can change this value accordingly
D0 = 60; # cut-off frequency can change this value accordingly
# Designing filter
u = np.arange(0, M)
idx = u > M/2
u[idx] = u[idx] - M
v = np.arange(0, N)
idy = v > N/2
v[idy] = v[idy] - N
V,U = np.meshgrid(v,u)
# Calculating Euclidean Distance
D=np.linalg.norm(V-U)
# determining the filtering mask
H = 1/(1 + (D0/D)**(2*n));
# Convolution between the Fourier Transformed image and the mask
G = convolve(H, FT_img)
And I get "Runtime error:filter weights array has incorrect shape." error at the last line when I run this code snippet. What I understand is H is float and FT_img is array so I cannot perform convolution on these. But I don't know how to solve that.
How can I solve this problem?
calculating distance D, and filter H for each (u, v) this will yield an array with same size of input image, multiplying that array(H the Filter) with the image in Fourier Domain will be equivalent to convolution in the Time domain, and the results will be as following:
import numpy as np
import cv2
import matplotlib.pyplot as plt
# Read Image as Grayscale
img = cv2.imread('input.png', 0)
# Designing filter
#------------------------------------------------------
def butterworth_filter(shape, n=2, D0=60):
'''
n = 2; # order value can change this value accordingly
D0 = 60; # cut-off frequency can change this value accordingly
'''
M, N = shape
# Initialize filter with zeros
H = np.zeros((M, N))
# Traverse through filter
for u in range(0, M):
for v in range(0, N):
# Get euclidean distance from point D(u,v) to the center
D_uv = np.sqrt((u - M / 2) ** 2 + (v - N / 2) ** 2)
# determining the filtering mask
H[u, v] = 1/(1 + (D0/D_uv)**(2*n))
return H
#-----------------------------------------------------
f = np.fft.fft2(img)
fshift = np.fft.fftshift(f)
phase_spectrumR = np.angle(fshift)
magnitude_spectrum = 20*np.log(np.abs(fshift))
# Generate Butterworth Filter
H = butterworth_filter(img.shape)
# Convolution between the Fourier Transformed image and the mask
G = H * fshift
# Obtain the Result
result = np.abs(np.fft.ifft2(np.fft.ifftshift((G))))
plt.subplot(222)
plt.imshow(img, cmap='gray')
plt.title('Original')
plt.axis('off')
plt.subplot(221)
plt.imshow(magnitude_spectrum, cmap='gray')
plt.title('magnitude spectrum')
plt.axis('off')
plt.subplot(223)
plt.imshow(H, "gray")
plt.title("Butterworth Filter")
plt.axis('off')
plt.subplot(224)
plt.imshow(result, "gray")
plt.title("Result")
plt.axis('off')
plt.show()
I am trying to use a Fast Fourier Transform to extract the amplitude and phase shift of two sinusoidal waves. By experimenting, I found out that transform returned from the FFT had an amplitude that was actually an N/2 times multiple of my actual signal (where N is the number of samples in the wave). So, to extract and plot the actual transform, I had to multiply the gain by 2/N.
The portion of the code showing this is attached below:
from scipy.fft import fft, rfft
import numpy as np
import matplotlib.pyplot as plt
N = 600 # number of sample points
d = 1.0 # time domain
f = 50 # frequency
u = 0.1 # mean inlet velocity
du = 0.1 # velocity perturbation rate
T = 1.0 / f # period
s = d/N # sample spacing
# 1st sine wave
x1 = np.linspace(0.0, d, N)
y1 = u*du* np.sin(f * 2.0*np.pi*x1)
yf1 = rfft(y1)
xf1 = np.linspace(0.0, 1.0/(2.0*s), N//2)
# 2nd sine wave
q = 0.08
dq = 0.1
phi = np.pi / 2 # phase delay (rad)
x2 = np.linspace(0.0, d, N)
y2 = q*dq* np.sin(f * 2.0*np.pi*x2 - phi)
yf2 = fft(y2)
xf2 = np.linspace(0.0, 1.0/(2.0*s), N//2)
#plt.plot(x,y)
plt.plot(xf1, 2.0/N * np.abs(yf1[0:N//2]))
plt.plot(xf2, 2.0/N * np.abs(yf2[0:N//2]))
plt.grid()
plt.show()
I cannot figure out why the FFT returns this amplitude multiplied by N/2.
A secondary problem I face is how to extract the phase shift (phi) from the 2 transformed waves. Any help would be appreciated.
I'm trying to fit a power-law to data which is in the double log scale. Therefore I've used the curve_fit(...) function from the scipy.optimize package.
To run the function I've implemented the following piece of code COR_coef[i] = curve_fit(lambda x, m: c * x ** m, x, COR_IFG[:, i])[0][0], to the best of my knowledge the curve_fit(...) should now correctly fit a power-law (being a straight line) to my data. However, for some reason, I just do not seem to get the fit right. See the attached picture for the data and its fit.
Some more context with regards to the minimum reproducible example (see below):
The code generates random noise for simulation purposes, this is done in the white_noise(...)
This random noise is than misaligned (in a for-loop with different fractions of misalignment according to the variable fractions_to_shift so the development of the power-law can be studied) and subtracted from the original noise to gain a residual signal
The residual signal is the signal the power-law is fitted to
The curve_fit(...) is applied in the sim_powerlaw_coefficient(...) function
I am aware of the fact that my residual signal shows some artifacts when the misalignment gets larger, unfortunately I don't know how to get rid of these artifacts.
MINIMUM REPRODUCIBLE EXAMPLE
import matplotlib.pyplot as plt
import numpy as np
import numpy.fft as fft
import numpy.random as rnd
from scipy.optimize import curve_fit
plt.style.use('seaborn-darkgrid')
rnd.seed(100) # to select a random seed for creating the "random" noise
grad = -5 / 3. # slope to use for every function
c = 1 # base parameter for the powerlaw
ylim = [1e-7, 30] # range for the double log plots of the powerfrequency domains
values_to_shift = [0, 2**-11, 2**-10, 2**-9, 2**-8, 2**-7, 2**-6, 2**-5, 2**-4, 2**-3, 2**-2, 2**-1, 2**0] # fractions of missalignment
def white_noise(n: int, N: int):
"""
- Creates a data set of white noise with size n, N;
- Filters this dataset with the corresponding slope;
This slope is usually equal to -5/3 or -2/3
- Makes sure the slope is equal to the requested slope in the double log scale.
#param n: size of random array
#param N: number of random arrays
#param slope: slope of the gradient
#return: white_noise, filtered white_noise and the original signal
"""
m = grad
x = np.linspace(1, n, n // 2)
slope_loglog = c * x ** m
whitenoise = rnd.randn(n // 2, N) + 1j * rnd.randn(n // 2, N)
whitenoise[0, :] = 0 # zero-mean noise
whitenoise_filtered = whitenoise * slope_loglog[:, np.newaxis]
whitenoise = 2 * np.pi * np.concatenate((whitenoise, whitenoise[0:1, :], np.conj(whitenoise[-1:0:-1, :])), axis=0)
whitenoise_filtered = 2 * np.pi * np.concatenate(
(whitenoise_filtered, whitenoise_filtered[0:1, :], np.conj(whitenoise_filtered[-1:0:-1, :])), axis=0)
whitenoise_signal = fft.ifft(whitenoise_filtered, axis=0)
whitenoise_signal = np.real_if_close(whitenoise_signal)
if np.iscomplex(whitenoise_signal).any():
print('Warning! whitenoise_signal is complex-valued!')
whitenoise_retransformed = fft.fft(whitenoise_signal, axis=0)
return whitenoise, whitenoise_filtered, whitenoise_signal, whitenoise_retransformed, slope_loglog
def sim_powerlaw_coefficient(n: int, N: int, show_powerlaw=0):
"""
#param n: Number of values in the IFG
#param N: Number of IFG's
#return: Returns the coefficient after subtraction of two IFG's
"""
master = white_noise(n, N)
slave = white_noise(n, N)
x = np.linspace(1, n, n // 2)
signal_IFG = master[2] - slave[2]
noise_IFG = np.abs(fft.fft(signal_IFG, axis=0))[0:n // 2, :]
for k in range(len(values_to_shift)):
shift = np.int(np.round(values_to_shift[k] * n, 0))
inp = signal_IFG.copy()
# the weather model is a shifted copy of the actual signal, to better understand the errors that are introduced.
weather_model = np.roll(inp, shift, axis=0)
WM_IFG = np.abs(fft.fft(weather_model, axis=0)[0:n // 2, :])
signal_corrected = signal_IFG - weather_model
COR_IFG = np.abs(fft.fft(signal_corrected, axis=0)[0:n // 2, :])
COR_coef = np.zeros(N)
for i in range(N):
COR_coef[i] = curve_fit(lambda x, m: c * x ** m, x, COR_IFG[:, i])[0][0]
plt.figure(figsize=(15, 10))
plt.title('Corrected IFG (combined - weather model)')
plt.loglog(COR_IFG, label='Corrected IFG')
plt.ylim(ylim)
plt.xlabel('log(k)')
plt.ylabel('log(P)')
plt.loglog(c * x ** COR_coef.mean(), '-.', label=f'COR powerlaw coef:{COR_coef.mean()}')
plt.legend(loc=0)
plt.tight_layout()
sim_powerlaw_coefficient(8192, 1, show_powerlaw=1)
I have write down a code to calculate angle between three points using their 3D coordinates.
import numpy as np
a = np.array([32.49, -39.96,-3.86])
b = np.array([31.39, -39.28, -4.66])
c = np.array([31.14, -38.09,-4.49])
f = a-b # normalization of vectors
e = b-c # normalization of vectors
angle = dot(f, e) # calculates dot product
print degrees(cos(angle)) # calculated angle in radians to degree
output of the code:
degree 33.4118214995
but when i used one of the software to calculate the same it gives output bit different 120 degree. please help
reference i have used to write the program:
(How to calculate bond angle in protein db file?)
Your original code is pretty close. Adomas.m's answer is not very idiomatic numpy:
import numpy as np
a = np.array([32.49, -39.96,-3.86])
b = np.array([31.39, -39.28, -4.66])
c = np.array([31.14, -38.09,-4.49])
ba = a - b
bc = c - b
cosine_angle = np.dot(ba, bc) / (np.linalg.norm(ba) * np.linalg.norm(bc))
angle = np.arccos(cosine_angle)
print np.degrees(angle)
I guess numpy is quite enough:
from numpy import *
from numpy.linalg import norm
a = array([32.49, -39.96,-3.86])
b = array([31.39, -39.28, -4.66])
c = array([31.14, -38.09,-4.49])
f = b-a
e = b-c
abVec = norm(f)
bcVec = norm(e)
abNorm = f / abVec;
bcNorm = e / bcVec;
res = abNorm[0] * bcNorm[0] + abNorm[1] * bcNorm[1] + abNorm[2] * bcNorm[2];
angle = arccos(res)*180.0/ pi
print angle
also the res can be calculated with dot:
res = abNorm[0] * bcNorm[0] + abNorm[1] * bcNorm[1] + abNorm[2] * bcNorm[2];
res = dot(abNorm, bcNorm)
For 2D, you can use this method using the math library.
import math
def getAngle(a, b, c):
ang = math.degrees(math.atan2(c[1]-b[1], c[0]-b[0]) - math.atan2(a[1]-b[1], a[0]-b[0]))
return ang + 360 if ang < 0 else ang
print(getAngle((5, 0), (0, 0), (0, 5)))
Credits: https://manivannan-ai.medium.com/find-the-angle-between-three-points-from-2d-using-python-348c513e2cd
In case you have a big list of (x,y,z) coordinates, this works:
import numpy
def compute_angle_between_3d_points(a,b,c):
ba = a - b
bc = c - b
cosine_numerator = np.sum(ba*bc, axis=1)
cosine_denominator_1 = np.linalg.norm(ba, axis=1)
cosine_denominator_2 = np.linalg.norm(bc, axis=1)
cosine_angle = cosine_numerator / (cosine_denominator_1 * cosine_denominator_2)
angles = np.arccos(cosine_angle)
degree_angles = np.rad2deg(angles)
return degree_angles
Above, a,b,c are presumed to be of shape (N_Points, 3). Something in TensorFlow or Torch would surely be faster, but there you go.