I'm trying to program the process on this image:
On the image the 2 on the right-side is mapped to bin "80" since its corresponding value on the left-side is 80. The 4 on the right-side however has a corresponding value of 10 on the left-side, and because there is no bin for 10, the 4 needs to get split into two values.
To accomplish this I am using numpy's histogram with the "weight" parameter like this:
t1 = [80, 10]
t2 = [2, 4]
bins = np.arange(0, 200, 20)
h = np.histogram(t1,bins=bins,weights=t2)
The 2 gets mapped correctly, but the 4 gets mapped entirely to bin 0 (leftmost).
Output:
[4 0 0 0 2 0 0 0 0]
I think is due to the fact that the first bin is responsible for all directions in a range (0 to 20), instead of giving the magnitude when the direction doesn't equal to the exact same number as the bin.
So, I was wondering if anybody knows how I can rewrite this so the output will be:
[2 2 0 0 2 0 0 0 0]
Let's consider an easier task first:
Assume you would want to quantize the gradient direction (GD) as follows: floor(GD/20). You could use the following:
h = np.bincount(np.floor(GD.reshape((-1)) / 20).astype(np.int64), GM.reshape((-1)).astype(np.float64), minlength=13)
Where np.bincount simply accumulates the gradient magnitude (GM) based on the quantized gradient direction (GD). Notice that binlength controls the length of the histogram and it equals ceil(255/20).
However, you wanted soft assignment so you have to weight the GM contribution, you might want to try:
GD = GD.reshape((-1))
GM = GM.reshape((-1))
w = ((GD / 20) - np.floor(GD / 20)).astype(np.float64)
h1 = np.bincount(np.floor(GD / 20).astype(np.int64), GM.astype(np.float64) * (1.0-w), minlength=13)
h2 = np.bincount(np.ceil(GD / 20).astype(np.int64), GM.astype(np.float64) * w, minlength=13)
h = h1 + h2
p.s one might want to consider the np.bincount documentation https://docs.scipy.org/doc/numpy/reference/generated/numpy.bincount.html
Refer to Roy Jevnisek's answer, minlength should be 9 as there are 9 bins.
Also, since 180 degree is equivalent to 0 degree, the last element of h should be omitted and treated as the first element of h, as both the first and last elements of h represent the weighted count of 0 degree, ie:
h[0] = h[-1]
h = h[:-1]
Then the HOG can be plotted by:
GD = GD.reshape(-1)
GM = GM.reshape(-1)
w1 = (GD / 20) - np.floor(GD / 20)
w2 = np.ceil(GD / 20) - (GD / 20)
h1 = np.bincount(np.floor(GD / 20).astype('int32'), GM * w2, minlength=9)
h2 = np.bincount(np.ceil(GD / 20).astype('int32'), GM * w1, minlength=9)
h = h1 + h2
h[0] = h[-1]
h = h[:-1]
values = np.unique(np.floor(GD / 20).astype(np.int64))[:-1]
plt.title('Histogram of Oriented Gradients (HOG)')
plt.bar(values, h)
plt.show()
Related
LHS method provides sampling values between zero to 1. If I want to set bounds, for example, for one dimension value should be -0 to 15? How can I do that in pyDOE python?
from pyDOE import *
n = 2
samples = 50
d = lhs(n, samples, criterion='center')
x1 = d[:,0]
x2 = d[:,1]
My x1 values should be between -10 to 10, and x2 should be 1 to 20.
Multiple each data point in x1 (or x2) by the range of your bounds e.g. 10 - (-10) ie. 20, and add it to the lower bound.
x1_new = [None for i in range(len(x1))]
for i,j in enumerate(x1):
x1_new[i] = -10 + 20*j
... I think?
I figured it out
import pyDOE as pyd
bounds = np.array([[-10,10],[1,20]]) #[Bounds]
# print(xbounds)
X = pyd.lhs(2, 100, criterion='centermaximin')
X[:,0] = (X[:,0]*(bounds[0,1]-bounds[0,0])+bounds[0,0])
X[:,1] = (X[:,1]*(bounds[1,1]-bounds[1,0])+bounds[1,0])
I am working on finding the frequencies from a given dataset and I am struggling to understand how np.fft.fft() works. I thought I had a working script but ran into a weird issue that I cannot understand.
I have a dataset that is roughly sinusoidal and I wanted to understand what frequencies the signal is composed of. Once I took the FFT, I got this plot:
However, when I take the same dataset, slice it in half, and plot the same thing, I get this:
I do not understand why the frequency drops from 144kHz to 128kHz which technically should be the same dataset but with a smaller length.
I can confirm a few things:
Step size between data points 0.001
I have tried interpolation with little luck.
If I slice the second half of the dataset I get a different frequency as well.
If my dataset is indeed composed of both 128 and 144kHz, then why doesn't the 128 peak show up in the first plot?
What is even more confusing is that I am running a script with pure sine waves without issues:
T = 0.001
fs = 1 / T
def find_nearest_ind(data, value):
return (np.abs(data - value)).argmin()
x = np.arange(0, 30, T)
ff = 0.2
y = np.sin(2 * ff * np.pi * x)
x = x[:len(x) // 2]
y = y[:len(y) // 2]
n = len(y) # length of the signal
k = np.arange(n)
T = n / fs
frq = k / T * 1e6 / 1000 # two sides frequency range
frq = frq[:len(frq) // 2] # one side frequency range
Y = np.fft.fft(y) / n # dft and normalization
Y = Y[:n // 2]
frq = frq[:50]
Y = Y[:50]
fig, (ax1, ax2) = plt.subplots(2)
ax1.plot(x, y)
ax1.set_xlabel("Time (us)")
ax1.set_ylabel("Electric Field (V / mm)")
peak_ind = find_nearest_ind(abs(Y), np.max(abs(Y)))
ax2.plot(frq, abs(Y))
ax2.axvline(frq[peak_ind], color = 'black', linestyle = '--', label = F"Frequency = {round(frq[peak_ind], 3)}kHz")
plt.legend()
plt.xlabel('Freq(kHz)')
ax1.title.set_text('dV/dX vs. Time')
ax2.title.set_text('Frequencies')
fig.tight_layout()
plt.show()
Here is a breakdown of your code, with some suggestions for improvement, and extra explanations. Working through it carefully will show you what is going on. The results you are getting are completely expected. I will propose a common solution at the end.
First set up your units correctly. I assume that you are dealing with seconds, not microseconds. You can adjust later as long as you stay consistent.
Establish the period and frequency of the sampling. This means that the Nyquist frequency for the FFT will be 500Hz:
T = 0.001 # 1ms sampling period
fs = 1 / T # 1kHz sampling frequency
Make a time domain of 30e3 points. The half domain will contain 15000 points. That implies a frequency resolution of 500Hz / 15k = 0.03333Hz.
x = np.arange(0, 30, T) # time domain
n = x.size # number of points: 30000
Before doing anything else, we can define our time domain right here. I prefer a more intuitive approach than what you are using. That way you don't have to redefine T or introduce the auxiliary variable k. But as long as the results are the same, it does not really matter:
F = np.linspace(0, 1 - 1/n, n) / T # Notice F[1] = 0.03333, as predicted
Now define the signal. You picked ff = 0.2. Notice that 0.2Hz. 0.2 / 0.03333 = 6, so you would expect to see your peak in exactly bin index 6 (F[6] == 0.2). To better illustrate what is going on, let's take ff = 0.22. This will bleed the spectrum into neighboring bins.
ff = 0.22
y = np.sin(2 * np.pi * ff * x)
Now take the FFT:
Y = np.fft.fft(y) / n
maxbin = np.abs(Y).argmax() # 7
maxF = F[maxbin] # 0.23333333: This is the nearest bin
Since your frequency bins are 0.03Hz wide, the best resolution you can expect 0.015Hz. For your real data, which has much lower resolution, the error is much larger.
Now let's take a look at what happens when you halve the data size. Among other things, the frequency resolution becomes smaller. Now you have a maximum frequency of 500Hz spread over 7.5k samples, not 15k: the resolution drops to 0.066666Hz per bin:
n2 = n // 2 # 15000
F2 = np.linspace(0, 1 - 1 / n2, n2) / T # F[1] = 0.06666
Y2 = np.fft.fft(y[:n2]) / n2
Take a look what happens to the frequency estimate:
maxbin2 = np.abs(Y2).argmax() # 3
maxF2 = F2[maxbin2] # 0.2: This is the nearest bin
Hopefully, you can see how this applies to your original data. In the full FFT, you have a resolution of ~16.1 per bin with the full data, and ~32.2kHz with the half data. So your original result is within ~±8kHz of the right peak, while the second one is within ~±16kHz. The true frequency is therefore between 136kHz and 144kHz. Another way to look at it is to compare the bins that you showed me:
full: 128.7 144.8 160.9
half: 96.6 128.7 160.9
When you take out exactly half of the data, you drop every other frequency bin. If your peak was originally closest to 144.8kHz, and you drop that bin, it will end up in either 128.7 or 160.9.
Note: Based on the bin numbers you show, I suspect that your computation of frq is a little off. Notice the 1 - 1/n in my linspace expression. You need that to get the right frequency axis: the last bin is (1 - 1/n) / T, not 1 / T, no matter how you compute it.
So how to get around this problem? The simplest solution is to do a parabolic fit on the three points around your peak. That is usually a sufficiently good estimator of the true frequency in the data when you are looking for essentially perfect sinusoids.
def peakF(F, Y):
index = np.abs(Y).argmax()
# Compute offset on normalized domain [-1, 0, 1], not F[index-1:index+2]
y = np.abs(Y[index - 1:index + 2])
# This is the offset from zero, which is the scaled offset from F[index]
vertex = (y[0] - y[2]) / (0.5 * (y[0] + y[2]) - y[1])
# F[1] is the bin resolution
return F[index] + vertex * F[1]
In case you are wondering how I got the formula for the parabola: I solved the system with x = [-1, 0, 1] and y = Y[index - 1:index + 2]. The matrix equation is
[(-1)^2 -1 1] [a] Y[index - 1]
[ 0^2 0 1] # [b] = Y[index]
[ 1^2 1 1] [c] Y[index + 1]
Computing the offset using a normalized domain and scaling afterwards is almost always more numerically stable than using whatever huge numbers you have in F[index - 1:index + 2].
You can plug the results in the example into this function to see if it works:
>>> peakF(F, Y)
0.2261613409657391
>>> peakF(F2, Y2)
0.20401580936430794
As you can see, the parabolic fit gives an improvement, however slight. There is no replacement for just increasing frequency resolution through more samples though!
I am trying to implement a bilateral filter from the paper Fast Bilateral Filteringfor the Display of High-Dynamic-Range Images. The equation (from the paper) that implements the bilateral filter is given as :
According to what I understood,
f is a Gaussian filter
g is a Gaussian filter
p is a pixel in a given image window
s is the current pixel
Ip is the intensity at the current pixel
With this, I wrote the code to implement these equations, given as :
import cv2
import numpy as np
img = cv2.imread("fish.png")
# image of width 239 and height 200
bl_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
i = cv2.magnitude(
cv2.Sobel(bl_img, cv2.CV_64F, 1, 0, ksize=3),
cv2.Sobel(bl_img, cv2.CV_64F, 0, 1, ksize=3)
)
f = cv2.getGaussianKernel(5, 0.1, cv2.CV_64F)
g = cv2.getGaussianKernel(5, 0.1, cv2.CV_64F)
rows, cols, _ = img.shape
filtered = np.zeros(img.shape, dtype=img.dtype)
for r in range(rows):
for c in range(cols):
ks = []
for index in [-2,-1,1,2]:
if index + c > 0 and index + c < cols-1:
p = img[r][index + c]
s = img[r][c]
i_p = i[index+c]
i_s = i[c]
ks.append(
(f * (p-s)) * (g * (i_p * i_s)) # EQUATION 7
)
ks = np.sum(np.array(ks))
js = []
for index in [-2, -1, 1, 2]:
if index + c > 0 and index + c < cols -1:
p = img[r][index + c]
s = img[r][c]
i_p = i[index+c]
i_s = i[c]
js.append((f * (p-s)) * (g * (i_p * i_s)) * i_p) # EQUATION 6
js = np.sum(np.asarray(js))
js = js / ks
filtered[r][c] = js
cv2.imwrite("f.png", filtered)
But as I run this code I get an error saying:
Traceback (most recent call last):
File "bft.py", line 33, in <module>
(f * (p-s)) * (g * (i_p * i_s))
ValueError: operands could not be broadcast together with shapes (5,3) (5,239)
Did I incorrectly implement the equations? What am I missing?
There are various issues with your code. Foremost, the equation is interpreted in a wrong way. f(p-s) means evaluating the function f at p-s. f is the Gaussian. Likewise with g. The section of the code would look like this:
weight = gaussian(p - s, sigma_f) * gaussian(i_p - i_s, sigma_g)
ks.append(weight)
js.append(weight * i_p)
Note that the two loops can be merged, this way you avoid some duplicated computation. gaussian(x, sigma) would be a function that computes the Gaussian weight at x. You need to define two sigmas, sigma_f and sigma_g, the spatial and the tonal sigma respectively.
The second issue is in the definition of p and s. These are the coordinates of the pixel, not the value of the image at the pixel. i_p and i_s are the value of the image at those locations. p-s is basically the spatial distance between the pixel at (r,c) and the given neighbor.
The third issue is the loop over the neighborhood. The neighborhood is all pixels where gaussian(p - s, sigma_f) is not negligible. So how large the neighborhood is depends on the chosen sigma_f. You should take it at least to be ceil(2*sigma_f). Say sigma_f is 2, then you want the neighborhood to go from -4 to 4 (9 pixels). But this neighborhood is two dimensional, not one-dimensional as in your code. So you need two loops:
for ii in range(-ceil(2*sigma_f), ceil(2*sigma_f)+1):
if ii + c > 0 and ii + c < cols-1:
for jj in range(-ceil(2*sigma_f), ceil(2*sigma_f)+1):
if jj + r > 0 and jj + r < rows-1:
# compute weight here
Note that now, p-s is computed with math.sqrt(ii**2 + jj**2). But also note that the Gaussian uses x**2, so you could skip the computation of the square root by passing x**2 into your gaussian function.
I am creating a circular mask in python as follows:
import numpy as np
def make_mask(image, radius, center=(0, 0)):
r, c, d = image.shape
y, x = np.ogrid[-center[0]:r-center[0], -center[1]:r-center[1]]
mask = x*x + y*y <= radius*radius
array = np.zeros((r, c))
array[mask] = 1
return array
This returns a mask of shape (r, c). What I would like to do is have a weighted mask where the weight is 1 at the center of the image (given by the center parameter) and decreasing linearly towards the edge of the image. So, his should be an added weight calculated between 0 and 1 (0 not included) in the line. I was thinking this should be something like:
distance = (center[0] - x)**2 + (center[1] - y)**2
# weigh it inversely to distance from center
mask = (x*x + y*y) * 1.0/distance
However, this will result in divide by 0 and the mask would not be between 0 and 1 either.
First, if you want to weight to be linear, you need to take the square root of what you have for distance (ie, what you're calling "distance" isn't the distance from the center but the square of that, so you should rename it to something like R_squared). So:
R_squared = (center[0] - x)**2 + (center[1] - y)**2 # what you have for distance
r = sqrt(R_squared)
Then, since it starts off as 0 where you want it to be 1, add 1 to it; but now that you've added 1 scale the value so it's 1 where you want the result to be 0. Say you want it to be 0 at a distance L from then center, then your equation is:
weight = 1 - r/L
Here this will be 1 where r==0 and 0 where r==L.
I have four columns, namely x,y,z,zcosmo. The range of zcosmo is 0.0<zcosmo<0.5.
For each x,y,z, there is a zcosmo.
When x,y,z are plotted, this is how they look.
I would like to find the volume of this figure. If I slice it into 50 parts (in ascending zcosmo order), so that each part resembles a cylinder, I can add them up to get the final volume.
The volume of the sliced cylinders would be pi*r^2*h, in my case r = z/2 & h = x
The slicing for example would be like,
x,z for 0.0<zcosmo<0.01 find this volume V1. Then x,z for 0.01<zcosmo<0.02 find this volume V2 and so on until zcosmo=0.5
I know to do this manually (which of course is time consuming) by saying:
r1 = z[np.logical_and(zcosmo>0.0,zcosmo<0.01)] / 2 #gives me z within the range 0.0<zcosmo<0.01
h1 = x[np.logical_and(zcosmo>0.0,zcosmo<0.01)] #gives me x within the range 0.0<zcosmo<0.01
V1 = math.pi*(r1**2)*(h1)
Here r1 and h1 should be r1 = ( min(z) + max(z) ) / 2.0 and h1 = max(x) - min(x), i.e the max and min values so that I get one volume for each slice
How should I create a code that calculates the 50 volume slices within the zcosmo sliced ranges??
Use a for loop:
volumes = list()
for index in range(0, 50):
r = z[np.logical_and(zcosmo>index * 0.01, zcosmo<index * 0.01 + 0.01)] / 2
h = x[np.logical_and(zcosmo>index * 0.01, zcosmo<index * 0.01 + 0.01)]
volumes.append(math.pi*(r**2)*(h))
At the end, volumes will be a list containing the volumes of the 50 cylinders.
You can use volume = sum(volumes) to get the final volume of the shape.