I have 4D( 2D + slices along z axis + time frames) gray-scale image for the heart beating on different moments.
I do like to take Fourier Transform along time axis(for each slice separately), and analyze the fundamental Harmonic (also called H1 component, where H stands for Hilbert Space) so I can determine pixel regions corresponding to ROI which show strongest response to cardiac frequency.
I'm using python for this purpose, and I tried to do that with the following code, but I'm not sure that this is the correct way to do it, because I don't know how to determine the cut-frequency to keep only the fundamental Harmonic.
This link to the image which I'm dealing with
import nibabel as nib
import numpy as np
import matplotlib.pyplot as plt
img = nib.load('patient057_4d.nii.gz')
f = np.fft.fft2(img)
# Move the DC component of the FFT output to the center of the spectrum
fshift = np.fft.fftshift(f)
fshift_orig = fshift.copy()
# logarithmic transformation
magnitude_spectrum = 20*np.log(np.abs(fshift))
# Create mask
rows, cols = img.shape
crow, ccol = int(rows/2), int(cols/2)
# Use mask to remove low frequency components
dist1 = 20
dist2 = 10
fshift[crow-dist1:crow+dist1, ccol-dist1:ccol+dist1] = 0
#fshift[crow-dist2:crow+dist2, ccol-dist2:ccol+dist2] = fshift_orig[crow-dist2:crow+dist2, ccol-dist2:ccol+dist2]
# logarithmic transformation
magnitude_spectrum1 = 20*np.log(np.abs(fshift))
f_ishift = np.fft.ifftshift(fshift)
# inverse Fourier transform
img_back = np.fft.ifft2(f_ishift)
# get rid of imaginary part by abs
img_back = np.abs(img_back)
plt.figure(num = 'Im_Back')
plt.imshow(abs(fshift[:,:,2,2]).astype('uint8'),cmap='gray')
plt.show()
The solution was to take Fourier transform 3D for each slice seperately, then to chose only the 2nd component of the Transform to transform it back to the spatial space, and that's it.
The benefit of this is to detect if something is moving along the third axis(time in my case).
for sl in range(img.shape[2]):
#-----Fourier--H1-----------------------------------------
# ff1[:, :, 1] H1 compnent 1, if 0 then DC
ff1 = FFT.fftn(img[:,:,sl,:])
fh = np.absolute(FFT.ifftn(ff1[:, :, 1]))
#-----Fourier--H1-----------------------------------------
Related
I want to digitize (= average out over cells) photon count data into pixels given by a grid that tells how they are aligned. The photon count data is stored in a 2D array. I want to split that data into cells, each of which would correspond to a pixel. The idea is basically the same as changing an HD image to a smaller resolution. I'd like to achieve this in Python.
The digitizing function I've written:
import numpy as np
def digitize(function_data, grid_shape):
"""
function_data = 2D array of function values of some 3D shape,
eg.: exp(-(x^2 + y^2 -> want to digitize this
grid_shape: an array of length 2 which contains the dimensions of the smaller resolution
"""
l = len(function_data)
pixel_len_x = int(l/grid_shape[0])
pixel_len_y = int(l/grid_shape[1])
digitized_data = np.empty((grid_shape[0], grid_shape[1]))
for i in range(grid_shape[0]): #row-index of pixel in smaller-resolution grid
for j in range(grid_shape[1]): #column-index of pixel in smaller-resolution grid
hd_pixel = []
for k in range(pixel_len_y):
hd_pixel.append(z_data[k][j:j*pixel_len_x])
hd_pixel = np.ravel(hd_pixel) #turns 2D array into 1D to be able to compute average
pixel_avg = np.average(hd_pixel)
digitized_data[i][j] = pixel_avg
return digitized_data
In theory, this function should do what I want to achieve, but when tested it doesn't yield the expected results. Either a completed version of my function or any other method that achieves my goal would be extremely helpful.
You could also use a interpolation function, if you can use SciPy. Here we use one of the gridded data interpolating functions, RectBivariateSpline to upsample your function, but you can find numerous examples on this and other sites.
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import RectBivariateSpline as rbs
# Sampling coordinates
x = np.linspace(-2,2,20)
y = np.linspace(-2,2,30)
# Your function
f = np.exp(-(x[:,None]**2 + y**2))
# Interpolator
interp = rbs(x, y, f)
# Higher resolution coordinates
x_hd = np.linspace(x.min(), x.max(), x.size * 5)
y_hd = np.linspace(y.min(), y.max(), y.size * 5)
# New higher res function
f_hd = interp(x_hd, y_hd, grid = True)
# Some plots
fig, ax = plt.subplots(ncols = 2)
ax[0].imshow(f)
ax[1].imshow(f_hd)
I am new to image processing and am working with images like these:
In these pictures, there will be more than one curves that I need to straighten out for them to look like a straight line.
Here's a quick solution. It can be improved by doing a spline fit to the features rather than just fitting the parabola. The algorithm shifts each row in the image individually according to the fitted parabola:
from skimage import io, measure, morphology
from matplotlib import pyplot as plt
from scipy.optimize import curve_fit
image = io.imread('curves.png', as_gray=True)
# need a binary mask of features
mask = image == image.min()
# close holes in features
mask = morphology.binary_closing(mask, morphology.square(3))
plt.matshow(mask, cmap='gray')
# need to get the coordinates of each feature
rp = measure.regionprops(measure.label(mask))
# going to fit a parabola to the features
def parabola(x, x0, A, y0):
return A*(x-x0)**2 + y0
# get coords of one of the features
coords = rp[0].coords
# do parabola fit
pop, pcov = curve_fit(parabola, coords[:,0], coords[:,1])
# generate fit
fit = parabola(np.arange(mask.shape[0]), *pop)
# plot fit
plt.plot(fit, np.arange(mask.shape[0])) # invert axes
# generate new image to shift
out = np.empty_like(image)
# shift each row individually and add to out array
for i, row in enumerate(image):
out[i] = np.roll(row, -int(round(fit[i] - pop[-1])))
plt.matshow(out, cmap='gray')
Original mask and fitted parabola:
Result:
I've just started to learn about images frecuency domain.
I have this function:
def fourier_transform(img):
f = np.fft.fft2(img)
fshift = np.fft.fftshift(f)
magnitude_spectrum = 20*np.log(np.abs(fshift))
return magnitude_spectrum
And I want to implement this function:
def inverse_fourier_transform(magnitude_spectrum):
return img
But I don't know how.
My idea is to use magnitude_spectrum to get the original img.
How can I do it?
You are loosing phases here: np.abs(fshift).
np.abs takes only real part of your data. You could separate the amplitudes and phases by:
abs = fshift.real
ph = fshift.imag
In theory, you could work on abs and join them later together with phases and reverse FFT by np.fft.ifft2.
EDIT:
You could try this approach:
import numpy as np
import matplotlib.pyplot as plt
# single chanel image
img = np.random.random((100, 100))
img = plt.imread(r'path/to/color/img.jpg')[:,:,0]
# should be only width and height
print(img.shape)
# do the 2D fourier transform
fft_img = np.fft.fft2(img)
# shift FFT to the center
fft_img_shift = np.fft.fftshift(fft_img)
# extract real and phases
real = fft_img_shift.real
phases = fft_img_shift.imag
# modify real part, put your modification here
real_mod = real/3
# create an empty complex array with the shape of the input image
fft_img_shift_mod = np.empty(real.shape, dtype=complex)
# insert real and phases to the new file
fft_img_shift_mod.real = real_mod
fft_img_shift_mod.imag = phases
# reverse shift
fft_img_mod = np.fft.ifftshift(fft_img_shift_mod)
# reverse the 2D fourier transform
img_mod = np.fft.ifft2(fft_img_mod)
# using np.abs gives the scalar value of the complex number
# with img_mod.real gives only real part. Not sure which is proper
img_mod = np.abs(img_mod)
# show differences
plt.subplot(121)
plt.imshow(img, cmap='gray')
plt.subplot(122)
plt.imshow(img_mod, cmap='gray')
plt.show()
You cannot recover the exact original image without the phase information, so you cannot only use the magnitude of the fft2.
To use the fft2 to recover the image, you just need to call numpy.fft.ifft2. See the code below:
import numpy as np
from numpy.fft import fft2, ifft2, fftshift, ifftshift
#do the 2D fourier transform
fft_img = fftshift(fft2(img))
# reverse the 2D fourier transform
freq_filt_img = ifft2(ifftshift(fft_img))
freq_filt_img = np.abs(freq_filt_img)
freq_filt_img = freq_filt_img.astype(np.uint8)
Note that calling fftshift and ifftshift is not necessary if you just want to recover the original image directly, but I added them in case there is some plotting to be done in the middle or some other operation that requires the centering of the zero frequency.
The result of calling numpy.abs() or freq_filt_img.real (assuming positive values for each pixel) to recover the image should be the same because the imaginary part of the ifft2 should be really small. Of course, the complexity of numpy.abs() is O(n) while freq_filt_img.real is O(1)
a research professor asked me to generate 2d-spatial spectrum density plots for a couple of videos. I have two problems:
How can I plot the PSD vs. x,y axis?
I know how to generate PSD for images, but uncertain how to do the same on videos. I thought about getting PSDs for every frame in the video and take the average, but I am having difficulties implementing it in python.
Below is the code I have
curr_dir = os.getcwd()
img = cv2.imread(curr_dir+'/test.jpg',0)
f = np.fft.fft2(img)
fshift = np.fft.fftshift(f)
mag = 20*np.log(np.abs(fshift))
plt.subplot(121), plt.imshow(img,cmap='gray')
plt.subplot(122), plt.imshow(mag,cmap='gray')
plt.show()
This generates something like this:
I would like to get something like this:
Any help/advice is greatly appreciated!
Since you show two 1d spectra, it would seem that you are looking for something like the following.
We read in the image, Fourier transform along one axis, and then sum the power in each bin, along the other axis. Since the input is real valued, we use rfft() so what we do not have to shift the spectrum, and we use rfftreq() to calculate the frequency for each bin. We graph the result omitting the sometimes large signal in the 0 frequency bin (which corresponds to baseline) so that the useful part of the spectrum appears on a convenient scale.
#!/usr/bin/python3
import cv2
import os
import math
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
curr_dir = os.getcwd()
img = cv2.imread(curr_dir+'/temp.png',0)
print( img.shape )
# Fourier Transform along the first axis
# Round up the size along this axis to an even number
n = int( math.ceil(img.shape[0] / 2.) * 2 )
# We use rfft since we are processing real values
a = np.fft.rfft(img,n, axis=0)
# Sum power along the second axis
a = a.real*a.real + a.imag*a.imag
a = a.sum(axis=1)/a.shape[1]
# Generate a list of frequencies
f = np.fft.rfftfreq(n)
# Graph it
plt.plot(f[1:],a[1:], label = 'sum of amplitudes over y vs f_x')
# Fourier Transform along the second axis
# Same steps as above
n = int( math.ceil(img.shape[1] / 2.) * 2 )
a = np.fft.rfft(img,n,axis=1)
a = a.real*a.real + a.imag*a.imag
a = a.sum(axis=0)/a.shape[0]
f = np.fft.rfftfreq(n)
plt.plot(f[1:],a[1:], label ='sum of amplitudes over x vs f_y')
plt.ylabel( 'amplitude' )
plt.xlabel( 'frequency' )
plt.yscale( 'log' )
plt.legend()
plt.savefig( 'test_rfft.png' )
#plt.show()
Applying this to the photograph posted in your question, produces the following result,
I've read this post about how to use OpenCV's HOG-based pedestrian detector: How can I detect and track people using OpenCV?
I want to use HOG for detecting other types of objects in images (not just pedestrians). However, the Python binding of HOGDetectMultiScale doesn't seem to give access to the actual HOG features.
Is there any way to use Python + OpenCV to extract the HOG features directly from any image?
In python opencv you can compute hog like this:
import cv2
hog = cv2.HOGDescriptor()
im = cv2.imread(sample)
h = hog.compute(im)
1. Get Inbuilt Documentation: Following command on your python console will help you know the structure of class HOGDescriptor:
import cv2;
help(cv2.HOGDescriptor())
2. Example Code: Here is a snippet of code to initialize an cv2.HOGDescriptor with different parameters (The terms I used here are standard terms which are well defined in OpenCV documentation here):
import cv2
image = cv2.imread("test.jpg",0)
winSize = (64,64)
blockSize = (16,16)
blockStride = (8,8)
cellSize = (8,8)
nbins = 9
derivAperture = 1
winSigma = 4.
histogramNormType = 0
L2HysThreshold = 2.0000000000000001e-01
gammaCorrection = 0
nlevels = 64
hog = cv2.HOGDescriptor(winSize,blockSize,blockStride,cellSize,nbins,derivAperture,winSigma,
histogramNormType,L2HysThreshold,gammaCorrection,nlevels)
#compute(img[, winStride[, padding[, locations]]]) -> descriptors
winStride = (8,8)
padding = (8,8)
locations = ((10,20),)
hist = hog.compute(image,winStride,padding,locations)
3. Reasoning: The resultant hog descriptor will have dimension as:
9 orientations X (4 corner blocks that get 1 normalization + 6x4 blocks on the edges that get 2 normalizations + 6x6 blocks that get 4 normalizations) = 1764. as I have given only one location for hog.compute().
4. One more way to initialize is from xml file which contains all parameter values:
hog = cv2.HOGDescriptor("hog.xml")
To get an xml file one can do following:
hog = cv2.HOGDescriptor()
hog.save("hog.xml")
and edit the respective parameter values in xml file.
Here is a solution that uses only OpenCV:
import numpy as np
import cv2
import matplotlib.pyplot as plt
img = cv2.cvtColor(cv2.imread("/home/me/Downloads/cat.jpg"),
cv2.COLOR_BGR2GRAY)
cell_size = (8, 8) # h x w in pixels
block_size = (2, 2) # h x w in cells
nbins = 9 # number of orientation bins
# winSize is the size of the image cropped to an multiple of the cell size
hog = cv2.HOGDescriptor(_winSize=(img.shape[1] // cell_size[1] * cell_size[1],
img.shape[0] // cell_size[0] * cell_size[0]),
_blockSize=(block_size[1] * cell_size[1],
block_size[0] * cell_size[0]),
_blockStride=(cell_size[1], cell_size[0]),
_cellSize=(cell_size[1], cell_size[0]),
_nbins=nbins)
n_cells = (img.shape[0] // cell_size[0], img.shape[1] // cell_size[1])
hog_feats = hog.compute(img)\
.reshape(n_cells[1] - block_size[1] + 1,
n_cells[0] - block_size[0] + 1,
block_size[0], block_size[1], nbins) \
.transpose((1, 0, 2, 3, 4)) # index blocks by rows first
# hog_feats now contains the gradient amplitudes for each direction,
# for each cell of its group for each group. Indexing is by rows then columns.
gradients = np.zeros((n_cells[0], n_cells[1], nbins))
# count cells (border cells appear less often across overlapping groups)
cell_count = np.full((n_cells[0], n_cells[1], 1), 0, dtype=int)
for off_y in range(block_size[0]):
for off_x in range(block_size[1]):
gradients[off_y:n_cells[0] - block_size[0] + off_y + 1,
off_x:n_cells[1] - block_size[1] + off_x + 1] += \
hog_feats[:, :, off_y, off_x, :]
cell_count[off_y:n_cells[0] - block_size[0] + off_y + 1,
off_x:n_cells[1] - block_size[1] + off_x + 1] += 1
# Average gradients
gradients /= cell_count
# Preview
plt.figure()
plt.imshow(img, cmap='gray')
plt.show()
bin = 5 # angle is 360 / nbins * direction
plt.pcolor(gradients[:, :, bin])
plt.gca().invert_yaxis()
plt.gca().set_aspect('equal', adjustable='box')
plt.colorbar()
plt.show()
I have used HOG descriptor computation and visualization to understand the data layout and vectorized the loops over groups.
Despite the fact that exist a method as said in previous answers:
hog = cv2.HOGDescriptor()
I would like to post a python implementation you can find on opencv's examples directory, hoping it can be useful to understand HOG funcionallity:
def hog(img):
gx = cv2.Sobel(img, cv2.CV_32F, 1, 0)
gy = cv2.Sobel(img, cv2.CV_32F, 0, 1)
mag, ang = cv2.cartToPolar(gx, gy)
bin_n = 16 # Number of bins
bin = np.int32(bin_n*ang/(2*np.pi))
bin_cells = []
mag_cells = []
cellx = celly = 8
for i in range(0,img.shape[0]/celly):
for j in range(0,img.shape[1]/cellx):
bin_cells.append(bin[i*celly : i*celly+celly, j*cellx : j*cellx+cellx])
mag_cells.append(mag[i*celly : i*celly+celly, j*cellx : j*cellx+cellx])
hists = [np.bincount(b.ravel(), m.ravel(), bin_n) for b, m in zip(bin_cells, mag_cells)]
hist = np.hstack(hists)
# transform to Hellinger kernel
eps = 1e-7
hist /= hist.sum() + eps
hist = np.sqrt(hist)
hist /= norm(hist) + eps
return hist
Regards.
I would disagree with the argument of peakxu. The HOG detector in the end is "just" a rigid linear filter. any degrees of freedom in the "object" (i.e. persons) lead to bluring in the detector, and are not actually handled by it. There is an extension of this detector using latent SVMs that does explicitly handle dgrees of freedom by introducing structural constraints between independent parts (i.e. head, arms, etc) as well as allowing for multiple appearances per object (i.e. frontal people and sideways people...).
Regarding the HOG detector in opencv: In theory you can upload another detector to be used with the features, but you cannot afaik get the features themselves. thus, if you have a trained detector (i.e. a class specific linear filter) you should be able to upload that into the detector to get the fast detections performance of opencv. that said it should be easy to hack the opencv source code to provide this access and propose this patch back to the maintainers.
I would not recommend using HOG features for detecting objects other than pedestrians. In the original HOG paper by Dalal and Triggs, they specifically mentioned that their detector is built around pedestrian detection in allowing for significant degrees of freedom in the limbs while using strong structural hints around human body.
Instead, try looking at OpenCV's HaarDetectObjects. You can learn how to train your own cascades here.