I have an image and I need to compute a fourier-related transform over it called Short Time Fourier Transform (for extra mathematical info check:http://en.wikipedia.org/wiki/Short-time_Fourier_transform).
In order to do that I need to :
(1) place a window at the starting pixel of the image (x,y)=(M/2,M/2)
(2) Truncate the image using this window
(3) Compute the FFT of the truncated image, save results.
(4) Incrementally slide the window to the right
(5) Go to step 3, until window reaches the end of the image
However I need to perform the aformentioned calculation in real time...
But it is rather slow !!!
Is there anyway to speed up the aformentioned process ??
I also include my code:
height, width = final_frame.shape
M=2
for j in range(M/2, height-M/2):
for i in range(M/2, width-M/2):
face_win=final_frame[j-M/2:j+M/2, i-M/2:i+M/2]
#these steps are perfomed in order to speed up the FFT calculation process
height_win, width_win = face_win.shape
fftheight=cv2.getOptimalDFTSize(height_win)
fftwidth=cv2.getOptimalDFTSize(width_win)
right = fftwidth - width_win
bottom = fftheight - height_win
bordertype = cv2.BORDER_CONSTANT
nimg = cv2.copyMakeBorder(face_win,0,bottom,0,right,bordertype, value = 0)
dft = cv2.dft(np.float32(face_win),flags = cv2.DFT_COMPLEX_OUTPUT)
dft_shift = np.fft.fftshift(dft)
magnitude_spectrum = 20*np.log(cv2.magnitude(dft_shift[:,:,0],dft_shift[:,:,1]))
Of course the bulk of your time is going to be spent in the FFT's and other transformation code, but I took a shot at easy optimizations of the other parts.
Changes
Frame size calculations are the same every loop so move them out (~nil improvement)
Type coercion from uint8 to float32 can be done once on the whole image rather than converting each frame. (small but measurable improvement)
If the window size is already the same as the optimal size (I guess it always will be if you keep M as a power of 2), then don't do the bordered copy. Just use the face_win view as-is. (small but measurable improvement)
Total improvement 26s --> 22s. Not much but there it is.
Standalone Code (just add 1024x768.jpg)
import time
import cv2
import numpy as np
# image loading for anybody else who wants to use this
final_frame = cv2.imread('1024x768.jpg')
final_frame = cv2.cvtColor(final_frame, cv2.COLOR_BGR2GRAY)
final_frame_f32 = final_frame.astype(np.float32) # moved out of the loop
# base data
M = 4
height, width = final_frame.shape
# various calculations moved out of the loop
m_half = M//2
height_win, width_win = [2 * m_half] * 2 # can you even use odd values for M?
fftheight = cv2.getOptimalDFTSize(height_win)
fftwidth = cv2.getOptimalDFTSize(width_win)
bordertype = cv2.BORDER_CONSTANT
right = fftwidth - width_win
bottom = fftheight - height_win
start = time.time()
for j in range(m_half, height-m_half):
for i in range(m_half, width-m_half):
face_win = final_frame_f32[j-m_half:j+m_half, i-m_half:i+m_half]
# only copy for border if necessary
if (fftheight, fftwidth) == (height_win, width_win):
nimg = face_win
else:
nimg = cv2.copyMakeBorder(face_win, 0, bottom, 0, right, bordertype, value=0)
dft = cv2.dft(nimg, flags=cv2.DFT_COMPLEX_OUTPUT)
dft_shift = np.fft.fftshift(dft)
magnitude_spectrum = 20 * np.log(cv2.magnitude(dft_shift[:, :, 0], dft_shift[:, :, 1]))
elapsed = time.time() - start
print elapsed
Bugs
I fixed these in the code above but I didn't edit your original since you may have intended it to be that way
you calculate nimg but then use the original face_win in the dft
to be explicit, I changed M/2 etc. to M//2
Related
I've spent the past week or two making a personalized Perlin noise generator (notice I said personalized because I don't want to use other generators), but I'm not a super-skilled programmer, and it's really slow. To speed it up, I've been looking into C#, because it's close to python and java, which are my two best languages, and it's not C. Problem is, I programmed the entire generator in python, which is not my strongest language, and had I programmed it in java I would've had an easier time converting it to C#.
Now I'm trying to translate my generator directly from python to C#, which I can do pretty easily for the most part, but I'm a little iffy on some stuff that my instructor coded for me. Namely, this normalize function:
# np is numpy
def normalize(img):
img_copy=img*1.0
img_copy-=np.min(img_copy)
img_copy/=np.max(img_copy)
img_copy*=255.9999
return np.uint8(img_copy)
I don't know if C# can do this almost instantaneous list comprehension without excessive for-looping, and I also don't know much about NumSharp, which is what I'd use instead of numpy.
how would I write this function in C#, and how do I use the NumSharp equivalent of the numpy functions zeros(), max(), min() and the cv2 function resize?
P.S. I have the program on repl.it if you need more context.
https://repl.it/#JoshuaFavorite/PerlinNoiseGenerator#main.py
Edit: apparently it isn't clear that my python program is fully functioning and I don't need any help with that, I need help with C#. Specifically instantaneous matrix multiplication, matrix statistics and such things that are so easily done with python.
Sorry, I do not know C#, but here is one way to generate perlin noise in Python OpenCV
- Start with a black image
- Iterate generating a noise image at different dimensions according to power law
- Resize it
- Attenuate it
- Add it to the previous iteration
- Scale to range 0 to 255 as integer
- Save the results
import cv2
import skimage.exposure
import numpy as np
from numpy.random import default_rng
rng = default_rng()
# define argument
wd = 500 # width output
ht = 500 # height of output
base = 2 #integer>1; typically 2 to 4; frequency=base^(octave level)
startlevel = 1 #starting octave level; integer>0
endlevel = 6 #ending octave level; nominally 5 or 6
atten = 0 #persistence=1/atten; amp=persist^(j-1); atten=0 -> amp=1/j; j=1,2...; atten=2 is good, also
# compute larger of wd and ht
max_dim = max(wd,ht)
#compute start dim as base^(startlevel)
start_dim = base**startlevel
# compute end dim as base^(endlevel)
end_dim = base**endlevel
# create zero frequency black base to which to add octaves
result = np.zeros((max_dim,max_dim),dtype=np.float32)
# process octaves
j = 1
for i in range (startlevel,endlevel):
if atten == 0:
amp = 1/j
else:
amp = 1/(atten**(j-1))
print("Processing Octave Level:",i," and Amplitude:",amp)
# create noise image, attenuate it, combine with zero frequency initial level
dim = base**i
noise = rng.integers(0,255,(dim,dim),np.uint8,True).astype(np.float32)
# resize to max_dim and attenuate
noise = amp * cv2.resize(noise, (max_dim,max_dim), interpolation=cv2.INTER_CUBIC).astype(np.float32)
# add to result
result = cv2.add(result, noise)
# stop if end_dim for given endlevel is larger than max_dim
if end_dim > max_dim:
break
# increment
i += 1
j += 1
# scale to range 0 to 255, and crop to desired dimensions
result = skimage.exposure.rescale_intensity(result, in_range='image', out_range=(0,255)).clip(0,255).astype(np.uint8)
result = result[0:ht, 0:wd]
# save result
cv2.imwrite('perlin.jpg', result)
# show results
cv2.imshow('perlin', result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Atten 0, Startlevel 1 and Endlevel 6:
Atten 0, Startlevel 1 and Endlevel 5:
Atten 2, Startlevel 1 and Endlevel 6:
This problem is about using scipy.signal.find_peaks for extracting mean peak height from data files efficiently. I am a beginner with Python (3.7), so I am not sure if I have written my code in the most optimal way, with regard to speed and code quality.
I have a set of measurement files containing one million data points (30MB) each. The graph of this data is a signal with peaks at regular intervals, and with noise. Also, the baseline and the amplitude of the signal vary across parts of the signal. I attached an image of an example. The signal can be much less clean.
My goal is to calculate the mean height of the peaks for each file. In order to do this, first I use find_peaks to locate all the peaks. Then, I loop over each peak location and detect the peak in a small interval around the peak, to make sure I get the local height of the peak.
I then put all these heights in numpy arrays and calculate their mean and standard deviation afterwards.
Here is a barebone version of my code, it is a bit long but I think that might also be because I am doing something wrong.
import numpy as np
from scipy.signal import find_peaks
# Allocate empty lists for values
mean_heights = []
std_heights = []
mean_baselines = []
std_baselines = []
temperatures = []
# Loop over several files, read them in and process data
for file in file_list:
temperatures.append(file)
# Load in data from a file of 30 MB
t_dat, x_dat = np.loadtxt(file, delimiter='\t', unpack=True)
# Find all peaks in this file
peaks, peak_properties = find_peaks(x_dat, prominence=prom, width=0)
# Calculate window size, make sure it is even
if round(len(t_dat)/len(peaks)) % 2 == 0:
n_points = len(t_dat) // len(peaks)
else:
n_points = len(t_dat) // len(peaks) + 1
t_slice = t_dat[-1] / len(t_dat)
# Allocate np arrays for storing heights
baseline_list = np.zeros(len(peaks) - 2)
height_list = np.zeros(len(peaks) - 2)
# Loop over all found peaks, and re-detect the peak in a window around the peak to be able
# to detect its local height without triggering to a baseline far away
for i in range(len(peaks) - 2):
# Making a window around a peak_properties
sub_t = t_dat[peaks[i+1] - n_points // 2: peaks[i+1] + n_points // 2]
sub_x = x_dat[peaks[i+1] - n_points // 2: peaks[i+1] + n_points // 2]
# Detect the peaks (2 version, specific to the application I have)
h_min = max(sub_x) - np.mean(sub_x)
_, baseline_props = find_peaks(
sub_x, prominence=h_min, distance=n_points - 1, width=0)
_, height_props = find_peaks(np.append(
min(sub_x) - 1, sub_x), prominence=h_min, distance=n_points - 1, width=0)
# Add the heights to the np arrays storing the heights
baseline_list[i] = baseline_props["prominences"]
height_list[i] = height_props["prominences"]
# Fill lists with values, taking the stdev and mean of the np arrays with the heights
mean_heights.append(np.mean(height_list))
std_heights.append(np.std(height_list))
mean_baselines.append(np.mean(baseline_list))
std_baselines.append(np.std(baseline_list))
It takes ~30 s to execute. Is this normal or too slow? If so, can it be optimised?
In the meantime I have improved the speed by getting rid of various inefficiencies, that I found by using the Python profiler. I will list the optimisations here ordered by significance for the speed:
Using pandas pd.read_csv() for I/O instead of np.loadtxt() cut off about 90% of the runtime. As also mentioned here, this saves a lot of time. This means changing this:
t_dat, x_dat = np.loadtxt(file, delimiter='\t', unpack=True)
to this:
data = pd.read_csv(file, delimiter = "\t", names=["t_dat", "x_dat"])
t_dat = data.values[:,0]
x_dat = data.values[:,1]
Removing redundant len() calls. I noticed that len() was called many times, and then noticed that this happened unnecessary. Changing this:
if round(len(t_dat) / len(peaks)) % 2 == 0:
n_points = int(len(t_dat) / len(peaks))
else:
n_points = int(len(t_dat) / len(peaks) + 1)
to this:
n_points = round(len(t_dat) / len(peaks))
if n_points % 2 != 0:
n_points += 1
proved to be also a significant improvement.
Lastly, a disproportionally component of the computational time (about 20%) was used by the built in Python functions min(), max() and sum(). Since I was using numpy arrays already, switching to the numpy equivalents for these functions, resulted in a 84% improvement on this part. This means for example changing max(sub_x) to sub_x.max().
These are all unrelated optimisations that I still think might be useful for Python beginner like myself, and they do help a lot.
So I'm trying to remove all background from all frames of a TIFF stack. Basically, I want to fit a spline for every row for every frame.
I know there are also ways to correct for local background to reduce overhead with naive background "rings" around located samples to quickly do it for multiple frames, as well as some sort of background fitting (where the uses I've heard of are quite slow).
My version is this:
import numpy as np
import time
from scipy.interpolate import UnivariateSpline as Spline
def timeit(method):
times = []
def timed(*args, **kw):
ts = time.time()
result = method(*args, **kw)
te = time.time()
times.append((te - ts) * 1000)
print('%r %2.2f ms' % (method.__name__, (te - ts) * 1000))
return result
return timed
# Generate something that resembles a video
img = np.random.randint(low = 0, high = 2**16, size = (10, 500, 400))
img = img/(2**16) # convert to (0,1)
#timeit
def spline_background_subtract(arr, deg, s):
frames, rows, columns = arr.shape
ix = np.arange(0, columns) # Points to evaluate spline over
frames = []
for i in range(img.shape[0]):
frame = img[i, :, :]
ls = [Spline(ix, frame[i, :], k = deg, s = s)(ix) for i in range(rows)] # Fit every row with a spline to determine background
new = np.row_stack(ls) # Stack all rows
frames.append(new)
return frames
frames = spline_background_subtract(arr = img, deg = 2, s = 1e4)
# new_video = np.reshape(np.dstack(frames), newshape = (img.shape))
This takes about 50 ms per frame on my computer, but if I have 1000 frames and 100 movies, this quickly adds up if corrections should be done in real-time.
I've tried to trim it as much as possible. Is there anywhere to gain anything, besides rewriting everything in a high performance language?
EDIT
Some testing:
scipy.interpolate.RectBivariateSpline is about twice as slow...
scipy.ndimage.filters.gaussian is about twice as fast it seems! It does a very good job if the features to be isolated are relatively small (in my case they are), as they'll be smoothed out at smaller standard deviations (= faster computation).
Without benchmarking it myself, I would guess that this line is perhaps the culprit:
ls = [Spline(ix, frame[i, :], k = deg, s = s)(ix) for i in range(rows)]
If you could vectorize that operation you would get a speedup. You could also try something like this 2d bspline in scipy: which could be faster.
Could also look into Cython / Numba
The input is a spectrum with colorful (sorry) vertical lines on a black background. Given the approximate x coordinate of that band (as marked by X), I want to find the width of that band.
I am unfamiliar with image processing. Please direct me to the correct method of image processing and a Python image processing package that can do the same.
I am thinking PIL, OpenCV gave me an impression of being overkill for this particular application.
What if I want to make this an expert system that can classify them in the future?
I'll give a complete minimal working example (as suggested by sega_sai). I don't have access to your original image, but you'll see it doesn't really matter! The peak distributions found by the code below are:
Mean values at: 26.2840960523 80.8255092125
import Image
from scipy import *
from scipy.optimize import leastsq
# Load the picture with PIL, process if needed
pic = asarray(Image.open("band2.png"))
# Average the pixel values along vertical axis
pic_avg = pic.mean(axis=2)
projection = pic_avg.sum(axis=0)
# Set the min value to zero for a nice fit
projection /= projection.mean()
projection -= projection.min()
# Fit function, two gaussians, adjust as needed
def fitfunc(p,x):
return p[0]*exp(-(x-p[1])**2/(2.0*p[2]**2)) + \
p[3]*exp(-(x-p[4])**2/(2.0*p[5]**2))
errfunc = lambda p, x, y: fitfunc(p,x)-y
# Use scipy to fit, p0 is inital guess
p0 = array([0,20,1,0,75,10])
X = xrange(len(projection))
p1, success = leastsq(errfunc, p0, args=(X,projection))
Y = fitfunc(p1,X)
# Output the result
print "Mean values at: ", p1[1], p1[4]
# Plot the result
from pylab import *
subplot(211)
imshow(pic)
subplot(223)
plot(projection)
subplot(224)
plot(X,Y,'r',lw=5)
show()
Below is a simple thresholding method to find the lines and their width, it should work quite reliably for any number of lines. The yellow and black image below was processed using this script, the red/black plot illustrates the found lines using parameters of threshold = 0.3, min_line_width = 5)
The script averages the rows of an image, and then determines the basic start and end positions of each line based on a threshold (which you can set between 0 and 1), and a minimum line width (in pixels). By using thresholding and minimum line width you can easily filter your input images to get the lines out of them. The first function find_lines returns all the lines in an image as a list of tuples containing the start, end, center, and width of each line. The second function find_closest_band_width is called with the specified x_position, and returns the width of the closest line to this position (assuming you want distance to centre for each line). As the lines are saturated (255 cut-off per channel), their cross-sections are not far from a uniform distribution, so I don't believe trying to fit any kind of distribution is really going to help too much, just unnecessarily complicates.
import Image, ImageStat
def find_lines(image_file, threshold, min_line_width):
im = Image.open(image_file)
width, height = im.size
hist = []
lines = []
start = end = 0
for x in xrange(width):
column = im.crop((x, 0, x + 1, height))
stat = ImageStat.Stat(column)
## normalises by 2 * 255 as in your example the colour is yellow
## if your images start using white lines change this to 3 * 255
hist.append(sum(stat.sum) / (height * 2 * 255))
for index, value in enumerate(hist):
if value > threshold and end >= start:
start = index
if value < threshold and end < start:
if index - start < min_line_width:
start = 0
else:
end = index
center = start + (end - start) / 2.0
width = end - start
lines.append((start, end, center, width))
return lines
def find_closest_band_width(x_position, lines):
distances = [((value[2] - x_position) ** 2) for value in lines]
index = distances.index(min(distances))
return lines[index][3]
## set your threshold, and min_line_width for finding lines
lines = find_lines("8IxWA_sample.png", 0.7, 4)
## sets x_position to 59th pixel
print 'width of nearest line:', find_closest_band_width(59, lines)
I don't think that you need anything fancy for you particular task.
I would just use PIL + scipy. That should be enough.
Because you essentially need to take your image, make a 1D-projection of it
and then fit a Gaussian or something like that to it. The information about the approximate location of the band should be used a first guess for the fitter.
So I have an array (it's large - 2048x2048), and I would like to do some element wise operations dependent on where they are. I'm very confused how to do this (I was told not to use for loops, and when I tried that my IDE froze and it was going really slow).
Onto the question:
h = aperatureimage
h[:,:] = 0
indices = np.where(aperatureimage>1)
for True in h:
h[index] = np.exp(1j*k*z)*np.exp(1j*k*(x**2+y**2)/(2*z))/(1j*wave*z)
So I have an index, which is (I'm assuming here) essentially a 'cropped' version of my larger aperatureimage array. *Note: Aperature image is a grayscale image converted to an array, it has a shape or text on it, and I would like to find all the 'white' regions of the aperature and perform my operation.
How can I access the individual x/y values of index which will allow me to perform my exponential operation? When I try index[:,None], leads to the program spitting out 'ValueError: broadcast dimensions too large'. I also get array is not broadcastable to correct shape. Any help would be appreciated!
One more clarification: x and y are the only values I would like to change (essentially the points in my array where there is white, z, k, and whatever else are defined previously).
EDIT:
I'm not sure the code I posted above is correct, it returns two empty arrays. When I do this though
index = (aperatureimage==1)
print len(index)
Actually, nothing I've done so far works correctly. I have a 2048x2048 image with a 128x128 white square in the middle of it. I would like to convert this image to an array, look through all the values and determine the index values (x,y) where the array is not black (I only have white/black, bilevel image didn't work for me). I would then like to take all the values (x,y) where the array is not 0, and multiply them by the h[index] value listed above.
I can post more information if necessary. If you can't tell, I'm stuck.
EDIT2: Here's some code that might help - I think I have the problem above solved (I can now access members of the array and perform operations on them). But - for some reason the Fx values in my for loop never increase, it loops Fy forever....
import sys, os
from scipy.signal import *
import numpy as np
import Image, ImageDraw, ImageFont, ImageOps, ImageEnhance, ImageColor
def createImage(aperature, type):
imsize = aperature*8
middle = imsize/2
im = Image.new("L", (imsize,imsize))
draw = ImageDraw.Draw(im)
box = ((middle-aperature/2, middle-aperature/2), (middle+aperature/2, middle+aperature/2))
import sys, os
from scipy.signal import *
import numpy as np
import Image, ImageDraw, ImageFont, ImageOps, ImageEnhance, ImageColor
def createImage(aperature, type):
imsize = aperature*8 #Add 0 padding to make it nice
middle = imsize/2 # The middle (physical 0) of our image will be the imagesize/2
im = Image.new("L", (imsize,imsize)) #Make a grayscale image with imsize*imsize pixels
draw = ImageDraw.Draw(im) #Create a new draw method
box = ((middle-aperature/2, middle-aperature/2), (middle+aperature/2, middle+aperature/2)) #Bounding box for aperature
if type == 'Rectangle':
draw.rectangle(box, fill = 'white') #Draw rectangle in the box and color it white
del draw
return im, middle
def Diffraction(aperaturediameter = 1, type = 'Rectangle', z = 2000000, wave = .001):
# Constants
deltaF = 1/8 # Image will be 8mm wide
z = 1/3.
wave = 0.001
k = 2*pi/wave
# Now let's get to work
aperature = aperaturediameter * 128 # Aperaturediameter (in mm) to some pixels
im, middle = createImage(aperature, type) #Create an image depending on type of aperature
aperaturearray = np.array(im) # Turn image into numpy array
# Fourier Transform of Aperature
Ta = np.fft.fftshift(np.fft.fft2(aperaturearray))/(len(aperaturearray))
# Transforming and calculating of Transfer Function Method
H = aperaturearray.copy() # Copy image so H (transfer function) has the same dimensions as aperaturearray
H[:,:] = 0 # Set H to 0
U = aperaturearray.copy()
U[:,:] = 0
index = np.nonzero(aperaturearray) # Find nonzero elements of aperaturearray
H[index[0],index[1]] = np.exp(1j*k*z)*np.exp(-1j*k*wave*z*((index[0]-middle)**2+(index[1]-middle)**2)) # Free space transfer for ap array
Utfm = abs(np.fft.fftshift(np.fft.ifft2(Ta*H))) # Compute intensity at distance z
# Fourier Integral Method
apindex = np.nonzero(aperaturearray)
U[index[0],index[1]] = aperaturearray[index[0],index[1]] * np.exp(1j*k*((index[0]-middle)**2+(index[1]-middle)**2)/(2*z))
Ufim = abs(np.fft.fftshift(np.fft.fft2(U))/len(U))
# Save image
fim = Image.fromarray(np.uint8(Ufim))
fim.save("PATH\Fim.jpg")
ftfm = Image.fromarray(np.uint8(Utfm))
ftfm.save("PATH\FTFM.jpg")
print "that may have worked..."
return
if __name__ == '__main__':
Diffraction()
You'll need numpy, scipy, and PIL to work with this code.
When I run this, it goes through the code, but there is no data in them (everything is black). Now I have a real problem here as I don't entirely understand the math I'm doing (this is for HW), and I don't have a firm grasp on Python.
U[index[0],index[1]] = aperaturearray[index[0],index[1]] * np.exp(1j*k*((index[0]-middle)**2+(index[1]-middle)**2)/(2*z))
Should that line work for performing elementwise calculations on my array?
Could you perhaps post a minimal, yet complete, example? One that we can copy/paste and run ourselves?
In the meantime, in the first two lines of your current example:
h = aperatureimage
h[:,:] = 0
you set both 'aperatureimage' and 'h' to 0. That's probably not what you intended. You might want to consider:
h = aperatureimage.copy()
This generates a copy of aperatureimage while your code simply points h to the same array as aperatureimage. So changing one changes the other.
Be aware, copying very large arrays might cost you more memory then you would prefer.
What I think you are trying to do is this:
import numpy as np
N = 2048
M = 64
a = np.zeros((N, N))
a[N/2-M:N/2+M,N/2-M:N/2+M]=1
x,y = np.meshgrid(np.linspace(0, 1, N), np.linspace(0, 1, N))
b = a.copy()
indices = np.where(a>0)
b[indices] = np.exp(x[indices]**2+y[indices]**2)
Or something similar. This, in any case, sets some values in 'b' based on the x/y coordinates where 'a' is bigger than 0. Try visualizing it with imshow. Good luck!
Concerning the edit
You should normalize your output so it fits in the 8 bit integer. Currently, one of your arrays has a maximum value much larger than 255 and one has a maximum much smaller. Try this instead:
fim = Image.fromarray(np.uint8(255*Ufim/np.amax(Ufim)))
fim.save("PATH\Fim.jpg")
ftfm = Image.fromarray(np.uint8(255*Utfm/np.amax(Utfm)))
ftfm.save("PATH\FTFM.jpg")
Also consider np.zeros_like() instead of copying and clearing H and U.
Finally, I personally very much like working with ipython when developing something like this. If you put the code in your Diffraction function in the top level of your script (in place of 'if __ name __ &c.'), then you can access the variables directly from ipython. A quick command like np.amax(Utfm) would show you that there are indeed values!=0. imshow() is always nice to look at matrices.