I want to make this code faster, as it takes ~4 miliseconds for a 1000x1000 image with a window size of 10x10.
import numpy
import scipy.misc
from matplotlib import pyplot as plt
import time
def corr(a, b):
'''finds the correlation of 2 intensities'''
return (sum2(a,b)/(sum2(a,a)*sum2(b,b))**0.5)
def sum2(a,b):
s = 0
for x in range(len(a)):
s += a[x]*b[x]
return s
##the commented code displays the images
##plt.ion()
def find_same(img1,img2,startx,width,starty,hight):
'''inputs 2 filenames, startx, width of search window, and hight.'''
crop_img = img1[starty:(starty+hight),startx:(startx+width)]
plt.imshow(crop_img,interpolation='nearest')
plt.draw()
a = []
for x in numpy.nditer(crop_img): #converting image to array of intesities
a.append(float(x))
mcfinder = []
for x in range(img2.shape[1]-width):
finder = img2[starty:(starty+hight),x:(x+width)]
b = []
for y in numpy.nditer(finder):
b.append(float(y))
correl = corr(a,b) #find correlation
mcfinder.append(correl)
maxim = max(mcfinder)
place = mcfinder.index(maxim)
finder = img2[starty:(starty+hight),place:(place+width)]
## plt.imshow(finder,interpolation='nearest')
## plt.draw()
## time.sleep(1)
## plt.close()
return maxim,place
img1 = scipy.misc.imread('me1.bmp')
img2 = scipy.misc.imread('me2.bmp')
starttime = time.clock()
print find_same(img1,img2,210,40,200,40)
endtime = time.clock()
print endtime-starttime
Are there any ways to make this faster? Or am I doing this fundamentally wrong?
Please let me know. To run this you need matplotlib, scipy, and numpy.
[I lack the reputation to post this as a comment]
As mentioned in #cel's comment, you should vectorize your code by using only numpy operations instead of loops over lists.
It seems you are trying to do some template matching, did you have a look at the example for skimage.feature.match_template() from the scikit-image documentation? scikit-image also provides windowed views (skimage.util.view-as-windows()) of a numpy array which is very handy, when you are analyzing a numpy array block-by-block.
If you don't want to add another dependency you should use Scipy's special functions to compute the correlation for you, e.g. scipy.ndimage.filters.correlate() (also have a look at the other functions in scipy.ndimage.filter).
Related
I have a 3D data cube of values of size on the order of 10,000x512x512. I want to parse a window of vectors (say 6) along dim[0] repeatedly and generate the fourier transforms efficiently. I think I'm doing an array copy into the pyfftw package and it's giving me massive overhead. I'm going over the documentation now since I think there is an option I need to set, but I could use some extra help on the syntax.
This code was originally written by another person with numpy.fft.rfft and accelerated with numba. But the implementation wasn't working on my workstation so I re-wrote everything and opted to go for pyfftw instead.
import numpy as np
import pyfftw as ftw
from tkinter import simpledialog
from math import ceil
import multiprocessing
ftw.config.NUM_THREADS = multiprocessing.cpu_count()
ftw.interfaces.cache.enable()
def runme():
# normally I would load a file, but for Stack Overflow, I'm just going to generate a 3D data cube so I'll delete references to the binary saving/loading functions:
# load the file
dataChunk = np.random.random((1000,512,512))
numFrames = dataChunk.shape[0]
# select the window size
windowSize = int(simpledialog.askstring('Window Size',
'How many frames to demodulate a single time point?'))
numChannels = windowSize//2+1
# create fftw arrays
ftwIn = ftw.empty_aligned(windowSize, dtype='complex128')
ftwOut = ftw.empty_aligned(windowSize, dtype='complex128')
fftObject = ftw.FFTW(ftwIn,ftwOut)
# perform DFT on the data chunk
demodFrames = dataChunk.shape[0]//windowSize
channelChunks = np.zeros([numChannels,demodFrames,
dataChunk.shape[1],dataChunk.shape[2]])
channelChunks = getDFT(dataChunk,channelChunks,
ftwIn,ftwOut,fftObject,windowSize,numChannels)
return channelChunks
def getDFT(data,channelOut,ftwIn,ftwOut,fftObject,
windowSize,numChannels):
frameLen = data.shape[0]
demodFrames = frameLen//windowSize
for yy in range(data.shape[1]):
for xx in range(data.shape[2]):
index = 0
for i in range(0,frameLen-windowSize+1,windowSize):
ftwIn[:] = data[i:i+windowSize,yy,xx]
fftObject()
channelOut[:,index,yy,xx] = 2*np.abs(ftwOut[:numChannels])/windowSize
index+=1
return channelOut
if __name__ == '__main__':
runme()
What happens is I get a 4D array; the variable channelChunks. I am saving out each channel to a binary (not included in the code above, but the saving part works fine).
This process is for a demodulation project we have, the 4D data cube channelChunks is then parsed into eval(numChannel) 3D data cubes (movies) and from that we are able to separate a movie by color given our experimental set up. I was hoping I could circumvent writing a C++ function that calls the fft on the matrix via pyfftw.
Effectively, I am taking windowSize=6 elements along the 0 axis of dataChunk at a given index of 1 and 2 axis and performing a 1D FFT. I need to do this throughout the entire 3D volume of dataChunk to generate the demodulated movies. Thanks.
The FFTW advanced plans can be automatically built by pyfftw.
The code could be modified in the following way:
Real to complex transforms can be used instead of complex to complex transform.
Using pyfftw, it typically writes:
ftwIn = ftw.empty_aligned(windowSize, dtype='float64')
ftwOut = ftw.empty_aligned(windowSize//2+1, dtype='complex128')
fftObject = ftw.FFTW(ftwIn,ftwOut)
Add a few flags to the FFTW planner. For instance, FFTW_MEASURE will time different algorithms and pick the best. FFTW_DESTROY_INPUT signals that the input array can be modified: some implementations tricks can be used.
fftObject = ftw.FFTW(ftwIn,ftwOut, flags=('FFTW_MEASURE','FFTW_DESTROY_INPUT',))
Limit the number of divisions. A division costs more than a multiplication.
scale=1.0/windowSize
for ...
for ...
2*np.abs(ftwOut[:,:,:])*scale #instead of /windowSize
Avoid multiple for loops by making use of FFTW advanced plan through pyfftw.
nbwindow=numFrames//windowSize
# create fftw arrays
ftwIn = ftw.empty_aligned((nbwindow,windowSize,dataChunk.shape[2]), dtype='float64')
ftwOut = ftw.empty_aligned((nbwindow,windowSize//2+1,dataChunk.shape[2]), dtype='complex128')
fftObject = ftw.FFTW(ftwIn,ftwOut, axes=(1,), flags=('FFTW_MEASURE','FFTW_DESTROY_INPUT',))
...
for yy in range(data.shape[1]):
ftwIn[:] = np.reshape(data[0:nbwindow*windowSize,yy,:],(nbwindow,windowSize,data.shape[2]),order='C')
fftObject()
channelOut[:,:,yy,:]=np.transpose(2*np.abs(ftwOut[:,:,:])*scale, (1,0,2))
Here is the modifed code. I also, decreased the number of frame to 100, set the seed of the random generator to check that the outcome is not modifed and commented tkinter. The size of the window can be set to a power of two, or a number made by multiplying 2,3,5 or 7, so that the Cooley-Tuckey algorithm can be efficiently applied. Avoid large prime numbers.
import numpy as np
import pyfftw as ftw
#from tkinter import simpledialog
from math import ceil
import multiprocessing
import time
ftw.config.NUM_THREADS = multiprocessing.cpu_count()
ftw.interfaces.cache.enable()
ftw.config.PLANNER_EFFORT = 'FFTW_MEASURE'
def runme():
# normally I would load a file, but for Stack Overflow, I'm just going to generate a 3D data cube so I'll delete references to the binary saving/loading functions:
# load the file
np.random.seed(seed=42)
dataChunk = np.random.random((100,512,512))
numFrames = dataChunk.shape[0]
# select the window size
#windowSize = int(simpledialog.askstring('Window Size',
# 'How many frames to demodulate a single time point?'))
windowSize=32
numChannels = windowSize//2+1
nbwindow=numFrames//windowSize
# create fftw arrays
ftwIn = ftw.empty_aligned((nbwindow,windowSize,dataChunk.shape[2]), dtype='float64')
ftwOut = ftw.empty_aligned((nbwindow,windowSize//2+1,dataChunk.shape[2]), dtype='complex128')
#ftwIn = ftw.empty_aligned(windowSize, dtype='complex128')
#ftwOut = ftw.empty_aligned(windowSize, dtype='complex128')
fftObject = ftw.FFTW(ftwIn,ftwOut, axes=(1,), flags=('FFTW_MEASURE','FFTW_DESTROY_INPUT',))
# perform DFT on the data chunk
demodFrames = dataChunk.shape[0]//windowSize
channelChunks = np.zeros([numChannels,demodFrames,
dataChunk.shape[1],dataChunk.shape[2]])
channelChunks = getDFT(dataChunk,channelChunks,
ftwIn,ftwOut,fftObject,windowSize,numChannels)
return channelChunks
def getDFT(data,channelOut,ftwIn,ftwOut,fftObject,
windowSize,numChannels):
frameLen = data.shape[0]
demodFrames = frameLen//windowSize
printed=0
nbwindow=data.shape[0]//windowSize
scale=1.0/windowSize
for yy in range(data.shape[1]):
#for xx in range(data.shape[2]):
index = 0
ftwIn[:] = np.reshape(data[0:nbwindow*windowSize,yy,:],(nbwindow,windowSize,data.shape[2]),order='C')
fftObject()
channelOut[:,:,yy,:]=np.transpose(2*np.abs(ftwOut[:,:,:])*scale, (1,0,2))
#for i in range(nbwindow):
#channelOut[:,i,yy,xx] = 2*np.abs(ftwOut[i,:])*scale
if printed==0:
for j in range(channelOut.shape[0]):
print j,channelOut[j,0,yy,0]
printed=1
return channelOut
if __name__ == '__main__':
seconds=time.time()
runme()
print "time: ", time.time()-seconds
Let us know how much it speeds up your computations! I went from 24s to less than 2s on my computer...
I've got a high-resolution healpix map (nside = 4096) that I want to smooth in disks of a given radius, let's say 10 arcmin.
Being very new to healpy and having read the documentation I found that one - not so good - way to do this was to perform a "cone search", that is to find around each pixels the ones inside the disk, average them and give this new value to the pixel at the center. However this is very time-consuming.
import numpy as np
import healpy as hp
kappa = hp.read_map("zs_1.0334.fits") #Reading my file
NSIDE = 4096
t = 0.00290888 #10 arcmin
new_array = []
n = len(kappa)
for i in range(n):
a = hp.query_disc(NSIDE,hp.pix2vec(NSIDE,i),t)
new_array.append(np.mean(kappa[a]))
I think the healpy.sphtfunc.smoothing function could be of some help as it states that you can enter any custom beam window function but I don't understand how this works at all...
Thanks a lot for your help !
As suggested, I can easily make use of the healpy.sphtfunc.smoothing function by specifying a custom (circular) beam window.
To compute the beam window, which was my problem, healpy.sphtfunc.beam2bl is very useful and simple in the case of a top-hat.
The appropriated l_max would roughly be 2*Nside but it can be smaller depending on specific maps. One could for example compute the angular power-spectra (the Cls) and check if it dampens for smaller l than l_max which could help gain some more time.
Thanks a lot to everyone who helped in the comments section!
since I spent a certain amount of time trying to figure out how the function smoothing was working. There is a bit of code that allows you to do a top_hat smoothing.
Cheers,
import healpy as hp
import numpy as np
import matplotlib.pyplot as plt
def top_hat(b, radius):
return np.where(abs(b)<=radius, 1, 0)
nside = 128
npix = hp.nside2npix(nside)
#create a empy map
tst_map = np.zeros(npix)
#put a source in the middle of the map with value = 100
pix = hp.ang2pix(nside, np.pi/2, 0)
tst_map[pix] = 100
#Compute the window function in the harmonic spherical space which will smooth the map.
b = np.linspace(0,np.pi,10000)
bw = top_hat(b, np.radians(45)) #top_hat function of radius 45°
beam = hp.sphtfunc.beam2bl(bw, b, nside*3)
#Smooth map
tst_map_smoothed = hp.smoothing(tst_map, beam_window=beam)
hp.mollview(tst_map_smoothed)
plt.show()
So far I found 4 ways to find peaks in Python, however none of them can specify the number of peaks like Matlab does. Can someone provide some insight?
import scipy.signal as sg
import numpy as np
# Method 1
sg.find_peaks_cwt(vector, np.arange(1,4),max_distances=np.arange(1, 4)*2)
# Method 2
sg.argrelextrema(np.array(vector),comparator=np.greater,order=2)
# Method 3
sg.find_peaks(vector, height=7, distance=2.1)
# Method 4
detect_peaks.detect_peaks(vector, mph=7, mpd=2)`
Below is the Matlab code that I want to emulate:
[pks,locs] = findpeaks(data,'Npeaks',n)
If you want the exact function Matlab has, why not just use that function? If you have the rest of your data in Python, then you can just use the module provided by Matlab.
import matlab.engine #import matlab engine
eng = matlab.engine.start_matlab() #Start matlab engine
a = a = [(0.1*i)*(0.1*i-1)*(0.1*i-2) for i in range(50)] #Create some data with peaks
b = eng.findpeaks(matlab.double(a),'Npeaks',1) #Find 1 peak
Try the findpeaks library. Multiple methods are available for the detections of peaks and valleys in 1D-vectors and 2D-arrays (images).
pip install findpeaks
Lets create some peaks:
i = 10000
xs = np.linspace(0,3.7*np.pi,i)
X = (0.3*np.sin(xs) + np.sin(1.3 * xs) + 0.9 * np.sin(4.2 * xs) + 0.06 *
np.random.randn(i))
# import library
from findpeaks import findpeaks
# Initialize
fp = findpeaks()
# Find the peaks (high/low)
results = fp.fit(X)
# Make plot
fp.plot()
# Some of the results:
results['df']
I'm trying to improve the speed of a function that calculates the normalized cross-correlation between a search image and a template image by using the anfft module, which provides Python bindings for the FFTW C library and seems to be ~2-3x quicker than scipy.fftpack for my purposes.
When I take the FFT of my template, I need the result to be padded to the same size as my search image so that I can convolve them. Using scipy.fftpack.fftn I would just use the shape parameter to do padding/truncation, but anfft.fftn is more minimalistic and doesn't do any zero-padding itself.
When I try and do the zero padding myself, I get a very different result to what I get using shape. This example uses just scipy.fftpack, but I have the same problem with anfft:
import numpy as np
from scipy.fftpack import fftn
from scipy.misc import lena
img = lena()
temp = img[240:281,240:281]
def procrustes(a,target,padval=0):
# Forces an array to a target size by either padding it with a constant or
# truncating it
b = np.ones(target,a.dtype)*padval
aind = [slice(None,None)]*a.ndim
bind = [slice(None,None)]*a.ndim
for dd in xrange(a.ndim):
if a.shape[dd] > target[dd]:
diff = (a.shape[dd]-b.shape[dd])/2.
aind[dd] = slice(np.floor(diff),a.shape[dd]-np.ceil(diff))
elif a.shape[dd] < target[dd]:
diff = (b.shape[dd]-a.shape[dd])/2.
bind[dd] = slice(np.floor(diff),b.shape[dd]-np.ceil(diff))
b[bind] = a[aind]
return b
# using scipy.fftpack.fftn's shape parameter
F1 = fftn(temp,shape=img.shape)
# doing my own zero-padding
temp_padded = procrustes(temp,img.shape)
F2 = fftn(temp_padded)
# these results are quite different
np.allclose(F1,F2)
I suspect I'm probably making a very basic mistake, since I'm not overly familiar with the discrete Fourier transform.
Just do the inverse transform and you'll see that scipy does slightly different padding (only to top and right edges):
plt.imshow(ifftn(fftn(procrustes(temp,img.shape))).real)
plt.imshow(ifftn(fftn(temp,shape=img.shape)).real)
So I have an array (it's large - 2048x2048), and I would like to do some element wise operations dependent on where they are. I'm very confused how to do this (I was told not to use for loops, and when I tried that my IDE froze and it was going really slow).
Onto the question:
h = aperatureimage
h[:,:] = 0
indices = np.where(aperatureimage>1)
for True in h:
h[index] = np.exp(1j*k*z)*np.exp(1j*k*(x**2+y**2)/(2*z))/(1j*wave*z)
So I have an index, which is (I'm assuming here) essentially a 'cropped' version of my larger aperatureimage array. *Note: Aperature image is a grayscale image converted to an array, it has a shape or text on it, and I would like to find all the 'white' regions of the aperature and perform my operation.
How can I access the individual x/y values of index which will allow me to perform my exponential operation? When I try index[:,None], leads to the program spitting out 'ValueError: broadcast dimensions too large'. I also get array is not broadcastable to correct shape. Any help would be appreciated!
One more clarification: x and y are the only values I would like to change (essentially the points in my array where there is white, z, k, and whatever else are defined previously).
EDIT:
I'm not sure the code I posted above is correct, it returns two empty arrays. When I do this though
index = (aperatureimage==1)
print len(index)
Actually, nothing I've done so far works correctly. I have a 2048x2048 image with a 128x128 white square in the middle of it. I would like to convert this image to an array, look through all the values and determine the index values (x,y) where the array is not black (I only have white/black, bilevel image didn't work for me). I would then like to take all the values (x,y) where the array is not 0, and multiply them by the h[index] value listed above.
I can post more information if necessary. If you can't tell, I'm stuck.
EDIT2: Here's some code that might help - I think I have the problem above solved (I can now access members of the array and perform operations on them). But - for some reason the Fx values in my for loop never increase, it loops Fy forever....
import sys, os
from scipy.signal import *
import numpy as np
import Image, ImageDraw, ImageFont, ImageOps, ImageEnhance, ImageColor
def createImage(aperature, type):
imsize = aperature*8
middle = imsize/2
im = Image.new("L", (imsize,imsize))
draw = ImageDraw.Draw(im)
box = ((middle-aperature/2, middle-aperature/2), (middle+aperature/2, middle+aperature/2))
import sys, os
from scipy.signal import *
import numpy as np
import Image, ImageDraw, ImageFont, ImageOps, ImageEnhance, ImageColor
def createImage(aperature, type):
imsize = aperature*8 #Add 0 padding to make it nice
middle = imsize/2 # The middle (physical 0) of our image will be the imagesize/2
im = Image.new("L", (imsize,imsize)) #Make a grayscale image with imsize*imsize pixels
draw = ImageDraw.Draw(im) #Create a new draw method
box = ((middle-aperature/2, middle-aperature/2), (middle+aperature/2, middle+aperature/2)) #Bounding box for aperature
if type == 'Rectangle':
draw.rectangle(box, fill = 'white') #Draw rectangle in the box and color it white
del draw
return im, middle
def Diffraction(aperaturediameter = 1, type = 'Rectangle', z = 2000000, wave = .001):
# Constants
deltaF = 1/8 # Image will be 8mm wide
z = 1/3.
wave = 0.001
k = 2*pi/wave
# Now let's get to work
aperature = aperaturediameter * 128 # Aperaturediameter (in mm) to some pixels
im, middle = createImage(aperature, type) #Create an image depending on type of aperature
aperaturearray = np.array(im) # Turn image into numpy array
# Fourier Transform of Aperature
Ta = np.fft.fftshift(np.fft.fft2(aperaturearray))/(len(aperaturearray))
# Transforming and calculating of Transfer Function Method
H = aperaturearray.copy() # Copy image so H (transfer function) has the same dimensions as aperaturearray
H[:,:] = 0 # Set H to 0
U = aperaturearray.copy()
U[:,:] = 0
index = np.nonzero(aperaturearray) # Find nonzero elements of aperaturearray
H[index[0],index[1]] = np.exp(1j*k*z)*np.exp(-1j*k*wave*z*((index[0]-middle)**2+(index[1]-middle)**2)) # Free space transfer for ap array
Utfm = abs(np.fft.fftshift(np.fft.ifft2(Ta*H))) # Compute intensity at distance z
# Fourier Integral Method
apindex = np.nonzero(aperaturearray)
U[index[0],index[1]] = aperaturearray[index[0],index[1]] * np.exp(1j*k*((index[0]-middle)**2+(index[1]-middle)**2)/(2*z))
Ufim = abs(np.fft.fftshift(np.fft.fft2(U))/len(U))
# Save image
fim = Image.fromarray(np.uint8(Ufim))
fim.save("PATH\Fim.jpg")
ftfm = Image.fromarray(np.uint8(Utfm))
ftfm.save("PATH\FTFM.jpg")
print "that may have worked..."
return
if __name__ == '__main__':
Diffraction()
You'll need numpy, scipy, and PIL to work with this code.
When I run this, it goes through the code, but there is no data in them (everything is black). Now I have a real problem here as I don't entirely understand the math I'm doing (this is for HW), and I don't have a firm grasp on Python.
U[index[0],index[1]] = aperaturearray[index[0],index[1]] * np.exp(1j*k*((index[0]-middle)**2+(index[1]-middle)**2)/(2*z))
Should that line work for performing elementwise calculations on my array?
Could you perhaps post a minimal, yet complete, example? One that we can copy/paste and run ourselves?
In the meantime, in the first two lines of your current example:
h = aperatureimage
h[:,:] = 0
you set both 'aperatureimage' and 'h' to 0. That's probably not what you intended. You might want to consider:
h = aperatureimage.copy()
This generates a copy of aperatureimage while your code simply points h to the same array as aperatureimage. So changing one changes the other.
Be aware, copying very large arrays might cost you more memory then you would prefer.
What I think you are trying to do is this:
import numpy as np
N = 2048
M = 64
a = np.zeros((N, N))
a[N/2-M:N/2+M,N/2-M:N/2+M]=1
x,y = np.meshgrid(np.linspace(0, 1, N), np.linspace(0, 1, N))
b = a.copy()
indices = np.where(a>0)
b[indices] = np.exp(x[indices]**2+y[indices]**2)
Or something similar. This, in any case, sets some values in 'b' based on the x/y coordinates where 'a' is bigger than 0. Try visualizing it with imshow. Good luck!
Concerning the edit
You should normalize your output so it fits in the 8 bit integer. Currently, one of your arrays has a maximum value much larger than 255 and one has a maximum much smaller. Try this instead:
fim = Image.fromarray(np.uint8(255*Ufim/np.amax(Ufim)))
fim.save("PATH\Fim.jpg")
ftfm = Image.fromarray(np.uint8(255*Utfm/np.amax(Utfm)))
ftfm.save("PATH\FTFM.jpg")
Also consider np.zeros_like() instead of copying and clearing H and U.
Finally, I personally very much like working with ipython when developing something like this. If you put the code in your Diffraction function in the top level of your script (in place of 'if __ name __ &c.'), then you can access the variables directly from ipython. A quick command like np.amax(Utfm) would show you that there are indeed values!=0. imshow() is always nice to look at matrices.