Interpolating a function over a grid with different input sizes - python

I have a function f(u,v,w) which I would like to interpolate using a scipy function (with linear interpolation). This is easy enough.
When I run the interpolation step, I simply do the following (interpolating over a u,v,w grid):
u = np.linspace(-1,1,100)
v = np.linspace(-2,2,50)
w = np.linspace(3,8,30)
values_grid = np.zeros((len(u),len(v),len(w)))
count = 0
for i in range(len(u)):
for j in range(len(w)):
for k in range(len(w)):
values_grid[i,j,k] = f(u[i],v[j],w[k])
from scipy.interpolate import RegularGridInterpolator
my_interpolating_function = RegularGridInterpolator((u, v, w), values_grid, method='linear',bounds_error=False,fill_value=-999)
This is fine for many cases. However, when I want to evaluate this interpolation function it seems like I am required to use inputs which have shape [(Number of input samples) x (Dimension of Samples)]. E.g:
func_input = np.vstack([u_samps,v_samps,w_samps].T # E.g. shape is 500,3
output = my_interpolating_function(func_input)) # Has output shape 500
This works fine. The issue is that I would like to evaluate this function over a grid where the samples have the following shape
shape(u_samps) = 500
shape(v_samps) = (100,100)
shape(w_samps) = (100,100)
Meaning I would like to evaluate
my_interpolating_function([u_samps, v_samps, w_samps])
and get out an array which has shape (500,100,100) (so the interpolation is evaluated for all 500 u_samps over the v_samps and w_samps grids). I can flatten the v_samps and w_samps array, but then I have to make several (hundreds) copies of u_samps to get the inputs into the correct format. So is there any way to have an interpolation function that can take the inputs above (u_samps, v_samps, w_samps with the specified shapes) and get out an array with shape (500,100,100) efficiently?
Any help greatly appreciated, I have been stuck on this problem and it's really holding up my progress! The end goal is to use this function in a statistical likelihood which needs to be sampled with MCMC, so speed is pretty important (and making hundreds of copies of massive arrays is very slow)

Related

interpolate / downsample 2D array in Python

I have 2 separate arrays with different sizes:
len(range_data) = 4320
len(az1) = 385
len(az2) = 347
data1.shape = (385,4320)
data2.shape = (347,4320)
I would like for the dimensions of data2 to equal that of data1, such that data2.shape should be (385,4320). I have tried scipy interpolate such as:
f = interpolate.interp2d(az1,range_data,data1,kind='cubic')
znew = f(az2,range_data)
print(znew.shape)
(347,4320)
znew.shape should be (385,4320), any ideas why this is happening and/or what might need to be done to fix this?
I don't think that interp2d actually generates more points for you, it defines an interpolation function over a grid. That means that what you've created is a way to interpolate points within the grid defined by your first set of data points. znew will return an interpolated grid with the same number of values as the x and y passed to it.
See the source code.
Returns
-------
z : 2-D array with shape (len(y), len(x))
The interpolated values.
If you want to add extra data points, I would suggest deriving a regression function (or whatever ML technique you want, NNs if you're so inclined) on the second data set and use that function to produce the extra 38 datapoints you need.

How to calculate eigenfaces in python?

I'm trying to calculate eigenfaces for a set of images using python.
First I turn each image into a vector using:
list(map(lambda x:x.flatten(), x))
Then I calculate covariance matrix (after removing mean from all data):
# x is a numpy array
x = x - mean_image
cov_matrix = np.cov(x.T)
Then I calculate eigenvalues and eigenevtors:
eigen_values, eigen_vecotrs = np.linalg.eig(cov_matrix)
The results are vectors with complex numbers, so I only keep the real part to be able to show them:
eigen_vectors = np.real(eigen_vectors)
After trying to show eigenfaces (eigenvectors), the result is not even close to how an eigenface looks like:
I have managed to get a list of eigenfaces using np.linalg.svd() however I'm curious why my code does not work and how can I change it so it work as expected.
To fix the np.linalg.eig returning complex results I reduced the size of images, so it doesn't return complex numbers anymore however still my eigenvectors doesn't look like an eigenface:
proj_data = np.dot(x.transpose(),eigen_vector).T
img = proj_data[i].reshape(height,width)
This will give you the expected result.
After calculating the eigenvectors you should transpose it. Or you will get mixed image.

Efficient 2D cross correlation in Python?

I have two arrays of size (n, m, m) (n number of images of size (m,m)). I want to perform a cross correlation between each corresponding n of the two arrays.
Example: n=1 -> corr2d([m,m]<sub>1</sub>,[m,m]<sub>2</sub>)
My current way include a bunch of for loops in python:
for i in range(len(X)):
X_co = X[i,0,:,:]/(np.max(X[i,0,:,:]))
X_x = X[i,1,:,:]/(np.max(X[i,1,:,:]))
autocorr[i,0,:,:]=correlate2d(X_co, X_x, mode='same', boundary='fill', fillvalue=0)
Obviously this is very slow when the input contain many images, and becomes a substantial part of the total run time if (m,m) << n.
The obvious optimization is to skip the loop and feed everything directly to the compiled correlation function. Currently I'm using scipy's correlate2d.
I've looked around but haven't found any function that allows correlation along some axis or multiple inputs.
Any tips on how to make scipy's correlate2d work or alternatives?
I decided to implement it via the FFT instead.
def fft_xcorr2D(x):
# Over axes (-2,-1) (default in the fft2 function)
## Pad because of cyclic (circular?) behavior of the FFT
x = np.fft2(np.pad(x,([0,0],[0,0],[0,34],[0,34]),mode='constant'))
# Conjugate for correlation, not convolution (Conv. Theorem)
x[:,1,:,:] = np.conj(x[:,1,:,:])
# Over axes (-2,-1) (default in the ifft2 function)
## Multiply elementwise over 2:nd axis (2 image bands for me)
### fftshift over rows and column over images
corr = np.fft.fftshift(np.ifft2(np.prod(x,axis=1)),axes=(-2,-1))
# Return after removing padding
return np.abs(corr)[:,3:-2,3:-2]
Call via:
ts=fft_xcorr2D(X)
If anybody wants to use it:
My input is a 4D array: (N, 2, #Rows, #Cols)
E.g. (500, 2, 30, 30): 500 images, 2 bands (polarizations, for example), of 30x30 pixels
If your input is different, adjust the padding to your liking
Check so your input order is the same as mine otherwise change the axes arguments in the fft2 and ifft2 functions, the np.prod and fftshift. I use fftshift to get the maximum value in the middle (otherwise in the corners), so be wary of that if that's not what you want.
Why is it the maximum value? Technically, it doesn't have to be, but for my purpose it is. fftshift is used to get a correlation that looks like you're used to. Otherwise, the quadrants are turned "inside out". If you wonder what I mean, remove fftshift (just the fftshift part, not its arguments), call the function as before, and plot it.
Afterwards, it should be ready to use.
Possibly x.prod(axis=1) is faster than np.prod(x,axis=1) but it's an old post. It shows no improvement for me after trying.

ValueError: object too deep for desired array in optimize.curve_fit

I am trying to fit a kinetic model for population growth and decay of four variables A,B,C,D in a chemical system. I am trying to solve the following set of equations, which I have attached in matrix form:
Matrix form of equations
where t is a time step and k1,k2,k3 are constants in an exponential function. I want to fit curves based on these equations to solve for k1,k2, and k3 given my populations of A,B,C,D.
For this I am using optimize.curve_fit, t is the time step in a (1000,) array, X is a (4,1000) matrix and where u and w are the two matrices:
from scipy import optimize
def func(t,X,k1,k2,k3):
u = np.array([[1,0,0],
[-k1/(k1+k2-k3),k1/(k1+k2-k3),0],
[(k1*k3)/((k1+k2-k3)*(k1+k2)),-k1/(k1+k2k3),k1/(k1+k2)],
[-k2/(k1+k2),0,k2/(k2+k1)]],dtype=float)
w = np.array([[np.exp(-t*(k1+k2))],
[np.exp(-t*k3)],
[1]])
return X*np.dot(u,w)
X = np.array([A,B,C,D]) # A,B,C,D are (1000,) arrays
# X.shape = (4, 1000)
# t.shape = (1000,)
optimize.curve_fit(func,t,X,method='lm')
When I run this piece of code, I get the following output:
ValueError: object too deep for desired array
error: Result from function call is not a proper array of floats.
I have seen in a similar post that the shapes of the arrays in important, but as far as I can tell these are correct.
Could anyone suggest where the problem may be in this code and how I can best go about solving for k1,k2,k3 using the curve fit function?
Thanks
As I mentioned in my comment, you don't need to pass X onto func. #WarrenWeckesser briefly explains why. So here is how func should be:
def func(t,k1,k2,k3):
u = np.array([[1,0,0],
[-k1/(k1+k2-k3),k1/(k1+k2-k3),0],
[(k1*k3)/((k1+k2-k3)*(k1+k2)),-k1/(k1+k2*k3),k1/(k1+k2)],
[-k2/(k1+k2),0,k2/(k2+k1)]],dtype=float)
w = np.array([np.exp(-t*(k1+k2)),
np.exp(-t*k3),
np.ones_like(t)]) # must match shapes with above
return np.dot(u,w).flatten()
The output at the end is flattened because otherwise it would give an error with curve_fit. Now we test it:
from scipy.optimize import curve_fit
t = np.arange(1000)*0.01
data = func(t, *[0.5, 2, 1])
data +=np.random.normal(size=data.shape)*0.01 # add some noise
po, pcov = curve_fit(func,t, data.flatten(), method='lm') #data must also be flattened
print(po)
#[ 0.50036411 2.00393807 0.99694513]
plt.plot(t, data.reshape(4,-1).T, t, func(t, *po).reshape(4,-1).T)
The optimised values are pretty close to the original ones and the fit seems good

Optimize 4D Numpy array construction

I have a 4D array data of shape (50,8,2048,256) which are 50 groups containing 8 2048x256 pixel images. times is an array of shape (50,8) giving the time that each image was taken.
I calculate a 1st order polynomial fit at each pixel for all images in each group, giving me an array of shape (50,2048,256,2). This is essentially a vector plot for each of the 50 groups. The code I use to store the polynomials is:
fits = np.ones((50,2048,256,2))
times = times.reshape(50,8,1).repeat(2048,2).reshape(50,8,2048,1).repeat(256,3)
for group in range(50):
for xpos in range(2048):
for ypos in range(256):
px_data = data[:,:,ypos,xpos]
fits[group,ypos,xpos,:] = np.polyfit(times[group,:,ypos,xpos],data[group,:,ypos,xpos],1)
Now the challenge is that I want to generate an array new_data of shape (50,12,2048,256) where I use the polynomial coefficients from fits and the times from new_time to generate 50 groups of 12 images.
I figure I can use something like np.polyval(fits, new_time) to generate the images but I'm very confused with how to phrase it. It should be something like:
new_data = np.ones((50,12,2048,256))
for i,(times,fit) in enumerate(zip(new_times,fits)):
new_data[i] = np.polyval(fit,times)
But I'm getting broadcasting errors. Any assistance would be greatly appreciated!
Update
Ok, so I changed the code a bit so that it does work and do exactly what I want, but it is terribly slow with all these loops (~1 minute per group meaning this would take me almost an hour to run!). Can anyone suggest a way to optimize this to speed it up?
# Generate the polynomials for each pixel in each group
fits = np.ones((50,2048,256,2))
times = np.arange(0,50*8*grptme,grptme).reshape(50,8)
times = times.reshape(50,8,1).repeat(2048,2).reshape(50,8,2048,1).repeat(256,3)
for group in range(50):
for xpos in range(2048):
for ypos in range(256):
fits[group,xpos,ypos] = np.polyfit(times[group,:,xpos,ypos],data[group,:,xpos,ypos],1)
# Create new array of 12 images per group using the polynomials for each pixel
new_data = np.ones((50,12,2048,256))
times = np.arange(0,50*12*grptme,grptme).reshape(50,12)
times = times.reshape(50,12,1).repeat(2048,2).reshape(50,12,2048,1).repeat(256,3)
for group in range(50):
for img in range(12):
for xpos in range(2048):
for ypos in range(256):
new_data[group,img,xpos,ypos] = np.polynomial.polynomial.polyval(times[group,img,xpos,ypos],fits[group,xpos,ypos])
Regarding the speed I see a lot of loops which is what should and often can be avoided due to the beauty of numpy. If I understand your problem fully you want to fit a first order polynom on 50 groups of 8 data points 2048 * 256 times. So for the fit the shape of your image does not play a role. So my suggestion is to flatten your images because with np.polyfit you can fit for a range of x-values several sets of y-values at the same time
From the doc string
x : array_like, shape (M,)
x-coordinates of the M sample points ``(x[i], y[i])``.
y : array_like, shape (M,) or (M, K)
y-coordinates of the sample points. Several data sets of sample
points sharing the same x-coordinates can be fitted at once by
passing in a 2D-array that contains one dataset per column.
So I would go for
# Generate the polynomials for each pixel in each group
fits = np.ones((50,2048*256,2))
times = np.arange(0,50*8*grptme,grptme).reshape(50,8)
data_fit = data.reshape((50,8,2048*256))
for group in range(50):
fits[group] = np.polyfit(times[group],data_fit[group],1).T
fits_original_shape = fits.reshape((50,2048,256,2))
The transposing is necessary since you want to have the parameters in the last index, but np.polyfit has them first and then the different data sets
And then to evaluate it it is basically the same trick again:
# Create new array of 12 images per group using the polynomials for each pixel
new_data = np.zeros((50,12,2048*256))
times = np.arange(0,50*12*grptme,grptme).reshape(50,12)
#times = times.reshape(50,12,1).repeat(2048,2).reshape(50,12,2048,1).repeat(256,3)
for group in range(50):
new_data[group] = np.polynomial.polynomial.polyval(times[group],fits[group].T).T
new_data_original_shape = new_data.reshape((50,12,2048,256))
The two transposes are again needed due to the ordering of the parameters vs. the different data sets so that matches with the shapes of your arrays.
Probably one could also avoid with some advanced numpy magic the loop over the groups, but with this the code runs much faster already.
I hope it helps!

Categories