Scipy interpolation how to resize/resample 3x3 matrix to 5x5? - python

EDIT: Paul has solved this one below. Thanks!
I'm trying to resample (upscale) a 3x3 matrix to 5x5, filling in the intermediate points with either interpolate.interp2d or interpolate.RectBivariateSpline (or whatever works).
If there's a simple, existing function to do this, I'd like to use it, but I haven't found it yet. For example, a function that would work like:
# upscale 2x2 to 4x4
matrixSmall = ([[-1,8],[3,5]])
matrixBig = matrixSmall.resample(4,4,cubic)
So, if I start with a 3x3 matrix / array:
0,-2,0
-2,11,-2
0,-2,0
I want to compute a new 5x5 matrix ("I" meaning interpolated value):
0, I[1,0], -2, I[3,0], 0
I[0,1], I[1,1], I[2,1], I[3,1], I[4,1]
-2, I[1,2], 11, I[3,2], -2
I[0,3], I[1,3], I[2,3], I[3,3], I[4,3]
0, I[1,4], -2, I[3,4], 0
I've been searching and reading up and trying various different test code, but I haven't quite figured out the correct syntax for what I'm trying to do. I'm also not sure if I need to be using meshgrid, mgrid or linspace in certain lines.
EDIT: Fixed and working Thanks to Paul
import numpy, scipy
from scipy import interpolate
kernelIn = numpy.array([[0,-2,0],
[-2,11,-2],
[0,-2,0]])
inKSize = len(kernelIn)
outKSize = 5
kernelOut = numpy.zeros((outKSize,outKSize),numpy.uint8)
x = numpy.array([0,1,2])
y = numpy.array([0,1,2])
z = kernelIn
xx = numpy.linspace(x.min(),x.max(),outKSize)
yy = numpy.linspace(y.min(),y.max(),outKSize)
newKernel = interpolate.RectBivariateSpline(x,y,z, kx=2,ky=2)
kernelOut = newKernel(xx,yy)
print kernelOut

Only two small problems:
1) Your xx,yy is outside the bounds of x,y (you can extrapolate, but I'm guessing you don't want to.)
2) Your sample size is too small for a kx and ky of 3 (default). Lower it to 2 and get a quadratic fit instead of cubic.
import numpy, scipy
from scipy import interpolate
kernelIn = numpy.array([
[0,-2,0],
[-2,11,-2],
[0,-2,0]])
inKSize = len(kernelIn)
outKSize = 5
kernelOut = numpy.zeros((outKSize),numpy.uint8)
x = numpy.array([0,1,2])
y = numpy.array([0,1,2])
z = kernelIn
xx = numpy.linspace(x.min(),x.max(),outKSize)
yy = numpy.linspace(y.min(),y.max(),outKSize)
newKernel = interpolate.RectBivariateSpline(x,y,z, kx=2,ky=2)
kernelOut = newKernel(xx,yy)
print kernelOut
##[[ 0. -1.5 -2. -1.5 0. ]
## [ -1.5 5.4375 7.75 5.4375 -1.5 ]
## [ -2. 7.75 11. 7.75 -2. ]
## [ -1.5 5.4375 7.75 5.4375 -1.5 ]
## [ 0. -1.5 -2. -1.5 0. ]]

If you are using scipy already, I think scipy.ndimage.interpolate.zoom can do what you need:
import numpy
import scipy.ndimage
a = numpy.array([[0.,-2.,0.], [-2.,11.,-2.], [0.,-2.,0.]])
out = numpy.round(scipy.ndimage.interpolation.zoom(input=a, zoom=(5./3), order = 2),1)
print out
#[[ 0. -1. -2. -1. 0. ]
# [ -1. 1.8 4.5 1.8 -1. ]
# [ -2. 4.5 11. 4.5 -2. ]
# [ -1. 1.8 4.5 1.8 -1. ]
# [ 0. -1. -2. -1. 0. ]]
Here the "zoom factor" is 5./3 because we are going from a 3x3 array to a 5x5 array. If you read the docs, it says that you can also specify the zoom factor independently for the two axes, which means you can upscale non-square matrices as well. By default, it uses third order spline interpolation, which I am not sure is best.
I tried it on some images and it works nicely.

Related

cv2.perspectiveTransform() not performing the operation

I want to apply a transformation matrix to a set of points. So the set of points:
points = np.array([[0 ,20], [0, 575], [0, 460]])
And I want to use the matrix I calculated with cv2.getPerspectiveTransform() which is a 3x3 matrix.
matrix = np.array([
[ -4. , -3. , 1920. ],
[ -2.25 , -1.6875 , 1080. ],
[ -0.0020833, -0.0015625, 1. ]])
Then I pass the array and a matrix to the following function:
def poly_points_transform(poly_points, matrix):
poly_points_transformed = np.empty_like(poly_points)
for i in range(len(poly_points)):
point = np.array([[poly_points[i]]])
transformed_point = cv2.perspectiveTransform(point, matrix)
np.append(poly_points_transformed, transformed_point)
return poly_points_transformed
Now It doesn't throw an error, but it just copies the src array to the poly_points_transformed. It might be something really rudimentary and stupid. If it is the case, I am sorry, but could someone give me a hint on what is wrong? Thanks in advance
We may solve it with one line of code:
transformed_point = cv2.perspectiveTransform(np.array([points], np.float64), matrix)[0]
As Micka commented cv2.perspectiveTransform takes a list of points (and returns a list of points as output).
np.array([points]) is used because cv2.perspectiveTransform expects 3D array.
For details see trouble getting cv.transform to work.
np.float64 is used in case the dtype of points is int32 (the method accepts float64 and float32 types).
[0] is used for removing the redundant dimension (convert from 3D to 2D).
For fixing the loop, replace np.append(poly_points_transformed, transformed_point) with:
poly_points_transformed[i] = transformed_point[0].
Since the array is initialized to poly_points_transformed = np.empty_like(poly_points), we can't use np.append().
Code sample:
import cv2
import numpy as np
points = np.array([[0.0 ,20.0], [0.0, 575.0], [0.0, 460.0]])
matrix = np.array([
[ -4. , -3. , 1920. ],
[ -2.25 , -1.6875 , 1080. ],
[ -0.0020833, -0.0015625, 1. ]])
# transformed_point = cv2.perspectiveTransform(np.array([points], np.float64), matrix)[0]
def poly_points_transform(poly_points, matrix):
poly_points_transformed = np.empty_like(poly_points)
for i in range(len(poly_points)):
point = np.array([[poly_points[i]]])
transformed_point = cv2.perspectiveTransform(point, matrix)
poly_points_transformed[i] = transformed_point[0] #np.append(poly_points_transformed, transformed_point)
return poly_points_transformed
poly_points_transformed = poly_points_transform(points, matrix)
The result is:
poly_points_transformed =
array([[1920., 1080.],
[1920., 1080.],
[1920., 1080.]])
Why are we getting [1920.0, 1080.0] value for all the transformed points?
Lets transform the middle point mathematically:
Multiply matrix by point (with 1 in the third index)
[ -4. , -3. , 1920. ] [ 0]
[ -2.25 , -1.6875 , 1080. ] * [575] =
[ -0.0020833, -0.0015625, 1. ] [ 1]
p = matrix # np.array([[0.0], [575.0], [1.0]]) =
[1.950000e+02]
[1.096875e+02]
[1.015625e-01]
Now divide the coordinates by the last element (converting homogeneous coordinates to Euclidian coordinates):
[1.950000e+02/1.015625e-01] [1920]
[1.096875e+02/1.015625e-01] = p / p[2] = [1080]
[1.015625e-01/1.015625e-01] [ 1]
The equivalent Euclidian point is [1920, 1080].
The transformation matrix may be wrong, because it transforms all the input points (with x coordinate equals 0) to the same output point...

Why is my gaussian np.array not symmetric?

I am trying to write a function that returns an np.array of size nx x ny that contains a centered gaussian distribution with mean mu and sd sig. It works in principle like below but the problem is that the result is not completely symmetric. This is not a problem for larger nx x ny but for smaller ones it is obvious that something is not quite right in my implementation ...
For:
create2dGaussian (1, 1, 5, 5)
It outputs:
[[ 0. 0.2 0.3 0.1 0. ]
[ 0.2 0.9 1. 0.5 0. ]
[ 0.3 1. 1. 0.6 0. ]
[ 0.1 0.5 0.6 0.2 0. ]
[ 0. 0. 0. 0. 0. ]]
... which is not symmetric. For larger nx and ny a 3d plot looks perfectly fine/smooth but why are the detailed numerics not correct and how can I fix it?
import numpy as np
def create2dGaussian (mu, sigma, nx, ny):
x, y = np.meshgrid(np.linspace(-nx/2, +nx/2+1,nx), np.linspace(-ny/2, +ny/2+1,ny))
d = np.sqrt(x*x+y*y)
g = np.exp(-((d-mu)**2 / ( 2.0 * sigma**2 )))
np.set_printoptions(precision=1, suppress=True)
print(g.shape)
print(g)
return g
----- EDIT -----
While the below described solution works for the problem mentioned in the headline (non-symmetric distribution) this code has also some other issues that are discussed here.
Numpy's linspace is inclusive of both edges by default, unlike range, you don't need to add one to the right side. I'd also recommend only dividing by floats, just to be safe:
x, y = np.meshgrid(np.linspace(-nx/2.0, +nx/2.0,nx), np.linspace(-ny/2.0, +ny/2.0,ny))

Upsample and Interpolate a NumPy Array

I have an array, something like:
array = np.arange(0,4,1).reshape(2,2)
> [[0 1
2 3]]
I want to both upsample this array as well as interpolate the resulting values. I know that a good way to upsample an array is by using:
array = eratemp[0].repeat(2, axis = 0).repeat(2, axis = 1)
[[0 0 1 1]
[0 0 1 1]
[2 2 3 3]
[2 2 3 3]]
but I cannot figure out a way to interpolate the values to remove the 'blocky' nature between each 2x2 section of the array.
I want something like this:
[[0 0.4 1 1.1]
[1 0.8 1 2.1]
[2 2.3 3 3.1]
[2.1 2.3 3.1 3.2]]
Something like this (NOTE: these will not be the exact numbers). I understand that it may not be possible to interpolate this particular 2D grid, but using the first grid in my answer, an interpolation should be possible during the upsampling process as you are increasing the number of pixels, and can therefore 'fill in the gaps'.
I am not too fussed on the type of interpolation, providing the final output is a smoothed surface! I have tried to use the scipy.interp2d method but to no avail, would be grateful if someone could share their wisdom!
You can use SciPy interp2d for the interpolation, you can find the documentation here.
I've modified the example from the documentation a bit:
from scipy import interpolate
x = np.array(range(2))
y = np.array(range(2))
a = np.array([[0, 1], [2, 3]])
f = interpolate.interp2d(x, y, a, kind='linear')
xnew = np.linspace(0, 2, 4)
ynew = np.linspace(0, 2, 4)
znew = f(xnew, ynew)
If you print znew it should look like this:
array([[ 0. , 0.66666667, 1. , 1. ],
[ 1.33333333, 2. , 2.33333333, 2.33333333],
[ 2. , 2.66666667, 3. , 3. ],
[ 2. , 2.66666667, 3. , 3. ]])
I would use scipy.misc.imresize:
array = np.arange(0,4,1).reshape(2,2)
from skimage.transform import resize
out = scipy.misc.imresize(array, 2.0)
The 2.0 indicates that I want the output to be twice the dimensions of the input. You could alternatively supply an int or a tuple to specify a percentage of the original dimensions or just the new dimensions themselves.
This is very easy to use, but there is an extra step because imresize rescales everything so that your max value becomes 255 and your min becomes 0. (And it changes the datatype to np.unit8.) You may need to do something like:
out = out.astype(array.dtype) / 255 * (np.max(array) - np.min(array)) + np.min(array)
Let's look at the output:
>>> out.round(2)
array([[0. , 0.25, 0.75, 1. ],
[0.51, 0.75, 1.26, 1.51],
[1.51, 1.75, 2.26, 2.51],
[2. , 2.25, 2.75, 3. ]])
imresize comes with a deprecation warning and a substitute, though:
DeprecationWarning: imresize is deprecated! imresize is deprecated
in SciPy 1.0.0, and will be removed in 1.2.0. Use
skimage.transform.resize instead.
Form resample method in SciPy. Signal you can up-sample your 2d array sequentially in one axis and then the other axis.

Assign numpy array of points to a 2D square grid

I'm going beyond my previous question because of speed problems. I have an array of Lat/Lon coordinates of points, and I would like to assign them to an index code derived from a 2D square grid of equal size cells. This is an example of how it would be. Let's called points my first array containing coordinates (called them [x y] pairs) of six points:
points = [[ 1.5 1.5]
[ 1.1 1.1]
[ 2.2 2.2]
[ 1.3 1.3]
[ 3.4 1.4]
[ 2. 1.5]]
Then I have another array containing the coordinates of the vertices of a grid of two cells in the form [minx,miny,maxx,maxy]; let's call it bounds:
bounds = [[ 0. 0. 2. 2.]
[ 2. 2. 3. 3.]]
I would like to find which points are in which boundary, and then assign a code derived from the bounds array index (in this case the first cell has code 0, the second 1 and so on...). Since the cells are squares, the easiest way to compute if each point is in each cell is to evaluate:
x > minx & x < maxx & y > miny & y < maxy
So that the resulting array would appear as:
results = [0 0 1 0 NaN NaN]
where NaN means that the point is outside cells. The number of elements in my real case is of the order of finding 10^6 points into 10^4 cells. Is there a way to do this kind of things in a fast way using numpy arrays?
EDIT: to clarify, the results array expected means that the first points is inside the first cell (0 index of the bounds array) so the second, and the first is inside the second cell of the bounds array and so on...
Here is a vectorized approach to your problem. It should speed things up significantly.
import numpy as np
def findCells(points, bounds):
# make sure points is n by 2 (pool.map might send us 1D arrays)
points = points.reshape((-1,2))
# check for each point if all coordinates are in bounds
# dimension 0 is bound
# dimension 1 is is point
allInBounds = (points[:,0] > bounds[:,None,0])
allInBounds &= (points[:,1] > bounds[:,None,1])
allInBounds &= (points[:,0] < bounds[:,None,2])
allInBounds &= (points[:,1] < bounds[:,None,3])
# now find out the positions of all nonzero (i.e. true) values
# nz[0] contains the indices along dim 0 (bound)
# nz[1] contains the indices along dim 1 (point)
nz = np.nonzero(allInBounds)
# initialize the result with all nan
r = np.full(points.shape[0], np.nan)
# now use nz[1] to index point position and nz[0] to tell which cell the
# point belongs to
r[nz[1]] = nz[0]
return r
def findCellsParallel(points, bounds, chunksize=100):
import multiprocessing as mp
from functools import partial
func = partial(findCells, bounds=bounds)
# using python3 you could also do 'with mp.Pool() as p:'
p = mp.Pool()
try:
return np.hstack(p.map(func, points, chunksize))
finally:
p.close()
def main():
nPoints = 1e6
nBounds = 1e4
# points = np.array([[ 1.5, 1.5],
# [ 1.1, 1.1],
# [ 2.2, 2.2],
# [ 1.3, 1.3],
# [ 3.4, 1.4],
# [ 2. , 1.5]])
points = np.random.random([nPoints, 2])
# bounds = np.array([[0,0,2,2],
# [2,2,3,3]])
# bounds = np.array([[0,0,1.4,1.4],
# [1.4,1.4,2,2],
# [2,2,3,3]])
bounds = np.sort(np.random.random([nBounds, 2, 2]), 1).reshape(nBounds, 4)
r = findCellsParallel(points, bounds)
print(points[:10])
for bIdx in np.unique(r[:10]):
if np.isnan(bIdx):
continue
print("{}: {}".format(bIdx, bounds[bIdx]))
print(r[:10])
if __name__ == "__main__":
main()
Edit:
Trying it with your amount of data gave me a MemoryError. You can avoid that and even speed things up a little more if you use multiprocessing.Pool with its map function, see updated code.
Result:
>time python test.py
[[ 0.69083585 0.19840985]
[ 0.31732711 0.80462512]
[ 0.30542996 0.08569184]
[ 0.72582609 0.46687164]
[ 0.50534322 0.35530554]
[ 0.93581095 0.36375539]
[ 0.66226118 0.62573407]
[ 0.08941219 0.05944215]
[ 0.43015872 0.95306899]
[ 0.43171644 0.74393729]]
9935.0: [ 0.31584562 0.18404152 0.98215445 0.83625487]
9963.0: [ 0.00526106 0.017255 0.33177741 0.9894455 ]
9989.0: [ 0.17328876 0.08181912 0.33170444 0.23493507]
9992.0: [ 0.34548987 0.15906761 0.92277442 0.9972481 ]
9993.0: [ 0.12448765 0.5404578 0.33981119 0.906822 ]
9996.0: [ 0.41198261 0.50958195 0.62843379 0.82677092]
9999.0: [ 0.437169 0.17833114 0.91096133 0.70713434]
[ 9999. 9993. 9989. 9999. 9999. 9935. 9999. 9963. 9992. 9996.]
real 0m 24.352s
user 3m 4.919s
sys 0m 1.464s
You can use a nested loop with to check the condition and yield the result as a generator :
points = [[ 1.5 1.5]
[ 1.1 1.1]
[ 2.2 2.2]
[ 1.3 1.3]
[ 3.4 1.4]
[ 2. 1.5]]
bounds = [[ 0. ,0. , 2., 2.],
[ 2. ,2. ,3., 3.]]
import numpy as np
def pos(p,b):
for x,y in p:
flag=False
for index,dis in enumerate(b):
minx,miny,maxx,maxy=dis
if x > minx and x < maxx and y > miny and y < maxy :
flag=True
yield index
if not flag:
yield 'NaN'
print list(pos(points,bounds))
result :
[0, 0, 1, 0, 'NaN', 'NaN']
I would do it like this:
import numpy as np
points = np.random.rand(10,2)
xmin = [0.25,0.5]
ymin = [0.25,0.5]
results = np.zeros(len(points))
for i in range(len(xmin)):
bool_index_array = np.greater(points, [xmin[i],ymin[i]])
print "boolean index of (x,y) greater (xmin, ymin): ", bool_index_array
indicies_of_true_true = np.where(bool_index_array[:,0]*bool_index_array[:,1]==1)[0]
print "indices of [True,True]: ", indicies_of_true_true
results[indicies_of_true_true] += 1
print "results: ", results
[out]: [ 1. 1. 1. 2. 0. 0. 1. 1. 1. 1.]
This uses the lower boundaries to catagorize your points into the groups:
1 (if xmin[0] < x <= xmin[1] & ymin[0] < y <= ymin[1])
2 (if x > xmin[1] & y > ymin[1])
0 if none of the conditions above are fullfilled

How do I use scipy.interpolate.splrep to interpolate a curve?

Using some experimental data, I cannot for the life of me work out how to use splrep to create a B-spline. The data are here: http://ubuntuone.com/4ZFyFCEgyGsAjWNkxMBKWD
Here is an excerpt:
#Depth Temperature
1 14.7036
-0.02 14.6842
-1.01 14.7317
-2.01 14.3844
-3 14.847
-4.05 14.9585
-5.03 15.9707
-5.99 16.0166
-7.05 16.0147
and here's a plot of it with depth on y and temperature on x:
Here is my code:
import numpy as np
from scipy.interpolate import splrep, splev
tdata = np.genfromtxt('t-data.txt',
skip_header=1, delimiter='\t')
depth = tdata[:, 0]
temp = tdata[:, 1]
# Find the B-spline representation of 1-D curve:
tck = splrep(depth, temp)
### fails here with "Error on input data" returned. ###
I know I am doing something bleedingly stupid, but I just can't see it.
You just need to have your values from smallest to largest :). It shouldn't be a problem for you #a different ben, but beware readers from the future, depth[indices] will throw a TypeError if depth is a list instead of a numpy array!
>>> indices = np.argsort(depth)
>>> depth = depth[indices]
>>> temp = temp[indices]
>>> splrep(depth, temp)
(array([-7.05, -7.05, -7.05, -7.05, -5.03, -4.05, -3. , -2.01, -1.01,
1. , 1. , 1. , 1. ]), array([ 16.0147 , 15.54473241, 16.90606794, 14.55343229,
15.12525673, 14.0717599 , 15.19657895, 14.40437622,
14.7036 , 0. , 0. , 0. , 0. ]), 3)
Hat tip to #FerdinandBeyer for the suggestion of argsort instead of my ugly "zip the values, sort the zip, re-assign the values" method.

Categories