Interpolation for a 2D array - python

I was wondering if there is a way to interpolate a 2D array in python using the same principle used to interpolate a 1D array ( {np.interpolate} ).
So my aim is to increase the number of data points that is within my array ([1000,20] to [1000, 200] [Time_indexing, X]).
I am looking for a function that is capable of doing that.
A = np.array([[ 0.45717218, 0.44250104, 0.47812272, 0.49092173, 0.46002069],
[ 0.29829681, 0.26408021, 0.3709202 , 0.44823109, 0.49311853],
[ 0.05469835, 0.01048596, 0.17398291, 0.30088943, 0.39783137],
[-0.20463768, -0.24610673, -0.0713164 , 0.08406331, 0.22047102],
[-0.4074527 , -0.43573695, -0.31062521, -0.15750053, -0.00222392]])
This is a [5,5] array i want to interpolate it using a spacing of 0.01 hence the final product should be [500,500].
Thank you,

You could use interp2d:
from scipy.interpolate import interp2d
f = interp2d(np.arange(0,500,100), np.arange(0,500,100), A)
f(np.arange(500), np.arange(500))
Output:
array([[ 0.45717218, 0.45702547, 0.45687876, ..., 0.46002069,
0.46002069, 0.46002069],
[ 0.45558343, 0.45543476, 0.45528609, ..., 0.46035167,
0.46035167, 0.46035167],
[ 0.45399467, 0.45384405, 0.45369343, ..., 0.46068265,
0.46068265, 0.46068265],
...,
[-0.4074527 , -0.40773554, -0.40801839, ..., -0.00222392,
-0.00222392, -0.00222392],
[-0.4074527 , -0.40773554, -0.40801839, ..., -0.00222392,
-0.00222392, -0.00222392],
[-0.4074527 , -0.40773554, -0.40801839, ..., -0.00222392,
-0.00222392, -0.00222392]])

Related

How to split a 3D matrix into 3D matrices lined up in a list?

I have a NumPy array with the following shape:
(1532, 2036, 5)
I would like to generate a list of arrays where each one has the following shape:
(1532, 2036)
You can use Ellipsis to signify all dimensions up to the last. For example:
arr = np.random.rand(4, 3, 2)
arr
array([[[ 0.35235813, 0.57984153],
[ 0.53743048, 0.46753367],
[ 0.80048303, 0.07982378]],
[[ 0.1339381 , 0.84586721],
[ 0.81425027, 0.41086151],
[ 0.34039991, 0.19972737]],
[[ 0.2112466 , 0.73086434],
[ 0.03755819, 0.40113463],
[ 0.74622891, 0.74695994]],
[[ 0.99313615, 0.65634951],
[ 0.90787642, 0.37387861],
[ 0.8738962 , 0.41747727]]])
The list of the last dimension arrays can be constructed as #Usernamenotfound mentioned or with Ellipsis like so:
[arr[..., i] for i in range(arr.shape[-1])]
[array([[ 0.35235813, 0.53743048, 0.80048303],
[ 0.1339381 , 0.81425027, 0.34039991],
[ 0.2112466 , 0.03755819, 0.74622891],
[ 0.99313615, 0.90787642, 0.8738962 ]]),
array([[ 0.57984153, 0.46753367, 0.07982378],
[ 0.84586721, 0.41086151, 0.19972737],
[ 0.73086434, 0.40113463, 0.74695994],
[ 0.65634951, 0.37387861, 0.41747727]])]
Each element has the shape (4, 3).
Likewise you could so the same for the first dimension, making 4 (3, 2) arrays.
[arr[i, ...] for i in range(arr.shape[0])]
[array([[ 0.35235813, 0.57984153],
[ 0.53743048, 0.46753367],
[ 0.80048303, 0.07982378]]), array([[ 0.1339381 , 0.84586721],
[ 0.81425027, 0.41086151],
[ 0.34039991, 0.19972737]]), array([[ 0.2112466 , 0.73086434],
[ 0.03755819, 0.40113463],
[ 0.74622891, 0.74695994]]), array([[ 0.99313615, 0.65634951],
[ 0.90787642, 0.37387861],
[ 0.8738962 , 0.41747727]])]
You can also permute the axes with numpy.transpose then simply iterate through the array:
import numpy as np
a = ... # Define the input array here
out = [a for a in np.transpose(arr, (2, 0, 1))]
You can slice the 3D array using
[x[:,:,i] for i in range(5)]
The above would give you a list of 2D arrays.
The same process can be scaled for multidimensional arrays

Create 3D array from multiple 2D arrays

I have two monthly gridded data sets which I want to compare later.
The input looks like this for both data and that is also how I want the output.
In[4]: data1.shape
Out[4]: (444, 72, 144)
In[5]: gfz.shape
Out[5]: (155, 72, 144)
In[6]: data1
Out[6]:
array([[[ 0.98412287, 0.96739882, 0.91172796, ..., 1.12651634,
1.0682013 , 1.07681048],
[ 1.47803092, 1.44721365, 1.49585509, ..., 1.58934438,
1.66956687, 1.57198083],
[ 0.68730044, 0.76112831, 0.78218687, ..., 0.92582172,
1.07873237, 0.87490368],
...,
[ 1.00752461, 1.00758123, 0.99440521, ..., 0.94128627,
0.88981551, 0.93984401],
[ 1.03467119, 1.02640462, 0.91580886, ..., 0.88302392,
0.99204206, 0.96396238],
[ 0.8280431 , 0.82936555, 0.82637453, ..., 0.92009377,
0.77890259, 0.81065702]],
...,
[[-0.12173297, -0.06624345, -0.02809682, ..., -0.04522502,
-0.11502996, -0.22779272],
[-0.61080372, -0.61958522, -0.52239478, ..., -0.6775983 ,
-0.79460669, -0.70022893],
[-0.12011283, -0.10849079, 0.096185 , ..., -0.45782232,
-0.39763898, -0.31247514],
...,
[ 0.90601307, 0.88580155, 0.90268403, ..., 0.86414611,
0.87041426, 0.86274058],
[ 1.46445823, 1.31938004, 1.37585044, ..., 1.51378822,
1.48515761, 1.49078977],
[ 0.29749078, 0.22273554, 0.27161494, ..., 0.43205476,
0.43777165, 0.36340511]],
[[ 0.41008961, 0.44208974, 0.40928891, ..., 0.45899671,
0.39472976, 0.36803097],
[-0.13514084, -0.17332518, -0.11183424, ..., -0.22284794,
-0.2532815 , -0.15402752],
[ 0.28614867, 0.33750001, 0.48767376, ..., 0.01886483,
0.07220326, 0.17406547],
...,
[ 1.0551219 , 1.09540403, 1.19031584, ..., 1.09203815,
1.07658005, 1.08363533],
[ 1.54310501, 1.49531853, 1.56107259, ..., 1.57243073,
1.5867976 , 1.57728028],
[ 1.1034857 , 0.98658448, 1.14141166, ..., 0.97744882,
1.13562942, 1.08589089]],
[[ 1.02020931, 0.99780071, 0.87209344, ..., 1.11072564,
1.01270151, 0.9222675 ],
[ 0.93467152, 0.81068456, 0.68190312, ..., 0.95696563,
0.84669352, 0.84596157],
[ 0.97022212, 0.94228816, 0.97413743, ..., 1.06613588,
1.08708596, 1.04224277],
...,
[ 1.21519053, 1.23492992, 1.2802881 , ..., 1.33915019,
1.32537413, 1.27963519],
[ 1.32051706, 1.28170252, 1.36266208, ..., 1.29100537,
1.38395023, 1.34622073],
[ 0.86108029, 0.86364979, 0.88489276, ..., 0.81707358,
0.82471925, 0.83550251]]], dtype=float32)
So both have the same spatial resolution of 144x72 but different length of time.
As one of them has some missing months, I made sure that only the months are selected were both have data. So I created a two dimensional array where the data is stored according to their longitude and latitude value if both data sets contain this month. In the end I want to have a three dimensional array for data1 and data2 of the same length.
3Darray_data1 =[]
3Darray_data2=[]
xy_data1=[[0 for i in range(len(lons_data1))] for j in range(len(lats_data1))]
xy_data2=[[0 for i in range(len(lons_data2))] for j in range(len(lats_data2))]
# comparing the time steps
for i in range(len(time_data1)):
for j in range(len(time_data2)):
if time_data1.year[i] == time_data2[j].year and time_data1[i].month==time_data2[j].month:
# loop for data1 which writes the data into a 2D array
for x in range(len(lats_data1)):
for y in range(len(lons_data1)):
xy_data1[x][y]=data1[j,0,x,y]
# append to get an array of arrays
xy_data1 = np.squeeze(np.asarray(xy_data1))
3Darray_data1 = np.append(3Darray_data1,[xy_data1])
# loop for data2 which writes the data into a 2D array
for x in range(len(lats_data2)):
for y in range(len(lons_data2)):
xy_data2[x][y]=data2[i,x,y]
# append to get an array of arrays
xy_data2 = np.squeeze(np.asarray(xy_data2))
3Darray_data2 = np.append(3Darray_data2,[xy_data2])
The script runs without an error, however, I only get a really long 1D array.
In[3]: 3Darray_data1
Out[3]: array([ nan, nan, nan, ..., 0.81707358,
0.82471925, 0.83550251])
How can I arrange it to a three dimensional array?
For me I got it working with the following.
I defined the three dimensional array with the fixed dimension of the longitude and latitude and an undefined length of the time axis.
temp_data1 = np.zeros((0,len(lats_data1),len(lons_data1)))
And then I appended two dimensional outputs along the time axis.
3Darray = np.append(3Darray,xy_data1[np.newaxis,:,:],axis=0)

How to create pcolormesh with the mean of a set of data?

How do I use this set of data and plot it using pcolormesh? the data is as follows:
array([[ 0. , 0. , 0. , ...,
0. , 0. , 0. ],
[ 34.19227552, 34.19246389, 34.19265956, ...,
34.19284295, 34.19253446, 34.1923012 ],
[ 68.46819899, 68.46861825, 68.46892983, ...,
68.46895204, 68.46856004, 68.46812476],
...,
[ 3937.42832088, 3937.42522049, 3937.43673897, ...,
3937.43603929, 3937.44434961, 3937.43535423],
[ 3987.08591207, 3987.082997 , 3987.09487184, ...,
3987.09300137, 3987.10157045, 3987.09271431],
[ 4037.00035477, 4036.9977684 , 4037.01006508, ...,
4037.00674248, 4037.01561165, 4037.00689316]])
I need to plot this data into a 3D pcolormesh matplotlib. How do I do this? If anyone can help me I would really appreciate it, as I really need help on this one.
Maybe you can follow this turorials:
http://matplotlib.org/examples/pylab_examples/pcolor_demo.html
or
http://mlpy.sourceforge.net/docs/3.3/tutorial.html
Load the modules:
>>> import numpy as np
>>> import mlpy
>>> import matplotlib.pyplot as plt # required for plotting
Load the Iris dataset:
>>> iris = np.loadtxt('iris.csv', delimiter=',')
>>> x, y = iris[:, :4], iris[:, 4].astype(np.int) # x: (observations x attributes) matrix, y: classes (1: setosa, 2: versicolor, 3: virginica)
>>> x.shape
(150, 4)
>>> y.shape
(150, )

python numpy array iterate on a single axis

I have seen a few questions similar to mine, but i couldn't find one that suits me.
I want to iterate over one single axis in my array, and without using 2 for loops to make it faster.
First, I open a bunch of pictures and I append them togheter (converting to np array)
after I get an array of array like these:
ffImageArr[0]
array([[ 45.49061198, 172.49061198, 174.49061198, ..., 30.49061198,
-71.50938802, -69.50938802],
[ 60.49061198, 169.49061198, 183.49061198, ..., 0.49061198,
-83.50938802, -66.50938802],
[ 55.49061198, 133.49061198, 135.49061198, ..., -43.50938802,
-130.50938802, -99.50938802],
...,
[ 118.49061198, 203.49061198, 195.49061198, ..., 182.49061198,
97.49061198, 132.49061198],
[ 108.49061198, 238.49061198, 197.49061198, ..., 121.49061198,
99.49061198, 133.49061198],
[ 118.49061198, 232.49061198, 196.49061198, ..., 130.49061198,
123.49061198, 145.49061198]])
ffImageArr[1]
array([[ 43.59677409, 172.59677409, 173.59677409, ..., 29.59677409,
-73.40322591, -71.40322591],
[ 60.59677409, 167.59677409, 182.59677409, ..., 0.59677409,
-86.40322591, -64.40322591],
[ 55.59677409, 133.59677409, 134.59677409, ..., -46.40322591,
-131.40322591, -102.40322591],
...,
[ 119.59677409, 201.59677409, 194.59677409, ..., 180.59677409,
98.59677409, 131.59677409],
[ 109.59677409, 238.59677409, 197.59677409, ..., 119.59677409,
98.59677409, 134.59677409],
[ 117.59677409, 231.59677409, 197.59677409, ..., 129.59677409,
122.59677409, 144.59677409]])
ffImageArr[2]
array([[ 42.16040365, 174.16040365, 177.16040365, ..., 28.16040365,
-75.83959635, -74.83959635],
[ 59.16040365, 168.16040365, 183.16040365, ..., -1.83959635,
-87.83959635, -66.83959635],
[ 54.16040365, 133.16040365, 135.16040365, ..., -47.83959635,
-133.83959635, -103.83959635],
...,
[ 119.16040365, 203.16040365, 196.16040365, ..., 182.16040365,
98.16040365, 132.16040365],
[ 108.16040365, 240.16040365, 199.16040365, ..., 121.16040365,
98.16040365, 132.16040365],
[ 116.16040365, 232.16040365, 196.16040365, ..., 129.16040365,
122.16040365, 143.16040365]])
ffImageArr[3]
array([[ 43.89271484, 174.89271484, 175.89271484, ..., 28.89271484,
-78.10728516, -75.10728516],
[ 59.89271484, 169.89271484, 183.89271484, ..., -2.10728516,
-89.10728516, -67.10728516],
[ 54.89271484, 132.89271484, 135.89271484, ..., -50.10728516,
-137.10728516, -105.10728516],
...,
[ 118.89271484, 204.89271484, 195.89271484, ..., 181.89271484,
98.89271484, 131.89271484],
[ 108.89271484, 240.89271484, 199.89271484, ..., 121.89271484,
98.89271484, 134.89271484],
[ 118.89271484, 234.89271484, 199.89271484, ..., 128.89271484,
123.89271484, 145.89271484]])
My goal is to retrieve an array with the n element of each of these arrays, it's and array of array as fast as possible.
like array =[45.49061198,43.59677409,42.16040365...]
I tried
for i in range(ffImageArr.shape[0]):
print ffImageArr[i,:,:]
but weirdly, [i,:,:] gives the same thing as [:,i:]
Thanks for the help and explanation!
edit :
code that I wrote in the meantime, I will try to use polyfit directly as suggested :
for k in range (ffImageArr.shape[1]):
for i in range(ffImageArr.shape[2]):
fffunc = []
for j in range(ffImageArr.shape[0]):
fffunc.append(ffImageArr[j,k,i])
fffunc = np.array(fffunc)
a = np.polyfit(tempArr,fffunc,1)
firstOrder0.append(a[1])
firstOrder1.append(a[0])
b = np.polyfit(tempArr,fffunc,2)
secondOrder0.append(b[2])
secondOrder1.append(b[1])
secondOrder2.append(b[1])
c = np.polyfit(tempArr,fffunc,3)
thirdOrder0.append(c[3])
thirdOrder1.append(c[2])
thirdOrder2.append(c[1])
thirdOrder3.append(c[0])
Assuming these are grayscale images with only one band/channel and not RGB, i.e. of shape (N, D) and not (N, D, 3), then you can use list comprehension.
# Generate 5 single-band images of size 8x8
ims = np.random.randn(5, 8, 8)
# Coordinates of the nth value
x = 1
y = 1
arr = [im[n, n] for im in ims]

scipy.sparse dot extremely slow in Python

The following code will not even finish on my system:
import numpy as np
from scipy import sparse
p = 100
n = 50
X = np.random.randn(p,n)
L = sparse.eye(p,p, format='csc')
X.T.dot(L).dot(X)
Is there any explanation why this matrix multiplication is hanging?
X.T.dot(L) is not, as you may think, a 50x100 matrix, but an array of 50x100 sparse matrices of 100x100
>>> X.T.dot(L).shape
(50, 100)
>>> X.T.dot(L)[0,0]
<100x100 sparse matrix of type '<type 'numpy.float64'>'
with 100 stored elements in Compressed Sparse Column format>
It seems that the problem is that X's dot method, it being an array, doesn't know about sparse matrices. So you must either convert the sparse matrix to dense using its todense or toarray method. The former returns a matrix object, the latter an array:
>>> X.T.dot(L.todense()).dot(X)
matrix([[ 81.85399873, 3.75640482, 1.62443625, ..., 6.47522251,
3.42719396, 2.78630873],
[ 3.75640482, 109.45428475, -2.62737229, ..., -0.31310651,
2.87871548, 8.27537382],
[ 1.62443625, -2.62737229, 101.58919604, ..., 3.95235372,
1.080478 , -0.16478654],
...,
[ 6.47522251, -0.31310651, 3.95235372, ..., 95.72988689,
-18.99209596, 17.31774553],
[ 3.42719396, 2.87871548, 1.080478 , ..., -18.99209596,
108.90045569, -16.20312682],
[ 2.78630873, 8.27537382, -0.16478654, ..., 17.31774553,
-16.20312682, 105.37102461]])
Alternatively, sparse matrices have a dot method that knows about arrays:
>>> X.T.dot(L.dot(X))
array([[ 81.85399873, 3.75640482, 1.62443625, ..., 6.47522251,
3.42719396, 2.78630873],
[ 3.75640482, 109.45428475, -2.62737229, ..., -0.31310651,
2.87871548, 8.27537382],
[ 1.62443625, -2.62737229, 101.58919604, ..., 3.95235372,
1.080478 , -0.16478654],
...,
[ 6.47522251, -0.31310651, 3.95235372, ..., 95.72988689,
-18.99209596, 17.31774553],
[ 3.42719396, 2.87871548, 1.080478 , ..., -18.99209596,
108.90045569, -16.20312682],
[ 2.78630873, 8.27537382, -0.16478654, ..., 17.31774553,
-16.20312682, 105.37102461]])

Categories