I have seen a few questions similar to mine, but i couldn't find one that suits me.
I want to iterate over one single axis in my array, and without using 2 for loops to make it faster.
First, I open a bunch of pictures and I append them togheter (converting to np array)
after I get an array of array like these:
ffImageArr[0]
array([[ 45.49061198, 172.49061198, 174.49061198, ..., 30.49061198,
-71.50938802, -69.50938802],
[ 60.49061198, 169.49061198, 183.49061198, ..., 0.49061198,
-83.50938802, -66.50938802],
[ 55.49061198, 133.49061198, 135.49061198, ..., -43.50938802,
-130.50938802, -99.50938802],
...,
[ 118.49061198, 203.49061198, 195.49061198, ..., 182.49061198,
97.49061198, 132.49061198],
[ 108.49061198, 238.49061198, 197.49061198, ..., 121.49061198,
99.49061198, 133.49061198],
[ 118.49061198, 232.49061198, 196.49061198, ..., 130.49061198,
123.49061198, 145.49061198]])
ffImageArr[1]
array([[ 43.59677409, 172.59677409, 173.59677409, ..., 29.59677409,
-73.40322591, -71.40322591],
[ 60.59677409, 167.59677409, 182.59677409, ..., 0.59677409,
-86.40322591, -64.40322591],
[ 55.59677409, 133.59677409, 134.59677409, ..., -46.40322591,
-131.40322591, -102.40322591],
...,
[ 119.59677409, 201.59677409, 194.59677409, ..., 180.59677409,
98.59677409, 131.59677409],
[ 109.59677409, 238.59677409, 197.59677409, ..., 119.59677409,
98.59677409, 134.59677409],
[ 117.59677409, 231.59677409, 197.59677409, ..., 129.59677409,
122.59677409, 144.59677409]])
ffImageArr[2]
array([[ 42.16040365, 174.16040365, 177.16040365, ..., 28.16040365,
-75.83959635, -74.83959635],
[ 59.16040365, 168.16040365, 183.16040365, ..., -1.83959635,
-87.83959635, -66.83959635],
[ 54.16040365, 133.16040365, 135.16040365, ..., -47.83959635,
-133.83959635, -103.83959635],
...,
[ 119.16040365, 203.16040365, 196.16040365, ..., 182.16040365,
98.16040365, 132.16040365],
[ 108.16040365, 240.16040365, 199.16040365, ..., 121.16040365,
98.16040365, 132.16040365],
[ 116.16040365, 232.16040365, 196.16040365, ..., 129.16040365,
122.16040365, 143.16040365]])
ffImageArr[3]
array([[ 43.89271484, 174.89271484, 175.89271484, ..., 28.89271484,
-78.10728516, -75.10728516],
[ 59.89271484, 169.89271484, 183.89271484, ..., -2.10728516,
-89.10728516, -67.10728516],
[ 54.89271484, 132.89271484, 135.89271484, ..., -50.10728516,
-137.10728516, -105.10728516],
...,
[ 118.89271484, 204.89271484, 195.89271484, ..., 181.89271484,
98.89271484, 131.89271484],
[ 108.89271484, 240.89271484, 199.89271484, ..., 121.89271484,
98.89271484, 134.89271484],
[ 118.89271484, 234.89271484, 199.89271484, ..., 128.89271484,
123.89271484, 145.89271484]])
My goal is to retrieve an array with the n element of each of these arrays, it's and array of array as fast as possible.
like array =[45.49061198,43.59677409,42.16040365...]
I tried
for i in range(ffImageArr.shape[0]):
print ffImageArr[i,:,:]
but weirdly, [i,:,:] gives the same thing as [:,i:]
Thanks for the help and explanation!
edit :
code that I wrote in the meantime, I will try to use polyfit directly as suggested :
for k in range (ffImageArr.shape[1]):
for i in range(ffImageArr.shape[2]):
fffunc = []
for j in range(ffImageArr.shape[0]):
fffunc.append(ffImageArr[j,k,i])
fffunc = np.array(fffunc)
a = np.polyfit(tempArr,fffunc,1)
firstOrder0.append(a[1])
firstOrder1.append(a[0])
b = np.polyfit(tempArr,fffunc,2)
secondOrder0.append(b[2])
secondOrder1.append(b[1])
secondOrder2.append(b[1])
c = np.polyfit(tempArr,fffunc,3)
thirdOrder0.append(c[3])
thirdOrder1.append(c[2])
thirdOrder2.append(c[1])
thirdOrder3.append(c[0])
Assuming these are grayscale images with only one band/channel and not RGB, i.e. of shape (N, D) and not (N, D, 3), then you can use list comprehension.
# Generate 5 single-band images of size 8x8
ims = np.random.randn(5, 8, 8)
# Coordinates of the nth value
x = 1
y = 1
arr = [im[n, n] for im in ims]
Related
I have a list tensor_list containing tensors in the form of matrices. Some of the matrices have a different shape (e.g. torch.Size([512, 784]) and torch.Size([10, 512])). I want to subtract each element from the element of the successor and store the result in a new list tensor_deltas = []. Since the list contains many tensors, I want to use a python loop. Here is a tensor_list example with two entries (normally there are much more):
[tensor([[-0.0262, 0.0310, 0.0067, ..., -0.0162, 0.0241, 0.0181],
[-0.0299, 0.0230, -0.0328, ..., 0.0084, -0.0042, -0.0162],
[ 0.0150, 0.0003, -0.0052, ..., 0.0046, 0.0110, 0.0019],
...,
[-0.0346, -0.0283, 0.0035, ..., 0.0010, 0.0279, -0.0162],
[-0.0166, -0.0165, -0.0339, ..., -0.0101, -0.0346, 0.0035],
[ 0.0146, 0.0320, 0.0009, ..., 0.0065, 0.0058, 0.0288]]), tensor([[-6.2551e-03, 1.6126e-02, 3.9450e-02, ..., 1.7971e-05,
2.4612e-02, -4.0139e-02],
[-3.0003e-02, -1.6719e-03, -2.3985e-02, ..., 4.3558e-02,
-1.9130e-02, 2.3564e-02],
[ 2.9886e-02, 3.2086e-02, -4.1213e-02, ..., -2.4083e-02,
2.7199e-02, -4.3203e-02],
...,
[ 2.7709e-02, -2.3003e-02, 4.4214e-03, ..., 2.7394e-02,
-1.6083e-02, -1.7070e-02],
[ 3.7920e-02, 5.7346e-03, -2.7768e-02, ..., 2.0152e-02,
2.6525e-02, -1.8638e-02],
[ 1.9585e-02, -5.5044e-03, 2.6463e-02, ..., -3.2142e-02,
-2.2696e-02, 1.6047e-02]])]
Specifically, I want to apply these operations tensor_deltas.append(abs(-0.0262--6.2551e-03)), tensor_deltas.append(abs(0.0310-1.6126e-02)), etc. to all elements of the tensors. How can I create an efficient loop for all tensors in the list? (Usually there are more than two tensors in the list.)
Thanks for your help!
I was wondering if there is a way to interpolate a 2D array in python using the same principle used to interpolate a 1D array ( {np.interpolate} ).
So my aim is to increase the number of data points that is within my array ([1000,20] to [1000, 200] [Time_indexing, X]).
I am looking for a function that is capable of doing that.
A = np.array([[ 0.45717218, 0.44250104, 0.47812272, 0.49092173, 0.46002069],
[ 0.29829681, 0.26408021, 0.3709202 , 0.44823109, 0.49311853],
[ 0.05469835, 0.01048596, 0.17398291, 0.30088943, 0.39783137],
[-0.20463768, -0.24610673, -0.0713164 , 0.08406331, 0.22047102],
[-0.4074527 , -0.43573695, -0.31062521, -0.15750053, -0.00222392]])
This is a [5,5] array i want to interpolate it using a spacing of 0.01 hence the final product should be [500,500].
Thank you,
You could use interp2d:
from scipy.interpolate import interp2d
f = interp2d(np.arange(0,500,100), np.arange(0,500,100), A)
f(np.arange(500), np.arange(500))
Output:
array([[ 0.45717218, 0.45702547, 0.45687876, ..., 0.46002069,
0.46002069, 0.46002069],
[ 0.45558343, 0.45543476, 0.45528609, ..., 0.46035167,
0.46035167, 0.46035167],
[ 0.45399467, 0.45384405, 0.45369343, ..., 0.46068265,
0.46068265, 0.46068265],
...,
[-0.4074527 , -0.40773554, -0.40801839, ..., -0.00222392,
-0.00222392, -0.00222392],
[-0.4074527 , -0.40773554, -0.40801839, ..., -0.00222392,
-0.00222392, -0.00222392],
[-0.4074527 , -0.40773554, -0.40801839, ..., -0.00222392,
-0.00222392, -0.00222392]])
I have a NumPy array with the following shape:
(1532, 2036, 5)
I would like to generate a list of arrays where each one has the following shape:
(1532, 2036)
You can use Ellipsis to signify all dimensions up to the last. For example:
arr = np.random.rand(4, 3, 2)
arr
array([[[ 0.35235813, 0.57984153],
[ 0.53743048, 0.46753367],
[ 0.80048303, 0.07982378]],
[[ 0.1339381 , 0.84586721],
[ 0.81425027, 0.41086151],
[ 0.34039991, 0.19972737]],
[[ 0.2112466 , 0.73086434],
[ 0.03755819, 0.40113463],
[ 0.74622891, 0.74695994]],
[[ 0.99313615, 0.65634951],
[ 0.90787642, 0.37387861],
[ 0.8738962 , 0.41747727]]])
The list of the last dimension arrays can be constructed as #Usernamenotfound mentioned or with Ellipsis like so:
[arr[..., i] for i in range(arr.shape[-1])]
[array([[ 0.35235813, 0.53743048, 0.80048303],
[ 0.1339381 , 0.81425027, 0.34039991],
[ 0.2112466 , 0.03755819, 0.74622891],
[ 0.99313615, 0.90787642, 0.8738962 ]]),
array([[ 0.57984153, 0.46753367, 0.07982378],
[ 0.84586721, 0.41086151, 0.19972737],
[ 0.73086434, 0.40113463, 0.74695994],
[ 0.65634951, 0.37387861, 0.41747727]])]
Each element has the shape (4, 3).
Likewise you could so the same for the first dimension, making 4 (3, 2) arrays.
[arr[i, ...] for i in range(arr.shape[0])]
[array([[ 0.35235813, 0.57984153],
[ 0.53743048, 0.46753367],
[ 0.80048303, 0.07982378]]), array([[ 0.1339381 , 0.84586721],
[ 0.81425027, 0.41086151],
[ 0.34039991, 0.19972737]]), array([[ 0.2112466 , 0.73086434],
[ 0.03755819, 0.40113463],
[ 0.74622891, 0.74695994]]), array([[ 0.99313615, 0.65634951],
[ 0.90787642, 0.37387861],
[ 0.8738962 , 0.41747727]])]
You can also permute the axes with numpy.transpose then simply iterate through the array:
import numpy as np
a = ... # Define the input array here
out = [a for a in np.transpose(arr, (2, 0, 1))]
You can slice the 3D array using
[x[:,:,i] for i in range(5)]
The above would give you a list of 2D arrays.
The same process can be scaled for multidimensional arrays
I have two monthly gridded data sets which I want to compare later.
The input looks like this for both data and that is also how I want the output.
In[4]: data1.shape
Out[4]: (444, 72, 144)
In[5]: gfz.shape
Out[5]: (155, 72, 144)
In[6]: data1
Out[6]:
array([[[ 0.98412287, 0.96739882, 0.91172796, ..., 1.12651634,
1.0682013 , 1.07681048],
[ 1.47803092, 1.44721365, 1.49585509, ..., 1.58934438,
1.66956687, 1.57198083],
[ 0.68730044, 0.76112831, 0.78218687, ..., 0.92582172,
1.07873237, 0.87490368],
...,
[ 1.00752461, 1.00758123, 0.99440521, ..., 0.94128627,
0.88981551, 0.93984401],
[ 1.03467119, 1.02640462, 0.91580886, ..., 0.88302392,
0.99204206, 0.96396238],
[ 0.8280431 , 0.82936555, 0.82637453, ..., 0.92009377,
0.77890259, 0.81065702]],
...,
[[-0.12173297, -0.06624345, -0.02809682, ..., -0.04522502,
-0.11502996, -0.22779272],
[-0.61080372, -0.61958522, -0.52239478, ..., -0.6775983 ,
-0.79460669, -0.70022893],
[-0.12011283, -0.10849079, 0.096185 , ..., -0.45782232,
-0.39763898, -0.31247514],
...,
[ 0.90601307, 0.88580155, 0.90268403, ..., 0.86414611,
0.87041426, 0.86274058],
[ 1.46445823, 1.31938004, 1.37585044, ..., 1.51378822,
1.48515761, 1.49078977],
[ 0.29749078, 0.22273554, 0.27161494, ..., 0.43205476,
0.43777165, 0.36340511]],
[[ 0.41008961, 0.44208974, 0.40928891, ..., 0.45899671,
0.39472976, 0.36803097],
[-0.13514084, -0.17332518, -0.11183424, ..., -0.22284794,
-0.2532815 , -0.15402752],
[ 0.28614867, 0.33750001, 0.48767376, ..., 0.01886483,
0.07220326, 0.17406547],
...,
[ 1.0551219 , 1.09540403, 1.19031584, ..., 1.09203815,
1.07658005, 1.08363533],
[ 1.54310501, 1.49531853, 1.56107259, ..., 1.57243073,
1.5867976 , 1.57728028],
[ 1.1034857 , 0.98658448, 1.14141166, ..., 0.97744882,
1.13562942, 1.08589089]],
[[ 1.02020931, 0.99780071, 0.87209344, ..., 1.11072564,
1.01270151, 0.9222675 ],
[ 0.93467152, 0.81068456, 0.68190312, ..., 0.95696563,
0.84669352, 0.84596157],
[ 0.97022212, 0.94228816, 0.97413743, ..., 1.06613588,
1.08708596, 1.04224277],
...,
[ 1.21519053, 1.23492992, 1.2802881 , ..., 1.33915019,
1.32537413, 1.27963519],
[ 1.32051706, 1.28170252, 1.36266208, ..., 1.29100537,
1.38395023, 1.34622073],
[ 0.86108029, 0.86364979, 0.88489276, ..., 0.81707358,
0.82471925, 0.83550251]]], dtype=float32)
So both have the same spatial resolution of 144x72 but different length of time.
As one of them has some missing months, I made sure that only the months are selected were both have data. So I created a two dimensional array where the data is stored according to their longitude and latitude value if both data sets contain this month. In the end I want to have a three dimensional array for data1 and data2 of the same length.
3Darray_data1 =[]
3Darray_data2=[]
xy_data1=[[0 for i in range(len(lons_data1))] for j in range(len(lats_data1))]
xy_data2=[[0 for i in range(len(lons_data2))] for j in range(len(lats_data2))]
# comparing the time steps
for i in range(len(time_data1)):
for j in range(len(time_data2)):
if time_data1.year[i] == time_data2[j].year and time_data1[i].month==time_data2[j].month:
# loop for data1 which writes the data into a 2D array
for x in range(len(lats_data1)):
for y in range(len(lons_data1)):
xy_data1[x][y]=data1[j,0,x,y]
# append to get an array of arrays
xy_data1 = np.squeeze(np.asarray(xy_data1))
3Darray_data1 = np.append(3Darray_data1,[xy_data1])
# loop for data2 which writes the data into a 2D array
for x in range(len(lats_data2)):
for y in range(len(lons_data2)):
xy_data2[x][y]=data2[i,x,y]
# append to get an array of arrays
xy_data2 = np.squeeze(np.asarray(xy_data2))
3Darray_data2 = np.append(3Darray_data2,[xy_data2])
The script runs without an error, however, I only get a really long 1D array.
In[3]: 3Darray_data1
Out[3]: array([ nan, nan, nan, ..., 0.81707358,
0.82471925, 0.83550251])
How can I arrange it to a three dimensional array?
For me I got it working with the following.
I defined the three dimensional array with the fixed dimension of the longitude and latitude and an undefined length of the time axis.
temp_data1 = np.zeros((0,len(lats_data1),len(lons_data1)))
And then I appended two dimensional outputs along the time axis.
3Darray = np.append(3Darray,xy_data1[np.newaxis,:,:],axis=0)
Take a 2D numpy.array, let's say:
mat = numpy.random.rand(3,3)
In [153]: mat
Out[153]:
array([[ 0.16716156, 0.90822617, 0.83888038],
[ 0.89771815, 0.62627978, 0.34992542],
[ 0.11097042, 0.80858005, 0.0437299 ]])
Changes the indices to numpy.nan is quite straight forward
One of the following works great:
In [154]: diag = numpy.diag_indices(mat.shape[0], ndim = 2)
In [155]: mat[diag] = numpy.nan
or
In [156]: numpy.fill_diagonal(mat, numpy.nan)
But let's say I have a 3D array, where I want the exact same process along every dimension of the 3rd dimension.
mat = numpy.random.rand(3, 5, 5)
In [158]: mat
Out[158]:
array([[[ 0.65000325, 0.71059547, 0.31880388, 0.24818623, 0.57722849],
[ 0.26908326, 0.41962004, 0.78642476, 0.25711662, 0.8662998 ],
[ 0.15332566, 0.12633147, 0.54032977, 0.17322095, 0.17210078],
[ 0.81952873, 0.20751669, 0.73514815, 0.00884358, 0.89222687],
[ 0.62775839, 0.53657471, 0.99611842, 0.75051645, 0.59328044]],
[[ 0.28718216, 0.84982865, 0.27830082, 0.90604492, 0.43119512],
[ 0.43039373, 0.76557782, 0.58089787, 0.81135684, 0.39151152],
[ 0.70592711, 0.30625204, 0.9753166 , 0.32806864, 0.21947731],
[ 0.74600317, 0.33711673, 0.16203076, 0.6002213 , 0.74996638],
[ 0.63555715, 0.71719058, 0.81420001, 0.28968442, 0.01368163]],
[[ 0.06474027, 0.51966572, 0.006429 , 0.98590784, 0.35708074],
[ 0.44977222, 0.63719921, 0.88325451, 0.53820139, 0.51526687],
[ 0.98529117, 0.46219441, 0.09349748, 0.11406291, 0.47697128],
[ 0.77446136, 0.87423445, 0.71810465, 0.39019846, 0.94070077],
[ 0.09154989, 0.36295161, 0.19740833, 0.17803146, 0.6498038 ]]])
A logical way to do that (I would think), is:
mat[:, diag] = numpy.nan # doesn't do it
In fact, to accomplish this, I need to:
In [190]: rng = numpy.arange(5)
In [191]: for i in numpy.arange(mat.shape[0]):
.....: mat[i, rng, rng] = numpy.nan
.....:
In [192]: mat
Out[192]:
array([[[ nan, 0.4040426 , 0.89449522, 0.63593736, 0.94922036],
[ 0.40682651, nan, 0.30812181, 0.01726625, 0.75655994],
[ 0.23925763, 0.41476223, nan, 0.91590111, 0.18391644],
[ 0.99784977, 0.71636554, 0.21252766, nan, 0.24195636],
[ 0.41137357, 0.84705055, 0.60086461, 0.16403918, nan]],
[[ nan, 0.26183712, 0.77621913, 0.5479058 , 0.17142263],
[ 0.17969373, nan, 0.89742863, 0.65698339, 0.95817106],
[ 0.79048886, 0.16365168, nan, 0.97394435, 0.80612441],
[ 0.94169129, 0.10895737, 0.92614597, nan, 0.08689534],
[ 0.20324943, 0.91402716, 0.23112819, 0.2556875 , nan]],
[[ nan, 0.43177039, 0.76901587, 0.82069345, 0.64351534],
[ 0.14148584, nan, 0.35820379, 0.17434688, 0.78884305],
[ 0.85232784, 0.93526843, nan, 0.80981366, 0.57326785],
[ 0.82104636, 0.63453196, 0.5872653 , nan, 0.96214559],
[ 0.69959383, 0.70257404, 0.92471502, 0.50077728, nan]]])
It's for an application where speed is of the utmost importance, so if there isn't an array based implementation of the following, I'm going to do the for-loop / assignment in Cython
This seems to work:
diag = numpy.diag_indices(mat.shape[1], ndim = 2)
mat[:, diag[0], diag[1]] = numpy.nan
The problem is that diag is a 2-element tuple, so using it as-is in a 3D index won't work, and using *diag us unfortunately invalid syntax. However, you can also do this:
diag = (Ellipsis, *numpy.diag_indices(mat.shape[-1], ndim = 2))
mat[diag] = numpy.nan
In this case, diag is the three-element tuple you need to use it as an index. Ellipsis is the object that represents : repeated as many times as necessary in the index. This version will work for any number of dimensions >2 where the last two represent the square matrices you want.
Using linear indexing -
m,n,r = mat.shape
mat.reshape(m,-1)[:,np.arange(r)*(r+1)] = np.nan
Using slicing and boolean indexing -
m,n,r = mat.shape
mat.reshape(m,-1)[:,np.eye(n,r,dtype=bool).ravel()] = np.nan