remove nan values from np array - python

my_list=[[ 0., 40. , nan],
[60. , 0. , nan],
[ nan , nan , nan]]
Is it possible that I can remove the nan value?
Expected output:
my_list=[[0.,40.],
[60., 0.]]

import numpy as np
x=np.array([[ 0., 40. , np.nan],
[60. , 0. , np.nan],
[ np.nan , np.nan , np.nan]])
x = (x[~np.isnan(x).all(axis=1), :]) # remove rows with nan
x = (x[:, ~np.isnan(x).all(axis=0)]) # remove cols with nan
Output
[[ 0. 40.]
[60. 0.]]
But as said #mozway if a row or column contains at least one not nan, then it will remain in the result.

Related

How to remove imaginary values from numpy array

I have a question that is similar to the one posted here, but their solution is not working for this case. Here is a code snippet:
import numpy as np
x = np.arange(-5, 6)
y = np.sqrt(x)[np.logical_not(np.isnan(x))]
print(y)
Output
[ nan nan nan nan nan 0.
1. 1.41421356 1.73205081 2. 2.23606798]
C:\Users\gmbra\Downloads\Python Programs\Mechanisms\scratch.py:4: RuntimeWarning: invalid value encountered in sqrt
y = np.sqrt(x)[np.logical_not(np.isnan(x))]
The np.logical_not is not working as expected. What was expected was an array with no nan values. On a side note, how can I remove the warning given from trying to take the square root of a negative number?
I would like to add that I will be performing other operations which will produce nan values. I just want to ignore those.
ufunc like np.sqrt take a where parameter that lets us select elements to evaluate. We need to provide a out as well.
In [783]: x = np.arange(-5,5)
In [784]: out=np.zeros(x.shape)
In [785]: np.sqrt(x, where=x>=0, out=out)
Out[785]:
array([0. , 0. , 0. , 0. , 0. ,
0. , 1. , 1.41421356, 1.73205081, 2. ])
In [786]: out
Out[786]:
array([0. , 0. , 0. , 0. , 0. ,
0. , 1. , 1.41421356, 1.73205081, 2. ])
This approach keeps the same size, as opposed to returning just the non-zero.
And for completeness compare sqrt with integer argument and one with complex input:
In [787]: np.sqrt(x)
<ipython-input-787-0b43c7e80401>:1: RuntimeWarning: invalid value encountered in sqrt
np.sqrt(x)
Out[787]:
array([ nan, nan, nan, nan, nan,
0. , 1. , 1.41421356, 1.73205081, 2. ])
In [788]: np.sqrt(x.astype(complex))
Out[788]:
array([0. +2.23606798j, 0. +2.j ,
0. +1.73205081j, 0. +1.41421356j,
0. +1.j , 0. +0.j ,
1. +0.j , 1.41421356+0.j ,
1.73205081+0.j , 2. +0.j ])

Inserting complex functions in a python code

I have been trying to insert $e^ix$ as matrix element.
The main aim is to find the eigenvalue of a matrix which has many complex functions as elements. Can anyone help me how to insert it? My failed attempt is below:
for i in range(0,size):
H[i,i]=-2*(cmath.exp((i+1)*aj))
H[i,i+1]=1.0
H[i,i-1]=1.0
'a' is defined earlier in the program. The error flagged shows that aj is not defined. Using cmath I thought a complex number can be expontiated as (x+yj). Unfortunately, I couldn't figure out the right way to use it. Any help would be appreciated
Define a small float array:
In [214]: H = np.eye(3)
In [215]: H
Out[215]:
array([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]])
Create a complex number:
In [216]: 1+3j
Out[216]: (1+3j)
In [217]: np.exp(1+3j)
Out[217]: (-2.6910786138197937+0.383603953541131j)
Trying to assign it to H:
In [218]: H[1,1]=np.exp(1+3j)
<ipython-input-218-6c0b228d2833>:1: ComplexWarning: Casting complex values to real discards the imaginary part
H[1,1]=np.exp(1+3j)
In [219]: H
Out[219]:
array([[ 1. , 0. , 0. ],
[ 0. , -2.69107861, 0. ],
[ 0. , 0. , 1. ]])
Now make an complex dtype array:
In [221]: H = np.eye(3).astype( complex)
In [222]: H[1,1]=np.exp(1+3j)
In [223]: H
Out[223]:
array([[ 1. +0.j , 0. +0.j ,
0. +0.j ],
[ 0. +0.j , -2.69107861+0.38360395j,
0. +0.j ],
[ 0. +0.j , 0. +0.j ,
1. +0.j ]])
edit
For an array of values:
In [225]: a = np.array([1,2,3])
In [226]: np.exp(a+1j*a)
Out[226]:
array([ 1.46869394+2.28735529j, -3.07493232+6.7188497j ,
-19.88453084+2.83447113j])
In [228]: H[:,0]=np.exp(a+1j*a)
In [229]: H
Out[229]:
array([[ 1.46869394+2.28735529j, 0. +0.j ,
0. +0.j ],
[ -3.07493232+6.7188497j , -2.69107861+0.38360395j,
0. +0.j ],
[-19.88453084+2.83447113j, 0. +0.j ,
1. +0.j ]])

How can I reshape this NumPy array correctly?

I'm needing to combine the rows in this array:
array([[0. , 1. , 0.44768612],
[0.34177215, 1. , 0. ]])
So that the output is:
array([[0., 0.34177215], [1., 1.], [0.44768612, 0.])
But for some reason, I can't figure it out with the reshape function. Any help would be appreciated.
If x is your array, x.T will transpose it:
array([[0. , 1. , 0.44768612],
[0.34177215, 1. , 0. ]])
becomes
array([[0. , 0.34177215],
[1. , 1. ],
[0.44768612, 0. ]])
if array is A, just do A.T...

How to interpolate/extrapolate within partly empty regular grid?

I would like to create a python function to linearly interpolate within a partly empty grid and get a nearest extrapolation out of bounds.
Let's say I have the following data stored in pandas DataFrame:
In [1]: import numpy as np
In [2]: import pandas as pd
In [3]: x = [0,1,2,3,4]
In [4]: y = [0.5,1.5,2.5,3.5,4.5,5.5]
In [5]: z = np.array([[np.nan,np.nan,1.5,2.0,5.5,3.5],[np.nan,1.0,4.0,2.5,4.5,3.0],[2.0,0.5,6.0,1.5,3.5,np.nan],[np.nan,1.5,4.0,2.0,np.nan,np.nan],[np.nan,np.nan,2.0,np.nan,np.nan,np.nan]])
In [6]: df = pd.DataFrame(z,index=x,columns=y)
In [7]: df
Out[7]:
0.5 1.5 2.5 3.5 4.5 5.5
0 NaN NaN 1.5 2.0 5.5 3.5
1 NaN 1.0 4.0 2.5 4.5 3.0
2 2.0 0.5 6.0 1.5 3.5 NaN
3 NaN 1.5 4.0 2.0 NaN NaN
4 NaN NaN 2.0 NaN NaN NaN
I would like to get function myInterp that returns a linear interpolation within data boundaries (i.e. not NaN values) and get a nearest extrapolation outside bounds (i.e. NaN or no values) such as:
In [1]: myInterp([1.5,2.5]) #linear interpolation
Out[1]: 5.0
In [2]: myInterp([1.5,4.0]) #bi-linear interpolation
Out[2]: 3.0
In [3]: myInterp([0.0,2.0]) #nearest extrapolation (inside grid)
Out[3]: 1.5
In [4]: myInterp([5.0,2.5]) #nearest extrapolation (outside grid)
Out[4]: 2.0
I tried many combination of scipy.interpolate package with no success, does anyone have a suggestion how to do it ?
Yes, unfortunately scipy doesn't deal with nans
From the docs:
Note that calling interp2d with NaNs present in input values results in undefined behaviour.
Even masking the nans in a np.masked_array was not successful.
So my advice would be to remove all the nan entries from z by taking the opportunity to give sp.interp2d the full list of x- and y-coordinates for only the valid data and leave z also 1D:
X=[];Y=[];Z=[] # initialize new 1-D-lists for interp2
for i, xi in enumerate(x): # iterate through x
for k, yk in enumerate(y): # iterate through y
if not np.isnan(z[i, k]): # check if z-value is valid...
X.append(xi) # ...and if so, append coordinates and value to prepared lists
Y.append(yk)
Z.append(z[i, k])
This way at least sp.interp2d works and gives a result:
ip = sp.interpolate.interp2d(X,Y,Z)
However, the values in the result won't please you:
In: ip(x,y)
Out:
array([[ 18.03583061, -0.44933642, 0.83333333, -1. , -1.46105542],
[ 9.76791531, 1.3014037 , 2.83333333, 1.5 , 0.26947229],
[ 1.5 , 3.05214381, 4.83333333, 4. , 2. ],
[ 2. , 3.78378051, 1.5 , 2. , 0.8364618 ],
[ 5.5 , 3.57039277, 3.5 , -0.83019815, -0.7967441 ],
[ 3.5 , 3.29227922, 17.29607177, 0. , 0. ]])
compared to the input data:
In:z
Out:
array([[ nan, nan, 1.5, 2. , 5.5, 3.5],
[ nan, 1. , 4. , 2.5, 4.5, 3. ],
[ 2. , 0.5, 6. , 1.5, 3.5, nan],
[ nan, 1.5, 4. , 2. , nan, nan],
[ nan, nan, 2. , nan, nan, nan]])
But IMHO this is because the gradient changes in your data are far too high. Even more with respect to the low number of data samples.
I hope this is just a test data set and your real application has smoother gradients and some more samples. Then I'd be glad to hear if it works...
However, the trivial test with an array of zero gradient - only destructed by nans a little bit - could give a hint that interpolation should work, while extrapolation is only partly correct:
In:ip(x,y)
Out:
array([[ 3. , 3. , 3. , 3. , 0. ],
[ 3. , 3. , 3. , 3. , 1.94701008],
[ 3. , 3. , 3. , 3. , 3. ],
[ 3. , 3. , 3. , 3. , 1.54973345],
[ 3. , 3. , 3. , 3. , 0.37706713],
[ 3. , 3. , 2.32108317, 0.75435203, 0. ]])
resulting from the trivial test input
In:z
Out:
array([[ nan, nan, 3., 3., 3., 3.],
[ nan, 3., 3., nan, 3., 3.],
[ 3., 3., 3., 3., 3., nan],
[ nan, 3., 3., 3., nan, nan],
[ nan, nan, 3., nan, nan, nan]])
PS: Looking closer to the right hand side: there are even valid entries completely changed, i.e made wrong, which introduces errors in a following analysis.
But surprise: the cubic version performs much better here:
In:ip = sp.interpolate.interp2d(X,Y,Z, kind='cubic')
In:ip(x,y)
Out:
array([[ 3. , 3. , 3. , 3.02397028, 3.0958811 ],
[ 3. , 3. , 3. , 3. , 3. ],
[ 3. , 3. , 3. , 3. , 3. ],
[ 3. , 3. , 3. , 3. , 3. ],
[ 3. , 3. , 3. , 2.97602972, 2.9041189 ],
[ 3. , 3. , 3. , 2.9041189 , 2.61647559]])
In:z
Out:
array([[ nan, nan, 3., 3., 3., 3.],
[ nan, 3., 3., nan, 3., 3.],
[ 3., 3., 3., 3., 3., nan],
[ nan, 3., 3., 3., nan, nan],
[ nan, nan, 3., nan, nan, nan]])
Since scipy.interp2d doesn't deal with Nans, the solution is to fill the NaNs in the DataFrame before using interp2d. This can be done by using pandas.interpolate function.
In the previous example, the following provide the desired output:
In [1]: from scipy.interpolate import interp2d
In [2]: df = df.interpolate(limit_direction='both',axis=1,inplace=True)
In [3]: myInterp = interp2d(df.index,df.columns,df.values.T)
In [4]: myInterp(1.5,2.5)
Out[4]: array([5.])
In [5]: myInterp(1.5,4.0)
Out[5]: array([3.])
In [6]: myInterp(0.0,2.0)
Out[6]: array([1.5])
In [7]: myInterp(5.0,2.5)
Out[7]: array([2.])

Interpolate NaN values in a big matrix (not just a list) in python

I'm searching from a simple method to interpolate a matrix that about 10% from values are NaN. For instance:
matrix = np.array([[ np.nan, np.nan, 2. , 3. , 4. ],
[ np.nan, 6. , 7. , 8. , 9. ],
[ 10. , 11. , 12. , 13., 14. ],
[ 15. , 16. , 17. , 18., 19. ],
[ np.nan, np.nan, 22. , 23., np.nan]])
I found a solution that uses griddata from scipy.interpolate, but the solution take much time. (My matrix have about 50 columns and 200,000 rows and the rate of Nan values does not higher than 10%)

Categories