I have a numpy array of floats which when printed look like this:
The red circles are the original values, the blue crosses are a linear interpolation using numpy.interp.
I would like to find the abscissa of the zero crossing of this numpy array (red circle) using scipy.optimize.bisect (for example). Since this is a numpy array (and not a function) I cannot pass it directly to scipy.optimize.bisect. So I was thinking to pass a function that interpolates the numpy array to bisect. Here is the code I am using for the moment:
def Inter_F(x,xp,fp):
return np.interp(x,xp,fp)
Numpyroot = scp.optimize.bisect(Inter_F,0,9,args=(XNumpy,YNumpy))
I find a value that seems correct, Numpyroot = 3.376425289196618.
I am wondering:
if this is the correct technical way to use scipy.optimize.bisect on
a numpy array? Specially when I am going to do this 10^6 times on different set of numpy values.
if enforcing a linear interpolation is not influencing the results
that bisect is going to find and if yes, are there better choice?
Here are the two numpy arrays:
XNumpy = array([ 0. , 1.125, 2.25 , 3.375, 4.5 , 5.625, 6.75 , 7.875, 9. ])
YNumpy = array([ -2.70584242e+04, -2.46925289e+04, -1.53211676e+04,
-2.30000000e+01, 1.81312104e+04, 3.41662461e+04,
4.80466863e+04, 5.75113178e+04, 6.41718009e+04])
I think what you do is correct. However, there is a more concise way.
import numpy as np
from scipy.interpolate import interp1d
XNumpy = np.array([0., 1.125, 2.25, 3.375, 4.5, 5.625, 6.75, 7.875, 9.])
YNumpy = np.array([
-2.70584242e+04, -2.46925289e+04, -1.53211676e+04,
-2.30000000e+01, 1.81312104e+04, 3.41662461e+04,
4.80466863e+04, 5.75113178e+04, 6.41718009e+04
])
invf = interp1d(YNumpy, XNumpy)
print(invf(0))
Result:
array(3.376425289199028)
Here I use scipy.interpolate.interp1d to return a function. Also I interpolate the inverse function so that the abscissa are readily calculated. Of course you can do the same trick with np.interp, I just like scipy.interpolate.interp1d because it returns a function so I can calculate x value from any given y value.
Related
I want to use scipy.optimize.curve_fit to fit a 2D array (a 10x10 array) with a function defined as follows
def musq(dz,y):
return 1.0/(1.0+y**2*(dz/dz[:,None])**2)
This function musq takes in 1D array (dz=np.arange(0.1,1.1,0.1)) and returns a 2D array. When I try to fit the data with this function I get ValueError: object too deep for desired array. I understand it must have something to do with the input and output shape mismatch...
But what is the proper way to fit a function with 1D array input that returns a 2D array?
My code and values are as follows
from scipy.optimize import curve_fit
dz=np.arange(0.1,1.1,0.1)
dat=np.mgrid[0.1:1.1:0.1,0.1:1.1:0.1][0]
ans=curve_fit(musq,dz,dat)
Curve fit is not really intended for this type of problem, but luckily you find that it simply calls least_squares under the hood which can be used to solve the problem
from scipy.optimize import least_squares
import numpy as np
def musq(x, param):
return 1.0/(1.0+param**2*(x/x[:,None])**2)
x = np.arange(0.1,1.1,0.1)
param = np.arange(10)
y = musq(x, param)
result = least_squares(lambda param: musq(x, param).ravel() - y.ravel(),
x0=np.zeros_like(param))
Which seems to give the correct result:
>>> result.x
array([ 0. , 1. , 2. , 3. , 4. ,
5. , 6. , 7. , 7.99999996, 8.99999922])
Based on the answer by #Jonas Adler, adding ravel() to both the function return as well as the data seemed to do the trick for curve_fit directly. Here is my solution (although in this case solution doesn't seem to give a good fit)
from scipy.optimize import curve_fit
def musq(dz,y):
res=1.0/(1.0+y**2*(dz/dz[:,None])**2)
return res.ravel()
dz=np.arange(0.1,1.1,0.1)
dat=np.mgrid[0.1:1.1:0.1,0.1:1.1:0.1][0]
dat=dat.ravel()
ans=curve_fit(musq,dz,dat)
I have a SAS script that uses the "proc corr" procedure, along with weighting in order to create a weighted correlation matrix. I am now trying to reproduce this function in python, but I haven't found a good way of including the weighting in the output matrix.
While looking for a solution, I've found a few scripts and functions that calculate weighted correlation coefficients for two columns/variables (examples here) using a weights array, but I am trying to create a weighted correlation matrix with many more variables. I've tried using these functions by looping through variable combinations, but it is running magnitudes slower than the SAS procedure.
I was wondering if there was an efficient way to create a weighted correlation matrix in python that works similarly to the SAS code, or at least returns equivalent results without looping through all variable combinations.
numpy's covariance takes two different kind of weights parameters - I don't have SAS to check against, but it is likely a similar approach.
https://docs.scipy.org/doc/numpy/reference/generated/numpy.cov.html#numpy.cov
Once you have a covariance matrix, it can be converted to a correlation matrix using a formula like this
https://en.wikipedia.org/wiki/Covariance_matrix#Correlation_matrix
Complete example
import numpy as np
x = np.array([1., 1.1, 1.2, 0.9])
y = np.array([2., 2.05, 2.02, 2.8])
np.cov(x, y)
Out[49]:
array([[ 0.01666667, -0.03816667],
[-0.03816667, 0.151225 ]])
cov = np.cov(x, y, fweights=[10, 1, 1, 1])
cov
Out[51]:
array([[ 0.00474359, -0.00703205],
[-0.00703205, 0.04872308]])
def cov_to_corr(cov):
""" based on https://en.wikipedia.org/wiki/Covariance_matrix#Correlation_matrix """
D = np.sqrt(np.diag(np.diag(cov)))
Dinv = np.linalg.inv(D)
return Dinv # cov # Dinv # requires python3.5, use np.dot otherwise
cov_to_corr(cov)
Out[53]:
array([[ 1. , -0.46255259],
[-0.46255259, 1. ]])
import numpy as np
np.random.random((5,5))
array([[ 0.26045197, 0.66184973, 0.79957904, 0.82613958, 0.39644677],
[ 0.09284838, 0.59098542, 0.13045167, 0.06170584, 0.01265676],
[ 0.16456109, 0.87820099, 0.79891448, 0.02966868, 0.27810629],
[ 0.03037986, 0.31481138, 0.06477025, 0.37205248, 0.59648463],
[ 0.08084797, 0.10305354, 0.72488268, 0.30258304, 0.230913 ]])
I would like to create a 2D density estimate from this 2D array such that similar values imply higher density. Is there a way to do this in numpy?
I agree, it is indeed not entirely clear what you mean.
The numpy.histogram function provides you with the density for an array.
import numpy as np
array = np.random.random((5,5))
print array
density = np.histogram(array, density=True)
print(density)
You can then plot the density, for example with Matplotlib.
There is a great discussion on this here: How does numpy.histogram() work?
I want to plot a graph of the magnitude of 1/(1+(i)(omega)(tau)) against frequency f, where i is the imaginary number, omega=(2)(pi)(f), tau is a constant. The following is the first part of the code:
import pylab as pl
import numpy as np
f=np.logspace(-2,4,10)
tau=1.0
omega=2*np.pi*f
y=np.complex(1,omega*tau)
print y
But I get this TypeError: only length-1 arrays can be converted to Python scalars. What's the problem? Why can't I put f (which is an array right?) to y? By the way, I am using enthought canopy.
One more question: What's the difference between pylab and matplotlib? Different modules? If I'm just plotting graphs, dealing with complex numbers and matrix, which one should I use?
You can't construct numpy arrays with np.complex. In python when you put a j after a number it makes it imaginary. Thus, to make complex arrays simply do:
y = 1 + omega * tau * 1j
This is a case of having to use np.vectorize. That is,
def main():
f = np.logspace(-2,4,10)
print(f)
tau=1.0
omega=2*np.pi*f
y=np.vectorize(complex)(1,omega*tau)
print (y)
, will return first:
[ 1.00000000e-02 4.64158883e-02 2.15443469e-01 1.00000000e+00
4.64158883e+00 2.15443469e+01 1.00000000e+02 4.64158883e+02
2.15443469e+03 1.00000000e+04]
And then return:
[ 1. +6.28318531e-02j 1. +2.91639628e-01j 1. +1.35367124e+00j
1. +6.28318531e+00j 1. +2.91639628e+01j 1. +1.35367124e+02j
1. +6.28318531e+02j 1. +2.91639628e+03j 1. +1.35367124e+04j
1. +6.28318531e+04j]
I am using scipy's gmean() function to determine the geometric mean of a numpy array that contains voltage outputs. The range of the numbers is between -80.0 and 30.0. Currently, the numpy array is two dimensional, giving the voltage for two different measurements.
array([[-60.0924, -60.0882],
[-80. , -80. ],
[-80. , -80. ],
...,
[-60.9221, -66.0748],
[-61.0971, -65.9637],
[-61.2706, -65.8803]])
However, I get NaN when I take the geometric mean:
>>> from scipy import stats as scistats
>>> scistats.gmean(voltages)
array([ NaN, NaN])
Does anybody have an idea what might be causing this? Am I doing something wrong?
Thanks in advance!
The geometric mean cannot be applied to negative values.