Find the minimum of a 2d interpolation - python

I'm trying to find the minimum of a 2d interpolation. I"m really stuck on trying to find a way to appropriately pass the data to the optimizer,
here is the code I have so far:
import scipy
from scipy.interpolate import interp2d
a_ca_energy_interp = interp2d(a, c_a, Energy)
def run_2d_params(params, func):
a, b = params
return func(a, b)
scipy.optimize.fmin(run_2d_params, np.array([1.60,6.075]),
args=a_ca_energy_interp)
Which throws the error:
TypeError: can only concatenate tuple (not "interp2d") to tuple

args must be a tuple, even if it is only one argument:
scipy.optimize.fmin(run_2d_params, np.array([1.60,6.075]),
args=(a_ca_energy_interp, ))

Related

Python - Find coefficients minimizing error in csv data

I've recently run into a problem. I have data looking like this :
Value 1
Value 2
Target
1345
4590
2.45
1278
3567
2.48
1378
4890
2.46
1589
4987
2.50
...
...
...
The data goes on for a few thousand lines.
I need to find two values (A & B), that minimize the error when the data is inputted like so :
Value 1 * A + Value 2 * B = Target
I've looked into scipy.optimize.curve_fit, but I can't seem to understand how it would work, because the function changes at every iteration of the data (since Value 1 and Value 2 are not the same over every row).
Any help is greatly appreciated, thanks in advance !
The function curve_fit takes 3 arguments :
a function f that takes an input argument, let's call it X and parameters params (as many as you want)
the input X_data you have from your dataset
the output Y_data you have from your dataset
The point of this function is the give you best params to input in f(X_data, params) to get Y_data.
Intuitively the form X in your function f is a simple numpy 1D array, but actually it can have the form you want. Here your input a tuple of two 1D arrays (or a 2D array if you want to implemente it this way).
Here's a code example :
import numpy as np
from scipy.optimize import curve_fit
X_data = (np.array([1345,1278,1378,1589]),
np.array([4590,3567,4890,4987]))
Y_data = np.array([2.45,2.48,2.46,2.50])
def my_func(X, A, B):
x1, x2 = X
return A*x1 + B*x2
(A, B), _ = curve_fit(my_func, X_data, Y_data)
interpolated_results = my_func(X_data, A, B)
relative_error_in_percent = abs((Y_data - interpolated_results)/Y_data)*100
print(relative_error_in_percent)
Unfortunataly you have not provided any test data so I have come up with my own:
import pandas as pd
import numpy as np
from scipy.optimize import minimize
import matplotlib.pyplot as plt
def f(V1,V2,A,B): #Target function
return V1*A+V2*B
# Generate Test-Data
def generateData(A,B):
np.random.seed(0)
V1=np.random.uniform(low=1000, high=1500, size=(100,))
V2=np.random.uniform(low=3500, high=5000, size=(100,))
Target=f(V1,V2,A,B) +np.random.normal(0,1,100)
return V1,V2,Target
data=generateData(2,3) #Important:
data={"Value 1":data[0], "Value 2":data[1], "Target":data[2]}
df=pd.DataFrame(data) #Similar structure as given in Table
df.head() looks like this:
Value 1 Value 2 Target
0 1292.0525763109854 3662.162080896163 13570.276523473405
1 1155.0421489258965 4907.133274663096 17033.392287295104
2 1430.7172112685223 4844.422515098364 17395.412651006143
3 1396.0480757043242 4076.5845114488666 15022.720636830541
4 1346.2120476329646 3570.9567326419674 13406.565815022896
Your question is answered in the following:
## Plot Data to check whether linear function is useful
df.head()
fig=plt.figure()
ax1=fig.add_subplot(211)
ax2=fig.add_subplot(212)
ax1.scatter(df["Value 1"], df["Target"])
ax2.scatter(df["Value 2"], df["Target"])
def fmin(x, df): #Returns Error at given parameters
def RMSE(y,y_target): #Definition for error term
return np.sqrt(np.mean((y-y_target)**2))
A,B=x
V1,V2,y_target=df["Value 1"], df["Value 2"], df["Target"]
y=f(V1,V2,A,B) #Calculate target value with given parameter set
return RMSE(y,y_target)
res=minimize(fmin,x0=[1,1],args=df, options={"disp":True})
print(res.x)
I prefere scipy.optimize.minimize() over curve_fit since you can define the error function yourself. The documentation can be found here.
You need:
a function fun that returns the error for a given set of parameter x (here fmin with RMSE)
an initial guess x0 (here [1,1]), if your guess is totally off you will probably do not find a solution or (with more complex problems) just a local one
additional arguments args provided to the fun here the data df but also helpful for fixed parameters
options={"disp":True} is for printing additional information
your parameters can be found besides further information in the returned variable res
For this case the result is:
[1.9987209 3.0004212]
Similar to the given parameters when generating the data.

TypeError: only size-1 arrays can be converted to Python scalars + Solution

According to Python Documentation a TypeError is defined as
Raised when an operation or function is applied to an object of inappropriate type. The associated value is a string giving details about the type mismatch.
exception TypeError
The reason I got this Error was because my code looked like this:
import math as m
import pylab as pyl
import numpy as np
#normal distribution function
def normal(x,mu,sigma):
P=(1/(m.sqrt(2*m.pi*sigma**2)))*(m.exp((-(x-mu)**2)/2*sigma**2))
return P
#solution
x = np.linspace(-5,5,1000)
P = normal(x,0,1)
#plotting the function
pyl.plot(x,P)
pyl.show()
P=(1/(m.sqrt(2***m**.pisigma2)))(**m.exp((-(x-mu)2)/2*sigma2))
Notice the m. - This is incorrect, because math. can only handle scalars. And the Error said that a TypeError had occurred.
np. (Numpy) can handle scalers as well as arrays and the problem is solved.
The right code looks like this:
import math as m
import pylab as pyl
import numpy as np
# normal distribution function
def normal(x,mu,sigma):
P = (1/(np.sqrt(2*np.pi*sigma**2))) * (np.exp((-(x-mu)**2)/2*sigma**2))
return P
# solution
x = np.linspace(-5,5,1000)
P = normal(x,0,1)
# plotting the function
pyl.plot(x,P)
pyl.show()
In the end we get a great normal distribution function that looks like this:
This Error occurred in Spyder IDE.

"only length-1 arrays can be converted to Python scalars" using scipy.optimize in Sage

I want to adjust the parameters of a model to a given set of data.
I'm trying to use scipy's function curve_fit in Sage, but I keep getting
TypeError: only length-1 arrays can be converted to Python scalars
Here´s my code:
from numpy import cos,exp,pi
f = lambda x: exp( - 1 / cos(x) )
import numpy as np
def ang(time): return (time-12)*pi/12
def temp(x,maxtemp):
cte=(273+maxtemp)/f(0)**(1/4)
if 6<x and x<18:
return float(cte*f(ang(x))**(1/4)-273)
else:
return -273
lT=list(np.linspace(15,40,1+24*2))
lT=[float(num) for num in lT] #list of y data
ltimes=np.linspace(0,24,6*24+1)[1:]
ltimes=list(ltimes) #list of x data
u0=lT[0]
def u(time,maxtemp,k): #the function I want to fit to the data
def integ(t): return k*exp(k*t)*temp(t,maxtemp)
return exp(-k*time)*( numerical_integral(integ, 0, time)[0] + u0 )
import scipy.optimize as optimization
print optimization.curve_fit(u, ltimes, lT,[1000,0.0003])
scipy.optimize.curve_fit expects the model function to be vectorized: that is, it must be able to receive an array (ndarray, to be precise), and return an array of values. You can see the problem right away by adding a debug printout
def u(time,maxtemp,k):
print time % for debugging
def integ(t): return k*exp(k*t)*temp(t,maxtemp)
return exp(-k*time)*( numerical_integral(integ, 0, time)[0] + u0 )
The output of print will be the entire array ltimes you are passing to curve_fit. This is something numerical_integral is not designed to handle. You need to give it values one by one.
Like this:
def u(time,maxtemp,k):
def integ(t): return k*exp(k*t)*temp(t,maxtemp)
return [exp(-k*time_i)*( numerical_integral(integ, 0, time_i)[0] + u0) for time_i in time]
This will take care of the “only length-1 arrays can be converted" error. You will then have another one, because your lists ltimes and lT are of different length, which doesn't make sense since lT is supposed to be the target outputs for inputs ltimes. You should revise the definitions of these arrays to figure out what size you want.

scipy fmin operands could not be broadcast together with shapes

i'm trying to learn about optimization in Python so i've written some code to test out the fmin function.
However i keep receiving the following error:
ValueError: operands could not be broadcast together with shapes (1,2) (100,)
I can tell the issue is to do with the dimensions of my arguments but I'm not sure how to rectify it. Rather than a lambda function I also tried to def a function but I still get the same error.
I'm sure it's something pretty basic but I can't seem to understand it. Any help wold be greatly appreciated!
import numpy as np
import pandas as pd
from scipy.stats.distributions import norm
from scipy.optimize import fmin
x = np.random.normal(size=100)
norm_1 = lambda theta,x: -(np.log(norm.pdf(x,theta[0],theta[1]))).sum()
def norm_2(theta,x):
mu = theta[0]
sigma = theta[1]
ll = np.log(norm.pdf(x,mu,sigma)).sum()
return -ll
fmin(norm_1,np.array([0,1]),x)
fmin(norm_2,np.array([0,1]),x)
The docs for fmin say:
Definition: fmin(func, x0, args=(), xtol=0.0001, ftol=0.0001, maxiter=None, maxfun=None, full_output=0, disp=1, retall=0, callback=None)
...
args : tuple, optional
Extra arguments passed to func, i.e. ``f(x,*args)``.
Therefore, the third argument, args, should be a tuple:
In [45]: fmin(norm_1,np.array([0,1]),(x,))
Warning: Maximum number of function evaluations has been exceeded.
Out[45]: array([-0.02405078, 1.0203125 ])
(x, ) is a tuple containing one element, x.
The docs say f(x, *args) gets called. Which mean in your case
norm_1(np.array([0,1]), *(x,))
will get called, which is equivalent to
norm_1(np.array([0,1]), x)

Error: [only length-1 arrays can be converted to Python scalars] when changing variable order

Dear Stackoverflow Community,
I am very new to Python and to programming in general, so please don't get mad when I don't get your answers and ask again.
I am trying to fit a curve to experimental data with scipy.optimization.curve_fit. This is my code:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as nm
from __future__ import division
import cantera as ct
from matplotlib.backends.backend_pdf import PdfPages
import math as ma
import scipy.optimize as so
R = 8.314
T = nm.array([700, 900, 1100, 1300, 1400, 1500, 1600, 1700])
k = nm.array([289, 25695, 763059, 6358040, 14623536, 30098925, 56605969, 98832907])
def func(A, E, T):
return A*ma.exp(-E/(R*T))
popt, pcov = so.curve_fit(func, T, k)
Now this code works for me, but if I change the function to:
def func(T, A, E)
and keep the rest I get:
TypeError: only length-1 arrays can be converted to Python scalars
Also I am not really convinced by the Parameter solution of the first one.
Can anyone tell me what happens when you change the variable order?
I got the same problem and found the cause and its solution:
The problem lies on the implementation of Scipy. After the optimal parameter has been found, Scipy calls your function with the input array xdata as first argument. That is, it calls func(xdata, *args), and the function complains with a type error because xdata is not an scalar. For example:
from math import erf
erf([1, 2]) # TypeError
erf(np.array([1, 2])) # TypeError
To avoid the error, you can add custom code for supporting arrays, or better, as suggested in the answer of Joris, use numpy functions because they have support for scalars and arrays.
If the math function is not in numpy , like erf or any custom function you coded, then I recommend you instead of doing from math import erf, to do as follows:
from math import erf as math_erf # only supports scalars
import numpy as np
erf = np.vectorize(math_erf) # adds array support
def fit_func(t,s):
return 0.5*(1.0-erf(t/(np.sqrt(2)*s)))
X = np.linspace(-5,5,1000)
Y = np.array([fit_func(x,1) for x in X])
curve_fit(fit_func, X, Y)
The curve_fit function from scipy does not handle very well embedded functions from the math module. When you change the exponential function to the numpy exponential function you don't get the error:
def func(A, E, T):
return A*np.exp(-E/(R*T))
I wonder whether you data shows an exponential decay of rate. The mathematical model may not be the most suitable one.
See the doc string of curve_fit
f : callable
The model function, f(x, ...). It must take the independent variable as the first argument and the parameters to fit as separate remaining arguments.
since your formula is essentially: k=A*ma.exp(-E/(R*T)), the right order of parameters in func should be (T, A, E) or (T, E, A).
Regarding the order of A and E, they don't really matter. If you flip them, the result will get flipped as well:
>>> def func(T, A, E):
return A*ma.exp(-E/(R*T))
>>> so.curve_fit(func, T, k)
(array([ 8.21449078e+00, -5.86499656e+04]), array([[ 6.07720215e+09, 4.31864058e+12],
[ 4.31864058e+12, 3.07102992e+15]]))
>>> def func(T, E, A):
return A*ma.exp(-E/(R*T))
>>> so.curve_fit(func, T, k)
(array([ -5.86499656e+04, 8.21449078e+00]), array([[ 3.07102992e+15, 4.31864058e+12],
[ 4.31864058e+12, 6.07720215e+09]]))
I didn't get your typeerror at all.

Categories