Using SCIPY.OPTIMIZE.FMIN_CG to extract Weibull distribution parameters - python

I am attempting to extract Weibull distribution parameters (shape 'k' and scale 'lambda') that satisfy a certain mean and variance. In this example, the mean is 4 and the variance is 8. It is a 2-unknowns and 2-equations type of problem.
Since this algorithm works with Excel 2010's GRG Solver, I am certain it is about the way I am framing the problem, or potentially, the libraries I am using. I am not overly familiar with optimization libraries, so please let me know where the error is.
Below is the script:
from scipy.optimize import fmin_cg
import math
def weibull_mu(k, lmda): #Formula can be found on wikipedia
return lmda*math.gamma(1+1/k)
def weibull_var(k, lmda): #Formula can be found on wikipedia
return lmda**2*math.gamma(1+2/k)-weibull_mu(k, lmda)**2
def min_function(arggs):
actual_mean = 4 # specific to this example
actual_var = 8 # specific to this example
k = arggs[0]
lmda = arggs[1]
output = [weibull_mu(k, lmda)-(var_wei)]
output.append(weibull_var(k, lmda)-(actual_var)**2-(actual_mean)**2)
return output
print fmin(min_function, [1,1])
This script gives me the following error:
[...]
File "C:\Program Files\Python27\lib\site-packages\scipy\optimize\optimize.py", line 278, in fmin
fsim[0] = func(x0)
ValueError: setting an array element with a sequence.

As far as I can tell, min_function returns a multi-dimensional list, but fmin and fmin_cg does expect that the objective function returns a scalar, if I am not mistaken.
If you are searching the root of the two-equations problem, I suppose it is better that you apply the root function instead. As far as I have been able to find out, scipy does not provide any general optimizers for vector functions.

I managed to get it to work thanks to Anders Gustafsson's comment (thank you). This script now works if one returns only a scalar (in this case I used something along the lines of least-squares). Also, bounds were added by changing the optimization function to "fmin_l_bfgs_b" (again, thanks to Anders Gustafsson).
I only changed the min_function definition relative to the question.
from scipy.optimize import fmin_l_bfgs_b
import math
def weibull_mu(k, lmda):
return lmda*math.gamma(1+1/k)
def weibull_var(k, lmda):
return lmda**2*math.gamma(1+2/k)-weibull_mu(k, lmda)**2
def min_function(arggs):
actual_mean = 4. # specific to this example
actual_var = 8. # specific to this example
k = arggs[0]
lmda = arggs[1]
extracted_var = weibull_var(k, lmda)
extracted_mean = weibull_mu(k, lmda)
output = (extracted_var - actual_var)**2 + (extracted_mean - actual_mean)**2
return output
print fmin_l_bfgs_b(min_function, best_guess, approx_grad = True, bounds = [(.0000001,None),(.0000001,None)], disp = False)
Note: Please feel free to use this script for your own or professional use.

Related

Using NEOS as a Pyomo solver

I have recently started in doing some OR, and have been trying to use Pyomo and NEOS to do some optimation problems. I have been following along with one of the UT Austin Pyomo lectures, and when my GLPT was being difficult to be installed, I moved on to NEOS. I am having some difficulty in now receiving a solved answer from NEOS.
What I have so far is this:
from pyomo import environ as pe
import os
os.environ['NEOS_EMAIL'] = 'my registered email'
model = pe.ConcreteModel()
model.x1 = pe.Var(domain=pe.Binary)
model.x2 = pe.Var(domain=pe.Binary)
model.x3 = pe.Var(domain=pe.Binary)
model.x4 = pe.Var(domain=pe.Binary)
model.x5 = pe.Var(domain=pe.Binary)
obj_expr = 3 * model.x1 + 4 * model.x2 + 5 * model.x3 + 8 * model.x4 + 9 * model.x5
model.obj = pe.Objective(sense=pe.maximize, expr=obj_expr)
con_expr = 2 * model.x1 + 3 * model.x2 + 4 * model.x3 + 5 * model.x4 + 9 * model.x5 <= 20
model.con = pe.Constraint(expr=con_expr)
solver_manager = pe.SolverManagerFactory('neos')
results = solver_manager.solve(model, solver = "minos")
print(results)
What I receive in return is number of solutions = 0, while I know for a fact that one exits. I also see that I don't have any bounds set, so how would I go about doing that? Once again, I am very new to this, and have not been able to find any sort of documentation regarding this elsewhere, or perhaps I just don't know how to look.
Thanks for any help!
This is a "problem" with the design of the current results object. For historical reasons, that field reports the number of solutions contained in the results object and is not the number of solutions generated by the solver. By default, Pyomo solvers directly load the solution returned by the solver into the original model (both for convenience and efficiency) and do not return it in the results object. You can change that behavior by providing load_solutions=False to the solve() call.
As for the bounds, what bounds are you referring to? Variable bounds are set using either the bounds= argument to the Var() declaration, or the domain= argument. For your example, because the variables are declared to be Binary, they all have bounds of [0..1]. Bounds on the objective are gathered by parsing the solver output. This is dependent on bother the solver that you are using (many do not report bounds information), and the interface used to parse the solver results.
As a final note, you are sending a MIP problem to a LP/NLP solver (minos). You will get fractional valies for your binary variables back from the solver.
To retrieve the solution from the model, you can use something like:
print(model.x1.value, model.x2.value, model.x3.value, model.x4.value, model.x5.value)
And using solver="cbc" you can avoid fractional values in this example.

An analog to rnorm in python

Good morning!
I am looking to create an analog of some code in R
Basically, I have a function that, among other things, takes a seed provided by the user (default is NULL), along with a specific distribution (default is rnorm), and outputs 9 random numbers, saved as a vector "e". This is what it looked like in R...
function (...other variables..., seed=NULL, dist=rnorm)
...other code...
e <- dist(9,...)
Now I'm converting the function to Python, but I can't quite seem to find an analog that would work, where a user can replace the base seed and distribution.
Here's what i have so far...
def (...other variables..., seed=None, dist=?):
...other code...
e = dist(9)
See numpy.random.normal function (doc here)
For instance:
import numpy as np
np.random.normal(0,1,9)
array([ 0.33593283, -0.18149502, 0.43148566, 1.46831794, -0.72244867,
-1.40048855, 0.52366471, 0.34099135, 0.71654992])

python dblquad invalid callable

I am trying to approximate the Gauss Linking integral for two straight lines in R^3 using dblquad. I've created this pair of lines as an object.
I have a form for the integrand in parametrisation variables s and t generated by a function gaussint(self,s,t) and this is working. I'm then just trying to define a function which returns the double integral over the two intervals [0,1].
Edit - the code for the function looks like this:
def gaussint(self,s,t):
formnum = self.newlens()[0]*self.newlens()[1]*np.sin(test.angle())*np.cos(test.angle())
formdenone = (np.cos(test.angle())**2)*(t*(self.newlens()[0]) - s*(self.newlens()[1]) + self.adists()[0] - self.adists()[1])**2
formdentwo = (np.sin(test.angle())**2)*(t*(self.newlens()[0]) + s*(self.newlens()[1]) + self.adists()[0] + self.adists()[1])**2
fullden = (4 + formdenone + formdentwo)**(3/2)
fullform = formnum/fullden
return fullform
The various other function calls here are just bits of linear algebra - lengths of lines, angle between them and so forth. s and t have been defined as symbols upstream, if they need to be.
The code for the integration then just looks like this (I've separated it out just to try and work out what was going on:
def approxint(self, s, t):
from scipy.integrate import dblquad
return dblquad(self.gaussint(s,t),0,1, lambda t:0,lambda t:1)
Running it gets me a lengthy bit of somewhat impenetrable process messages, followed by
ValueError: invalid callable given
Any idea where I'm going wrong?
Cheers.

Algorithm - return aliasing frequency

In Python, I'm trying to write an algorithm alias_freq(f_signal,f_sample,n), which behaves as follows:
def alias_freq(f_signal,f_sample,n):
f_Nyquist=f_sample/2.0
if f_signal<=f_Nyquist:
return n'th frequency higher than f_signal that will alias to f_signal
else:
return frequency (lower than f_Nyquist) that f_signal will alias to
The following is code that I have been using to test the above function (f_signal, f_sample, and n below are chosen arbitrarily just to fill out the code)
import numpy as np
import matplotlib.pyplot as plt
t=np.linspace(0,2*np.pi,500)
f_signal=10.0
y1=np.sin(f_signal*t)
plt.plot(t,y1)
f_sample=13.0
t_sample=np.linspace(0,int(f_sample)*(2*np.pi/f_sample),f_sample)
y_sample=np.sin(f_signal*t_sample)
plt.scatter(t_sample,y_sample)
n=2
f_alias=alias_freq(f_signal,f_sample,n)
y_alias=np.sin(f_alias*t)
plt.plot(t,y_alias)
plt.xlim(xmin=-.1,xmax=2*np.pi+.1)
plt.show()
My thinking is that if the function works properly, the plots of both y1 and y_alias will hit every scattered point from y_sample. So far I have been completely unsuccessful in getting either the if statement or the else statement in the function to do what I think it should, which makes me believe that either I don't understand aliasing nearly as well as I want to, or my test code is no good.
My questions are: Prelimarily, is the test code I'm using sound for what I'm trying to do? And primarily, what is the alias_freq function that I am looking for?
Also please note: If some Python package has a function just like this already built in, I'd love to hear about it - however, part of the reason I'm doing this is to give myself a device to understand phenomena like aliasing better, so I'd still like to see what my function should look like.
As far as I understood the question correctly, the frequency of the aliased signal is abs(sampling_rate * n - f_signal), where n is the closest integer multiple to f_signal.
Thus:
n = round(f_signal / float(f_sample))
f_alias = abs(f_sample * n - f_signal)
This should work for frequencies under and over Nyquist.
I figured out the answer to my and just realized that I forgot to post it here, sorry. Turns out it was something silly - Antii's answer is basically right, but the way I wrote the code I need a f_sample-1 in the alias_freq function, where I just had an f_sample. There's still a phase shift thing that happens sometimes, but just plugging in either 0 or pi for the phase shift has worked for me every time, I think it's just due to even or odd folding. The working function and test code is below.
import numpy as np
import matplotlib.pyplot as plt
#Given a sample frequency and a signal frequency, return frequency that signal frequency will be aliased to.
def alias_freq(f_signal,f_sample,n):
f_alias = np.abs((f_sample-1)*n - f_signal)
return f_alias
t=np.linspace(0,2*np.pi,500)
f_signal=13
y1=np.sin(f_signal*t)
plt.plot(t,y1)
f_sample=7
t_sample=np.linspace(0,int(f_sample)*(2*np.pi/f_sample),f_sample)
y_sample=np.sin((f_signal)*t_sample)
plt.scatter(t_sample,y_sample)
f_alias=alias_freq(f_signal,f_sample,3)
y_alias=np.sin(f_alias*t+np.pi)#Sometimes with phase shift, usually np.pi for integer f_signal and f_sample, sometimes without.
plt.plot(t,y_alias)
plt.xlim(xmin=-.1,xmax=2*np.pi+.1)
plt.show()
Here is a Python aliased frequency calculator based on numpy
def get_aliased_freq(f, fs):
"""
return aliased frequency of f sampled at fs
"""
import numpy as np
fn = fs / 2
if np.int(f / fn) % 2 == 0:
return f % fn
else:
return fn - (f % fn)

SymPy/SciPy: solving a system of ordinary differential equations with different variables

I am new to SymPy and Python in general, and I am currently working with Python 2.7 and SymPy 0.7.5 with the objective to:
a) read a system of differential equations from a text file
b) solve the system
I already read this question and this other question, and they are almost what I am looking for, but I have an additional issue: I do not know in advance the form of the system of equations, so I cannot create the corresponding function using def inside the script, as in this example. The whole thing has to be managed at run-time.
So, here are some snippets of my code. Suppose I have a text file system.txt containing the following:
dx/dt = 0.0387*x - 0.0005*x*y
dy/dt = 0.0036*x*y - 0.1898*y
What I do is:
# imports
import sympy
import scipy
import re as regex
# define all symbols I am going to use
x = sympy.Symbol('x')
y = sympy.Symbol('y')
t = sympy.Symbol('t')
# read the file
systemOfEquations = []
with open("system.txt", "r") as fp :
for line in fp :
pattern = regex.compile(r'.+?\s+=\s+(.+?)$')
expressionString = regex.search(pattern, line) # first match ends in group(1)
systemOfEquations.append( sympy.sympify( expressionString.group(1) ) )
At this point, I am stuck with the two symbolic expressions inside the systemOfEquation list. Provided that I can read the initial conditions for the ODE system from another file, in order to use scipy.integrate.odeint, I would have to convert the system into a Python-readable function, something like:
def dX_dt(X, t=0):
return array([ 0.0387*X[0] - 0.0005*X[0]*X[1] ,
-0.1898*X[1] + 0.0036*X[0]*X[1] ])
Is there a nice way to create this at run-time? For example, write the function to another file and then import the newly created file as a function? (maybe I am being stupid here, but remember that I am relatively new to Python :-D)
I've seen that with sympy.utilities.lambdify.lambdify it's possible to convert a symbolic expression into a lambda function, but I wonder if this can help me...lambdify seems to work with one expression at the time, not with systems.
Thank you in advance for any advice :-)
EDIT:
With minimal modifications, Warren's answer worked flawlessly. I have a list of all symbols inside listOfSymbols; moreover, they appear in the same order as the columns of data X that will be used by odeint. So, the function I used is
def dX_dt(X, t):
vals = dict()
for index, s in enumerate(listOfSymbols) :
if s != time :
vals[s] = X[index]
vals[time] = t
return [eq.evalf(subs=vals) for eq in systemOfEquations]
I just make an exception for the variable 'time' in my specific problem. Thanks again! :-)
If you are going to solve the system in the same script that reads the file (so systemOfEquations is available as a global variable), and if the only variables used in systemOfEquations are x, y and possibly t, you could define dX_dt in the same file like this:
def dX_dt(X, t):
vals = dict(x=X[0], y=X[1], t=t)
return [eq.evalf(subs=vals) for eq in systemOfEquations]
dX_dt can be used in odeint. In the following ipython session, I have already run the script that creates systemOfEquations and defines dX_dt:
In [31]: odeint(dX_dt, [1,2], np.linspace(0, 1, 5))
Out[31]:
array([[ 1. , 2. ],
[ 1.00947534, 1.90904183],
[ 1.01905178, 1.82223595],
[ 1.02872997, 1.73939226],
[ 1.03851059, 1.66032942]]

Categories