I have a fairly large number (around 1000) of step functions, each with only two intervals. I'd like to sum them up and then find the maximum value. What is the best way to do this? I've tried out sympy, with code as follows:
from sympy import Piecewise, piecewise_fold, evalf
from sympy.abc import x
from sympy.plotting import *
import numpy as np
S = 20
t = np.random.random(20)
sum_piecewise = None
for s in range(S):
p = Piecewise((np.random.random(), x<t[s]), (np.random.random(), x>=t[s]))
if not sum_piecewise:
sum_piecewise = p
else:
sum_piecewise += p
print sum_piecewise.evalf(0.2)
However, this outputs a large symbolic expression and not an actual value, which is what I want.
As it appears that you consider numerical functions, it is better (in terms of performance) to work with Numpy. Here's one approach:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(10)
S = 20 # number of piecewise functions
# generate S function parameters.
# For example, the k-th function is defined as equal to
# p_values[k,0] when t<t_values[k] and equal to
# p_values[k,1] when t>= t_values[k]
t_values = np.random.random(S)
p_values = np.random.random((S,2))
# define a piecewise function given the function's parameters
def p_func(t, t0, p0):
return np.piecewise(t, [t < t0, t >= t0], p0)
# define a function that sums a set of piecewise functions corresponding to
# parameter arrays t_values and p_values
def p_sum(t, t_values, p_values):
return np.sum([p_func(t, t0, p0) for t0, p0 in zip(t_values,p_values)])
Here is the plot of the sum of functions:
t_range = np.linspace(0,1,1000)
plt.plot(t_range, [p_sum(tt,t_values,p_values) for tt in t_range])
Clearly, in order to find the maximum, it suffices to consider only the S time instants contained in t_values. For this example,
np.max([p_sum(tt,t_values,p_values) for tt in t_values])
11.945901591934897
What about using substitution? Try changing sum_piecewise.evalf(0.2) by sum_piecewise.subs(x, 0.2)
Related
I have written the following two functions to calibrate a model :
The main function is:
def function_Price(para,y,t,T,tau,N,C):
# y= price array
# C = Auto and cross correlation array
# a= paramters need to be calibrated
a=para[0:]
temp=0
for j in range(N):
price_j = a[j]*C[j]*P[t:T-tau,j]
temp=temp+price_j
Price=temp
return Price
The objective function is :
def GError_function_Price(para,y,k,t,T,tau,N,C):
# k is the price need to be fitted
return sum((function_Price(para,y,t,T,tau,N,C)-k[t+tau:T]) ** 2)
Now, I am calling these two functions to do the optimization of the model:
import numpy as np
from scipy.optimize import minimize
# Prices (example)
y = np.array([[1,2,3,4,5,4], [4,5,6,7,8,9], [6,7,8,7,8,6], [13,14,15,11,12,19]])
# Correaltion (example)
Corr= np.array([[1,2,3,4,5,4], [4,5,6,7,8,9], [6,7,8,7,8,6], [13,14,15,11,12,19],[1,2,3,4,5,4],[6,7,8,7,8,6]])
# Define
tau=1
Size = y.shape
N = Size[1]
T = Size[0]
t=0
# initial Values
para=np.zeros(N)
# Bounds
B = np.zeros(shape=(N,2))
for n in range(N):
B[n][0]= float('-inf')
B[n][1]= float('inf')
# Calibration
A = np.zeros(shape=(N,N))
for i in range (N):
k=y[:,i] #fitted one
C=Corr[i,:]
parag=minimize(GError_function_Price,para,args=(y,Y,t,T,tau,N,C),method='SLSQP',bounds=B)
A[i,:]=parag.x
Once, I run the model, It should produce an N by N array of optimized values of paramters. But, except for the first column, it keeps zeros for the rest. Something is wrong.
Can you help me fix the problem, please?
I know how to do it in Matlab.
The following is Matlab Code :
main function
function Price=function_Price(para,P,t,T,tau,N,C)
a=para(:,:);
temp=0;
for j=1:N
price_j = a(j).*C(j).*P(t:T-tau,j);
temp=temp+price_j;
end
Price=temp;
end
The objective function:
function gerr=GError_function_Price(para,P,Y,t,T,tau,N,C)
gerr=sum((function_Price(para,P,t,T,tau,N,C)-Y(t+tau:T)).^2);
end
Now, I call these two functions in the following way:
P = [1,2,3,4,5,4;4,5,6,7,8,9;6,7,8,7,8,6;13,14,15,11,12,19];
AutoAndCrossCorr= [1,2,3,4,5,4;4,5,6,7,8,9;6,7,8,7,8,6;13,14,15,11,12,19;1,2,3,4,5,4;6,7,8,7,8,6];
tau=1;
Size = size(P);
N =6;
T =4;
t=1;
for i=1:N
Y=P(:,i); % fitted one
C=AutoAndCrossCorr(i,:);
para=zeros(1,N);
lb= repmat(-inf,N,1);
ub= repmat(inf ,N,1);
parag=fminsearchbnd(#(para)abs(GError_function_Price(para,P,Y,t,T,tau,N,C)),para,lb,ub);
a(i,:)=parag;
end
The problem seems to be that you're passing the result of a function call to minimize, rather than the function itself. The arguments get passed by the args parameter. So instead of:
minimize(GError_function_Price(para,y,k,t,T,tau,N,C),para,method='SLSQP',bounds=B)
the following should work
minimize(GError_function_Price,para,args=(y,k,t,T,tau,N,C),method='SLSQP',bounds=B)
I am trying to solve a second order ODE with solve_bvp. I have split the second order ODE into a system of tow first oder ODEs. I have a changing set of constants depending on the x (mesh) value. So I am passing these as an array of shape (N,) into my function numdens. While trying to run solve_bvp I get the error that the returns have different shapes namely (N,) and (N-1,) and thus cannot be broadcast into one array. But when I check each return back manually outside of the function it has the shape (N,).
If I run the solver without my changing constants I get a solution akin to the right one.
import numpy as np
from scipy.integrate import solve_bvp,odeint
import matplotlib.pyplot as plt
E_0 = 1 * 0.0000016021773 #erg: gcm^2/s^2
m_H = 1.6*10**(-24) #g
c = 3e11 #cm
sigma_c = 2*10**(-23)
n_0 = 1*10**(20) #1/cm^3
v_0 = (2*E_0/m_H)**(0.5) #cm/s
T = 10**7
b = 20.3
n_eq = b*T**3
n_s = 2.03*10**(19)
Q = 1
def velocity(v,x):
dvdx = -sigma_c*n_0*v_0*((8*v_0*v-7*v**2-v_0**2)/(2*v*c))
return dvdx
n_num = 100
x_num = np.linspace(-1*10**(6),3*10**(6), n_num)
sol_velo = odeint(velocity,0.999999999999*v_0,x_num)
sol_new = np.reshape(sol_velo,n_num)
def constants(v):
D1 = (c*v/(3*n_0*v_0*sigma_c))
D2 = ((v**2-8*v_0*v+v_0**2)/(6*v))
D3 = sigma_c*n_0*v_0*((8*v_0*v-7*v**2-v_0**2)/(2*v*c))
return D1,D2,D3
def numdens(x,y):
v = sol_new
D1,D2,D3 = constants(v)
return np.vstack((y[1],(-D2*y[1]-D3*y[0]+Q*((1-y[0])/n_eq))/(D1)))
def bc_num(ya, yb):
return np.array([ya[0]-n_s,yb[0]-n_eq])
y_num = np.array([np.linspace(n_s, n_eq, n_num),np.linspace(n_s, n_eq, n_num)])
sol_num = solve_bvp(numdens, bc_num, x_num, y_num)
plt.plot(sol_num.x, sol_num.y[0], label='$n(x)$')
plt.plot(x_num, sol_velo-v_0/7, label='$v(x)$')
plt.yscale('log')
plt.grid(alpha=0.5)
plt.legend(framealpha=1)
plt.show()
You need to take into account that the BVP solver uses an adaptive mesh. That is, after refining the initial guess on the initial grid the solver identifies regions with overly large errors and creates new mesh nodes there. As far as I have seen, the opposite is not implemented, even if it may be in some applications sensible to reduce the number of mesh nodes on especially "nice" segments.
Thus what you are doing the the numdens function is incomprehensible, it has to function exactly like any other function that you would pass to an ODE solver. If I had to propose some fast fix, and without knowing what the underlying problem is that you want to solve, I would change the assignment of v to
v = np.interp(x,x_num,sol_velo)
as that should at least produce an array of the correct format.
I have a rather complicated function H(x), and I'm trying to solve for the value of x such that H(x) = constant. I would like to do this with an interpolation object generated from a discrete interval and the corresponding output of H(interval), where other inputs are held constant. I denote the interpolation object f.
My problem is that the call function of the interpolation object accepts an array_like, so passing a symbol to f(x) to use sage's solver method is out of the question. Any ideas of how to get around this?
I have interpolation function f. I would like to solve the equation f(x) == sageconstant forx.
from scipy.interpolate import InterpolatedUnivariateSpline as IUspline
import numpy as np
#Generating my interpolation object
xint = srange(30,200,step=.1)
val = [H(i,1,.1,0,.2,.005,40) for i in srange(30,299,step=.1)]
f = IUspline(xint,val,k=4)
#This will yield a sage constant
eq_G(x) = freeB - x
#relation that I would like to solve
eq_m(x) = eq_G(39.9) == f(x)
m = solve(eq_m(x),x)
The above code (f(x) to be more specific) generates
"TypeError: Cannot cast array data from dtype('0') to dtype('float64')
according to the rule 'safe'.
edit: Any function H(x) will result in the same error, hence it doesn't matter what H(x) is. For simplicity (I wasn't kidding when I said H is complicated), try H(x) = x. Then the block will read:
from scipy.interpolate import InterpolatedUnivariateSpline as IUspline
import numpy as np
#Generating my interpolation object
xint = srange(30,200,step=.1)
H(x) = x
val = [H(i) for i in srange(30,299,step=.1)]
f = IUspline(xint,val,k=4)
#This will yield a sage constant
eq_G(x) = freeB - x
#relation that I would like to solve
eq_m(x) = eq_G(39.9) == f(x)
m = solve(eq_m(x),x)
When working with numpy and scipy, prefer Python types to Sage types.
Instead of Sage Integers and Reals, use Python ints and floats.
Maybe you can fix your code like this.
from scipy.interpolate import InterpolatedUnivariateSpline as IUspline
import numpy as np
# Generate interpolation object
xint = srange(30,200,step=.1)
xint = [float(x) for x in xint]
val = [float(H(i,1,.1,0,.2,.005,40)) for i in srange(30,299,step=.1)]
f = IUspline(xint,val,k=4)
# This will yield a Sage constant
eq_G(x) = freeB - x
# relation that I would like to solve
eq_m(x) = eq_G(39.9) == f(x)
m = solve(eq_m(x),x)
I'm trying to find a good way to solve a nonlinear overdetermined system with python. I looked into optimization tools here http://docs.scipy.org/doc/scipy/reference/optimize.nonlin.html but I can't figure out how to use them. What I have so far is
#overdetermined nonlinear system that I'll be using
'''
a = cos(x)*cos(y)
b = cos(x)*sin(y)
c = -sin(y)
d = sin(z)*sin(y)*sin(x) + cos(z)*cos(y)
e = cos(x)*sin(z)
f = cos(z)*sin(x)*cos(z) + sin(z)*sin(x)
g = cos(z)*sin(x)*sin(y) - sin(z)*cos(y)
h = cos(x)*cos(z)
a-h will be random int values in the range 0-10 inclusive
'''
import math
from random import randint
import scipy.optimize
def system(p):
x, y, z = p
return(math.cos(x)*math.cos(y)-randint(0,10),
math.cos(x)*math.sin(y)-randint(0,10),
-math.sin(y)-randint(0,10),
math.sin(z)*math.sin(y)*math.sin(x)+math.cos(z)*math.cos(y)-randint(0,10),
math.cos(x)*math.sin(z)-randint(0,10),
math.cos(z)*math.sin(x)*math.cos(z)+math.sin(z)*math.sin(x)-randint(0,10),
math.cos(z)*math.sin(x)*math.sin(y)-math.sin(z)*math.cos(y)-randint(0,10),
math.cos(x)*math.cos(z)-randint(0,10))
x = scipy.optimize.broyden1(system, [1,1,1], f_tol=1e-14)
could you help me out a bit here?
If I understand you right, you want to find an approximate solution to the non-linear system of equations f(x) = b where b is the vector containing the random values b=[a,...,h].
In order to do this you will first need to remove the random values from the system function, because otherwise in each iteration the solver will try to solve a different equation system. Moreover, I think that the basic Broyden method only works for a system with as many unknowns as equations. Alternatively you could use scipy.optimize.leastsq. A possible solution looks like this:
# I am using numpy because it's more convenient for the generation of
# random numbers.
import numpy as np
from numpy.random import randint
import scipy.optimize
# Removed random right-hand side values and changed nomenclature a bit.
def f(x):
x1, x2, x3 = x
return np.asarray((math.cos(x1)*math.cos(x2),
math.cos(x1)*math.sin(x2),
-math.sin(x2),
math.sin(x3)*math.sin(x2)*math.sin(x1)+math.cos(x3)*math.cos(x2),
math.cos(x1)*math.sin(x3),
math.cos(x3)*math.sin(x1)*math.cos(x3)+math.sin(x3)*math.sin(x1),
math.cos(x3)*math.sin(x1)*math.sin(x2)-math.sin(x3)*math.cos(x2),
math.cos(x1)*math.cos(x3)))
# The second parameter is used to set the solution vector using the args
# argument of leastsq.
def system(x,b):
return (f(x)-b)
b = randint(0, 10, size=8)
x = scipy.optimize.leastsq(system, np.asarray((1,1,1)), args=b)[0]
I hope this is of help for you. However, note that it is extremely unlikely that you will find a solution, especially when you generate random integers in the interval [0,10] while the range of f is limited to [-2,2]
What I'm trying to do is make a gaussian function graph. then pick random numbers anywhere in a space say y=[0,1] (because its normalized) & x=[0,200]. Then, I want it to ignore all values above the curve and only keep the values underneath it.
import numpy
import random
import math
import matplotlib.pyplot as plt
import matplotlib.mlab as mlab
from math import sqrt
from numpy import zeros
from numpy import numarray
variance = input("Input variance of the star:")
mean = input("Input mean of the star:")
x=numpy.linspace(0,200,1000)
sigma = sqrt(variance)
z = max(mlab.normpdf(x,mean,sigma))
foo = (mlab.normpdf(x,mean,sigma))/z
plt.plot(x,foo)
zing = random.random()
random = random.uniform(0,200)
import random
def method2(size):
ret = set()
while len(ret) < size:
ret.add((random.random(), random.uniform(0,200)))
return ret
size = input("Input number of simulations:")
foos = set(foo)
xx = set(x)
method = method2(size)
def undercurve(xx,foos,method):
Upper = numpy.where(foos<(method))
Lower = numpy.where(foos[Upper]>(method[Upper]))
return (xx[Upper])[Lower],(foos[Upper])[Lower]
When I try to print undercurve, I get an error:
TypeError: 'set' object has no attribute '__getitem__'
and I have no idea how to fix it.
As you can all see, I'm quite new at python and programming in general, but any help is appreciated and if there are any questions I'll do my best to answer them.
The immediate cause of the error you're seeing is presumably this line (which should be identified by the full traceback -- it's generally quite helpful to post that):
Lower = numpy.where(foos[Upper]>(method[Upper]))
because the confusingly-named variable method is actually a set, as returned by your function method2. Actually, on second thought, foos is also a set, so it's probably failing on that first. Sets don't support indexing with something like the_set[index]; that's what the complaint about __getitem__ means.
I'm not entirely sure what all the parts of your code are intended to do; variable names like "foos" don't really help like that. So here's how I might do what you're trying to do:
# generate sample points
num_pts = 500
sample_xs = np.random.uniform(0, 200, size=num_pts)
sample_ys = np.random.uniform(0, 1, size=num_pts)
# define distribution
mean = 50
sigma = 10
# figure out "normalized" pdf vals at sample points
max_pdf = mlab.normpdf(mean, mean, sigma)
sample_pdf_vals = mlab.normpdf(sample_xs, mean, sigma) / max_pdf
# which ones are under the curve?
under_curve = sample_ys < sample_pdf_vals
# get pdf vals to plot
x = np.linspace(0, 200, 1000)
pdf_vals = mlab.normpdf(x, mean, sigma) / max_pdf
# plot the samples and the curve
colors = np.array(['cyan' if b else 'red' for b in under_curve])
scatter(sample_xs, sample_ys, c=colors)
plot(x, pdf_vals)
Of course, you should also realize that if you only want the points under the curve, this is equivalent to (but much less efficient than) just sampling from the normal distribution and then randomly selecting a y for each sample uniformly from 0 to the pdf value there:
sample_xs = np.random.normal(mean, sigma, size=num_pts)
max_pdf = mlab.normpdf(mean, mean, sigma)
sample_pdf_vals = mlab.normpdf(sample_xs, mean, sigma) / max_pdf
sample_ys = np.array([np.random.uniform(0, pdf_val) for pdf_val in sample_pdf_vals])
It's hard to read your code.. Anyway, you can't access a set using [], that is, foos[Upper], method[Upper], etc are all illegal. I don't see why you convert foo, x into set. In addition, for a point produced by method2, say (x0, y0), it is very likely that x0 is not present in x.
I'm not familiar with numpy, but this is what I'll do for the purpose you specified:
def undercurve(size):
result = []
for i in xrange(size):
x = random()
y = random()
if y < scipy.stats.norm(0, 200).pdf(x): # here's the 'undercurve'
result.append((x, y))
return results