'bounds' doesn't work in scipy.stats.minimize function - python

I had a very strange problem in the minimize function. I wrote following code and hoped to output result from (0,1).
cons = ({'type':'eq','fun':lambda x: 1-sum(x)})
bnds = [(0,1)]*len(w)
minS = minimize(min_S_function, w, method = 'SLSQP', bounds = bnds, constraints = cons)
However, the result output many extremely small numbers instead of zero even though I set bounds between 0 to 1. Why is that?
In [68]:minS.x
Out[68]:
array([ 2.18674802e-14, -2.31905438e-14, 4.05696128e-01,
1.61295198e-14, 4.98954818e-02, -2.75073615e-14,
3.97195447e-01, 1.09796187e-14, -4.33297358e-15,
2.38805100e-14, 7.73037793e-15, 3.21824430e-14,
-1.42202909e-14, -1.08110329e-14, -1.83513297e-14,
-1.37745269e-14, 3.37854385e-14, 4.69473932e-14,
-1.09088800e-15, -1.57169147e-14, 7.47784562e-02,
1.32782180e-02, 1.64441640e-14, 2.72140153e-15,
5.23069695e-14, 5.91562687e-02, 2.16467506e-15,
-6.50672519e-15, 2.53337977e-15, -6.68019297e-14])

This is an acceptable solution!
Those iterative solvers only guarantee a local-optimum approximate-solution and that's what you got!
Check your numbers and you will see, that the distance to the lower-bound (of your negative out-of-bound values) is within 10^-14 = 0,00000000000001, an acceptable error (we use floating-point math after all).

Related

Solving simultaneous equations (>2) without conversion to matrices in R or Python

I have a set of 4 simultaneous equations:
0.059z = x
0.06w = y
z+w = 8093
x+y = 422
All the solutions I've found so far seem to be for equations that have all the variables present in each equation, then convert to matrices and use the solve function.
Is there an easier way to do this in R or Python using the equations in their original form?
Also, how can I ensure that only positive numbers are returned in the solution?
Hope this makes sense...many thanks for the help
You can use sympy for this:
from sympy import symbols, linsolve, Eq
x,y,z,w = symbols('x y z w')
linsolve([Eq(0.059*z, x), Eq(0.06*w, y), Eq(z+w, 8093), Eq(x+y, 422)], (x, y, z, w))
Output:
Regarding your comments about negative values - there is only one solution to the system of equations, and it has negative values for y and w. If there was more than one solution, sympy would return them, and you could filter the solutions from there to only positive values.
In R, maybe you try it like below:
library(rootSolve)
library(zeallot)
model <- function(v){
c(x,y,z,w) %<-% v
return(c(0.059*z-x, 0.06*w-y, z+w-8093, x+y-422))
}
res <- multiroot(f = model, start = c(0,0,0,0))
then you can get the solution res as
> res
[1] 3751.22 -3329.22 63580.00 -55487.00
there are a few things going on here. first as CDJB notes: if there were any positive solutions then sympy would find them. I searched for those numbers and found this paper which suggests you should be using 7088 instead of 8093. we can do a quick sanity check:
def pct(value):
return f"{value:.1%}"
print(pct(422 / 8093)) # ~5.2%
print(pct(422 / 7088)) # ~6.0%
confirming that you're going to struggle averaging ~5.9% and ~6.0% towards ~5.2%, and explaining the negative solutions in the other answers. further, these are presumably counts so all your variables also need to be whole numbers.
once this correct denominator is used, I'd comment that there are many solutions (11645 by my count) e.g:
cases = [1, 421]
pop = [17, 7071]
rates = [pct(c / p) for c, p in zip(cases, pop)]
gives the appropriate output, as does:
cases = [2, 420]
pop = [34, 7054]
this is because the data was rounded to two decimal places. you probably also don't want to use either of the above, they're just the first two valid solutions I got.
we can define a Python function to enumerate all solutions:
from math import floor, ceil
def solutions(pop, cases, rate1, rate2, err):
target = (pct(rate1), pct(rate2))
for pop1 in range(1, pop):
pop2 = pop - pop1
c1_lo = ceil(pop1 * (rate1 - err))
c1_hi = floor(pop1 * (rate1 + err))
for c1 in range(c1_lo, c1_hi+1):
c2 = cases - c1
if (pct(c1 / pop1), pct(c2 / pop2)) == target:
yield c1, c2, pop1, pop2
all_sols = list(solutions(7088, 422, 0.059, 0.060, 0.0005))
which is where I got my count of 11645 above from.
not sure what to suggest with this, but you could maybe do a bootstrap to see how much your statistic varies with different solutions. another option would be to do a Bayesian analysis which would let you put priors over the population sizes and hence cut this down a lot.

How to monitor the process of SciPy.odeint?

SciPy can solve ode equations by scipy.integrate.odeint or other packages, but it gives result after the function has been solved completely. However, if the ode function is very complex, the program will take a lot of time(one or two days) to give the whole result. So how can I mointor the step it solve the equations(print out result when the equation hasn't been solved completely)?
When I was googling for an answer, I couldn't find a satisfactory one. So I made a simple gist with a proof-of-concept solution using the tqdm project. Hope that helps you.
Edit: Moderators asked me to give an explanation of what is going on in the link above.
First of all, I am using scipy's OOP version of odeint (solve_ivp) but you could adapt it back to odeint. Say you want to integrate from time T0 to T1 and you want to show progress for every 0.1% of progress. You can modify your ode function to take two extra parameters, a pbar (progress bar) and a state (current state of integration). Like so:
def fun(t, y, omega, pbar, state):
# state is a list containing last updated time t:
# state = [last_t, dt]
# I used a list because its values can be carried between function
# calls throughout the ODE integration
last_t, dt = state
# let's subdivide t_span into 1000 parts
# call update(n) here where n = (t - last_t) / dt
time.sleep(0.1)
n = int((t - last_t)/dt)
pbar.update(n)
# we need this to take into account that n is a rounded number.
state[0] = last_t + dt * n
# YOUR CODE HERE
dydt = 1j * y * omega
return dydt
This is necessary because the function itself must know where it is located, but scipy's odeint doesn't really give this context to the function. Then, you can integrate fun with the following code:
T0 = 0
T1 = 1
t_span = (T0, T1)
omega = 20
y0 = np.array([1], dtype=np.complex)
t_eval = np.arange(*t_span, 0.25/omega)
with tqdm(total=1000, unit="‰") as pbar:
sol = solve_ivp(
fun,
t_span,
y0,
t_eval=t_eval,
args=[omega, pbar, [T0, (T1-T0)/1000]],
)
Note that anything mutable (like a list) in the args is instantiated once and can be changed from within the function. I recommend doing this rather than using a global variable.
This will show a progress bar that looks like this:
100%|█████████▉| 999/1000 [00:13<00:00, 71.69‰/s]
You could split the integration domain and integrate the segments, taking the last value of the previous as initial condition of the next segment. In-between, print out whatever you want. Use numpy.concatenate to assemble the pieces if necessary.
In a standard example of a 3-body solar system simulation, replacing the code
u0 = solsys.getState0();
t = np.arange(0, 100*365.242*day, 0.5*day);
%timeit u_res = odeint(lambda u,t: solsys.getDerivs(u), u0, t, atol = 1e11*1e-8, rtol = 1e-9)
output: 1 loop, best of 3: 5.53 s per loop
with a progress-reporting code
def progressive(t,N):
nk = [ int(n+0.5) for n in np.linspace(0,len(t),N+1) ]
u0 = solsys.getState0();
u_seg = [ np.array([u0]) ];
for k in range(N):
u_seg.append( odeint(lambda u,t: solsys.getDerivs(u), u0, t[nk[k]:nk[k+1]], atol = 1e11*1e-8, rtol = 1e-9)[1:] )
print t[nk[k]]/day
for b in solsys.bodies: print("%10s %s"%(b.name,b.x))
return np.concatenate(u_seg)
%timeit u_res = progressive(t,20)
output: 1 loop, best of 3: 5.96 s per loop
shows only a slight 8% overhead for console printing. With a more substantive ODE function, the fraction of the reporting overhead will reduce significantly.
That said, python, at least with its standard packages, is not the tool for industrial-scale number-crunching. Always use compiled versions with strong typing of variables to reduce interpretative overhead as much as possible.
Use some heavily developed and tested package like Sundials or the julia-lang framework differentialequations.jl directly coding the ODE function in an appropriate compiled language. Use the higher-order methods for larger step sizes, thus smaller steps. Test if using implicit or exponential/Rosenbrock methods reduces the number of steps or ODE function evaluations per fixed interval further. The difference can be a factor of 10 to 100 in speedup.
Use a python wrapper of the above with some acceleration-friendly implementation of your ODE function.
Use the quasi-source-translating tool JITcode to translate your python ODE function to a spaghetti list of C instruction that then give a compiled function that can be (almost) directly called from the compiled FORTRAN kernel of odeint.
Simple and Clear.
If you want to integrate an ODE from T0 to T1:
In the last line of the code, before return, you can use print((t/T1)*100,end='')
Then use a sys.stdout.flush() to keep the same line of printing.
Here is an example. My integrating time [0 0.2]
ddt[-2]=(beta/(Ap2*(L-x)))*(-Qgap+Ap*u)
ddt[-1]=(beta/(Ap2*(L+x)))*(Qgap-Ap*u)
print("\rCompletion percentage "+str(format(((t/0.2)*100),".4f")),end='')
sys.stdout.flush()
return ddt
It slows a bit the solving process by fraction of seconds, but it serves perfectly the purpose rather than creating new functions.

Scipy Curve_fit. Separate bounds for multiple parameters

I am using Scipy to fit my data to a function. The function give me values for 2 parameters, in this case a and b. I want to use the bound argument to limit the values these parameters can take, each have their own range of acceptable values.
Acceptable values: 15< a <50 and 0.05< b <0.2
I want to know how to implement them. The official documentation only shows how to do them for 1 parameter. This question is similiar to: Python curve fit library that allows me to assign bounds to parameters. Which also only tackles boundaries for 1 parameter.
Here is what i tried:
def Ebfit(x,a,b):
Eb_mean = a*(0.0256/kt) # Eb at bake temperature
Eb_sigma = b*Eb_mean
Foursigma = 4*Eb_sigma
Eb_a = np.linspace(Eb_mean-Foursigma,Eb_mean+Foursigma,N_Device)
dEb = Eb_a[1] - Eb_a[0]
pdfEb_a = spys.norm.pdf(Eb_a,Eb_mean,Eb_sigma)
## Retention Time
DMom = np.zeros(len(x),float)
tau = (1/f0)*np.exp(Eb_a)
for bb in range(len(x)):
DMom[bb]= (1 - 2*(sum(pdfEb_a*(1 - np.exp(np.divide(-x[bb],tau))))*dEb))
return DMom
time = datafile['time'][0:501]
Moment = datafile['25Oe'][0:501]
params,extras = curve_fit(Ebfit,time,Moment, p0=[20,0.1], bounds=[(15,50),(0.05,0.2)])
I have also tried the following variations to see if the parenthesis was the issue:
params,extras = curve_fit(Ebfit,time,Moment, p0=[20,0.1], bounds=[[15,50],[0.02,0.2]])
params,extras = curve_fit(Ebfit,time,Moment, p0=[20,0.1], bounds=((15,50),(0.02,0.2)))
But I get the same error for all of these variations
ValueError: Each lower bound mush be strictly less than each upper
bound.
It only works with a single bound such as:
params,extras = curve_fit(Ebfit,time,Moment, p0=[20,0.1], bounds=[0,50])
Any help is appreciated.
Thank you!
bounds=[[0,50],[0,0.3]]) means the second parameter is greater than 50 but smaller then 0.3. Also the first parameter is fixed at zero.
The format is bounds=(lower, upper).
As per #ev-br suggestion. I tried the following changes for the bounds argument and it worked out great.
bounds=[[15,0.02],[50,0.2]]
So in the end, the argument give should be as follows:
bounds=[[a1,b1],[a2,b2]]
Where a1 is the lower limit for a and a2 the upper limit for a. Sames goes for b.

How to deal with the output of np.fft?

I am very confused of how to use the output of FFD , here is a example output generate by
result = np.abs(fftpack.fft(targetArray))[0:sample_size/2]
print result will gives a ndarray :
[ 4.21477326e+05 3.03444591e+04 1.80682694e+04 1.05924224e+04
1.98799134e+04 2.82620258e+04 1.39538936e+04 2.40331051e+04
4.57465472e+04 6.41241587e+04 1.88203479e+04 1.88670469e+04
5.42137198e+03 5.97675834e+03 1.23986101e+04 9.70841580e+02
2.07380817e+04 4.52632516e+04 4.49493295e+04 1.03481720e+04
2.09126074e+04 2.07691720e+04 1.22091133e+04 1.85574554e+04
1.70652450e+04 2.53329892e+04 3.33015423e+04 2.80037092e+04
2.22665347e+04 2.08644916e+04 1.76186194e+04 1.14038308e+04
3.10918574e+04 2.97875178e+04 5.85659852e+04 3.77721316e+04
1.78952068e+04 1.74037254e+04 1.80938450e+04 3.05330693e+04
4.24407000e+04 2.74864012e+04 4.41637099e+04 3.31359768e+04
1.74327061e+04 1.06382371e+04 2.49780963e+04 1.66725706e+04
8.29635786e+03 2.58186162e+04 1.16012033e+04 2.63763332e+04
1.57059736e+04 1.70297649e+04 1.03524047e+04 3.25882139e+03
5.26665772e+03 1.15851816e+04 1.28794168e+04 1.01768783e+04
8.62533363e+03 4.42905399e+03 4.23319771e+03 9.19616531e+03
4.25165576e+03 4.96332367e+03 6.89522891e+03 4.04350194e+03
1.44337315e+04 7.78488536e+03 5.85210341e+03 3.56468734e+03
8.35331002e+03 5.41229478e+03 4.55374300e+03 1.01986739e+04
7.04839030e+03 3.78646612e+03 7.07872186e+03 4.06017111e+03
4.63905900e+03 9.84305110e+03 6.24621254e+03 2.76105428e+03
3.23469868e+03 2.31035716e+03 2.15636401e+03 2.60730520e+03
3.77829898e+03 3.40698553e+03 8.72582345e+02 1.72800115e+03
1.83439871e+03 7.95636998e+03 4.69696233e+03 2.50245035e+03
6.50413681e+03 4.75657064e+03 3.18621272e+03 1.38365470e+03
6.25551674e+02 3.51399477e+03 2.96447293e+03 1.88665733e+03
1.63347429e+03 1.92721372e+03 5.34091697e+03 2.86529694e+03
1.80361382e+03 3.65567684e+03 2.08391205e+03 3.77302704e+03
2.61882954e+03 1.17689735e+03 1.10303601e+03 1.46603669e+03
1.67959657e+03 1.90800388e+03 2.35782546e+03 1.61309844e+03
1.36326484e+03 4.06967773e+03 1.40142207e+03 1.32657523e+03
3.17829657e+03 2.48240862e+03 1.84764188e+03 2.46198195e+03
2.44352793e+03 1.29546145e+03 9.34633855e+02 1.42411185e+02
1.11686208e+03 1.61629862e+03 1.82113405e+03 1.26350347e+03
3.63268437e+03 9.33373272e+02 8.45292645e+02 1.03929325e+03
1.65583031e+03 9.54310546e+02 1.95195173e+03 1.91535953e+03
5.61485427e+02 1.98666296e+03 9.88850958e+02 7.80781362e+02
1.16064386e+03 1.08425676e+03 3.95616137e+02 1.25423006e+03
2.12467757e+03 7.12337370e+02 1.44060716e+03 7.73146781e+02
1.05641593e+03 1.19763314e+03 1.59583780e+03 1.23434921e+03
3.33146158e+02 1.75650022e+03 8.81978933e+02 1.28186954e+03
1.47573928e+03 8.07757403e+02 8.84292001e+02 1.64624690e+03
1.29680496e+03 4.76763593e+02 1.14002526e+03 1.88558087e+02
6.21497355e+02 5.30485958e+02 1.14902281e+03 4.16705689e+02
1.46212548e+03 1.32165278e+03 7.72060051e+02 9.39714410e+02
1.09011170e+03 8.90859235e+02 7.67129975e+02 2.72632265e+02
2.71574309e+02 5.28939138e+02 5.04479312e+02 4.53129779e+02
7.42214724e+02 2.61798368e+02 4.98990728e+02 6.02745861e+02
9.87830434e+02 2.97161466e+02 1.08718023e+03 5.87366849e+02
3.00425885e+02 8.33291219e+02 1.31052224e+02 2.31099067e+02
6.64652156e+02 1.32180021e+02 2.92862313e+00 2.39475444e+02
7.71773465e+02 8.34334863e+02 7.92791780e+02 6.70180885e+02
5.73451905e+02 4.66006885e+02 9.48437277e+02 7.04566875e+02
2.54136392e+02 4.29167074e+02 2.69560662e+02 6.08905902e+02
1.04487371e+03 5.70108773e+02 5.03504459e+02 7.67808997e+02
4.38126513e+02 7.56769864e+02 7.36892665e+02 5.61631429e+02
8.44062274e+02 8.30259267e+02 3.41959075e+02 4.06049010e+02
1.68799150e+02 7.98590743e+02 5.24271279e+02 4.96069745e+02
5.49647172e+02 7.41309283e+02 9.07897622e+02 1.04985345e+03
1.00181109e+03 5.42974899e+02 7.35959741e+02 4.04694642e+02
5.81271022e+02 2.01778038e+02 6.00141101e+02 3.80334242e+02
6.44350585e+02 8.54890120e+02 7.12173695e+02 8.64161173e+02
7.57346370e+02 7.92985369e+02 7.39425694e+02 4.64160309e+02
7.04501040e+02 4.39166237e+02 1.01374899e+03 7.39703012e+02
8.22200001e+02 4.71396567e+02 8.06529692e+02 7.18184947e+02
7.04886010e+02 6.71256922e+02 5.19651471e+02 9.20043821e+02
7.69576193e+02 8.78863865e+02 1.09071085e+03 9.10790235e+02
6.99356743e+02 9.75210348e+02 7.42159855e+02 2.94034843e+02
6.98690944e+02 7.64206208e+02 6.88827262e+02 5.81514517e+02
1.00230881e+03 7.13219427e+02 8.59968358e+02 8.52206990e+02
4.52436732e+02 6.05729013e+02 8.60630471e+02 7.77693596e+02
6.06655413e+02 7.24578627e+02 6.57839491e+02 6.72231281e+02
7.01971817e+02 4.12298654e+02 6.04044947e+02 6.71707719e+02
6.30927816e+02 7.82746860e+02 7.94808478e+02 5.94066021e+02
6.51161261e+02 7.95649076e+02 2.92195286e+02 4.08585488e+02
6.10540227e+02 5.15197819e+02 5.67327416e+02 5.21334315e+02
4.52410192e+02 7.44553730e+02 6.98824805e+02 7.93759345e+02
5.97743322e+02 5.74907952e+02 3.85973511e+02 3.58967385e+02
5.79438559e+02 4.50199311e+02 4.60028768e+02 4.84243380e+02
7.86184753e+02 4.04682342e+02 5.55837013e+02 6.36922370e+02
3.40645318e+02 5.57139578e+02 3.69777174e+02 3.78496601e+02
5.39000001e+02 8.90982470e+02 3.24737044e+02 2.77411040e+02
4.87813362e+02 1.67412278e+02 7.61243559e+02 3.58371802e+02
5.23608891e+02 2.89915508e+02 5.71091257e+02 6.46835815e+02
4.49435858e+02]
I want use this array to calculate FFT energy , but these numbers doesn't looks like a complex number ..
I want write a function like this :
def get_energy(input):
energy = 0
for e in input:
energy = energy + sqrt(pow(e.real,2)+pow(e.imag,2))
return energy
(I am not very familiar with python language , sorry about this non-pythonic coding)
Many thanks ..
At first, your result does not look like a complex FFT output, because you calculated the absolute values of the FFT. That's never complex...
Therefore:
result = fftpack.fft(targetArray)[0:sample_size//2]
(sample_size//2 ensures that the upper bound of the slice is an integer.)
Your get_energy-function should actually look like this:
def get_energy(input):
return np.sum(input*input.conj())
The energy of a signal is the sum of the squared amplitudes... alternatively you can write it as
def get_energy(input):
return np.sum(np.abs(input)**2)
Note that in numpy mathematical operations are executed elementwise. So you do not have to square each of the element in a for-loop.

getting more precise answer in python

I am trying to implement a simple formula with python . In the case of "RLSR" there two approach to calculate the coefficient "C" by C=k^-1*y where k is the kernel matrix and y is the target value vector.
The second way is by using eigenvalues(w) and eigenvectors(v) of the kernel method and using formula :
Sum(u(i)*<u(i),y> / w(i) ).
to check the answer of the result I'm checking whether y=kc or not! in the first case it's ok , but the latter case it's not.but the problem is my c vectors in both method look the same.
output from invers algorithm :
[ 19.49840251 18.82695226 20.08390355 15.01043404 14.79353281
16.75316736 12.88504257 16.92127176 16.77292954 17.81827473
20.90503787 17.09359467 18.76366701 18.14816903 20.03491117
22.56668264 21.45176136 25.44051036 30.40312692 22.61466379
22.86480382 19.34631818 17.0169598 19.85244414 16.63702471
20.35280156 20.58093488 22.42058736 20.54935198 19.35541575
20.39006958 19.74766081 20.41781019 22.33858797 17.57962283
22.61915219 22.54823733 24.96292824 22.82888425 34.18952603
20.7487537 24.82019935 22.40621769 21.15767304 27.58919263
18.39293156 21.55455108 18.69532341]
output from second(eigendecomposition ) algorithm :
[ 19.25280289 18.73927731 19.77184991 14.7650427 14.87364331
16.2273648 12.29183797 16.52024239 16.66669961 17.59282615
20.83059115 17.02815857 18.3635798 18.16931845 20.50528549
22.67690164 21.40479524 25.54544 30.94618128 22.72992565
23.06289609 17.6485592 15.2758427 19.50578691 16.45571607
20.19960765 20.35352859 22.60091638 20.36586912 18.90760728
20.57141151 19.43677153 20.43437031 22.39310576 17.72296978
22.82139991 22.50744791 25.10496617 22.30462867 34.80540213
20.77064617 25.18071618 22.5500315 20.71481252 27.91939784
18.29868659 22.00800019 18.71266093]
this how I have implemented it:
so let say if we have 48 samples the size of K is 48x48 , y is 48x1
def cpu_compute(y,k):
w, u=LA.eigh(k)
uDoty=np.dot(u,y)
temp=np.transpose(np.tile(w,(len(u),1)))
div=np.divide(u,temp)
r=np.tile(uDoty,(len(div),1))
a=div*r.T
c=sum(a)
return c
and the result from
print np.allclose(Y,np.dot(K,c))
is false.
also the norm of difference from the true result is 3.13014997999.
now I have no idea how can I fix this . I thought maybe some how by doing more precise answer.
Appreciate any help!
To solve k c = y with numpy, use numpy.linalg.solve:
c = solve(k, y)
This uses an algorithm that is much more robust than the methods you are trying.

Categories