Solve Polynomial equation of 6th order with Python efficiently - python

I want to Solve Polynomial equation of 6th order with Python.
I've tried the "basic" version:
avgIrms = 19.61
c_val = (0.000002324*avgIrms**6) - (0.0001527*avgIrms**5) + (0.003961843*avgIrms**4) - (0.052211292*avgIrms**3) + (0.379269091*avgIrms**2) -(0.404399274*avgIrms) + 0.000682896
print(c_val)
After that I've used the numpy with the following code:
import numpy as np
avgIrms = 19.61
ppar = [0.000002324, -0.0001527, 0.003961843, -0.052211292, 0.379269091, -0.404399274, 0.000682896]
p = np.poly1d(ppar)
print(p(avgIrms))
In the both ways the raspberry tooks more than five seconds to process... It's to much! Any help to solve polynomial equations efficiently? (less than one second...)
Thanks in advance,
Daniel

First, what you want is to evaluate a polynomial for a given x, not to solve it. Second, I still don't see how do you get your speed up..
Find here a couple of timmings:
>>> import numpy as np
>>> x = 19.61
>>> pr = [0.000002324, -0.0001527, 0.003961843, -0.052211292, 0.379269091, -0.404399274, 0.000682896]
>>> p = pr[::-1] # reverse the order
Hardcoded solution:
>>> %timeit p[0] + x * p[1] + p[2] * x**2 + p[3] * x**3 + p[4] * x**4 + p[5] * x**5 + p[6] * x**6
809 ns
Loopy solution:
>>> %%timeit
val = 0
for i in range(len(p)):
val += p[i] * x**i
1.24 µs
Functional programming solution:
>>> %timeit reduce(lambda acc, i: acc + p[i] * x**i, range(len(p)))
1.61 µs
Using numpy's polyval:
>>> %timeit np.polyval(pr, x)
6.12 µs
Using numpy's poly1d
>>> %%timeit
c = np.poly1d(pr)
c(x)
9.46 µs
So, clearly numpy is slower, as for a such a small array it adds some overhead in the Python <-> C communication, but still, it is of the order of 6-9 µs, I'm using a desktop computer, but I would be pretty impressed if a Raspberry Pi would really take 5 seconds to do that operation. Are you sure you did the timings properly?
Any way, either the hardcoded or the loopy solution seem faster than the functional programming one (the equivalent to the one that you defined as horner in your comment).

Related

Improve performance of `compute_optimal_weights` function in Python

Is there a faster way to write "compute_optimal_weights" function in Python. I run it hundreds of millions of times, so any speed increase would help. The arguments of the function are different each time I run it.
c1 = 0.25
c2 = 0.67
def compute_optimal_weights(input_prices):
input_weights_optimal = {}
for i in input_prices:
price = input_prices[i]
input_weights_optimal[i] = c2 / sum([(price/n) ** c1 for n in input_prices.values()])
return input_weights_optimal
input_sellers_ID = range(10)
input_prices = {}
for i in input_sellers_ID:
input_prices[i] = random.uniform(0,1)
t0 = time.time()
for i in xrange(1000000):
compute_optimal_weights(input_prices)
t1 = time.time()
print "old time", (t1 - t0)
The number of elements in list and dictionary vary, but on average there are about 10 elements. They keys in input_prices are the same across all calls but the values change, so the same key will have different values over different runs.
Using a little bit of math, you can calculate part of your sum_price_ratio_scaled as a constant earlier in the loop and speed up your program by ~80% (for the average input size of 10).
Optimized Implementation (Python 3):
def compute_optimal_weights(ids, prices):
scaled_sum = 0
for i in ids:
scaled_sum += prices[i] ** -0.25
result = {}
for i in ids:
result[i] = 0.67 * (prices[i] ** -0.25) / scaled_sum
return result
Edit, in response to this answer: While using numpy will prove more performant with massive data sets, given that "on average there are about 10 elements" in your input_sellers_ID list, I doubt that this approach is worth its own weight for your particular application.
Although it might be tempting to leverage the terseness of generator expressions and dictionary comprehensions, I noticed when running on my machine that the best performance was obtained by using regular for-in loops and avoiding function calls like sum(...). For the sake of completeness, though, here is what the above implementation would look like in a more 'pythonic' style:
def compute_optimal_weights(ids, prices):
scaled_sum = sum(prices[i] ** -0.25 for i in ids)
return {i: 0.67 * (prices[i] ** -0.25) / scaled_sum for i in ids}
Reasoning / Math:
Based on your posted algorithm, you are trying to create a dictionary with values represented by the function f(i) below, where i is one of the elements in your input_sellers_ID list.
When you initially write out the formula for f(i), it appears as though prices[i] must be recalculated for every step of the summation process, which is costly. Simplifying the expression using the rules of exponents, however, you can see that the simplest summation needed to determine f(i) is actually independent of i (only the index value of j is ever used), meaning that that term is a constant and can be calculated outside of the loop which sets the dictionary values.
Note that above I refer to input_prices as prices and input_sellers_ID as ids.
Performance Profile (~80% speed improvement on my machine, size 10):
import time
import random
def compute_optimal_weights(ids, prices):
scaled_sum = 0
for i in ids:
scaled_sum += prices[i] ** -0.25
result = {}
for i in ids:
result[i] = 0.67 * (prices[i] ** -0.25) / scaled_sum
return result
def compute_optimal_weights_old(input_sellers_ID, input_prices):
input_weights_optimal = {}
for i in input_sellers_ID:
sum_price_ratio_scaled = 0
for j in input_sellers_ID:
price_ratio = input_prices[i] / input_prices[j]
scaled_price_ratio = price_ratio ** c1
sum_price_ratio_scaled += scaled_price_ratio
input_weights_optimal[i] = c2 / sum_price_ratio_scaled
return input_weights_optimal
c1 = 0.25
c2 = 0.67
input_sellers_ID = range(10)
input_prices = {i: random.uniform(0,1) for i in input_sellers_ID}
start = time.clock()
for _ in range(1000000):
compute_optimal_weights_old(input_sellers_ID, input_prices) and None
old_time = time.clock() - start
start = time.clock()
for _ in range(1000000):
compute_optimal_weights(input_sellers_ID, input_prices) and None
new_time = time.clock() - start
print('Old:', compute_optimal_weights_old(input_sellers_ID, input_prices))
print('New:', compute_optimal_weights(input_sellers_ID, input_prices))
print('New algorithm is {:.2%} faster.'.format(1 - new_time / old_time))
I believe we could speed-up the function by factoring the loop. Let a = price, b = n and c = c1, if my maths are not wrong (e.g. (5/6)**3 == 5**3 / 6**3:
(5./6.)**2 + (5./4.)**2
==
5**2 / 6.**2 + 5**2 / 4.**2
==
5**2 * (1/6.**2 + 1/4.**2)
With variables:
sum( (a / b) ** c for each b)
==
sum( a**c * (1/b) ** c for each b)
==
a**c * sum((1./b)**c for each b)
The second term is constant and can be taken out. Which leaves:
Faster implementation - Raw Python
Using generators and dict-comprehension:
def compute_optimal_weights(input_prices):
sconst = sum(1/w**c1 for w in input_prices.values())
return {k: c2 / (v**c1 * sconst) for k, v in input_prices.items()}
NOTE: if you are using Python2 replace .values() and .items() with .itervalues() and .iteritems() for extra speedup (few ms with large lists).
Even Faster - Numpy
Additionally, if you don't care that much about the dictionary and just want the values, you could speed it up using numpy (for large inputs >100):
def compute_optimal_weights_np(input_prices):
data = np.asarray(input_prices.values()) ** c1
return c2 / (data * np.sum(1./data))
Few timings for different input size:
N = 10 inputs:
MINE: 100000 loops, best of 3: 6.02 µs per loop
NUMPY: 100000 loops, best of 3: 10.6 µs per loop
YOURS: 10000 loops, best of 3: 23.8 µs per loop
N = 100 inputs:
MINE: 10000 loops, best of 3: 49.1 µs per loop
NUMPY: 10000 loops, best of 3: 22.6 µs per loop
YOURS: 1000 loops, best of 3: 1.86 ms per loop
N = 1000 inputs:
MINE: 1000 loops, best of 3: 458 µs per loop
NUMPY: 10000 loops, best of 3: 121 µs per loop
YOURS: 10 loops, best of 3: 173 ms per loop
N = 100000 inputs:
MINE: 10 loops, best of 3: 54.2 ms per loop
NUMPY: 100 loops, best of 3: 11.1 ms per loop
YOURS: didn't finish in a couple of minutes
Both options here are considerably faster than the one presented in the question. The benefit of using numpy if you can give consistent input (in the form of array instead of a dictionary) becomes apparent when the size grows:

How to vectorize fourier series partial sum in numpy

Given the Fourier series coefficients a[n] and b[n] (for cosines and sines respectively) of a function with period T and t an equally spaced interval the following code will evaluate the partial sum for all points in interval t (a,b,t are all numpy arrays). It is clarified that len(t) <> len(a).
yn=ones(len(t))*a[0]
for n in range(1,len(a)):
yn=yn+(a[n]*cos(2*pi*n*t/T)-b[n]*sin(2*pi*n*t/T))
My question is: Can this for loop be vectorized?
Here's one vectorized approach making use broadcasting to create the 2D array version of cosine/sine input : 2*pi*n*t/T and then using matrix-multiplication with np.dot for the sum-reduction -
r = np.arange(1,len(a))
S = 2*np.pi*r[:,None]*t/T
cS = np.cos(S)
sS = np.sin(S)
out = a[1:].dot(cS) - b[1:].dot(sS) + a[0]
Further performance boost
For further boost, we can make use of numexpr module to compute those trignometric steps -
import numexpr as ne
cS = ne.evaluate('cos(S)')
sS = ne.evaluate('sin(S)')
Runtime test -
Approaches -
def original_app(t,a,b,T):
yn=np.ones(len(t))*a[0]
for n in range(1,len(a)):
yn=yn+(a[n]*np.cos(2*np.pi*n*t/T)-b[n]*np.sin(2*np.pi*n*t/T))
return yn
def vectorized_app(t,a,b,T):
r = np.arange(1,len(a))
S = (2*np.pi/T)*r[:,None]*t
cS = np.cos(S)
sS = np.sin(S)
return a[1:].dot(cS) - b[1:].dot(sS) + a[0]
def vectorized_app_v2(t,a,b,T):
r = np.arange(1,len(a))
S = (2*np.pi/T)*r[:,None]*t
cS = ne.evaluate('cos(S)')
sS = ne.evaluate('sin(S)')
return a[1:].dot(cS) - b[1:].dot(sS) + a[0]
Also, including function PP from #Paul Panzer's post.
Timings -
In [22]: # Setup inputs
...: n = 10000
...: t = np.random.randint(0,9,(n))
...: a = np.random.randint(0,9,(n))
...: b = np.random.randint(0,9,(n))
...: T = 3.45
...:
In [23]: print np.allclose(original_app(t,a,b,T), vectorized_app(t,a,b,T))
...: print np.allclose(original_app(t,a,b,T), vectorized_app_v2(t,a,b,T))
...: print np.allclose(original_app(t,a,b,T), PP(t,a,b,T))
...:
True
True
True
In [25]: %timeit original_app(t,a,b,T)
...: %timeit vectorized_app(t,a,b,T)
...: %timeit vectorized_app_v2(t,a,b,T)
...: %timeit PP(t,a,b,T)
...:
1 loops, best of 3: 6.49 s per loop
1 loops, best of 3: 6.24 s per loop
1 loops, best of 3: 1.54 s per loop
1 loops, best of 3: 1.96 s per loop
Can't beat numexpr, but if it's not available we can save on the transcendentals (testing and benchmarking code heavily based on #Divakar's code in case you didn't notice ;-) ):
import numpy as np
from timeit import timeit
def PP(t,a,b,T):
CS = np.empty((len(t), len(a)-1), np.complex)
CS[...] = np.exp(2j*np.pi*(t[:, None])/T)
np.cumprod(CS, axis=-1, out=CS)
return a[1:].dot(CS.T.real) - b[1:].dot(CS.T.imag) + a[0]
def original_app(t,a,b,T):
yn=np.ones(len(t))*a[0]
for n in range(1,len(a)):
yn=yn+(a[n]*np.cos(2*np.pi*n*t/T)-b[n]*np.sin(2*np.pi*n*t/T))
return yn
def vectorized_app(t,a,b,T):
r = np.arange(1,len(a))
S = 2*np.pi*r[:,None]*t/T
cS = np.cos(S)
sS = np.sin(S)
return a[1:].dot(cS) - b[1:].dot(sS) + a[0]
n = 1000
t = 2000
t = np.random.randint(0,9,(t))
a = np.random.randint(0,9,(n))
b = np.random.randint(0,9,(n))
T = 3.45
print(np.allclose(original_app(t,a,b,T), vectorized_app(t,a,b,T)))
print(np.allclose(original_app(t,a,b,T), PP(t,a,b,T)))
print('{:18s} {:9.6f}'.format('orig', timeit(lambda: original_app(t,a,b,T), number=10)/10))
print('{:18s} {:9.6f}'.format('Divakar no numexpr', timeit(lambda: vectorized_app(t,a,b,T), number=10)/10))
print('{:18s} {:9.6f}'.format('PP', timeit(lambda: PP(t,a,b,T), number=10)/10))
Prints:
True
True
orig 0.166903
Divakar no numexpr 0.179617
PP 0.060817
Btw. if delta t divides T one can potentially save more, or even run the full fft and discard what's too much.
This is not really another answer but a comment on #Paul Panzer's one, written as an answer because I needed to post some code. If there is a way to post propely formatted code in a comment please advice.
Inspired by #Paul Panzer cumprod idea, I came up with the following:
an = ones((len(a)-1,len(te)))*2j*pi*te/T
CS = exp(cumsum(an,axis=0))
out = (a[1:].dot(CS.real) - b[1:].dot(CS.imag)) + a[0]
Although it seems properly vectorized and produces correct results, its performance is miserable. It is not only much slower than the cumprod, which is expected as len(a)-1 exponentiations more are made, but 50% slower than the original unvectorized version. What is the cause of this poor performance?

Parsing Complex Mathematical Functions in Python

Is there a way in Python to parse a mathematical expression in Python that describes a 3D graph? Using other math modules or not. I couldn't seem to find a way for it to handle two inputs.
An example of a function I would want to parse is Holder Table Function.
It has multiple inputs, trigonometric functions, and I would need to parse the abs() parts too.
I have two 2D numpy meshgrids as input for x1 and x2, and would prefer to pass them directly to the expression for it to evaluate.
Any help would be greatly appreciated, thanks.
It's not hard to write straight numpy code that evaluates this formula.
def holder(x1, x2):
f1 = 1 - np.sqrt(x1**2 + x2**2)/np.pi
f2 = np.exp(np.abs(f1))
f3 = np.sin(x1)*np.cos(x2)*f2
return -np.abs(f3)
Evaluated at a point:
In [109]: holder(10,10)
Out[109]: -15.140223856952055
Evaluated on a grid:
In [60]: I,J = np.mgrid[-bd:bd:.01, -bd:bd:.01]
In [61]: H = holder(I,J)
Quick-n-dirty sympy
Diving into sympy without reading much of the docs:
In [65]: from sympy.abc import x,y
In [69]: import sympy as sp
In [70]: x
Out[70]: x
In [71]: x**2
Out[71]: x**2
In [72]: (x**2 + y**2)/sp.pi
Out[72]: (x**2 + y**2)/pi
In [73]: 1-(x**2 + y**2)/sp.pi
Out[73]: -(x**2 + y**2)/pi + 1
In [75]: from sympy import Abs
...
In [113]: h = -Abs(sp.sin(x)*sp.cos(y)*sp.exp(Abs(1-sp.sqrt(x**2 + y**2)/sp.pi)))
In [114]: float(h.subs(x,10).subs(y,10))
Out[114]: -15.140223856952053
Or with the sympify that DSM suggested
In [117]: h1 = sp.sympify("-Abs(sin(x)*cos(y)*exp(Abs(1-sqrt(x**2 + y**2)/pi)))")
In [118]: h1
Out[118]: -exp(Abs(sqrt(x**2 + y**2)/pi - 1))*Abs(sin(x)*cos(y))
In [119]: float(h1.subs(x,10).subs(y,10))
Out[119]: -15.140223856952053
...
In [8]: h1.subs({x:10, y:10}).n()
Out[8]: -15.1402238569521
Now how do I use this with numpy arrays?
Evaluating the result of sympy lambdify on a numpy mesgrid
In [22]: f = sp.lambdify((x,y), h1, [{'ImmutableMatrix': np.array}, "numpy"])
In [23]: z = f(I,J)
In [24]: z.shape
Out[24]: (2000, 2000)
In [25]: np.allclose(z,H)
Out[25]: True
In [26]: timeit holder(I,J)
1 loop, best of 3: 898 ms per loop
In [27]: timeit f(I,J)
1 loop, best of 3: 899 ms per loop
Interesting - basically the same speed.
Earlier answer along this line: How to read a system of differential equations from a text file to solve the system with scipy.odeint?

Why is my Numpy test code 2X slower than in Matlab

I've been developing a Fresnel coefficient based reflectivity solver in Python and I've hit a bit of a roadblock as the performance in Python + Numpy is 2X slower than in Matlab. I've distilled the problem code into a simple example to show the operation being performed in each case:
Python Code for test case:
import numpy as np
import time
def compare_fn(i):
a = np.random.rand(400)
vec = np.random.rand(400)
t = time.time()
for j in xrange(i):
a = (2.3 + a * np.exp(2j*vec))/(1 + (2.3 * a * np.exp(2j*vec)))
print (time.time()-t)
return a
a = compare_fn(200000)
Output: 10.7989997864
Equivalent Matlab code:
function a = compare_fn(i)
a = rand(1, 400);
vec = rand(1, 400);
tic
for m = 1:i
a = (2.3 + a .* exp(2j*vec))./(1 + (2.3 * a .* exp(2j*vec)));
end
toc
a = compare_fn(200000);
Elapsed time is 5.644673 seconds.
I'm stumped by this. I already have MKL installed (Anaconda Academic License). I would greatly appreciate any help in identifying the issue with my example if any and how I can achieve equivalent if not better performance using Numpy.
In general, I cannot parallelize the loop as solving the Fresnel coefficients for a multilayer involves a recursive calculation which can be expressed in the form of a loop as above.
The following is similar to unutbu's deleted answer, and for your sample input runs 3x faster on my system. It will probably also run faster if you implement it like this in Matlab, but that's a different story. To be able to use ipython's %timeit functionality I have rewritten your original function as:
def fn(a, vec, i):
for j in xrange(i):
a = (2.3 + a * np.exp(2j*vec))/(1 + (2.3 * a * np.exp(2j*vec)))
return a
And I have optimized it by removing the exponential calculation from the loop:
def fn_bis(a, vec, n):
exp_vec = np.exp(2j*vec)
for j in xrange(n):
a = (2.3 + a * exp_vec) / (1 + 2.3 * a * exp_vec)
return a
Taking both approaches for a test ride:
In [2]: a = np.random.rand(400)
In [3]: vec = np.random.rand(400)
In [9]: np.allclose(fn(a, vec, 100), fn_bis(a, vec, 100))
Out[9]: True
In [10]: %timeit fn(a, vec, 100)
100 loops, best of 3: 8.43 ms per loop
In [11]: %timeit fn_bis(a, vec, 100)
100 loops, best of 3: 2.57 ms per loop
In [12]: %timeit fn(a, vec, 200000)
1 loops, best of 3: 16.9 s per loop
In [13]: %timeit fn_bis(a, vec, 200000)
1 loops, best of 3: 5.25 s per loop
I've been doing a lot of experimenting to try and determine the source of the speed difference between Matlab and Python/Numpy for the example in my original question. Some of the key findings have been:
Matlab now has a JIT compiler that provides significant benefit in situations involving loops. Turning it off reduces performance by 2X making it similar in speed to the native Python + Numpy code.
feature accel off
a = compare_fn(200000);
Elapsed time is 9.098062 seconds.
I then began exploring options for optimizing my example function using Numba and Cython to see how much better I could do. The one significant finding for me was that Numba JIT optimization on an explicit looped calculation was faster than native vectorized math operations on Numpy arrays. I don't quite understand why this is the case, but I have included my sample code and timing for tests below. I also played with Cython (I'm no expert) and although it was also quicker, Numba was still 2X faster than Cython, so I ended up sticking with Numba for the tests.
Here is the code for 3 equivalent functions. First one is a Numba optimized function with an explicit loop to perform elementwise calculations. Second function is a Python+Numpy function relying on Numpy vectorization to perform calculations. The third function tries to use Numba to optimize the vectorized Numpy code (and fails to improve as you can see in the results). Lastly, I've included the Cython code, though I only tested it for one case.
import numpy as np
import numba as nb
#nb.jit(nb.complex128[:](nb.int16, nb.int16))
def compare_fn_jit(i, j):
a = np.asarray(np.random.rand(j), dtype=np.complex128)
vec = np.random.rand(j)
exp_term = np.exp(2j*vec)
for k in xrange(i):
for l in xrange(j):
a[l] = (2.3 + a[l] * exp_term[l])/(1 + (2.3 * a[l] * exp_term[l]))
return a
def compare_fn(i, j):
a = np.asarray(np.random.rand(j), dtype=np.complex128)
vec = np.random.rand(j)
exp_term = np.exp(2j*vec)
for k in xrange(i):
a = (2.3 + a * exp_term)/(1 + (2.3 * a * exp_term))
return a
compare_fn_jit2 = nb.jit(nb.complex128[:](nb.int16, nb.int16))(compare_fn)
import numpy as np
cimport numpy as np
cimport cython
#cython.boundscheck(False)
def compare_fn_cython(int i, int j):
cdef int k, l
cdef np.ndarray[np.complex128_t, ndim=1] a, vec, exp_term
a = np.asarray(np.random.rand(j), dtype=np.complex128)
vec = np.asarray(np.random.rand(j), dtype=np.complex128)
exp_term = np.exp(2j*vec)
for k in xrange(i):
for l in xrange(j):
a[l] = (2.3 + a[l] * exp_term[l])/(1 + (2.3 * a[l] * exp_term[l]))
return a
Timing Results:
i. Timing for a single outer loop - Demonstrates efficiency of vectorized calculations
%timeit -n 1 -r 10 compare_fn_jit(1,1000000) 1 loops, best of 10: 352
ms per loop
%timeit -n 1 -r 10 compare_fn(1,1000000) 1 loops, best of 10: 498 ms
per loop
%timeit -n 1 -r 10 compare_fn_jit2(1,1000000) 1 loops, best of 10: 497
ms per loop
%timeit -n 1 -r 10 compare_fn_cython(1,1000000) 1 loops, best of 10:
424 ms per loop
ii. Timing in extreme case of large loops with calculations on short arrays (expect Numpy+Python to perform poorly)
%timeit -n 1 -r 5 compare_fn_jit(1000000,40) 1 loops, best of 5: 1.44
s per loop
%timeit -n 1 -r 5 compare_fn(1000000,40) 1 loops, best of 5: 28.2 s
per loop
%timeit -n 1 -r 5 compare_fn_jit2(1000000,40) 1 loops, best of 5: 29 s
per loop
iii. Test for somewhere mid-way between the two cases above
%timeit -n 1 -r 5 compare_fn_jit(100000,400) 1 loops, best of 5: 1.4 s
per loop
%timeit -n 1 -r 5 compare_fn(100000,400) 1 loops, best of 5: 5.26 s
per loop
%timeit -n 1 -r 5 compare_fn_jit2(100000,400) 1 loops, best of 5: 5.34
s per loop
As you can see, using Numba can improve efficiency by a factor ranging from 1.5X - 30X for this particular case. I am truly impressed with how efficient it is and how easy it is to use and implement when compared against Cython.
I don't know if numpypy is far enough along yet for what you're doing, but you might try it.
http://buildbot.pypy.org/numpy-status/latest.html

Speeding up newton-raphson in pandas/python

I'm currently iterating through a very large set of data ~85GB (~600M lines) and simply using newton-raphson to compute a new parameter. As of right now my code is extremely slow, any tips on how to speed it up? The methods from BSCallClass & BSPutClass are closed-form, so there's nothing really to speed up there. Thanks.
class NewtonRaphson:
def __init__(self, theObject):
self.theObject = theObject
def solve(self, Target, Start, Tolerance, maxiter=500):
y = self.theObject.Price(Start)
x = Start
i = 0
while (abs(y - Target) > Tolerance):
i += 1
d = self.theObject.Vega(x)
x += (Target - y) / d
y = self.theObject.Price(x)
if i > maxiter:
x = nan
break
return x
def main():
for row in a.iterrows():
print row[1]["X.1"]
T = (row[1]["X.7"] - row[1]["X.8"]).days
Spot = row[1]["X.2"]
Strike = row[1]["X.9"]
MktPrice = abs(row[1]["X.10"]-row[1]["X.11"])/2
CPflag = row[1]["X.6"]
if CPflag == 'call':
option = BSCallClass(0, 0, T, Spot, Strike)
elif CPflag == 'put':
option = BSPutClass(0, 0, T, Spot, Strike)
a["X.15"][row[0]] = NewtonRaphson(option).solve(MktPrice, .05, .0001)
EDIT:
For those curious, I ended up speeding this entire process significantly by using the scipy suggestion, as well as using the multiprocessing module.
Don't code your own Newton-Raphson method in Python. You'll get better performance using one of the root finders in scipy.optimize such as brentq or newton.
(Presumably, if you have pandas, you'd also install scipy.)
Back of the envelope calculation:
Making 600M calls to brentq should be manageable on standard hardware:
import scipy.optimize as optimize
def f(x):
return x**2 - 2
In [28]: %timeit optimize.brentq(f, 0, 10)
100000 loops, best of 3: 4.86 us per loop
So if each call to optimize.brentq takes 4.86 microseconds, 600M calls will take about 4.86 * 600 ~ 3000 seconds ~ 1 hour.
newton may be slower, but still manageable:
def f(x):
return x**2 - 2
def fprime(x):
return 2*x
In [40]: %timeit optimize.newton(f, 10, fprime)
100000 loops, best of 3: 8.22 us per loop

Categories