Coefficient matrix and loops challenge - python

I'm trying to solve the equation for T(p,q) as shown more clearly in the attached image.
Where:
p = 0.60
q = 0.45
M - coefficient matrix with 3 rows and 6 columns
I created five matrices and put them within their own functions in order to later call them in the while loop. However, the loop doesn't cycle through the various values of i.
How can I get the loop to work or is there another/better way I can solve the following equation?
(FYI this is approx my third day ever working with Python and coding)
import numpy as np
def M1(i):
M = np.array([[1,3,0,4,0,0],[3,3,1,4,0,0],[4,4,0,3,1,0]])
return M[i-1,0]
def M2(i):
M = np.array([[1,3,0,4,0,0],[3,3,1,4,0,0],[4,4,0,3,1,0]])
return M[i-1,1]
def M3(i):
M = np.array([[1,3,0,4,0,0],[3,3,1,4,0,0],[4,4,0,3,1,0]])
return M[i-1,2]
def M4(i):
M = np.array([[1,3,0,4,0,0],[3,3,1,4,0,0],[4,4,0,3,1,0]])
return M[i-1,3]
def M5(i):
M = np.array([[1,3,0,4,0,0],[3,3,1,4,0,0],[4,4,0,3,1,0]])
return M[i-1,4]
def T(p,q):
sum_i = 0
i = 1
while i <=5:
sum_i = sum_i + ((M1(i)*p**M2(i))*((1-p)**M3(i))*(q**M4(i))*((1-q)**M5(i)))
i = i +1
return sum_i
print(T(0.6,0.45))
"""I printed the below equation (using a single value for i) to test if the above loop is working and since I get the same answer as the loop, I can see that the loop is not cycling through the various values of i as expected"""
i=1
p=0.6
q=0.45
print(((M1(i)*p**M2(i))*((1-p)**M3(i))*(q**M4(i))*((1-q)**M5(i))))

return is placed inside while loop, you need to change the code a bit
while i <=5:
sum_i = sum_i + ((M1(i)*p**M2(i))*((1-p)**M3(i))*(q**M4(i))*((1-q)**M5(i)))
i = i +1
return sum_i

The real power with numpy is to look at computations like these and try to understand what is repeatable and what structure they have. Once you find operations that are similar or parallel, try to set up your numpy function calls so that they can be done element-wise in parallel with one call.
For instance, inside the typical element of the sum, there are four things being explicitly raised to a power (a fifth if you consider M(i, 1)^1). In one function call, you can perform all four of these exponentiations in parallel with one function call if you arrange your arrays smartly:
M = np.array([[1,3,0,4,0,0],[3,3,1,4,0,0],[4,4,0,3,1,0]])
ps_and_qs = np.array([[p, (1-p), q, (1-q)]])
a = np.power(ps_and_qs, M[:,1:5])
Now a will be populated with a 3 x 4 matrix with all of your exponentiations.
Now the next step is how to reduce these. There are several reduction functions that are built into numpy that are efficiently implemented with vectorized loops where possible. They can speed up your code quite a bit. In particular, there is both a product reduction as well as a sum reduction. With your equation, we need to first multiply across the rows to get one number per row and then sum across the remaining column like this:
b = M[:,:1] * np.product(a,axis=1)
c = np.sum(b, axis=0)
c should now be a scalar equal to T evaluated at (p,q). That is a lot to take in for a third day, but something to consider if you continue to use numpy for numerical analysis on bigger projects that need better performance.

Related

With Python: My recursive function is way to slow

I want to program the following (I've just start to learn python):
f[i]:=f[i-1]-(1/n)*(1-(1-f[i-1])^n)-(1/n)*(f[i-1])^n+(2*f[0]/n);
with F[0]=x, x belongs to [0,1] and n a constant integer.
My try:
import pylab as pl
import numpy as np
N=20
n=100
h=0.01
T=np.arange(0, 1+h, h)
def f(i):
if i == 0:
return T
else:
return f(i-1)-(1./n)*(1-(1-f(i-1))**n)-(1./n)*(f(i-1))**n+2.*T/n
pl.figure(figsize=(10, 6), dpi=80)
pl.plot(T,f(N), color="red",linestyle='--', linewidth=2.5)
pl.show()
For N=10 (number of iterations) it returns the correct plot fast enough, but for N=20 it keeps running and running (more than 30 minutes already).
The reason why your run time is so slow is the fact that, like the simplistic calculation of the nth fibonacci number it runs in exponential time (in this case 3^n). To see this, before F[i] can return it's value, it has to call f[i-1] 3 times, but then each of those has to call F[i-2] 3 times (3*3 calls), and then each of those has to call F[i-3] 3 times (3*3*3 calls), and so on. In this example, as others have shown, this can be calculated simply in linear time. That you see it slow for N = 20 is because your function has to be called 3^20 = 3486784401 times before you get the answer!
You calculate f(i-1) three times in a single recursion layer - so after the first run you "know" the answer but still calculate it two more times. A naive approach:
fi_1 = f(i-1)
return fi_1-(1./n)*(1-(1-fi_1)**n)-(1./n)*(fi_1)**n+2.*T/n
But of course we can still do better and cache every evaluation of f:
cache = {}
def f_cached(i):
if not i in cache:
cache[i] = f(i)
return(cache[i])
Then replace every every occurence of f with f_cached.
There are also libraries out there that can do that for you automatically (with a decorator).
While recursion often yields nice and easy formulas, python is not that good at evaluating them (see tail recursion). You are probably better off with rewriting it in a iterativ way and calculate that.
First of all you are calculating f[i-1] three times when you can save it's result in some variable and calculate it only once :
t = f(i-1)
return t-(1./n)*(1-(1-t)**n)-(1./n)*(t)**n+2.*T/n
It will increase the speed of the program, but I would also like to recommend to calculate f without using recursion.
fs = T
for i in range(1,N+1):
tmp = fs
fs = (tmp-(1./n)*(1-(1-tmp)**n)-(1./n)*(tmp)**n+2.*T/n)

Allowing for deviations in exact values during matrix multiplication, python

I need to solve this:
Check if AT * n * A = n, where A is the test matrix, AT is the transposed test matrix and n = [[1,0,0,0],[0,-1,0,0],[0,0,-1,0],[0,0,0,-1]].
I don't know how to check for equality due to the numerical errors in the float multiplication. How do I go about doing this?
Current code:
def trans(A):
n = numpy.matrix([[1,0,0,0],[0,-1,0,0],[0,0,-1,0],[0,0,0,-1]])
c = numpy.matrix.transpose(A) * n * numpy.matrix(A)
Have then tried
>if c == n:
return True
I have also tried assigning variables to every element of matrix and then checking that each variable is within certain limits.
Typically, the way that numerical-precision limitations are overcome is by allowing for some epsilon (or error-value) between the actual value and expected value that is still considered 'equal'. For example, I might say that some value a is equal to some value b if they are within plus/minus 0.01. This would be implemented in python as:
def float_equals(a, b, epsilon):
return abs(a-b)<epsilon
Of course, for matrixes entered as lists, this isn't quite so simple. We have to check if all values are within the epsilon to their partner. One example solution would be as follows, assuming your matrices are standard python lists:
from itertools import product # need this to generate indexes
def matrix_float_equals(A, B, epsilon):
return all(abs(A[i][j]-B[i][j])<epsilon for i,j in product(xrange(len(A)), repeat = 2))
all returns True iff all values in a list are True (list-wise and). product effectively dot-products two lists, with the repeat keyword allowing easy duplicate lists. Therefore given a range repeated twice, it will produce a list of tuples for each index. Of course, this method of index generation assumes square, equally-sized matrices. For non-square matrices you have to get more creative, but the idea is the same.
However, as is typically the way in python, there are libraries that do this kind of thing for you. Numpy's allclose does exactly this; compares two numpy arrays for equality element-wise within some tolerance. If you're working with matrices in python for numeric analysis, numpy is really the way to go, I would get familiar with its basic API.
If a and b are numpy arrays or matrices of the same shape, then you can use allclose:
if numpy.allclose(a, b): # a is approximately equal to b
# do something ...
This checks that for all i and all j, |aij - bij| < εa for some absolute error εa (by default 10-5) and that |aij - bij| < |bij| εr for some relative error εr (by default 10-8). Thus it is safe to use, even if your calculations introduce numerical errors.

Writing a double sum in Python

I am new to StackOverflow, and I am extremely new to Python.
My problem is this... I am needing to write a double-sum, as follows:
The motivation is that this is the angular correction to the gravitational potential used for the geoid.
I am having difficulty writing the sums. And please, before you say "Go to such-and-such a resource," or get impatient with me, this is the first time I have ever done coding/programming/whatever this is.
Is this a good place to use a "for" loop?
I have data for the two indices (n,m) and for the coefficients c_{nm} and s_{nm} in a .txt file. Each of those items is a column. When I say usecols, do I number them 0 through 3, or 1 through 4?
(the equation above)
\begin{equation}
V(r, \phi, \lambda) = \sum_{n=2}^{360}\left(\frac{a}{r}\right)^{n}\sum_{m=0}^{n}\left[c_{nm}*\cos{(m\lambda)} + s_{nm}*\sin{(m\lambda)}\right]*\sqrt{\frac{(n-m)!}{(n+m)!}(2n + 1)(2 - \delta_{m0})}P_{nm}(\sin{\lambda})
\end{equation}
(2) Yes, a "for" loop is fine. As #jpmc26 notes, a generator expression is a good alternative to a "for" loop. IMO, you'll want to use numpy if efficiency is important to you.
(3) As #askewchan notes, "usecols" refers to an argument of genfromtxt; as specified in that documentation, column indexes start at 0, so you'll want to use 0 to 3.
A naive implementation might be okay since the larger factorial is the denominator, but I wouldn't be surprised if you run into numerical issues. Here's something to get you started. Note that you'll need to define P() and a. I don't understand how "0 through 3" relates to c and s since their indexes range much further. I'm going to assume that each (and delta) has its own file of values.
import math
import numpy
c = numpy.getfromtxt("the_c_file.txt")
s = numpy.getfromtxt("the_s_file.txt")
delta = numpy.getfromtxt("the_delta_file.txt")
def V(r, phi, lam):
ret = 0
for n in xrange(2, 361):
for m in xrange(0, n + 1):
inner = c[n,m]*math.cos(m*lam) + s[n,m]*math.sin(m*lam)
inner *= math.sqrt(math.factorial(n-m)/math.factorial(n+m)*(2*n+1)*(2-delta[m,0]))
inner *= P(n, m, math.sin(lam))
ret += math.pow(a/r, n) * inner
return ret
Make sure to write unittests to check the math. Note that "lambda" is a reserved word.

Interpolation of sin(x) using Python

I am working on a homework problem for which I am supposed to make a function that interpolates sin(x) for n+1 interpolation points and compares the interpolation to the actual values of sin at those points. The problem statement asks for a function Lagrangian(x,points) that accomplishes this, although my current attempt at executing it does not use 'x' and 'points' in the loops, so I think I will have to try again (especially since my code doesn't work as is!) However, why I can't I access the items in the x_n array with an index, like x_n[k]? Additionally, is there a way to only access the 'x' values in the points array and loop over those for L_x? Finally, I think my 'error' definition is wrong, since it should also be an array of values. Is it necessary to make another for loop to compare each value in the 'error' array to 'max_error'? This is my code right now (we are executing in a GUI our professor made, so I think some of the commands are unique to that such as messages.write()):
def problem_6_run(problem_6_n, problem_6_m, plot, messages, **kwargs):
n = problem_6_n.value
m = problem_6_m.value
messages.write('\n=== PROBLEM 6 ==========================\n')
x_n = np.linspace(0,2*math.pi,n+1)
y_n = np.sin(x_n)
points = np.column_stack((x_n,y_n))
i = 0
k = 1
L_x = 1.0
def Lagrange(x, points):
for i in n+1:
for k in n+1:
return L_x = (x- x_n[k] / x_n[i] - x_n[k])
return Lagrange = y_n[i] * L_x
error = np.sin(x) - Lagrange
max_error = 0
if error > max_error:
max_error = error
print.messages('Maximum error = &g' % max_error)
plot.draw_lines(n+1,np.sin(x))
plot.draw_points(m,Lagrange)
plots.draw_points(m,error)
Edited:
Yes, the different things ThiefMaster mentioned are part of my (non CS) professor's environment; and yes, voithos, I'm using numpy and at this point have definitely had more practice with Matlab than Python (I guess that's obvious!). n and m are values entered by the user in the GUI; n+1 is the number of interpolation points and m is the number of points you plot against later.
Pseudocode:
Given n and m
Generate x_n a list of n evenly spaced points from 0 to 2*pi
Generate y_n a corresponding list of points for sin(x_n)
Define points, a 2D array consisting of these ordered pairs
Define Lagrange, a function of x and points
for each value in the range n+1 (this is where I would like to use points but don't know how to access those values appropriately)
evaluate y_n * (x - x_n[later index] / x_n[earlier index] - x_n[later index])
Calculate max error
Calculate error interpolation Lagrange - sin(x)
plot sin(x); plot Lagrange; plot error
Does that make sense?
Some suggestions:
You can access items in x_n via x_n[k] (to answer your question).
Your loops for i in n+1: and for k in n+1: only execute once each, one with i=n+1 and one with k=n+1. You need to use for i in range(n+1) (or xrange) to get the whole list of values [0,1,2,...,n].
in error = np.sin(x) - Lagrange: You haven't defined x anywhere, so this will probably result in an error. Did you mean for this to be within the Lagrange function? Also, you're subtracting a function (Lagrange) from a number np.sin(x), which isn't going to end well.
When you use the return statement in your def Lagrange you are exiting your function. So your loop will never loop more than once because you're returning out of the function. I think you might actually want to store those values instead of returning them.
Can you write some pseudocode to show what you'd like to do? e.g.:
Given a set of points `xs` and "interpolated" points `ys`:
For each point (x,y) in (xs,ys):
Calculate `sin(x)`
Calculate `sin(x)-y` being the difference between the function and y
.... etc etc
This will make the actual code easier for you to write, and easier for us to help you with (especially if you intellectually understand what you're trying to do, and the only problem is with converting that into python).
So : try fix up some of these points in your code, and try write some pseudocode to say what you want to do, and we'll keep helping you :)

Methods for quickly calculating standard deviation of large number set in Numpy

What's the best(fastest) way to do this?
This generates what I believe is the correct answer, but obviously at N = 10e6 it is painfully slow. I think I need to keep the Xi values so I can correctly calculate the standard deviation, but are there any techniques to make this run faster?
def randomInterval(a,b):
r = ((b-a)*float(random.random(1)) + a)
return r
N = 10e6
Sum = 0
x = []
for sample in range(0,int(N)):
n = randomInterval(-5.,5.)
while n == 5.0:
n = randomInterval(-5.,5.) # since X is [-5,5)
Sum += n
x = np.append(x, n)
A = Sum/N
for sample in range(0,int(N)):
summation = (x[sample] - A)**2.0
standard_deviation = np.sqrt((1./N)*summation)
You made a decent attempt, but should make sure you understand this and don't copy explicitly since this is HW
import numpy as np
N = int(1e6)
a = np.random.uniform(-5,5,size=(N,))
standard_deviation = np.std(a)
This assumes you can use a package like numpy (you tagged it as such). If you can, there are a whole host of methods that allow you to create and do operations on arrays of data, thus avoiding explicit looping (it's done under the hood in an efficient manner). It would be good to take a look at the documentation to see what features are available and how to use them:
http://docs.scipy.org/doc/numpy/reference/index.html
Using the formulas found on this wiki page for Variance, you could compute it in one loop without storing a list of the random numbers (assuming you didn't need them elsewhere).

Categories