Is there any possibility to vectorize this? - python

Currently I'm working on a project that implements cubic spline interpolation. So far I have managed to calculate coefficients for my equations.
Now I'm trying to return an interpolating function that for any x returns y.
Let's assume that we have
x = [1, 3, 5]
y = [6, -2, 4]
The coefficients that we get are as follow:
[ 6, -5.75, 0, 0.4375, -2, -0.5, 2.625, -0.4375]
It is equal to
[ a<sub>0</sub>, b<sub>0</sub>, c<sub>0</sub>, d<sub>0</sub>, a<sub>1</sub>, b<sub>1</sub>, c<sub>1</sub>, d<sub>1</sub>]
The interpolating polynomials are
S<sub>0</sub>(x) = a<sub>0</sub> + b<sub>0</sub>*x + c<sub>0</sub>*x<sup>2</sup> + d<sub>0</sub>*x<sup>3</sup> x ∈ [1, 3]
S<sub>1</sub>(x) = a<sub>1</sub> + b<sub>1</sub>*x + c<sub>1</sub>*x<sup>2</sup> + d<sub>1</sub>*x<sup>3</sup> x ∈ (3, 5]
And so on - it can be calculated for more than only 3 points
Right now I have implemented a method that works only if one x is given as an input.
def interpolate_spline(x, x_array, coefficients):
i = 1
while x_array[i] < x:
i += 1
i = i - 1
a = coefficients[4 * i]
b = coefficients[4 * i + 1]
c = coefficients[4 * i + 2]
d = coefficients[4 * i + 3]
return a + b * x + c * (x ** 2) + d * (x ** 3)
And coming back to my question: Is there any possibility that it can be vectorized or at least take whole array as an input?
I don't know if that matters but assume that x_array is sorted

Related

Distance between two lines in 3d

I'm writing a program for university on python. I have to find the shortest distance between two lines in 3d given by two points (A B and C D) and find the points on both of these lines with the shortest distance between them. I'm bad at math so I can't understand how to find the points, only managed to find the formula of the minimal distance between two lines.
tried to write program, but it finds only point of intersection between lines and doesn't work correctly
#line 1
A = [1, 3, 1]
B = [0, -1, 2]
#line 2
C = [0, -2, 3]
D = [1, 0, 2]
def line_intersection(a, b, c, d):
v1 = [a_i - b_i for a_i, b_i in zip(B, A)]
v2 = [a_i - b_i for a_i, b_i in zip(D, C)]
if (-v1[0] * v2[1] + v1[1] * v2[0]) == 0:
t = ((c[1]-a[1]) * (-v2[2]) + (v2[1]) * (c[2]-a[2]))/(-v1[1] * v2[2] + v1[2] * v2[1])
else:
t = ((c[0]-a[0]) * (-v2[1]) + (v2[0]) * (c[1]-a[1]))/(-v1[0] * v2[1] + v1[1] * v2[0])
x = a[0] + t * v1[0]
y = a[1] + t * v1[1]
z = a[2] + t * v1[2]
return x, y, z

Turning my code into list comprehensions to allow unlimited data points

I am creating code that outputs a kernel prediction ynew. At first I had this code, which works but only for 3 value datasets:
def foobar(data):
([a,b,c],[d,e,f]) = data
x = data[0]
y = data[1]
#part 1
kh0 = math.e**(- ((x_new-x[0])**2) / (2*h) )
kh1 = math.e**(- ((x_new-x[1])**2) / (2*h) )
kh2 = math.e**(- ((x_new-x[2])**2) / (2*h) )
#part 2
w0 = kh0 / (kh0 + kh1 + kh2)
w1 = kh1 / (kh0 + kh1 + kh2)
w2 = kh2 / (kh0 + kh1 + kh2)
#part 3
ynew = (w0 * y[0]) + (w1 * y[1]) + (w2 * y[2])
return ynew
I need to change my code to allow for unlimited data sets instead of just three. For the labeled part one above I changed it to:
k = [math.e**(-(x_new-val)**2 /(2*h)) for val in x]
Which is correct. Now I am having trouble changing part 2 and part 3. Here are my attempts for both:
part 2
w = [(val / (sum(k))) for val in k]
Edit: part 2 now resolved
part 3
ynew = sum(w[i] * y[i])
My part 3 try isn't correct because I am not sure how to reference it so that it iterates through both the w values and the y values. I am not sure what I am doing wrong for part 2, i followed a similar structure to what I did with part 1 so I don't know what to change.
After one comment now I have part 2 corrected. I am still confused about part 3. My next try is:
ynew = (sum(val * y[i]) for val in w)
But how can I loop through the values in w as well as the values in y?
This is asking for NumPy:
import numpy
x_new = 7
h = 7
data = np.array([
[0, 1, 2],
[9, 8, 7],
])
x, y = data
kh = np.exp((-(x_new - x) ** 2) / (2 * h))
w = kh / np.sum(kh)
ynew = np.sum(w * y)
which allows you to extend your data as you like:
data = np.array([
[0, 1, 2, 7],
[9, 8, 7, 9],
])

How to find the nth term of a generating function using sympy?

I have a rational function: f(x) = P(x)/Q(x).
For example:
f(x) = (5x + 3)/(1-x^2)
Because f(x) is a generating function it can be written as:
f(x) = a0 + a1*x + a2*x² + ... + a_n*x^n + ... = P(x)/Q(x)
How can I use sympy to find the nth term of the generating function f(x) (that is a_n)?
If there is no such implementation in Sympy, I am curious also to know if this implemented in other packages, such as Maxima.
I appreciate any help.
To get the general formula for a_n of the generating function of a rational form , SymPy's rational_algorithm can be used.
For example:
from sympy import simplify
from sympy.abc import x, n
from sympy.series.formal import rational_algorithm
f = (5*x + 3)/(1-x**2)
func_n, independent_term, order = rational_algorithm(f, x, n, full=True)
print(f"The general formula for a_n is {func_n}")
for k in range(10):
print(f"a_{k} = {simplify(func_n.subs(n, k))}")
Output:
The general formula for a_n is (-1)**(-n - 1) + 4
a_0 = 3
a_1 = 5
a_2 = 3
a_3 = 5
a_4 = 3
a_5 = 5
a_6 = 3
a_7 = 5
a_8 = 3
a_9 = 5
Here is another example:
f = x / (1 - x - 2 * x ** 2)
func_n, independent_term, order = rational_algorithm(f, x, n, full=True)
print(f"The general formula for a_n is {func_n.simplify()}")
print("First terms:", [simplify(func_n.subs(n, k)) for k in range(20)])
The general formula for a_n is 2**n/3 - (-1)**(-n)/3
First terms: [0, 1, 1, 3, 5, 11, 21, 43, 85, 171, 341, 683, 1365, 2731, 5461, 10923, 21845, 43691, 87381, 174763]
You could take the kth derivative and substitute 0 for x and divide by factorial(k):
>>> f = (5*x + 3) / (1-x**2)
>>> f.diff(x, 20).subs(x, 0)/factorial(20)
3
The reference here talks about rational generating functions. Looking for a recurrence you can see the pattern pretty quickly using differentiation:
[f.diff(x,i).subs(x,0)/factorial(i) for i in range(6)]
[3, 5, 3, 5, 3, 5]
Adapting the approach of this post, you could try the following:
from sympy import *
from sympy.abc import x
f = (5*x + 3) / (1-x**2)
print(f.series(n=20))
k = 50
coeff50 = Poly(series(f, x, n=k + 1).removeO(), x).coeff_monomial(x ** k)
print(f"The coeffcient of x^{k} of the generating function of {f} is {coeff50}")
# to get the first 100 coeffcients (reversing the list to get a[0] the
# coefficient of x**0 etc.):
a = Poly(series(f, x, n=100).removeO(), x).all_coeffs()[::-1]
Output:
3 + 5*x + 3*x**2 + 5*x**3 + 3*x**4 + 5*x**5 + 3*x**6 + 5*x**7 + 3*x**8 + 5*x**9 + 3*x**10 + 5*x**11 + 3*x**12 + 5*x**13 + 3*x**14 + 5*x**15 + 3*x**16 + 5*x**17 + 3*x**18 + 5*x**19 + O(x**20)
The coeffcient of x^50 of the generating function of (5*x + 3)/(1 - x**2) is 3
Following this example at Cut The Knot, the approach can be used to find out the number of ways an amount n can be paid with coins of 1, 5, 10, 25 and 50 cents.
f = 1/((1 - x)*(1 - x**5)*(1 - x**10)*(1 - x**25)*(1 - x**50))
a = Poly(series(f, x, n=101).removeO(), x).all_coeffs()[::-1]
print(a[50]) # there are 50 ways to pay 50 cents
print(a[100]) # there are 292 ways to pay 100 cents
In maxima:
powerseries((5*x+3)/(1-x^2),x,0);
returns
Use part to extract the generator:
part(''%,1);
(4-(-1)^i1)x^i1
and coeff to get the coefficient:
a(i1) := coeff(''%, x, i1);
[a(0), a(1), a(2)];
[3, 5, 3]
Another nice way to approach this is to use the ring series:
>>> from sympy.polys.ring_series import rs_mul, rs_pow
>>> from sympy.polys.rings import ring
>>> R,x=ring('x', ZZ)
>>> from sympy import ZZ
>>> R,x=ring('x', ZZ)
>>> nmax = 100
>>> s = rs_mul(5*x+3, rs_pow(1-x**2, -1, x, nmax+1), x, nmax+1)
>>> [s.coeff(x**i) for i in (2,3,5,17,100)]
[3, 5, 5, 5, 3]

How to calculate auto-covariance in Python

I want to calculate auto-covariance of 3 arrays X1, X2 and Y which are all stationary random process. Is there any function in sciPy or other library can solve this problem?
Statsmodels has auto- and cross covariance functions
http://statsmodels.sourceforge.net/devel/generated/statsmodels.tsa.stattools.acovf.html
http://statsmodels.sourceforge.net/devel/generated/statsmodels.tsa.stattools.ccovf.html
plus the correlation functions and partial autocorrelation
http://statsmodels.sourceforge.net/devel/tsa.html#descriptive-statistics-and-tests
According to the standard estimation of the autocovariance coefficient for discrete signals, which can be expressed by equation:
...where x(i) is a given signal (i.e specific 1D vector), k stands for the shift of x(i) signal by k samples, N is the length of x(i) signal, and:
...which is simple average, we can write:
'''
Calculate the autocovarriance coefficient.
'''
import numpy as np
Xi = np.array([1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5])
N = np.size(Xi)
k = 5
Xs = np.average(Xi)
def autocovariance(Xi, N, k, Xs):
autoCov = 0
for i in np.arange(0, N-k):
autoCov += ((Xi[i+k])-Xs)*(Xi[i]-Xs)
return (1/(N-1))*autoCov
print("Autocovariance:", autocovariance(Xi, N, k, Xs))
If you would like to normalize the autocovariance coefficient, which will become the autocorrelation coefficient expressed as:
...than you just have to add to the above code just two additional lines:
def autocorrelation():
return autocovariance(Xi, N, k, Xs) / autocovariance(Xi, N, 0, Xs)
Here is full script:
'''
Calculate the autocovarriance and autocorrelation coefficients.
'''
import numpy as np
Xi = np.array([1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5])
N = np.size(Xi)
k = 5
Xs = np.average(Xi)
def autocovariance(Xi, N, k, Xs):
autoCov = 0
for i in np.arange(0, N-k):
autoCov += ((Xi[i+k])-Xs)*(Xi[i]-Xs)
return (1/(N-1))*autoCov
def autocorrelation():
return autocovariance(Xi, N, k, Xs) / autocovariance(Xi, N, 0, Xs)
print("Autocovariance:", autocovariance(Xi, N, k, Xs))
print("Autocorrelation:", autocorrelation())
A small tweak to the previous answers, which avoids python for loops and uses numpy array operations instead. This will be quicker if you have a lot of data.
def lagged_auto_cov(Xi,t):
"""
for series of values x_i, length N, compute empirical auto-cov with lag t
defined: 1/(N-1) * \sum_{i=0}^{N-t} ( x_i - x_s ) * ( x_{i+t} - x_s )
"""
N = len(Xi)
# use sample mean estimate from whole series
Xs = np.mean(Xi)
# construct copies of series shifted relative to each other,
# with mean subtracted from values
end_padded_series = np.zeros(N+t)
end_padded_series[:N] = Xi - Xs
start_padded_series = np.zeros(N+t)
start_padded_series[t:] = Xi - Xs
auto_cov = 1./(N-1) * np.sum( start_padded_series*end_padded_series )
return auto_cov
Comparing this against #bluevoxel's code, using a time-series of 50,000 data points and computing the auto-correlation for a single fixed value of lag, the python for loop code averaged about 30 milli-seconds and using numpy arrays averaged faster than 0.3 milli-seconds (running on my laptop).
Get sample auto covariance:
# cov_auto_samp(X,delta)/cov_auto_samp(X,0) = auto correlation
def cov_auto_samp(X,delta):
N = len(X)
Xs = np.average(X)
autoCov = 0.0
times = 0.0
for i in np.arange(0, N-delta):
autoCov += (X[i+delta]-Xs)*(X[i]-Xs)
times +=1
return autoCov/times
#user333700 has the right answer. Using a library (such as statsmodels) is generally preferred over writing your own. However, it is insightful to implement your own at least once.
def _check_autocovariance_input(x):
if len(x) < 2:
raise ValueError('Need at least two elements to calculate autocovariance')
def get_autocovariance_given_lag(x, lag):
_check_autocovariance_input(x)
x_centered = x - np.mean(x)
a = np.pad(x_centered, pad_width=(0, lag), mode='constant')
b = np.pad(x_centered, pad_width=(lag, 0), mode='constant')
return np.dot(a, b) / len(x)
def get_autocovariance(x):
_check_autocovariance_input(x)
x_centered = x - np.mean(x)
return np.correlate(x_centered, x_centered, mode='full')[len(x) - 1:] / len(x)
The function I have get_autocovariance_given_lag calculates the autocovariance for a given lag.
If you are interested in all lags, the get_autocovariance can be used. The np.correlate function is what statsmodels uses under the hood. It calculates the cross correlation. This is a sliding dot product. For example, suppose the array is [1, 2, 3]. Then we get:
[1, 2, 3] = 3 * 1 = 3
[1, 2, 3]
[1, 2, 3] = 2 * 1 + 3 * 2 = 8
[1, 2, 3]
[1, 2, 3] = 1 * 1 + 2 * 2 + 3 * 3 = 14
[1, 2, 3]
[1, 2, 3] = 2 * 1 + 3 * 2 = 8
[1, 2, 3]
[1, 2, 3] = 3 * 1 = 3
[1, 2, 3]
But note we are interested in the covariance that starts at lag 0. Where is this? Well, this occurs after we have moved N - 1 positions to the right where N is the length of the array. This is why we return the array starting at N-1.

Find area of polygon from xyz coordinates

I'm trying to use the shapely.geometry.Polygon module to find the area of polygons but it performs all calculations on the xy plane. This is fine for some of my polygons but others have a z dimension too so it's not quite doing what I'd like.
Is there a package which will either give me the area of a planar polygon from xyz coordinates, or alternatively a package or algorithm to rotate the polygon to the xy plane so that i can use shapely.geometry.Polygon().area?
The polygons are represented as a list of tuples in the form [(x1,y1,z1),(x2,y2,z3),...(xn,yn,zn)].
Here is the derivation of a formula for calculating the area of a 3D planar polygon
Here is Python code that implements it:
#determinant of matrix a
def det(a):
return a[0][0]*a[1][1]*a[2][2] + a[0][1]*a[1][2]*a[2][0] + a[0][2]*a[1][0]*a[2][1] - a[0][2]*a[1][1]*a[2][0] - a[0][1]*a[1][0]*a[2][2] - a[0][0]*a[1][2]*a[2][1]
#unit normal vector of plane defined by points a, b, and c
def unit_normal(a, b, c):
x = det([[1,a[1],a[2]],
[1,b[1],b[2]],
[1,c[1],c[2]]])
y = det([[a[0],1,a[2]],
[b[0],1,b[2]],
[c[0],1,c[2]]])
z = det([[a[0],a[1],1],
[b[0],b[1],1],
[c[0],c[1],1]])
magnitude = (x**2 + y**2 + z**2)**.5
return (x/magnitude, y/magnitude, z/magnitude)
#dot product of vectors a and b
def dot(a, b):
return a[0]*b[0] + a[1]*b[1] + a[2]*b[2]
#cross product of vectors a and b
def cross(a, b):
x = a[1] * b[2] - a[2] * b[1]
y = a[2] * b[0] - a[0] * b[2]
z = a[0] * b[1] - a[1] * b[0]
return (x, y, z)
#area of polygon poly
def area(poly):
if len(poly) < 3: # not a plane - no area
return 0
total = [0, 0, 0]
for i in range(len(poly)):
vi1 = poly[i]
if i is len(poly)-1:
vi2 = poly[0]
else:
vi2 = poly[i+1]
prod = cross(vi1, vi2)
total[0] += prod[0]
total[1] += prod[1]
total[2] += prod[2]
result = dot(total, unit_normal(poly[0], poly[1], poly[2]))
return abs(result/2)
And to test it, here's a 10x5 square that leans over:
>>> poly = [[0, 0, 0], [10, 0, 0], [10, 3, 4], [0, 3, 4]]
>>> poly_translated = [[0+5, 0+5, 0+5], [10+5, 0+5, 0+5], [10+5, 3+5, 4+5], [0+5, 3+5, 4+5]]
>>> area(poly)
50.0
>>> area(poly_translated)
50.0
>>> area([[0,0,0],[1,1,1]])
0
The problem originally was that I had oversimplified. It needs to calculate the unit vector normal to the plane. The area is half of the dot product of that and the total of all the cross products, not half of the sum of all the magnitudes of the cross products.
This can be cleaned up a bit (matrix and vector classes would make it nicer, if you have them, or standard implementations of determinant/cross product/dot product), but it should be conceptually sound.
This is the final code I've used. It doesn't use shapely, but implements Stoke's theorem to calculate the area directly. It builds on #Tom Smilack's answer which shows how to do it without numpy.
import numpy as np
#unit normal vector of plane defined by points a, b, and c
def unit_normal(a, b, c):
x = np.linalg.det([[1,a[1],a[2]],
[1,b[1],b[2]],
[1,c[1],c[2]]])
y = np.linalg.det([[a[0],1,a[2]],
[b[0],1,b[2]],
[c[0],1,c[2]]])
z = np.linalg.det([[a[0],a[1],1],
[b[0],b[1],1],
[c[0],c[1],1]])
magnitude = (x**2 + y**2 + z**2)**.5
return (x/magnitude, y/magnitude, z/magnitude)
#area of polygon poly
def poly_area(poly):
if len(poly) < 3: # not a plane - no area
return 0
total = [0, 0, 0]
N = len(poly)
for i in range(N):
vi1 = poly[i]
vi2 = poly[(i+1) % N]
prod = np.cross(vi1, vi2)
total[0] += prod[0]
total[1] += prod[1]
total[2] += prod[2]
result = np.dot(total, unit_normal(poly[0], poly[1], poly[2]))
return abs(result/2)
#pythonn code for polygon area in 3D (optimised version)
def polygon_area(poly):
#shape (N, 3)
if isinstance(poly, list):
poly = np.array(poly)
#all edges
edges = poly[1:] - poly[0:1]
# row wise cross product
cross_product = np.cross(edges[:-1],edges[1:], axis=1)
#area of all triangles
area = np.linalg.norm(cross_product, axis=1)/2
return sum(area)
if __name__ == "__main__":
poly = [[0+5, 0+5, 0+5], [10+5, 0+5, 0+5], [10+5, 3+5, 4+5], [0+5, 3+5, 4+5]]
print(polygon_area(poly))
The area of a 2D polygon can be calculated using Numpy as a one-liner...
poly_Area(vertices) = np.sum( [0.5, -0.5] * vertices * np.roll( np.roll(vertices, 1, axis=0), 1, axis=1) )
Fyi, here is the same algorithm in Mathematica, with a baby unit test
ClearAll[vertexPairs, testPoly, area3D, planeUnitNormal, pairwise];
pairwise[list_, fn_] := MapThread[fn, {Drop[list, -1], Drop[list, 1]}];
vertexPairs[Polygon[{points___}]] := Append[{points}, First[{points}]];
testPoly = Polygon[{{20, -30, 0}, {40, -30, 0}, {40, -30, 20}, {20, -30, 20}}];
planeUnitNormal[Polygon[{points___}]] :=
With[{ps = Take[{points}, 3]},
With[{p0 = First[ps]},
With[{qs = (# - p0) & /# Rest[ps]},
Normalize[Cross ## qs]]]];
area3D[p : Polygon[{polys___}]] :=
With[{n = planeUnitNormal[p], vs = vertexPairs[p]},
With[{areas = (Dot[n, #]) & /# pairwise[vs, Cross]},
Plus ## areas/2]];
area3D[testPoly]
Same as #Tom Smilack's answer, but in javascript
//determinant of matrix a
function det(a) {
return a[0][0] * a[1][1] * a[2][2] + a[0][1] * a[1][2] * a[2][0] + a[0][2] * a[1][0] * a[2][1] - a[0][2] * a[1][1] * a[2][0] - a[0][1] * a[1][0] * a[2][2] - a[0][0] * a[1][2] * a[2][1];
}
//unit normal vector of plane defined by points a, b, and c
function unit_normal(a, b, c) {
let x = math.det([
[1, a[1], a[2]],
[1, b[1], b[2]],
[1, c[1], c[2]]
]);
let y = math.det([
[a[0], 1, a[2]],
[b[0], 1, b[2]],
[c[0], 1, c[2]]
]);
let z = math.det([
[a[0], a[1], 1],
[b[0], b[1], 1],
[c[0], c[1], 1]
]);
let magnitude = Math.pow(Math.pow(x, 2) + Math.pow(y, 2) + Math.pow(z, 2), 0.5);
return [x / magnitude, y / magnitude, z / magnitude];
}
// dot product of vectors a and b
function dot(a, b) {
return a[0] * b[0] + a[1] * b[1] + a[2] * b[2];
}
// cross product of vectors a and b
function cross(a, b) {
let x = (a[1] * b[2]) - (a[2] * b[1]);
let y = (a[2] * b[0]) - (a[0] * b[2]);
let z = (a[0] * b[1]) - (a[1] * b[0]);
return [x, y, z];
}
// area of polygon poly
function area(poly) {
if (poly.length < 3) {
console.log("not a plane - no area");
return 0;
} else {
let total = [0, 0, 0]
for (let i = 0; i < poly.length; i++) {
var vi1 = poly[i];
if (i === poly.length - 1) {
var vi2 = poly[0];
} else {
var vi2 = poly[i + 1];
}
let prod = cross(vi1, vi2);
total[0] = total[0] + prod[0];
total[1] = total[1] + prod[1];
total[2] = total[2] + prod[2];
}
let result = dot(total, unit_normal(poly[0], poly[1], poly[2]));
return Math.abs(result/2);
}
}
Thanks for detailed answers, But I am little surprised there is no simple answer to get the area.
So, I am just posting a simplified approach for calculating area using 3d Coordinates of polygon or surface using pyny3d.
#Install pyny3d as:
pip install pyny3d
#Calculate area
import numpy as np
import pyny3d.geoms as pyny
coords_3d = np.array([[0, 0, 0],
[7, 0, 0],
[7, 10, 2],
[0, 10, 2]])
polygon = pyny.Polygon(coords_3d)
print(f'Area is : {polygon.get_area()}')

Categories