I am looking for a simple approach to finding the interpolated intersection between two Numpy arrays. I know that this is simply achieved if we have two function handles, rather than two arrays, as shown in this link using Scipy or using Sympy. I want to do the same, but given two arrays, specifically between the linear spline that results from connecting array entries by lines.
For example, suppose we have two arrays, y_1 and y_2, both thyought of as being evaluated at xSupport.
import numpy as np
xSupport = np.array([0,1])
y_1 = np.array([0,2])
y_2 = np.array([1,0])
I am looking for the function that returns 1/3, which is the x value at the intersections between these two lines. In my application the support is larger than two, so I am looking for an approach that is independent of the arrays' length.
Along the same lines as ser's answer:
import numpy as np
x = np.array([0,1])
y1 = np.array([0,2])
y2 = np.array([1,0])
def solve(f,x):
s = np.sign(f)
z = np.where(s == 0)[0]
if z:
return z
else:
s = s[0:-1] + s[1:]
z = np.where(s == 0)[0]
return z
def interp(f,x,z):
m = (f[z+1] - f[z]) / (x[z+1] - x[z])
return x[z] - f[z]/m
f = y1-y2
z = solve(f,x)
ans = interp(f,x,z)
print(ans)
The problem can be simplified by assuming that you're finding a zero, and then performing the function on the difference of the two series. First, 'solve' finds where a sign transition occurs (implying a zero occurs somewhere in between) and then 'interp' performs a linear interpolation to find the solution.
Over in Digitizing an analog signal, I created a function called find_transition_times . You could use that function by passing y_1 - y_2 for y and 0 for threshold:
In [5]: xSupport = np.array([0,1])
...: y_1 = np.array([0,2])
...: y_2 = np.array([1,0])
...:
In [6]: find_transition_times(xSupport, y_1 - y_2, 0)
Out[6]: array([ 0.33333333])
Related
I am trying to write a simple covariance matrix function in Python.
import numpy as np
def manual_covariance(x):
mean = x.mean(axis=1)
print(x.shape[1])
cov = np.zeros((len(x), len(x)), dtype='complex64')
for i in range(len(mean)):
for k in range(len(mean)):
s = 0
for j in range(len(x[1])): # 5 Col
s += np.dot((x[i][j] - mean[i]), (x[k][j] - mean[i]))
cov[i, k] = s / ((x.shape[1]) - 1)
return cov
With this function if I compute the covariance of:
A = np.array([[1, 2], [1, 5]])
man_cov = manual_covariance(A)
num_cov = np.cov(A)
My answer matches with the np.cov(), and there is no problem. But, when I use complex number instead, my answer does not match with np.cov()
A = np.array([[1+1j, 1+2j], [1+4j, 5+5j]])
man_cov = manual_covariance(A)
num_cov = cov(A)
Manual result:
[[-0.5+0.j -0.5+2.j]
[-0.5+2.j 7.5+4.j]]
Numpy cov result:
[[0.5+0.j 0.5+2.j]
[0.5-2.j 8.5+0.j]]
I have tried printing every statement, to check where it can go wrong, but I am not able to find a fault.
It is because the dot product of two complex vectors z1 and z2 is defined as z1 ยท z2*, where * means conjugation. If you use s += np.dot((x[i,j] - mean[i]), np.conj(x[k,j] - mean[i])) you should get the correct result, where we have used Numpy's conjugate function.
I'm trying to sum a two dimensional function using the array method, somehow, using a for loop is not outputting the correct answer. I want to find (in latex) $$\sum_{i=1}^{M}\sum_{j=1}^{M_2}\cos(i)\cos(j)$$ where according to Mathematica the answer when M=5 is 1.52725. According to the for loop:
def f(N):
s1=0;
for p1 in range(N):
for p2 in range(N):
s1+=np.cos(p1+1)*np.cos(p2+1)
return s1
print(f(4))
is 0.291927.
I have thus been trying to use some code of the form:
def f1(N):
mat3=np.zeros((N,N),np.complex)
for i in range(0,len(mat3)):
for j in range(0,len(mat3)):
mat3[i][j]=np.cos(i+1)*np.cos(j+1)
return sum(mat3)
which again
print(f1(4))
outputs 0.291927. Looking at the array we should find for each value of i and j a matrix of the form
mat3=[[np.cos(1)*np.cos(1),np.cos(2)*np.cos(1),...],[np.cos(2)*np.cos(1),...]...[np.cos(N+1)*np.cos(N+1)]]
so for N=4 we should have
mat3=[[np.cos(1)*np.cos(1) np.cos(2)*np.cos(1) ...] [np.cos(2)*np.cos(1) ...]...[... np.cos(5)*np.cos(5)]]
but what I actually get is the following
mat3=[[0.29192658+0.j 0.+0.j 0.+0.j ... 0.+0.j] ... [... 0.+0.j]]
or a matrix of all zeros apart from the mat3[0][0] element.
Does anybody know a correct way to do this and get the correct answer? I chose this as an example because the problem I'm trying to solve involves plotting a function which has been summed over two indices and the function that python outputs is not the same as Mathematica (i.e., a function of the form $$f(E)=\sum_{i=1}^{M}\sum_{j=1}^{M_2}F(i,j,E)$$).
The return statement is not indented correctly in your sample code. It returns immediately in the first loop iteration. Indent it on the function body instead, so that both for loops finish:
def f(N):
s1=0;
for p1 in range(N):
for p2 in range(N):
s1+=np.cos(p1+1)*np.cos(p2+1)
return s1
>>> print(f(5))
1.527247272700347
I have moved your code to a more numpy-ish version:
import numpy as np
N = 5
x = np.arange(N) + 1
y = np.arange(N) + 1
x = x.reshape((-1, 1))
y = y.reshape((1, -1))
mat = np.cos(x) * np.cos(y)
print(mat.sum()) # 1.5272472727003474
The trick here is to reshape x to a column and y to a row vector. If you multiply them, they are matched up like in your loop.
This should be more performant, since cos() is only called 2*N times. And it avoids loops (bad in python).
UPDATE (regarding your comment):
This pattern can be extended in any dimension. Basically, you get something like a crossproduct. Where every instance of x is matched up with every instance of y, z, u, k, ... Along the corresponding dimensions.
It's a bit confusing to describe, so here is some more code:
import numpy as np
N = 5
x = np.arange(N) + 1
y = np.arange(N) + 1
z = np.arange(N) + 1
x = x.reshape((-1, 1, 1))
y = y.reshape((1, -1, 1))
z = z.reshape((1, 1, -1))
mat = z**2 * np.cos(x) * np.cos(y)
# x along first axis
# y along second, z along third
# mat[0, 0, 0] == 1**2 * np.cos(1) * np.cos(1)
# mat[0, 4, 2] == 3**2 * np.cos(1) * np.cos(5)
If you use this for many dimensions, and big values for N, you will run into memory problems, though.
I have two given arrays: x and y. I want to calculate correlation coefficient between two arrays as follows:
import numpy as np
from scipy.stats import pearsonr
x = np.array([[[1,2,3,4],
[5,6,7,8]],
[[11,22,23,24],
[25,26,27,28]]])
i,j,k = x.shape
y = np.array([[[31,32,33,34],
[35,36,37,38]],
[[41,42,43,44],
[45,46,47,48]]])
xx = np.row_stack(np.dstack(x))
yy = np.row_stack(np.dstack(y))
results = []
for a, b in zip(xx,yy):
r_sq, p_val = pearsonr(a, b)
results.append(r_sq)
results = np.array(results).reshape(j,k)
print results
[[ 1. 1. 1. 1.]
[ 1. 1. 1. 1.]]
The answer is correct. However, would like to know if there are better and faster ways of doing it using numpy and/or scipy.
An alternate way (not necessarily better) is:
xx = x.reshape(2,-1).T # faster, minor issue though
yy = y.reshape(2,-1).T
results = [pearsonr(a,b)[0] for a,b in zip(xx,yy)]
results = np.array(results).reshape(x.shape[1:])
Another current thread was discussing the use of list comprehensions to iterate over values of an array(s): Confusion about numpy's apply along axis and list comprehensions
As discussed there, an alternative is to initialize results, and fill in values during the iteration. That's probably faster for really large cases, but for modest ones, this
np.array([... for .. in ...])
is reasonable.
The deeper question is whether pearsonr, or some alternative, can calculate this correlation for many pairs, rather than just one pair. That may require studying the internals of pearsonr, or other functions in stats.
Here's a first cut at vectorizing stats.pearsonr:
def pearsonr2(a,b):
# stats.pearsonr adapted for
# x and y are (N,2) arrays
n = x.shape[1]
mx = x.mean(1)
my = y.mean(1)
xm, ym = x-mx[:,None], y-my[:,None]
r_num = np.add.reduce(xm * ym, 1)
r_den = np.sqrt(stats.ss(xm,1) * stats.ss(ym,1))
r = r_num / r_den
r = np.clip(r, -1.0, 1.0)
return r
print pearsonr2(xx,yy)
It matches your case, though these test values don't really exercise the function. I just took the pearsonr code, added the axis=1 parameter in most of the lines, and made sure everything ran. The prob step could be included with some boolean masking.
(I can add the stats.pearsonr code to my answer if needed).
This version will take any dimension a,b (as long as they are the same), and do your pearsonr calc along the designated axis. No reshaping needed.
def pearsonr_flex(a,b, axis=1):
# stats.pearsonr adapted for
# x and y are (N,2) arrays
n = x.shape[axis]
mx = x.mean(axis, keepdims=True)
my = y.mean(axis, keepdims=True)
xm, ym = x-mx, y-my
r_num = np.add.reduce(xm * ym, axis)
r_den = np.sqrt(stats.ss(xm, axis) * stats.ss(ym, axis))
r = r_num / r_den
r = np.clip(r, -1.0, 1.0)
return r
pearsonr_flex(xx, yy, 1)
preasonr_flex(x, y, 0)
I have two gaussian plots:
x = np.linspace(-5,9,10000)
plot1=plt.plot(x,mlab.normpdf(x,2.5,1))
plot2=plt.plot(x,mlab.normpdf(x,5,1))
and I want to find the point at where the two curves intersect. Is there a way of doing this? In particular I want to find the value of the x-coordinate where they meet.
You want to find the x's such that both gaussian functions have the same height.(i.e intersect)
You can do so by equating two gaussian functions and solve for x. In the end you will get a quadratic equation with coefficients relating to the gaussian means and variances. Here is the final result:
import numpy as np
def solve(m1,m2,std1,std2):
a = 1/(2*std1**2) - 1/(2*std2**2)
b = m2/(std2**2) - m1/(std1**2)
c = m1**2 /(2*std1**2) - m2**2 / (2*std2**2) - np.log(std2/std1)
return np.roots([a,b,c])
m1 = 2.5
std1 = 1.0
m2 = 5.0
std2 = 1.0
result = solve(m1,m2,std1,std2)
The output is :
array([ 3.75])
You can plot the found intersections:
x = np.linspace(-5,9,10000)
plot1=plt.plot(x,mlab.normpdf(x,m1,std1))
plot2=plt.plot(x,mlab.normpdf(x,m2,std2))
plot3=plt.plot(result,mlab.normpdf(result,m1,std1),'o')
The plot will be:
If your gaussians have multiple intersections, the code will also find all of them(say m1=2.5, std1=3.0, m2=5.0, std2=1.0):
Here's a solution based on purely numpy that is also applicable to curves other than Gaussian.
def get_intersection_locations(y1,y2,test=False,x=None):
"""
return indices of the intersection point/s.
"""
idxs=np.argwhere(np.diff(np.sign(y1 - y2))).flatten()
if test:
x=range(len(y1)) if x is None else x
plt.figure(figsize=[2.5,2.5])
ax=plt.subplot()
ax.plot(x,y1,color='r',label='line1',alpha=0.5)
ax.plot(x,y2,color='b',label='line2',alpha=0.5)
_=[ax.axvline(x[i],color='k') for i in idxs]
_=[ax.text(x[i],ax.get_ylim()[1],f"{x[i]:1.1f}",ha='center',va='bottom') for i in idxs]
ax.legend(bbox_to_anchor=[1,1])
ax.set(xlabel='x',ylabel='density')
return idxs
# single intersection
x = np.arange(-10, 10, 0.001)
y1=sc.stats.norm.pdf(x,-2,2)
y2=sc.stats.norm.pdf(x,2,3)
get_intersection_locations(y1=y1,y2=y2,x=x,test=True) # returns indice/s array([10173])
# double intersection
x = np.arange(-10, 10, 0.001)
y1=sc.stats.norm.pdf(x,-2,1)
y2=sc.stats.norm.pdf(x,2,3)
get_intersection_locations(y1=y1,y2=y2,x=x,test=True)
Based on an answer to a similar question.
In numpy, the numpy.dot() function can be used to calculate the matrix product of two 2D arrays. I have two 3D arrays X and Y (say), and I'd like to calculate the matrix Z where Z[i] == numpy.dot(X[i], Y[i]) for all i. Is this possible to do non-iteratively?
How about:
from numpy.core.umath_tests import inner1d
Z = inner1d(X,Y)
For example:
X = np.random.normal(size=(10,5))
Y = np.random.normal(size=(10,5))
Z1 = inner1d(X,Y)
Z2 = [np.dot(X[k],Y[k]) for k in range(10)]
print np.allclose(Z1,Z2)
returns True
Edit Correction since I didn't see the 3D part of the question
from numpy.core.umath_tests import matrix_multiply
X = np.random.normal(size=(10,5,3))
Y = np.random.normal(size=(10,3,5))
Z1 = matrix_multiply(X,Y)
Z2 = np.array([np.dot(X[k],Y[k]) for k in range(10)])
np.allclose(Z1,Z2) # <== returns True
This works because (as the docstring states), matrix_multiplyprovides
matrix_multiply(x1, x2[, out]) matrix
multiplication on last two dimensions