Related
I have two NumPy array (two variables), which contains complex numbers. Since they have been defined in NumPy, so the complex notation is denoted by "j".
for i in range(0, 8):
cg1 = np.array(eigVal1[i])
det_eval = eval(det)
AA = det_eval[0:2,0:2]
bb = det_eval[0:2,2] * (-1)
roots = np.append(roots, solve(AA, bb))
a = np.append(a, roots[i])
b = np.append(b, roots[i+1])
Output:
a = array([-9.03839731e-04+0.00091541j, 3.02435614e-07-0.00043776j,
-9.03839731e-04-0.00091541j, 3.02435614e-07+0.00043776j,
9.03812649e-04+0.00092323j, 4.17553402e-07+0.00043764j,
9.03812649e-04-0.00092323j, 4.17553402e-07-0.00043764j])
b = array([ 3.02435614e-07-0.00043776j, -9.03839731e-04-0.00091541j,
3.02435614e-07+0.00043776j, 9.03812649e-04+0.00092323j,
4.17553402e-07+0.00043764j, 9.03812649e-04-0.00092323j,
4.17553402e-07-0.00043764j, -5.53769989e-05-0.00243369j])
I also have a long equation which some variables have been defined symbolic (y).
u_n = A0*y**(1322.5696672125 + 1317.38942049453*I) + A1*y**(1322.5696672125 - 1317.38942049453*I) + A2*y**(-1322.5696672125 + 1317.38942049453*I) + A3*y**(-1322.5696672125 - 1317.38942049453*I) + ..
My problem is when I want to substitute the two variables (a and b) into the equation, all complex numbers change to "I" and it makes the equation more complex, because I am not able to simplify the equation further.
Is there any solution to convert "I" to "j" in sympy.
for i in range(0, 8):
u_n = u_n.subs(A[i], (a[i] * C[i]))
The result is:
u_n = C0*y**(1322.5696672125 + 1317.38942049453*I)*(-0.000903839731101097 + 0.000915407724097998*I) + C1*y**(1322.5696672125 - 1317.38942049453*I)*(3.02435613673241e-7 - 0.000437760318205723*I) +..
As you see I can not simplify it further, even if I use simplify(u_n). However, in numpy for example,(2+3j)(5+6j) will be reduced to (-8+27j), but when a symbolic notation comes to my equation it won't be simplified further. y**(2+3j)(5+6j) -->> y**(2+3I)(5+6*I).
I would like to have y**(-8+27j) which y is symbolic.
I would appreciate it if someone help me with that.
From you last paragraph:
The python expression:
In [38]: (2+3j)*(5+6j)
Out[38]: (-8+27j)
sympy: (y is a sympy symbol):
In [39]: y**(2+3j)*(5+6j)
Out[39]:
With corrective () to group the multiply before the power:
In [40]: y**((2+3j)*(5+6j))
Out[40]:
Even in plain python the operator order matters:
In [44]: 1**(2+3j)*(5+6j)
Out[44]: (5+6j)
In [45]: 1**((2+3j)*(5+6j))
Out[45]: (1+0j)
Given the Fourier series coefficients a[n] and b[n] (for cosines and sines respectively) of a function with period T and t an equally spaced interval the following code will evaluate the partial sum for all points in interval t (a,b,t are all numpy arrays). It is clarified that len(t) <> len(a).
yn=ones(len(t))*a[0]
for n in range(1,len(a)):
yn=yn+(a[n]*cos(2*pi*n*t/T)-b[n]*sin(2*pi*n*t/T))
My question is: Can this for loop be vectorized?
Here's one vectorized approach making use broadcasting to create the 2D array version of cosine/sine input : 2*pi*n*t/T and then using matrix-multiplication with np.dot for the sum-reduction -
r = np.arange(1,len(a))
S = 2*np.pi*r[:,None]*t/T
cS = np.cos(S)
sS = np.sin(S)
out = a[1:].dot(cS) - b[1:].dot(sS) + a[0]
Further performance boost
For further boost, we can make use of numexpr module to compute those trignometric steps -
import numexpr as ne
cS = ne.evaluate('cos(S)')
sS = ne.evaluate('sin(S)')
Runtime test -
Approaches -
def original_app(t,a,b,T):
yn=np.ones(len(t))*a[0]
for n in range(1,len(a)):
yn=yn+(a[n]*np.cos(2*np.pi*n*t/T)-b[n]*np.sin(2*np.pi*n*t/T))
return yn
def vectorized_app(t,a,b,T):
r = np.arange(1,len(a))
S = (2*np.pi/T)*r[:,None]*t
cS = np.cos(S)
sS = np.sin(S)
return a[1:].dot(cS) - b[1:].dot(sS) + a[0]
def vectorized_app_v2(t,a,b,T):
r = np.arange(1,len(a))
S = (2*np.pi/T)*r[:,None]*t
cS = ne.evaluate('cos(S)')
sS = ne.evaluate('sin(S)')
return a[1:].dot(cS) - b[1:].dot(sS) + a[0]
Also, including function PP from #Paul Panzer's post.
Timings -
In [22]: # Setup inputs
...: n = 10000
...: t = np.random.randint(0,9,(n))
...: a = np.random.randint(0,9,(n))
...: b = np.random.randint(0,9,(n))
...: T = 3.45
...:
In [23]: print np.allclose(original_app(t,a,b,T), vectorized_app(t,a,b,T))
...: print np.allclose(original_app(t,a,b,T), vectorized_app_v2(t,a,b,T))
...: print np.allclose(original_app(t,a,b,T), PP(t,a,b,T))
...:
True
True
True
In [25]: %timeit original_app(t,a,b,T)
...: %timeit vectorized_app(t,a,b,T)
...: %timeit vectorized_app_v2(t,a,b,T)
...: %timeit PP(t,a,b,T)
...:
1 loops, best of 3: 6.49 s per loop
1 loops, best of 3: 6.24 s per loop
1 loops, best of 3: 1.54 s per loop
1 loops, best of 3: 1.96 s per loop
Can't beat numexpr, but if it's not available we can save on the transcendentals (testing and benchmarking code heavily based on #Divakar's code in case you didn't notice ;-) ):
import numpy as np
from timeit import timeit
def PP(t,a,b,T):
CS = np.empty((len(t), len(a)-1), np.complex)
CS[...] = np.exp(2j*np.pi*(t[:, None])/T)
np.cumprod(CS, axis=-1, out=CS)
return a[1:].dot(CS.T.real) - b[1:].dot(CS.T.imag) + a[0]
def original_app(t,a,b,T):
yn=np.ones(len(t))*a[0]
for n in range(1,len(a)):
yn=yn+(a[n]*np.cos(2*np.pi*n*t/T)-b[n]*np.sin(2*np.pi*n*t/T))
return yn
def vectorized_app(t,a,b,T):
r = np.arange(1,len(a))
S = 2*np.pi*r[:,None]*t/T
cS = np.cos(S)
sS = np.sin(S)
return a[1:].dot(cS) - b[1:].dot(sS) + a[0]
n = 1000
t = 2000
t = np.random.randint(0,9,(t))
a = np.random.randint(0,9,(n))
b = np.random.randint(0,9,(n))
T = 3.45
print(np.allclose(original_app(t,a,b,T), vectorized_app(t,a,b,T)))
print(np.allclose(original_app(t,a,b,T), PP(t,a,b,T)))
print('{:18s} {:9.6f}'.format('orig', timeit(lambda: original_app(t,a,b,T), number=10)/10))
print('{:18s} {:9.6f}'.format('Divakar no numexpr', timeit(lambda: vectorized_app(t,a,b,T), number=10)/10))
print('{:18s} {:9.6f}'.format('PP', timeit(lambda: PP(t,a,b,T), number=10)/10))
Prints:
True
True
orig 0.166903
Divakar no numexpr 0.179617
PP 0.060817
Btw. if delta t divides T one can potentially save more, or even run the full fft and discard what's too much.
This is not really another answer but a comment on #Paul Panzer's one, written as an answer because I needed to post some code. If there is a way to post propely formatted code in a comment please advice.
Inspired by #Paul Panzer cumprod idea, I came up with the following:
an = ones((len(a)-1,len(te)))*2j*pi*te/T
CS = exp(cumsum(an,axis=0))
out = (a[1:].dot(CS.real) - b[1:].dot(CS.imag)) + a[0]
Although it seems properly vectorized and produces correct results, its performance is miserable. It is not only much slower than the cumprod, which is expected as len(a)-1 exponentiations more are made, but 50% slower than the original unvectorized version. What is the cause of this poor performance?
I have these 2 vectors A and B:
import numpy as np
A=np.array([1,2,3])
B=np.array([8,7])
and I want to add them up with this expression:
Result = sum((A-B)**2)
The expected result that I need is:
Result = np.array([X,Y])
Where:
X = (1-8)**2 + (2-8)**2 + (3-8)**2 = 110
Y = (1-7)**2 + (2-7)**2 + (3-7)**2 = 77
How can I do it? The 2 arrays are an example, in my case I have a very large arrays and I cannot do it manually.
You can make A a 2d array and utilize numpy's broadcasting property to vectorize the calculation:
((A[:, None] - B) ** 2).sum(0)
# array([110, 77])
Since you have mentioned that you are working with large arrays, with focus on performance here's one with np.einsum that does the combined operation of squaring and sum-reduction in one step efficiently, like so -
def einsum_based(A,B):
subs = A[:,None] - B
return np.einsum('ij,ij->j',subs, subs)
Sample run -
In [16]: A = np.array([1,2,3])
...: B = np.array([8,7])
...:
In [17]: einsum_based(A,B)
Out[17]: array([110, 77])
Runtime test with large arrays scaling up the given sample 1000x -
In [8]: A = np.random.rand(3000)
In [9]: B = np.random.rand(2000)
In [10]: %timeit ((A[:, None] - B) ** 2).sum(0) # #Psidom's soln
10 loops, best of 3: 21 ms per loop
In [11]: %timeit einsum_based(A,B)
100 loops, best of 3: 12.3 ms per loop
I want to Solve Polynomial equation of 6th order with Python.
I've tried the "basic" version:
avgIrms = 19.61
c_val = (0.000002324*avgIrms**6) - (0.0001527*avgIrms**5) + (0.003961843*avgIrms**4) - (0.052211292*avgIrms**3) + (0.379269091*avgIrms**2) -(0.404399274*avgIrms) + 0.000682896
print(c_val)
After that I've used the numpy with the following code:
import numpy as np
avgIrms = 19.61
ppar = [0.000002324, -0.0001527, 0.003961843, -0.052211292, 0.379269091, -0.404399274, 0.000682896]
p = np.poly1d(ppar)
print(p(avgIrms))
In the both ways the raspberry tooks more than five seconds to process... It's to much! Any help to solve polynomial equations efficiently? (less than one second...)
Thanks in advance,
Daniel
First, what you want is to evaluate a polynomial for a given x, not to solve it. Second, I still don't see how do you get your speed up..
Find here a couple of timmings:
>>> import numpy as np
>>> x = 19.61
>>> pr = [0.000002324, -0.0001527, 0.003961843, -0.052211292, 0.379269091, -0.404399274, 0.000682896]
>>> p = pr[::-1] # reverse the order
Hardcoded solution:
>>> %timeit p[0] + x * p[1] + p[2] * x**2 + p[3] * x**3 + p[4] * x**4 + p[5] * x**5 + p[6] * x**6
809 ns
Loopy solution:
>>> %%timeit
val = 0
for i in range(len(p)):
val += p[i] * x**i
1.24 µs
Functional programming solution:
>>> %timeit reduce(lambda acc, i: acc + p[i] * x**i, range(len(p)))
1.61 µs
Using numpy's polyval:
>>> %timeit np.polyval(pr, x)
6.12 µs
Using numpy's poly1d
>>> %%timeit
c = np.poly1d(pr)
c(x)
9.46 µs
So, clearly numpy is slower, as for a such a small array it adds some overhead in the Python <-> C communication, but still, it is of the order of 6-9 µs, I'm using a desktop computer, but I would be pretty impressed if a Raspberry Pi would really take 5 seconds to do that operation. Are you sure you did the timings properly?
Any way, either the hardcoded or the loopy solution seem faster than the functional programming one (the equivalent to the one that you defined as horner in your comment).
What I'm looking for: a way to implement in Python a special multiplication operation for matrices that happen to be in scipy sparse (csr) format. This is a special kind of multiplication, not matrix multiplication nor Kronecker multiplication nor Hadamard aka pointwise multiplication, and does not seem to have any built-in support in scipy.sparse.
The desired operation: Each row of the output should contain the results of every product of the elements of the corresponding rows in the two input matrices. So starting with two identically sized matrices, each with dimensions m by n, the result should have dimensions m by n^2.
It looks like this:
Python code:
import scipy.sparse
A = scipy.sparse.csr_matrix(np.array([[1,2],[3,4]]))
B = scipy.sparse.csr_matrix(np.array([[0,5],[6,7]]))
# C is the desired product of A and B. It should look like:
C = scipy.sparse.csr_matrix(np.array([[0,5,0,10],[18,21,24,28]]))
What would be a nice or efficient way to do this? I've tried looking here on stackoverflow as well as elsewhere, with no luck so far. So far it sounds like my best bet is to do a row by row operation in a for loop, but that sounds horrendous seeing as my input matrices have a few million rows and few thousand columns, mostly sparse.
In your example, C is the first and last row of kron
In [4]: A=np.array([[1,2],[3,4]])
In [5]: B=np.array([[0,5],[6,7]])
In [6]: np.kron(A,B)
Out[6]:
array([[ 0, 5, 0, 10],
[ 6, 7, 12, 14],
[ 0, 15, 0, 20],
[18, 21, 24, 28]])
In [7]: np.kron(A,B)[[0,3],:]
Out[7]:
array([[ 0, 5, 0, 10],
[18, 21, 24, 28]])
kron contains the same values as np.outer, but they are in a different order.
For large dense arrays, einsum might provide good speed:
np.einsum('ij,ik->ijk',A,B).reshape(A.shape[0],-1)
sparse.kron does the same thing as the np.kron:
As = sparse.csr_matrix(A); Bs ...
sparse.kron(As,Bs).tocsr()[[0,3],:].A
sparse.kron is written in Python, so you probably could modify it if it is doing unnecessary calculations.
An iterative solution appears to be:
sparse.vstack([sparse.kron(a,b) for a,b in zip(As,Bs)]).A
Being iterative I don't expect it to be faster than paring down the full kron. But short of digging into the logic of sparse.kron it is probably the best I can do.
vstack uses bmat, so the calculation is:
sparse.bmat([[sparse.kron(a,b)] for a,b in zip(As,Bs)])
But bmat is rather complex, so it won't be easy to simplify this further.
The np.einsum solution can't be easily extended to sparse - there isn't a sparse.einsum, and the intermediate product is 3d, which sparse does not handle.
sparse.kron uses coo format, which is no good for working with the rows. But working in the spirit of that function, I've worked out a function that iterates on the rows of csr format matrices. Like kron and bmat I'm constructing the data, row, col arrays, and constructing a coo_matrix from those. That in turn can be converted to other formats.
def test_iter(A, B):
m,n1 = A.shape
n2 = B.shape[1]
Cshape = (m, n1*n2)
data = np.empty((m,),dtype=object)
col = np.empty((m,),dtype=object)
row = np.empty((m,),dtype=object)
for i,(a,b) in enumerate(zip(A, B)):
data[i] = np.outer(a.data, b.data).flatten()
#col1 = a.indices * np.arange(1,a.nnz+1) # wrong when a isn't dense
col1 = a.indices * n2 # correction
col[i] = (col1[:,None]+b.indices).flatten()
row[i] = np.full((a.nnz*b.nnz,), i)
data = np.concatenate(data)
col = np.concatenate(col)
row = np.concatenate(row)
return sparse.coo_matrix((data,(row,col)),shape=Cshape)
With these small 2x2 matrices, as well as larger ones (e.g. A1=sparse.rand(1000,2000).tocsr()), this is about 3x faster than the version using bmat. For large enough matrices it is better than dense einsum version (which can have memory errors).
A non-optimal way to do it is to kron separately for each row:
def my_mult(A, B):
nrows = A.shape[0]
prodrows = []
for i in xrange(0, nrows):
Arow = A.getrow(i)
Brow = B.getrow(i)
prodrow = scipy.sparse.kron(Arow,Brow)
prodrows.append(prodrow)
return scipy.sparse.vstack(prodrows)
This is approx 3x worse in performance than #hpaulj's solution here, as can be seen by running the following code:
A=scipy.sparse.rand(20000,1000, density=0.05).tocsr()
B=scipy.sparse.rand(20000,1000, density=0.05).tocsr()
# Check memory
%memit C1 = test_iter(A,B)
%memit C2 = my_mult(A,B)
# Check time
%timeit C1 = test_iter(A,B)
%timeit C2 = my_mult(A,B)
# Last but not least, check correctness!
print (C1 - C2).nnz == 0
Results:
hpaulj's method:
peak memory: 1993.93 MiB, increment: 1883.80 MiB
1 loops, best of 3: 6.42 s per loop
this method:
peak memory: 2456.75 MiB, increment: 1558.78 MiB
1 loops, best of 3: 18.9 s per loop
hpauj's answer to my another post:
How do i create interacting sparse matrix?
def test_iter2(A, B):
m,n1 = A.shape
n2 = B.shape[1]
Cshape = (m, n1*n2)
data = []
col = []
row = []
for i in range(A.shape[0]):
slc1 = slice(A.indptr[i],A.indptr[i+1])
data1 = A.data[slc1]; ind1 = A.indices[slc1]
slc2 = slice(B.indptr[i],B.indptr[i+1])
data2 = B.data[slc2]; ind2 = B.indices[slc2]
data.append(np.outer(data1, data2).ravel())
col.append(((ind1*n2)[:,None]+ind2).ravel())
row.append(np.full(len(data1)*len(data2), i))
data = np.concatenate(data)
col = np.concatenate(col)
row = np.concatenate(row)
return sparse.coo_matrix((data,(row,col)),shape=Cshape)
It got 6 times faster.
In [536]: S0=sparse.random(200,200, 0.01, format='csr')
In [537]: S1=sparse.random(200,200, 0.01, format='csr')
In [538]: timeit test_iter(S0,S1)
42.8 ms ± 1.7 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [539]: timeit test_iter2(S0,S1)
6.94 ms ± 27 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)