I have a function that I am trying to optimize.
def mul_spectrums_with_conj(x: ndarray, y: ndarray) -> ndarray:
lst = np.empty((x.shape[0], x.shape[1]), dtype=np.complex64)
for kx in range(x.shape[0]):
for ky in range(x.shape[1]):
acc0 = x.real[kx, ky] * y.real[kx, ky] + x.imag[kx, ky] * y.imag[kx, ky]
acc1 = x.imag[kx, ky] * y.real[kx, ky] - x.real[kx, ky] * y.imag[kx, ky]
lst[kx][ky] = complex(acc0, acc1)
return lst
I have implemented the logic I needed. But, I am definite there is a optimized way to write this logic. Can someone help?
What you have there is a very manual, lengthy way of multiplying each element of x by the complex conjugate of the corresponding element of y. You don't need to write it out like that. NumPy can already take complex conjugates and multiply complex numbers on its own.
NumPy supports those operations on arrays, too, so you don't have to write out explicit loops. If you let NumPy handle it instead of looping explicitly, NumPy can do the work in fast C loops over the underlying data buffers, instead of going through all the dynamic dispatch and wrapper objects involved in looping over an array explicitly at Python level:
def mul_spectrums_with_conj(x: ndarray, y: ndarray) -> ndarray:
return x * y.conj()
And with the code being so simple, you might as well just write x * y.conj() directly when you want to do this operation, instead of calling a function.
Related
I am writing an application in Python having speed as the main driver. While optimizing my code, I found out that the main bottleneck is given by the code used to compute
In my code, this matrix multiplication is computed as
POW = np.arange(4)
y = C # (x ** POW)
I tried to use different methods (e.g., for cycle and others), but as now this is the fastest way I found. Do you have any suggestion to improve the computational time?
It's absolutely to use Numpy. Numpy does the actual mathematical operations in highly optimized C code. Using Numpy is faster than writing your own non-optimized C code.
Firstly, use float instead int. Secondly, if you don't need double precision then use np.float32.
POW = np.arange(4, dtype='f')
# C = C.astype('f', copy=False) # ensure that C.dtype == np.float32
y = C # (x ** POW)
I would like to obtain a numpy array from element-wise calculation on different numpy arrays. As of now, I am using a lambda function to return a value, repeat that for all values, create a list therefrom, and convert to numpy array:
import math
import numpy as np
def weightAdjLoads(loadsX, loadsY, angles, g):
adjust = lambda x, y, a: math.sqrt((abs(x) - math.sin(a)*g)**2 + (abs(y) - math.cos(a)*g)**2)
return np.array([adjust(x, y, a) for x, y, a in zip (loadsX, loadsY, angles)])
This seems to me like too much overhead. Are there any numpy routines which could do just that?
I am aware of methods such as numpy.sqrt(A**2 + B**2), where A and B are numpy arrays. However, those only allow to apply predefined formulas. How can I apply custom formulas on numpy arrays?
numpy.sqrt(A**2 + B**2) is parsed by the Python interpreter into calls roughly as follows:
tmp1 = A**2 # A.__pow__(2)
tmp2 = B**2 #
tmp3 = tmp1 + tmp2 # tmp1.__add__(tmp2)
tmp4 = np.sqrt(tmp3)
That is, there are defined numpy functions and methods for power, addition, sqrt etc.
Your lambda works with scalars, not numpy arrays:
math.sqrt((abs(x) - math.sin(a)*g)**2 + (abs(y) - math.cos(a)*g)**2)
Specifically it's the math trig functions that require scalars. abs works with arrays:
abs(A) => A.__abs__()
numpy provides a full set of trig functions, so this function should work with array, or scalar, arguments:
def foo(x, y, a):
return np.sqrt((abs(x) - np.sin(a)*g)**2 + (abs(y) - np.cos(a)*g)**2)
There are ways of wrapping your scalar adjust into a numpy function, but the speed savings relative to your list comprehension are minor.
f = np.vectorize(adjust)
f = np.frompyfunc(adjust, 3, 1)
Mainly they make it easier to broadcast arrays to a scalar functions. But to gain compiled speed you have to make a conversion such as in my foo, or use a third party package like cython, numba, or numexpr.
I'm new to programming and am a bit unsure about how to write my own for loop. This is what I would like please?
Let us subdivide interval [0,1] into n points x0=0,...,xn−1=1.
Write a function compute_discrete_u(epsilon, n) that returns two numpy arrays:
x_array contains the coordinates of the n points
u_array contains the discrete values of u at these points.
u(x)=sin(1x+ϵ)
Thank you!
First of all, you do not need a for loop at all. You want to use numpy, so you can use the vectorized operations that numpy is built upon.
Here's the function you are literally asking for (and most likely not how you should solve your problem):
# Do NOT use this.
import numpy as np
def compute_discrete_u(epsilon, n):
x = np.linspace(0, 1, n)
return x, np.sin(x + expsilon)
That's quite an awkward API. From a design point-of-view, you are mixing two responsibilities in the function:
Generating a certain x vector
Calculating a u vector based on a mathematical function.
You should not do this for complexity and reusability reasons. What if you want a non-uniform x later on?
So here's what you should do:
import numpy as np
def compute_u(x, epsilon):
return np.sin(x + epsilon)
x = np.linspace(0, 1, num=101)
u = compute_u(x, epsilon=1e-3)
This is more easy to understand because the function is just the mathematical function. Additionally, you can compute u for any x array (or single float) you like. If you do not need compute_u elsewhere, you may even completely drop it and write u = np.sin(x + epsilon)
import numpy as np
beta= 0.9
A=[1+1j,2+2j]
real=np.zeros((1,2))
for i in range(1):
for l in range(2):
real[i,j] = real[i,j]-beta*A[i,j]
I am not familiar with the computation of different types of arrays in numpy. How can I make the code work?
The problem with your original code is that the result of
real[i, j] - beta * A[i, j]
will be complex, but you created real using np.zeros, which will give you a float64 array unless you explicitly specify a different dtype. Since there is no safe way to cast a complex value to a float, the assignment to real[i, j] will raise a TypeError.
One way to solve the problem would be to initialize real with a complex dtype:
real = np.zeros((1, 2), dtype=np.complex)
If you make A a numpy array, you can use broadcasting to do the multiplication in one go without pre-allocating real and without looping:
import numpy as np
beta = 0.9
A = np.array([1 + 1j, 2 + 2j])
real = -beta * A
print(repr(real))
# array([-0.9-0.9j, -1.8-1.8j])
It looks like you'd probably benefit from reading some of the examples here.
Assume a few functions called many times. These functions do something such as multiply, divide, add, on a 3d vector (a 1x3 array).
Given:
import numpy as np
import math
x = [0,1,2]
y = [3,2,1]
a = 1.2
Based on my testing, it is faster for python math library to do:
math.sin(a)
than for numpy to do:
np.sin(a)
Additionally, simple algorithms such as normalization are faster with python than np.linalg.norm using the method discussed in this conversation.
Now if we add a bit of complexity to the data, such as doing matrix multiplication for 3d, where we have a rotation matrix of 3x3 that is then multiplied by another matrix and transposed, numpy starts to gain the advantage.
Currently, doing operations such as:
L = math.sqrt(V[0] * V[0] + V[1] * V[1] + V[2] * V[2])
V = (V[0] / L, V[1] / L, V[2] / L)
are much faster when called repeatedly (I assume from no overhead in creating the numpy array).
However, in order to use the numpy matrix functions, the array needs to be numpy. Using np.asarray() has significant overhead, which makes the efficiency border between not using numpy at all, accepting the overhead of creating the array, or accepting the efficiency of numpy math functions on scalars and only using numpy.
Of course I can try out all of these methods, but in a large algorithm, the possible combinations are too much. Is there any strategy to efficiently switch between python and numpy in this situation?
EDIT:
From some comments, it seems the question is not clear enough. I understand numpy is more efficient with big sets, which is why this question exists. The algorithm is NOT ONLY calculating sine. The following code might make it easier to understand:
x = [2,1,2]
math.sin(x[0])
L = math.sqrt(x[0] * x[0] + x[1] * x[1] + x[2] * x[2])
V = (x[0] / L, x[1] / L, x[2] / L)
math.sin(V[0])
#Do something else here
When working with single values, and small arrays, the np.array overhead certainly slows things down compared to using the math. equivalents. But with many values, the array approach quickly becomes better.
For example in Ipython I can time sin for 50 values:
In [444]: %%timeit x=np.arange(50)
np.sin(x)
100000 loops, best of 3: 8.5 us per loop
In [445]: %%timeit x=range(50)
[math.sin(i) for i in x]
100000 loops, best of 3: 18.1 us per loop
Your V calculation is 20x faster than
Va=Va/math.sqrt((Va*Va).sum())
But if I do that on 20 sets of values, the times are about equal. And I don't have to change the expression to handle Va=np.ones((20,3), float). To time your V I had to wrap it in a function and time [foo(i) for i in V].
You might even gain more speed by doing the indexing only once, e.g.
v1, v2, v3 = V
L = math.sqrt(v1*v1+ v2*v2+v3*v3)
V = (v1/L, v2/L, v3/L)
I'd expect more gain when using arrays than lists.