Assume three numpy arrays x, y and z
z = (x**2)/ y for each x > 2 y
z = (x**2)/y**(3/2) for each x > 3 y
z = (1/x)*sin(x) for each x > 4 y
The array x, y and z are of-course made up but they illustrate the point of operating multiple if statements on multiple arrays. The arrays x, y and z are about 500,000 elements each.
One possible way (much like FORTRAN) is to create a variable i to index the arrays and use it to test if x[i] > 2*y[i] or x[i] > 3*y[i]. I assume it would be slow.
I need a fast, elegant and a more pythonic way to compute the array z.
UPDATE: I have tried the two methods and here are the results:
# Fortran way of loops:
import numpy as np
x=np.random.rand(40000,1)
y=np.random.rand(40000,1)
z = np.zeros(x.shape)
for i, v in enumerate(x):
#print i
if x[i] >2*y[i]:
z[i]= x[i]**2/y[i]
if x[i] > 3*y[i]:
z[i]=x[i]**2/y[i]**(1.5)
if x[i] > 4*y[i]:
z[i] = (1/x[i])*np.sin(x[i])
z = np.zeros(x.shape)
print z
#end----
The timing results are as follows:
real 0m0.920s
user 0m0.900s
sys 0m0.016s
The other piece of code used is:
# Pythonic way
import numpy as np
x=np.random.rand(40000,1)
y=np.random.rand(40000,1)
indices1 = np.where(x > 2*y)
indices2 = np.where(x > 3*y)
indices3 = np.where(x > 4*y)
z = np.zeros(x.shape)
z[indices1] = x[indices1]**2/y[indices1]
z[indices2] = x[indices2]**2/y[indices2]**(1.5)
z[indices3] = (1/x[indices3])*np.sin(x[indices3])
print z
# end of code -----
The timing results are as follows:
real 0m0.110s
user 0m0.076s
sys 0m0.028s
So there is a large difference in the execution times. The two pieces were run on a ubuntu virtual machine with python 2.7.5
UPDATE: I did another test using
indices1 = x > 2*y
indices2 = x > 3*y
indices3 = x > 4*y
The timing results were:
real 0m0.105s
user 0m0.084s
sys 0m0.016s
SUMMARY: Method 3 is the most elegant and slightly faster than using np.where. Using explicit loops is very slow.
I'm not quite sure if you are looking to have your z array be the same size as x or y, but I will assume so.
Numpy has a function that can find the indices of elements based on a condition.
In the example below I am doing a calculation similar to what your first line does.
import numpy as np
x = np.arange(4)
x[2:] += 10
print x
y = np.arange(4)
print y
indices = np.where(x > 2*y)
print indices
z = np.zeros(x.shape)
z[indices] = x[indices]**2/y[indices]
print z
The print statements yield the following:
x: [0 1 12 13]
y: [0 1 2 3]
indices: [2, 3]
z: [0 0 72 56]
Edit:
Upon further testing it turns out that you don't even need to use the numpy where function. You can simply set indices = x > 2*y.
Related
I'm looking to find a vectorised way to create ndarrays from formulae, using the indices (or coordinates) of the value being calculated.
For example, if I want a 4x5x3 array filled by the formula 3x+y^z, I have no current way to reference x, y, or z directly. The closest I can come to that is wasting memory by creating incrementing arrays through arange().
Here are my current methods:
import numpy as np
# for loops
arr = np.empty((4, 5, 3))
for x in range(4):
for y in range(5):
for z in range(3):
arr[x, y, z] = 3 * x + y ** z
print(arr)
# separate arange()-created arrays for x, y, and z
x = np.arange(4)[:, None, None]
y = np.arange(5)[None, :, None]
z = np.arange(3)[None, None, :]
arr = 3 * x + y ** z
print(arr)
# same thing but written in a different way (I think)
all = np.arange(5)
arr = 3 * all[:4, None, None] + all[None, :, None] ** all[None, None, :3]
print(arr)
Is there any more efficient way to do this? I'm assuming there is, because numpy's underlying C code must use an iterator to find the memory addresses of each item, so surely that iterator could be repurposed as a faster alternative to using indices stored in memory?
I'm trying to sum a two dimensional function using the array method, somehow, using a for loop is not outputting the correct answer. I want to find (in latex) $$\sum_{i=1}^{M}\sum_{j=1}^{M_2}\cos(i)\cos(j)$$ where according to Mathematica the answer when M=5 is 1.52725. According to the for loop:
def f(N):
s1=0;
for p1 in range(N):
for p2 in range(N):
s1+=np.cos(p1+1)*np.cos(p2+1)
return s1
print(f(4))
is 0.291927.
I have thus been trying to use some code of the form:
def f1(N):
mat3=np.zeros((N,N),np.complex)
for i in range(0,len(mat3)):
for j in range(0,len(mat3)):
mat3[i][j]=np.cos(i+1)*np.cos(j+1)
return sum(mat3)
which again
print(f1(4))
outputs 0.291927. Looking at the array we should find for each value of i and j a matrix of the form
mat3=[[np.cos(1)*np.cos(1),np.cos(2)*np.cos(1),...],[np.cos(2)*np.cos(1),...]...[np.cos(N+1)*np.cos(N+1)]]
so for N=4 we should have
mat3=[[np.cos(1)*np.cos(1) np.cos(2)*np.cos(1) ...] [np.cos(2)*np.cos(1) ...]...[... np.cos(5)*np.cos(5)]]
but what I actually get is the following
mat3=[[0.29192658+0.j 0.+0.j 0.+0.j ... 0.+0.j] ... [... 0.+0.j]]
or a matrix of all zeros apart from the mat3[0][0] element.
Does anybody know a correct way to do this and get the correct answer? I chose this as an example because the problem I'm trying to solve involves plotting a function which has been summed over two indices and the function that python outputs is not the same as Mathematica (i.e., a function of the form $$f(E)=\sum_{i=1}^{M}\sum_{j=1}^{M_2}F(i,j,E)$$).
The return statement is not indented correctly in your sample code. It returns immediately in the first loop iteration. Indent it on the function body instead, so that both for loops finish:
def f(N):
s1=0;
for p1 in range(N):
for p2 in range(N):
s1+=np.cos(p1+1)*np.cos(p2+1)
return s1
>>> print(f(5))
1.527247272700347
I have moved your code to a more numpy-ish version:
import numpy as np
N = 5
x = np.arange(N) + 1
y = np.arange(N) + 1
x = x.reshape((-1, 1))
y = y.reshape((1, -1))
mat = np.cos(x) * np.cos(y)
print(mat.sum()) # 1.5272472727003474
The trick here is to reshape x to a column and y to a row vector. If you multiply them, they are matched up like in your loop.
This should be more performant, since cos() is only called 2*N times. And it avoids loops (bad in python).
UPDATE (regarding your comment):
This pattern can be extended in any dimension. Basically, you get something like a crossproduct. Where every instance of x is matched up with every instance of y, z, u, k, ... Along the corresponding dimensions.
It's a bit confusing to describe, so here is some more code:
import numpy as np
N = 5
x = np.arange(N) + 1
y = np.arange(N) + 1
z = np.arange(N) + 1
x = x.reshape((-1, 1, 1))
y = y.reshape((1, -1, 1))
z = z.reshape((1, 1, -1))
mat = z**2 * np.cos(x) * np.cos(y)
# x along first axis
# y along second, z along third
# mat[0, 0, 0] == 1**2 * np.cos(1) * np.cos(1)
# mat[0, 4, 2] == 3**2 * np.cos(1) * np.cos(5)
If you use this for many dimensions, and big values for N, you will run into memory problems, though.
Can you pls explain how to create a matrix in python to be created in object datatype. My code :
w, h = 8, 5;
Matrix = ([[0 for x in range(w)] for y in range(h)],dtype=object)
gives a syntax error. I tried various other ways. But still none of them working.
Thanks a lot
In your code the Matrix line tries to create a tuple, however you are giving it an expression dtype=object.
Matrix = ([[0 for x in range(w)] for y in range(h)],dtype=object)
The line reads: Set matrix to the tuple (2D array, dtype=object). However, the second part cannot be set. You can create the matrix as follows:
Matrix = [[0 for x in range(w)] for y in range(h)]
Or if you would like to have a numpy array with dtype object:
import numpy as np
Matrix = np.array([[0 for x in range(w)] for y in range(h)], dtype=object)
Or even more clean:
import numpy as np
Matrix = np.zeros((h, w), dtype=object)
Let me present you two options using numpy module and loops.
import numpy as np
print("Using numpy module:")
x = np.array([1,5,2])
y = np.array([7,4,1])
sum = x + y
subtract = x - y
mult = x * y
div = x / y
print("Sum: {}".format(sum))
print("Subtraction: {}".format(subtract))
print("Multiplication: {}".format(mult))
print("Division: {}".format(div))
print("----------------------------------------")
print("Using for loops:")
x = [1,5,2]
y = [7,4,1]
sum = []
subtract = []
mult =[]
div = []
for i,j in zip(x,y):
sum.append(i+j)
subtract.append(i-j)
mult.append(i*j)
div.append(i/j)
print(sum)
print(subtract)
print(mult)
print(div)
I have two 2D array, x(ni, nj) and y(ni,nj), that I need to interpolate over one axis. I want to interpolate along last axis for every ni.
I wrote
import numpy as np
from scipy.interpolate import interp1d
z = np.asarray([200,300,400,500,600])
out = []
for i in range(ni):
f = interp1d(x[i,:], y[i,:], kind='linear')
out.append(f(z))
out = np.asarray(out)
However, I think this method is inefficient and slow due to loop if array size is too large. What is the fastest way to interpolate multi-dimensional array like this? Is there any way to perform linear and cubic interpolation without loop? Thanks.
The method you propose does have a python loop, so for large values of ni it is going to get slow. That said, unless you are going to have large ni you shouldn't worry much.
I have created sample input data with the following code:
def sample_data(n_i, n_j, z_shape) :
x = np.random.rand(n_i, n_j) * 1000
x.sort()
x[:,0] = 0
x[:, -1] = 1000
y = np.random.rand(n_i, n_j)
z = np.random.rand(*z_shape) * 1000
return x, y, z
And have tested them with this two versions of linear interpolation:
def interp_1(x, y, z) :
rows, cols = x.shape
out = np.empty((rows,) + z.shape, dtype=y.dtype)
for j in xrange(rows) :
out[j] =interp1d(x[j], y[j], kind='linear', copy=False)(z)
return out
def interp_2(x, y, z) :
rows, cols = x.shape
row_idx = np.arange(rows).reshape((rows,) + (1,) * z.ndim)
col_idx = np.argmax(x.reshape(x.shape + (1,) * z.ndim) > z, axis=1) - 1
ret = y[row_idx, col_idx + 1] - y[row_idx, col_idx]
ret /= x[row_idx, col_idx + 1] - x[row_idx, col_idx]
ret *= z - x[row_idx, col_idx]
ret += y[row_idx, col_idx]
return ret
interp_1 is an optimized version of your code, following Dave's answer. interp_2 is a vectorized implementation of linear interpolation that avoids any python loop whatsoever. Coding something like this requires a sound understanding of broadcasting and indexing in numpy, and some things are going to be less optimized than what interp1d does. A prime example being finding the bin in which to interpolate a value: interp1d will surely break out of loops early once it finds the bin, the above function is comparing the value to all bins.
So the result is going to be very dependent on what n_i and n_j are, and even how long your array z of values to interpolate is. If n_j is small and n_i is large, you should expect an advantage from interp_2, and from interp_1 if it is the other way around. Smaller z should be an advantage to interp_2, longer ones to interp_1.
I have actually timed both approaches with a variety of n_i and n_j, for z of shape (5,) and (50,), here are the graphs:
So it seems that for z of shape (5,) you should go with interp_2 whenever n_j < 1000, and with interp_1 elsewhere. Not surprisingly, the threshold is different for z of shape (50,), now being around n_j < 100. It seems tempting to conclude that you should stick with your code if n_j * len(z) > 5000, but change it to something like interp_2 above if not, but there is a great deal of extrapolating in that statement! If you want to further experiment yourself, here's the code I used to produce the graphs.
n_s = np.logspace(1, 3.3, 25)
int_1 = np.empty((len(n_s),) * 2)
int_2 = np.empty((len(n_s),) * 2)
z_shape = (5,)
for i, n_i in enumerate(n_s) :
print int(n_i)
for j, n_j in enumerate(n_s) :
x, y, z = sample_data(int(n_i), int(n_j), z_shape)
int_1[i, j] = min(timeit.repeat('interp_1(x, y, z)',
'from __main__ import interp_1, x, y, z',
repeat=10, number=1))
int_2[i, j] = min(timeit.repeat('interp_2(x, y, z)',
'from __main__ import interp_2, x, y, z',
repeat=10, number=1))
cs = plt.contour(n_s, n_s, np.transpose(int_1-int_2))
plt.clabel(cs, inline=1, fontsize=10)
plt.xlabel('n_i')
plt.ylabel('n_j')
plt.title('timeit(interp_2) - timeit(interp_1), z.shape=' + str(z_shape))
plt.show()
One optimization is to allocate the result array once like so:
import numpy as np
from scipy.interpolate import interp1d
z = np.asarray([200,300,400,500,600])
out = np.zeros( [ni, len(z)], dtype=np.float32 )
for i in range(ni):
f = interp1d(x[i,:], y[i,:], kind='linear')
out[i,:]=f(z)
This will save you some memory copying that occurs in your implementation, which occurs in the calls to out.append(...).
I have three separate 1d arrays of a list of numbers, their squares and cubes (created by a 'for' loop).
I would like these arrays to appear in three corresponding columns, however I have tried the column_stack function and python says its not defined. I have read about the vstack and hstack functions but am confused about which to use and what exactly they do.
My code so far reads;
import numpy
makearange = lambda a: numpy.arange(int(a[0]),int(a[1]),int(a[2]))
x = makearange(raw_input('Enter start,stop,increment: ').split(','))
y = numpy.zeros(len(x), dtype=int)
z = numpy.zeros(len(x), dtype=int)
for i in range(len(x)):
y[i] = x[i]**2
for i in range(len(x)):
z[i] = x[i]**3
print 'original array: ',x
print 'squared array: ',y
print 'cubed array: ', z
I would appreciate any advice
Why don't you define y and z directly?
y = x**2
z = x**3
and then simply:
stacked = np.column_stack((x,y,z))
which gives you a 2D array of shape len(x) * 3
import numpy
makearange = lambda a: numpy.arange(int(a[0]),int(a[1]),int(a[2]))
x = makearange(raw_input('Enter start,stop,increment: ').split(','))
a = np.zeros((len(x),3))
a[:,0] = x
a[:,1] = x**2
a[:,2] = x**3
When using arrays you should avoid for loops as much as possible, that's kind of the point of arrays.
a = np.zeros((len(x),3)) creates an array of length same as x and with 3 columns
a[:,i] is a reference to the 'i'th column of this array (i.e. select all values (denoted by :) along this (i) column)
I would strongly recommend you look at the Numpy Tutorial.
You do want column_stack. Have you tried:
w = numpy.column_stack((x,y,z))
print(w)