Numpy modify ndarray diagonal - python

is there any way in numpy to get a reference to the array diagonal?
I want my array diagonal to be divided by a certain factor
Thanks

If X is your array and c is the factor,
X[np.diag_indices_from(X)] /= c
See diag_indices_from in the Numpy manual.

A quick way to access the diagonal of a square (n,n) numpy array is with arr.flat[::n+1]:
n = 1000
c = 20
a = np.random.rand(n,n)
a[np.diag_indices_from(a)] /= c # 119 microseconds
a.flat[::n+1] /= c # 25.3 microseconds

The np.fill_diagonal function is quite fast:
np.fill_diagonal(a, a.diagonal() / c)
where a is your array and c is your factor. On my machine, this method was as fast as #kwgoodman's a.flat[::n+1] /= c method, and in my opinion a bit clearer (but not as slick).

Comparing the above 3 methods:
import numpy as np
import timeit
n = 1000
c = 20
a = np.random.rand(n,n)
a1 = a.copy()
a2 = a.copy()
a3 = a.copy()
t1 = np.zeros(1000)
t2 = np.zeros(1000)
t3 = np.zeros(1000)
for i in range(1000):
start = timeit.default_timer()
a1[np.diag_indices_from(a1)] /= c
stop = timeit.default_timer()
t1[i] = start-stop
start = timeit.default_timer()
a2.flat[::n+1] /= c
stop = timeit.default_timer()
t2[i] = start-stop
start = timeit.default_timer()
np.fill_diagonal(a3,a3.diagonal() / c)
stop = timeit.default_timer()
t3[i] = start-stop
print([t1.mean(), t1.std()])
print([t2.mean(), t2.std()])
print([t3.mean(), t3.std()])
[-4.5693619907979154e-05, 9.3142851395411316e-06]
[-2.338075107036275e-05, 6.7119609571872443e-06]
[-2.3731951987429056e-05, 8.0455946813059586e-06]
So you can see that the np.flat method is the fastest but marginally. When I ran this for a few more times there were times when the fill_diagonal method was slightly faster. But readability wise its probably worth using the fill_diagonal method.

Related

Same FLOPs different runtime. Explanation?

For an m-by-n matrix, A, where m is much greater than n, the FLOPs to do QR factorization on A is 2mn^2. With another n-by-m matrix, B, the FLOPs to do B*A (matrix multiplication) is also 2mn^2. However, both MATLAB and Python show that matrix multiplication is faster than QR factorization, on my computer by a factor of about 6. I wonder how do we interpret this phenomenon correctly?
Below are my Python codes
import numpy as np
import time
t1 = t2 = 0
m = 5000
n = 100
for k in range(100):
A = np.random.rand(m,n)-0.5
B = np.random.rand(m,n)-0.5
tic = time.time()
Q,R = np.linalg.qr(A)
t1 += time.time()-tic
tic = time.time()
C = B.T # A
t2 += time.time()-tic
print('t1=%.3f, t2=%.3f'%(t1,t2))
Example outcome:
t1=2.246, t2=0.353

What is the best way to show the percentage value of each element against the column total using Python 3 and Numpy?

Suppose I have a 30000000 x 3 matrix of random integers. I want to show the percentage value of each element against the column total using Python 3 and Numpy. Below are four approaches I used.
Is there a better way anyone can suggest please?
I find method 2 is slightly slower than method 1, despite one less For loop. Any idea why?
Method 3 and 4 is best as it seems to use python vectorization.
Times for each method on my laptop: Method 1 takes 12000 millisecs, Method 2 takes 13500 ms, Method 3 takes 240 ms, Method 4 takes 254 ms.
#
import numpy as np
import time
#
## make matrix of 10,000 x 3 of random integers
a = (np.random.rand(30000000).reshape(-1,3) * 10).astype(int)
nRows = a.shape[0]
nCols = a.shape[1]
print(f"nRows = {nRows}\tnCols = {nCols}\n\n")
#
## METHOD 1: with two for loops
tic = time.time()
ans = np.zeros(a.shape)
for coli in range(nCols):
colsum = np.sum(a[:,coli])
for rowi in range(nRows):
ans[rowi, coli] = a[rowi, coli] / colsum
ans *= 100
print(f"{ans}")
toc = time.time()
print(f"method 1 = {1000*(toc-tic)} ms\n")
#
## METHOD 2: -- one less for loop
tic = time.time()
ans = np.zeros(a.shape)
colTotals = np.sum(a, axis=0)
for rowi in range(nRows):
ans[rowi] = a[rowi] / colTotals
ans *= 100
print(f"\n{ans}")
toc = time.time()
print(f"method 2 = {1000*(toc-tic)} ms\n")
#
## METHOD 3: fastest way -- no for loops
tic = time.time()
ans = np.zeros(a.shape)
colTotals = np.sum(a, axis=0)
ans = a / colTotals.reshape(1, nCols)
ans *= 100
print(f"\n{ans}")
toc = time.time()
print(f"method 3 = {1000*(toc-tic)} ms\n")
#
## METHOD 4: fastest way -- no for loops
tic = time.time()
ans = np.zeros(a.shape)
colTotals = np.sum(a, axis=0).reshape(1,-1)
colTotals = np.repeat(colTotals, nRows, axis=0)
ans = a / colTotals
ans *= 100
print(f"{ans}")
toc = time.time()
print(f"method 4 = {1000*(toc-tic)} ms\n")
#

Python Non-Loop Way to change array values based on row and column position

I am trying to change numpy-array values based on their column and row location and currently am achieving it this way:
for r in range(ResultArr2.shape[0]):
for c in range(ResultArr2.shape[1]):
ResultArr2[r,c] = ResultArr2[r,c]-r*1000-c*500
Is there a non-loop way of achieving the same result? I know that Python often works faster if one implements non-loop structure, but I could not find out how to do this.
Here are a few variants using either mgrid or ogrid or manually creating the same ranges that ogrid generates.
Observations:
for an array of size 1000, the fastest method is more than three times faster than mgrid
using ogrid or manual it is a bit better to add the two ranges separately, thereby avoiding a full size temporary
conveniences such as mgrid or ogrid tend to come at a cost in numpy, and indeed the manual method is twice as fast as ogrid
Code:
import numpy as np
from timeit import timeit
A = np.arange(1000).reshape(20, 50)
def f():
B = A.copy()
m, n = B.shape
I, J = np.mgrid[:m*1000:1000, :n*500:500]
B += I+J
return B
def g():
B = A.copy()
m, n = B.shape
I, J = np.ogrid[:m*1000:1000, :n*500:500]
B += I+J
return B
def h():
B = A.copy()
m, n = B.shape
I, J = np.ogrid[:m*1000:1000, :n*500:500]
B += I
B += J
return B
def i():
B = A.copy()
m, n = B.shape
BT = B.T
BT += np.arange(0, 1000*m, 1000)
B += np.arange(0, 500*n, 500)
return B
def j():
B = A.copy()
m, n = B.shape
B += np.arange(0, 1000*m, 1000)[:, None]
B += np.arange(0, 500*n, 500)
return B
assert np.all(f()==h())
assert np.all(g()==h())
assert np.all(i()==h())
assert np.all(j()==h())
print(timeit(f, number=10000))
print(timeit(g, number=10000))
print(timeit(h, number=10000))
print(timeit(i, number=10000))
print(timeit(j, number=10000))
Sample run:
0.289166528998976 # mgrid
0.25259370900130307 # ogrid 1 step
0.24528862700026366 # ogrid 2 steps
0.09056068700010655 # manual transpose
0.08238107499892067 # manual add dim
You can use np.mgrid:
arr = np.random.uniform(size=(5,5))
n_rows, n_cols = arr.shape
r, c = np.ogrid[0:n_rows, 0:n_cols]
arr -= 1000 * r + 500 * c

Fastest way of using function on 3D array/matrix to create a new 3D array/matrix

So what I have, or want to create, is a 3D array consisting of different parameters I can then use a function on to create a new 3D array (of same size) with the results from the function. Basically I have something like this (R code) :
x <- seq(0,1,0.01)
y <- seq(0,1,0.01)
z <- seq(0,100,0.1)
And let's say I have a function that just is just:
result = x*data_point + y^2 + z^3
In principle I could probably just make three loops, and save it into a array(or something like that), but I would think that would take A LOT of computation time, especially if this step has to be done for several data-points. In this case that would mean approximately 10.000.000 calculations per data-point - and I have about a thousand. So in total around 10 billion calculations.
I understand that in order to get this resulting matrix it will take some time, no matter what, but are there some steps I can do to do it as fast as possible, or is looping the best way ? I also need to be able to go back and say: "I want x = 0.2, y = 0.2, and z = 10 on data-point 5".
A solution in R would be the best, but if it can be done a lot faster in Python, that will work just as well.
The fastest way is to use Numpy's broadcasting (or here).I modified the code from #EternusVia and it is about 14 times faster than his faster version. Avoid for loops wherever possible :)
import numpy as np
import time
# number of parameter values and patients
nx=100;
ny=100;
nz=100;
n_data=100;
# dummy data
x = np.linspace(0,1,nx);
y = np.linspace(1,2,ny);
z = np.linspace(2,3,nz);
data = np.linspace(0,100,n_data);
result2 = np.empty((n_data,nx,ny,nz));
# method 2 from #EternusVia
start = time.time()
y2=np.power(y,2);
z3=np.power(z,3);
for l in range(0,n_data):
for i in range(0,nx):
for j in range(0,ny):
result2[l,i,j,:]=x[i]*data[l]+y2[j]+z3[:]
end = time.time()
print(end-start)
# method 3 using Numpy broadcasting
# expand the dimensions of the array depending on where
# they are in the final array
x_bc = x[np.newaxis, :, np.newaxis, np.newaxis]
y_bc = y[np.newaxis, np.newaxis, :, np.newaxis]
z_bc = z[np.newaxis, np.newaxis, np.newaxis, :]
data_bc = data[:, np.newaxis, np.newaxis, np.newaxis]
start = time.time()
# just write the equation, broadcasting will to the rest
# of the magic and calculate the results element-wise
result3 = x_bc * data_bc + np.power(y_bc, 2) + np.power(z_bc, 3)
end = time.time()
print(end-start)
print(np.array_equal(result2,result3))
Here are two ways to implement your problem in Python; I timed both. Running the first method on my machine for 100^4 elements took about 2 minutes, while the second method took only 4 seconds.
import numpy as np
import time
# number of parameter values and patients
nx=100;
ny=100;
nz=100;
n_data=100;
# dummy data
x = np.linspace(0,1,nx);
y = np.linspace(1,2,ny);
z = np.linspace(2,3,nz);
data = np.linspace(0,100,n_data);
result1 = np.empty((n_data,nx,ny,nz));
result2 = np.empty((n_data,nx,ny,nz));
# method 1
start = time.time()
y2=np.power(y,2);
z3=np.power(z,3);
for l in range(0,n_data):
for i in range(0,nx):
for j in range(0,ny):
for k in range(0,nz):
result1[l,i,j,k] = x[i]*data[l]+y2[j]+z3[k]
end = time.time()
print(end-start)
# method 2
start = time.time()
y2=np.power(y,2);
z3=np.power(z,3);
for l in range(0,n_data):
for i in range(0,nx):
for j in range(0,ny):
result2[l,i,j,:]=x[i]*data[l]+y2[j]+z3[:]
end = time.time()
print(end-start)
print(np.array_equal(result1,result2))
Output:
133.110018015
4.36485505104
True
Are you looking for numpy.mgrid?
import numpy as np
x, y, z = np.mgrid[0:1:0.01, 0:1:0.01, 0:100:0.1]
data = np.mgrid[0:100:0.1] # could use np.arange here, but why?
# this will take some time
result = x * data[..., np.newaxis, np.newaxis, np.newaxis] + y**2 + z**3
print(result.shape) # (100, 100, 100, 1000)

Intersection of two arrays, retaining order in larger array

I have a numpy array a of length n, which has the numbers 0 through n-1 shuffled in some way. I also have a numpy array mask of length <= n, containing some subset of the elements of a, in a different order.
The query I want to compute is "give me the elements of a that are also in mask in the order that they appear in a".
I had a similar question here, but the difference was that mask was a boolean mask instead of a mask on the individual elements.
I've outlined and tested 4 methods below:
import timeit
import numpy as np
import matplotlib.pyplot as plt
n_test = 100
n_coverages = 10
np.random.seed(0)
def method1():
return np.array([x for x in a if x in mask])
def method2():
s = set(mask)
return np.array([x for x in a if x in s])
def method3():
return a[np.in1d(a, mask, assume_unique=True)]
def method4():
bmask = np.full((n_samples,), False)
bmask[mask] = True
return a[bmask[a]]
methods = [
('naive membership', method1),
('python set', method2),
('in1d', method3),
('binary mask', method4)
]
p_space = np.linspace(0, 1, n_coverages)
for n_samples in [1000]:
a = np.arange(n_samples)
np.random.shuffle(a)
for label, method in methods:
if method == method1 and n_samples == 10000:
continue
times = []
for coverage in p_space:
mask = np.random.choice(a, size=int(n_samples * coverage), replace=False)
time = timeit.timeit(method, number=n_test)
times.append(time * 1e3)
plt.plot(p_space, times, label=label)
plt.xlabel(r'Coverage ($\frac{|\mathrm{mask}|}{|\mathrm{a}|}$)')
plt.ylabel('Time (ms)')
plt.title('Comparison of 1-D Intersection Methods for $n = {}$ samples'.format(n_samples))
plt.legend()
plt.show()
Which produced the following results:
So, binary mask, is, without a doubt, the fastest method of these 4 for any size of the mask.
My question is, is there a faster way?
So, binary mask, is, without a doubt, the fastest method of these 4 for any size of the mask.
My question is, is there a faster way?
I totally agree that binary mask method is the fastest one. I also don't think there could be any better ways in terms of computation complexity to do what you need.
Let me analyse your method time results:
Method running time is T = O(|a|*|mask|) time. Every element of a is checked to be present in mask by iterating over every its element. It gives O(|mask|) time per element in the worst case when element is missing in mask. |a| does not change,
consider it a constant.
|mask| = coverage * |a|
T = O(|a|2 * coverage)
Hence a linear dependency of coverage in plot. Note that running time has quadratic dependency of |a|. If |mask| ≤ |a| and |a| = n then T = O(n2)
Second method is using set. Set is a data-structure that performs operations of insertion/lookup in O(log(n)), where n is a number of elements in the set. s = set(mask) takes O(|mask|*log(|mask|)) to complete because there are |mask| insertion operations.
x in s is a lookup operation. So second row runs in O(|a|*log(|mask|))
Overall time complexity is O(|mask|*log(|mask|) + |a|*log(|mask|)). If |mask| ≤ |a| and |a| = n then T = O(n*log(n)). You probably observe f(x) = log(x) dependency on plot.
in1d runs in O(|mask|*log(|mask|) + |a|*log(|mask|)) as well. Same T = O(n*log(n)) complexity and f(x) = log(x) dependency on plot.
Time complexity is O(|a| + |mask|) which is T = O(n) and its the best. You observe constant dependency on plot. Algorithm simply iterates over a and mask arrays couple of times.
The thing is that if you have to output n items you will already have T = O(n) complexity. So this method 4 algorithm is optimal.
P.S. In order to observe mentioned f(n) dependencies you'd better vary |a| and let |mask| = 0.9*|a|.
EDIT: Looks like python set indeed performs lookup/insert in O(1) using hash table.
Assuming a is the bigger one.
def with_searchsorted(a, b):
sb = b.argsort()
bs = b[sb]
sa = a.argsort()
ia = np.arange(len(a))
ra = np.empty_like(sa)
ra[sa] = ia
ac = bs.searchsorted(ia) % b.size
return a[(bs[ac] == ia)[ra]]
demo
a = np.arange(10)
np.random.shuffle(a)
b = np.random.choice(a, 5, False)
print(a)
print(b)
[7 2 9 3 0 4 8 5 6 1]
[0 8 5 4 6]
print(with_searchsorted(a, b))
[0 4 8 5 6]
how it works
# sort b for faster searchsorting
sb = b.argsort()
bs = b[sb]
# sort a for faster searchsorting
sa = a.argsort()
# this is the sorted a... we just cheat because we know what it will be
ia = np.arange(len(a))
# construct the reverse sort look up
ra = np.empty_like(sa)
ra[sa] = ia
# perform searchsort
ac = bs.searchsorted(ia) % b.size
return a[(bs[ac] == ia)[ra]]

Categories