I have to implement in python3.6 a matrix-vector multiplication onto a system that can only handle objects with binary states (in the following 0 or 1). I simply have to do: y = Rx where R is a square NxN matrix (in general non-symmetric) and x is a vector with N elements.
The way I'm approaching the problem is to convert the vector x to a tuple of unsigned integers. Using numpy uint8, it becomes a vector of 8N elements, e.g.:
e_1 = (1,0,0,...,0)
e_2 = (0,1,0,...,0)
...
e_N = (0,0,0....,1)
My problem is how to convert the matrix R to a binary representation such that I can still perform a matrix multiplication to obtain the binary representation of y, which I can later convert to decimal.
For example:
x = [10, 25, 20]
R = [[2, 1, 0],
[1, 4, 0],
[0, 0, 2]]
x_b = np.unpackbits(x) # (4)*8 = (24)
R_b = some_function(R)
# calculation in decimal representation
y = np.dot(R, x)
# calculation in binary representation
y_b = np.dot(R_b, x_b)
z = np.packbits( y_b )
If the procedure makes sense, y and z should be the same.
Now, I recall from linear algebra that the binary vectors mentioned above have the same structure of the standard basis of a vector space. So, I thought that by repeatedly applying R to each of those vectors, I should be able to create a binary representation of R. My implementation of some_function is:
def some_function(R, n_bits=8):
n_cols = R.shape[0]
n_vectors = n_cols*n_bits
R_b = np.zeros([n_vectors, n_vectors], dtype='uint8')
for i in range(n_vectors):
v_bin = np.zeros(n_vectors, dtype='uint8')
v_bin[i] = 1
v_dec = np.packbits(v_bin)
u_dec = np.dot(R, v_dec)
u_bin = np.unpackbits(u_dec)
R_b[:, i] = u_bin
return R_b
However, this seems to work only if the matrix is diagonal, and if the elements on the diagonal are even. I'm quite lost at this point, but I have the feeling that this problem has been solved already long ago.
Cheers,
Riccardo
Related
I'm working on some code for dehazing images, based on this paper, and I started with an abandoned Py2.7 implementation. Since then, particularly with Numba, I've made some real performance improvements (important since I'll have to run this on 8K images).
I'm pretty convinced my last significant performance bottleneck is in performing the box filter step (I've already shaved off almost a minute per image, but this last slow step is ~30s/image), and I'm close to getting it to run as nopython in Numba:
#njit # Row dependencies means can't be parallel
def yCumSum(a):
"""
Numba based computation of y-direction
cumulative sum. Can't be parallel!
"""
out = np.empty_like(a)
out[0, :] = a[0, :]
for i in prange(1, a.shape[0]):
out[i, :] = a[i, :] + out[i - 1, :]
return out
#njit(parallel= True)
def xCumSum(a):
"""
Numba-based parallel computation
of X-direction cumulative sum
"""
out = np.empty_like(a)
for i in prange(a.shape[0]):
out[i, :] = np.cumsum(a[i, :])
return out
#jit
def _boxFilter(m, r, gpu= hasGPU):
if gpu:
m = cp.asnumpy(m)
out = __boxfilter__(m, r)
if gpu:
return cp.asarray(out)
return out
#jit(fastmath= True)
def __boxfilter__(m, r):
"""
Fast box filtering implementation, O(1) time.
Parameters
----------
m: a 2-D matrix data normalized to [0.0, 1.0]
r: radius of the window considered
Return
-----------
The filtered matrix m'.
"""
#H: height, W: width
H, W = m.shape
#the output matrix m'
mp = np.empty(m.shape)
#cumulative sum over y axis
ySum = yCumSum(m) #np.cumsum(m, axis=0)
#copy the accumulated values of the windows in y
mp[0:r+1,: ] = ySum[r:(2*r)+1,: ]
#differences in y axis
mp[r+1:H-r,: ] = ySum[(2*r)+1:,: ] - ySum[ :H-(2*r)-1,: ]
mp[(-r):,: ] = np.tile(ySum[-1,: ], (r, 1)) - ySum[H-(2*r)-1:H-r-1,: ]
#cumulative sum over x axis
xSum = xCumSum(mp) #np.cumsum(mp, axis=1)
#copy the accumulated values of the windows in x
mp[:, 0:r+1] = xSum[:, r:(2*r)+1]
#difference over x axis
mp[:, r+1:W-r] = xSum[:, (2*r)+1: ] - xSum[:, :W-(2*r)-1]
mp[:, -r: ] = np.tile(xSum[:, -1][:, None], (1, r)) - xSum[:, W-(2*r)-1:W-r-1]
return mp
There's plenty to do around the edges, but if I can get the tile operation as a nopython call, I can nopython the whole boxfilter step and get a big performance boost. I'm not super inclined to do something really really specific as I'd love to reuse this code elsewhere, but I wouldn't particularly object to it being limited to a 2D scope. For whatever reason I'm just staring at this and not really sure where to start.
np.tile is a bit too complicated to reimplement in full, but unless I'm misreading it looks like you only need to take a vector and then repeat it along a different axis r times.
A Numba-compatible way to do this is to write
y = x.repeat(r).reshape((-1, r))
Then x will be repeated r times along the second dimension, so that y[i, j] == x[i].
Example:
In [2]: x = np.arange(5)
In [3]: x.repeat(3).reshape((-1, 3))
Out[3]:
array([[0, 0, 0],
[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4]])
If you want x to be repeated along the first dimension instead, just take the transpose y.T.
I'm trying to get all eigenvalues from a 3x3 matrix by using Power Method in Python. However my method returns diffrent eigenvalues from the correct ones for some reason.
My matrix: A = [[1, 2, 3], [2, 4, 5], [3, 5,-1]]
Correct eigenvalues: [ 8.54851285, -4.57408723, 0.02557437 ]
Eigenvalues returned by my method: [ 8.5485128481521926, 4.5740872291939381, 9.148174458392436 ]
So the first one is correct, second one has wrong sign and the third one is all wrong. I don't know what I'm doing wrong and I can't see where have I made mistake.
Here's my code:
import numpy as np
import numpy.linalg as la
eps = 1e-8 # Precision of eigenvalue
def trans(v): # translates vector (v^T)
v_1 = np.copy(v)
return v_1.reshape((-1, 1))
def power(A):
eig = []
Ac = np.copy(A)
lamb = 0
for i in range(3):
x = np.array([1, 1, 1])
while True:
x_1 = Ac.dot(x) # y_n = A*x_(n-1)
x_norm = la.norm(x_1)
x_1 = x_1/x_norm # x_n = y_n/||y_n||
if(abs(lamb - x_norm) <= eps): # If precision is reached, it returns eigenvalue
break
else:
lamb = x_norm
x = x_1
eig.append(lamb)
# Matrix Deflaction: A - Lambda * norm[V]*norm[V]^T
v = x_1/la.norm(x_1)
R = v * trans(v)
R = eig[i]*R
Ac = Ac - R
return eig
def main():
A = np.array([1, 2, 3, 2, 4, 5, 3, 5, -1]).reshape((3, 3))
print(power(A))
if __name__ == '__main__':
main()
PS. Is there a simpler way to get the second and third eigenvalue from power method instead of matrix deflaction?
With
lamb = x_norm
you ever only compute the absolute value of the eigenvalues. Better compute them as
lamb = dot(x,x_1)
where x is assumed to be normalized.
As you do not remove the negative eigenvalue -4.57408723, but effectively add it instead, the largest eigenvalue in the third stage is 2*-4.574.. = -9.148.. where you again computed the absolute value.
I didn't know this method, so I googled it and found here:
http://ergodic.ugr.es/cphys/LECCIONES/FORTRAN/power_method.pdf
that it is valid only for finding the leading (largest) eigenvalue, thus, it seems that it is working for you fine, and it is not guaranteed that the following eigenvalues will be correct.
Btw. numpy.linalg.eig() works faster than your code for this matrix, but I am guessing you implemented it as an exercise.
Given the product of a matrix and a vector
A.v
with A of shape (m,n) and v of dim n, where m and n are symbols, I need to calculate the Derivative with respect to the matrix elements.
I haven't found the way to use a proper vector, so I started with 2 MatrixSymbol:
n, m = symbols('n m')
j = tensor.Idx('j')
i = tensor.Idx('i')
l = tensor.Idx('l')
h = tensor.Idx('h')
A = MatrixSymbol('A', n,m)
B = MatrixSymbol('B', m,1)
C=A*B
Now, if I try to derive with respect to one of A's elements with the indices I get back the unevaluated expression:
diff(C, A[i,j])
>>>> Derivative(A*B, A[i, j])
If I introduce the indices in C also (it won't let me use only one index in the resulting vector) I get back the product expressed as a Sum:
C[l,h]
>>>> Sum(A[l, _k]*B[_k, h], (_k, 0, m - 1))
If I derive this with respect to the matrix element I end up getting 0 instead of an expression with the KroneckerDelta, which is the result that I would like to get:
diff(C[l,h], A[i,j])
>>>> 0
I wonder if maybe I shouldn't be using MatrixSymbols to start with. How should I go about implementing the behaviour that I want to get?
SymPy does not yet know matrix calculus; in particular, one cannot differentiate MatrixSymbol objects. You can do this sort of computation with Matrix objects filled with arrays of symbols; the drawback is that the matrix sizes must be explicit for this to work.
Example:
from sympy import *
A = Matrix(symarray('A', (4, 5)))
B = Matrix(symarray('B', (5, 3)))
C = A*B
print(C.diff(A[1, 2]))
outputs:
Matrix([[0, 0, 0], [B_2_0, B_2_1, B_2_2], [0, 0, 0], [0, 0, 0]])
The git version of SymPy (and the next version) handles this better:
In [55]: print(diff(C[l,h], A[i,j]))
Sum(KroneckerDelta(_k, j)*KroneckerDelta(i, l)*B[_k, h], (_k, 0, m - 1))
I have angular data on a domain that is wrapped at pi radians (i.e. 0 = pi). The data are 2D, where one dimension represents the angle. I need to interpolate this data onto another grid in a wrapped way.
In one dimension, the np.interp function takes a period kwarg (for NumPy 1.10 and later):
http://docs.scipy.org/doc/numpy/reference/generated/numpy.interp.html
This is exactly what I need, but I need it in two dimensions. I'm currently just stepping through columns in my array and using np.interp, but this is of course slow.
Anything out there that could achieve this same outcome but faster?
An explanation of how np.interp works
Use the source, Luke!
The numpy doc for np.interp makes the source particularly easy to find, since it has the link right there, along with the documentation. Let's go through this, line by line.
First, recall the parameters:
"""
x : array_like
The x-coordinates of the interpolated values.
xp : 1-D sequence of floats
The x-coordinates of the data points, must be increasing if argument
`period` is not specified. Otherwise, `xp` is internally sorted after
normalizing the periodic boundaries with ``xp = xp % period``.
fp : 1-D sequence of floats
The y-coordinates of the data points, same length as `xp`.
period : None or float, optional
A period for the x-coordinates. This parameter allows the proper
interpolation of angular x-coordinates. Parameters `left` and `right`
are ignored if `period` is specified.
"""
Let's take a simple example of a triangular wave while going through this:
xp = np.array([-np.pi/2, -np.pi/4, 0, np.pi/4])
fp = np.array([0, -1, 0, 1])
x = np.array([-np.pi/8, -5*np.pi/8]) # Peskiest points possible }:)
period = np.pi
Now, I start off with the period != None branch in the source code, after all the type-checking happens:
# normalizing periodic boundaries
x = x % period
xp = xp % period
This just ensures that all values of x and xp supplied are between 0 and period. So, since the period is pi, but we specified x and xp to be between -pi/2 and pi/2, this will adjust for that by adding pi to all values in the range [-pi/2, 0), so that they effectively appear after pi/2. So our xp now reads [pi/2, 3*pi/4, 0, pi/4].
asort_xp = np.argsort(xp)
xp = xp[asort_xp]
fp = fp[asort_xp]
This is just ordering xp in increasing order. This is especially required after performing that modulo operation in the previous step. So, now xp is [0, pi/4, pi/2, 3*pi/4]. fp has also been shuffled accordingly, [0, 1, 0, -1].
xp = np.concatenate((xp[-1:]-period, xp, xp[0:1]+period))
fp = np.concatenate((fp[-1:], fp, fp[0:1]))
return compiled_interp(x, xp, fp, left, right) # Paraphrasing a little
np.interp does linear interpolation. When trying to interpolate between two points a and b present in xp, it only uses the values of f(a) and f(b) (i.e., the values of fp at the corresponding indices). So what np.interp is doing in this last step is to take the point xp[-1] and put it in front of the array, and take the point xp[0] and put it after the array, but after subtracting and adding one period respectively. So you now have a new xp that looks like [-pi/4, 0, pi/4, pi/2, 3*pi/4, pi]. Likewise, fp[0] and fp[-1] have been concatenated around, so fp is now [-1, 0, 1, 0, -1, 0].
Note that after the modulo operations, x had been brought into the [0, pi] range too, so x is now [7*pi/8, 3*pi/8]. Which lets you easily see that you'll get back [-0.5, 0.5].
Now, coming to your 2D case:
Say you have a grid and some values. Let's just take all values to be between [0, pi] off the bat so we don't need to worry about modulos and shufflings.
xp = np.array([0, np.pi/4, np.pi/2, 3*np.pi/4])
yp = np.array([0, 1, 2, 3])
period = np.pi
# Put x on the 1st dim and y on the 2nd dim; f is linear in y
fp = np.array([0, 1, 0, -1])[:, np.newaxis] + yp[np.newaxis, :]
# >>> fp
# array([[ 0, 1, 2, 3],
# [ 1, 2, 3, 4],
# [ 0, 1, 2, 3],
# [-1, 0, 1, 2]])
We now know that all you need to do is to add xp[[-1]] in front of the array and xp[[0]] at the end, adjusting for the period. Note how I've indexed using the singleton lists [-1] and [0]. This is a trick to ensure that dimensions are preserved.
xp = np.concatenate((xp[[-1]]-period, xp, xp[[0]]+period))
fp = np.concatenate((fp[[-1], :], fp, fp[[0], :]))
Finally, you are free to use scipy.interpolate.interpn to achieve your result. Let's get the value at x = pi/8 for all y:
from scipy.interpolate import interpn
interp_points = np.hstack(( (np.pi/8 * np.ones(4))[:, np.newaxis], yp[:, np.newaxis] ))
result = interpn((xp, yp), fp, interp_points)
# >>> result
# array([ 0.5, 1.5, 2.5, 3.5])
interp_points has to be specified as an Nx2 matrix of points, where the first dimension is for each point you want interpolation at the second dimension gives the x- and y-coordinate of that point. See this answer for a detailed explanation.
If you want to get the value outside of the range [0, period], you'll need to modulo it yourself:
x = 21 * np.pi / 8
x_equiv = x % period # Now within [0, period]
interp_points = np.hstack(( (x_equiv * np.ones(4))[:, np.newaxis], yp[:, np.newaxis] ))
result = interpn((xp, yp), fp, interp_points)
# >>> result
# array([-0.5, 0.5, 1.5, 2.5])
Again, if you want to generate interp_points for a bunch of x- and y- values, look at this answer.
I'm pretty new to Python, so I'm doing a project in it. Part of it includes a diffusion across a map. I'm implementing it by going through and making the current tile equal to .2 * the sum of its neighbors n,w,s,e. If I was doing this in C, I'd just do a double for loop that loops through an array doing arr[i*width + j] = arr of j+1, j-1, i+i, i-1 the neighbors) and have several different arrays that I'd do the same thing for (different qualities of the map I'd be changing). However, I'm not sure if this is really the fastest way in Python. Some people I have asked suggest stuff like numPy, but the width probably won't be more than ~200 (so 40-50k elements max) and I wasn't sure if the overhead is worth it. I don't really know any builtin functions to do what I want. Any advice?
edit: This will be very dense i.e. every spot is going to have a non-trivial calculation
This is quite simple to arrange with NumPy. The function np.roll returns a copy of the array, "rolled" in a specified direction.
For example, given the array x,
x=np.arange(9).reshape(3,3)
# array([[0, 1, 2],
# [3, 4, 5],
# [6, 7, 8]])
you can roll the columns to the right with
np.roll(x,shift=1,axis=1)
# array([[2, 0, 1],
# [5, 3, 4],
# [8, 6, 7]])
Using np.roll, boundaries are wrapped like on a torus. If you do not want wrapped boundaries, you could pad the array with an edge of zeros, and reset the edge to zero before every iteration.
import numpy as np
def diffusion(arr):
while True:
arr+=0.2*np.roll(arr,shift=1,axis=1) # right
arr+=0.2*np.roll(arr,shift=-1,axis=1) # left
arr+=0.2*np.roll(arr,shift=1,axis=0) # down
arr+=0.2*np.roll(arr,shift=-1,axis=0) # up
yield arr
N=5
initial=np.random.random((N,N))
for state in diffusion(initial):
print(state)
raw_input()
Use convolution.
from numpy import *
from scipy.signal import convolve2d
mapArr=array(map)
kernel=array([[0 , 0.2, 0],
[0.2, 0, 0.2],
[0 , 0.2, 0]])
diffused=convolve2d(mapArr,kernel,boundary='wrap')
Is this for the ants challenge? If so, in the ants context, convolve2d worked ~20 times faster than the loop, in my implementation.
This modification to unutbu's code maintains constant the global sum of the array while diffuses the values of it:
import numpy as np
def diffuse(arr, d):
contrib = (arr * d)
w = contrib / 8.0
r = arr - contrib
N = np.roll(w, shift=-1, axis=0)
S = np.roll(w, shift=1, axis=0)
E = np.roll(w, shift=1, axis=1)
W = np.roll(w, shift=-1, axis=1)
NW = np.roll(N, shift=-1, axis=1)
NE = np.roll(N, shift=1, axis=1)
SW = np.roll(S, shift=-1, axis=1)
SE = np.roll(S, shift=1, axis=1)
diffused = r + N + S + E + W + NW + NE + SW + SE
return diffused