Matlab Function equivalence in Python - python

I was writing code in Matlab and had to return a matrix which gave a 0 or 1 to represent elements in the original matrix.
I wanted to know if there is a python equivalent of the above without running nested loops to achieve the same result.
c = [2; 1; 3]
temp = eye(3,3)
d = temp(c,:)
the d matrix needs to tell me what number was present in my original matrix.
i = 1, j = 2 if 1 tells me the first element of the original matrix was 2

The "direct" equivalent of that code is this (note the 0-indexing, compared to matlab's 1-indexing)
import numpy
c = numpy.array( [1, 0, 2] )
temp = numpy.eye( 3 )
d = temp[c, :]
Here is the link to the documentation on how to index using 'index arrays' in the official numpy documentation
However, in general what you are doing above is called "one hot" encoding (or "one-of-K", as per Bishop2006). There are specialised methods for one hot encoding in the various machine learning toolkits, which confer some advantages, so you may prefer to look those up instead.

Related

Python: fast matrix multiplication with extra indices

I have two arrays, A and B, with dimensions (l,m,n) and (l,m,n,n), respectively. I would like to obtain an array C of dimensions (l,m,n) which is obtained by treating A and B as matrices in their fourth (A) and third and fourth indices (B). An easy way to do this is:
import numpy as np
#Define dimensions
l = 1024
m = l
n = 6
#Create some random arrays
A = np.random.rand(l,m,n)
B = np.random.rand(l,m,n,n)
C = np.zeros((l,m,n))
#Desired multiplication
for i in range(0,l):
for j in range(0,m):
C[i,j,:] = np.matmul(A[i,j,:],B[i,j,:,:])
It is, however, slow (about 3 seconds on my MacBook). What'd be the fastest, fully vectorial way to do this?
Try to use einsum.
It has many use cases, check the docs: https://numpy.org/doc/stable/reference/generated/numpy.einsum.html
Or, for more info, a really good explanation can be also found at: https://ajcr.net/Basic-guide-to-einsum/
In your case, it seems like
np.einsum('dhi,dhij->dhj',A,B)
should work. Also, you can try the optimize=True flag to get more speed, if needed.

Low Performance unpacking byte array to data structure

I've some performance trouble to put data from a byte array to the internal data structure. The data contains several nested arrays and can be extracted as the attached code. In C it takes something like one Second by reading from a stream, but in Python it takes almost one Minute. I guess indexing and calling int.from_bytes was not the best idea.
Has anybody a proposal to improve the performance?
...
ycnt = int.from_bytes(bytedat[idx:idx + 4], 'little')
idx += 4
while ycnt > 0:
ky = int.from_bytes(bytedat[idx:idx + 4], 'little')
idx += 4
dv = DataObject()
xvec.update({ky: dv})
dv.x = int.from_bytes(bytedat[idx:idx + 4], 'little')
idx += 4
dv.y = int.from_bytes(bytedat[idx:idx + 4], 'little')
idx += 4
cntv = int.from_bytes(bytedat[idx:idx + 4], 'little')
idx += 4
while cntv > 0:
dv.data_values.append(int.from_bytes(bytedat[idx:idx + 4], 'little', signed=True))
idx += 4
cntv -= 1
dv.score = struct.unpack('d', bytedat[idx:idx + 8])[0]
idx += 8
ycnt -= 1
...
First, a factor 60 between Python versus C is normal for low-level code like this. This is not where Python shines, because it doesn't get compiled down to machine-code.
Micro-Optimizations
The most obvious one is to reduce your integer math by using struct.unpack() properly. See the format string docu. Something like this:
ky, dy, dv.x, dv.y, cntv = struct.unpack('<iiiii', bytedat[idx:idx+5*4])
The second one is to load your int arrays (if they are large) "in batch" instead of the (interpreted!) while cntv > 0 loop. I would use a numpy array:
numpy.frombuffer(bytedat[idx:idx + 4*cntv], dtype='int32')
Why is not a list? A Python list contains (generic) Python objects. It requires extra memory and pointer indirection for each item. Libraries cannot use optimized C code (for example to calculate the sum) because each item has first to be dereferenced and then checked for its type.
A numpy object, on the other hand, is basically a wrapper to manage the memory of a C array. Loading it it will probably boil down to a memcpy(), or it may even just reference the bytes memory you passed.
And thirdly, instead of xvec.update({ky: dv}) you can probably write xvec[ky] = dy. This may prevent the creation of a temporary dict object.
Compiling your Python-Code
There are ways to compile Python (partially) down to machine code (PyPy, Numba, Cython). It's a bit involved, but your original byte-indexing code would then run at C speed.
However, you are filling a Python list and a dict in the inner loop. This is never going to get "C"-like fast because it will have to deal with Python objects and reference counting, even when it gets compiled down to C.
Different file format
The easiest way is to use a data format handled by a fast specialized library (like numpy, hd5, pillow, maybe even pandas).
The pickle module may also help, but only if you can control the writing and everything is trusted, and you mainly care about loading speed.
I do something similar, but big-endian.
I find that
(byte1 << 8) | byte2
to be faster than int.from_bytes() and struct.unpack().
I also find pypy3 to be at least 4x faster than python3
for this sort of stuff.

Making the nested loop run faster, e.g. by vectorization in Python

I have a 2 * N integer array ids representing intervals, where N is about a million. It looks like this
0 2 1 ...
3 4 3 ...
The ints in the arrays can be 0, 1, ... , M-1, where M <= 2N - 1. (Detail: if M = 2N, then the ints span all the 2N integers; if M < 2N, then there are some integers that have the same values.)
I need to calculate a kind of inverse map from ids. What I called "inverse map" is to see ids as intervals and capture the relation from their inner points with their indices.
Intuition Intuitively,
0 2 1
3 4 3
can be seen as
0 -> 0, 1, 2
1 -> 2, 3
2 -> 1, 2
where the right-hand-side endpoints are excluded for my problem. The "inverse" map would be
0 -> 0
1 -> 0, 2
2 -> 0, 1, 2
3 -> 1
Code I have a piece of Python code that attempts to calculate the inverse map in a dictionary inv below:
for i in range(ids.shape[1]):
for j in range(ids[0][i], ids[1][i]):
inv[j].append(i)
where each inv[j] is an array-like data initialized as empty before the nested loop. Currently I use python's built-in arrays to initialize it.
for i in range(M): inv[i]=array.array('I')
Question The nested loop above works like a mess. In my problem setting (in image processing), my first loop has a million iterations; second one about 3000 iterations. Not only it takes much memory (because inv is huge), it is also slow. I would like to focus on speed in this question. How can I accelerate this nested loop above, e.g. with vectorization?
You could try the below option, in which, your outer loop is hidden away within numpy's C-language implementation of apply_along_axis(). Not sure about about performance benefit, only a test at a decent scale can tell (especially as there's some initial overhead involved in converting lists to numpy arrays):
import numpy as np
import array
ids = [[0,2,1],[3,4,3]]
ids_arr = np.array(ids) # Convert to numpy array. Expensive operation?
range_index = 0 # Initialize. To be bumped up by each invocation of my_func()
inv = {}
for i in range(np.max(ids_arr)):
inv[i] = array.array('I')
def my_func(my_slice):
global range_index
for i in range(my_slice[0], my_slice[1]):
inv[i].append(range_index)
range_index += 1
np.apply_along_axis (my_func,0,ids_arr)
print (inv)
Output:
{0: array('I', [0]), 1: array('I', [0, 2]), 2: array('I', [0, 1, 2]),
3: array('I', [1])}
Edit:
I feel that using a dictionary might not be a good idea here. I suspect that in this particular context, dictionary-indexing might actually be slower than numpy array indexing. Use the below lines to create and initialize inv as a numpy array of Python arrays. The rest of the code can remain as-is:
inv_len = np.max(ids_arr)
inv = np.empty(shape=(inv_len,), dtype=array.array)
for i in range(inv_len):
inv[i] = array.array('I')
(Note: This assumes that your application isn't doing dict-specific stuff on inv, such as inv.items() or inv.keys(). If that's the case, however, you might need an extra step to convert the numpy array into a dict)
avoid for loop, just a pandas sample
import numpy as np
import pandas as pd
df = pd.DataFrame({
"A": np.random.randint(0, 100, 100000),
"B": np.random.randint(0, 100, 100000)
})
df.groupby("B")["A"].agg(list)
Since the order of N is large, I've come up with what seems like a practical approach; let me know if there are any flaws.
For the ith interval as [x,y], store it as [x,y,i]. Sort the arrays based on their start and end times. This should take O(NlogN) time.
Create a frequency array freq[2*N+1]. For each interval, update the frequency using the concept of range update in O(1) per update. Generating the frequencies gets done in O(N).
Determine a threshold, based on your data. According to that value, the elements can be specified as either sparse or frequent. For sparse elements, do nothing. For frequent elements only, store the intervals in which they occur.
During lookup, if there is a frequent element, you can directly access the pre-computed lists. If the element is a sparse one, you can search the intervals in O(logN) time, since the intervals are sorted and there indexes were appended in step 1.
This seems like a practical approach to me, rest depends on your usage. Like the amortized time complexity you need per query and so on.

Conjugate transpose of self using numpy syntax

I am trying to translate this MATLAB code into Python.
The following is the code:
Y=C*Up(:,1:p-1)'*Y;
And this is my translation thus far:
Y = C * Up[:, 1:p-1] * Y
I am having trouble with the syntax for the conjugate transpose of self that is used in the MATLAb code. I am not certain that my first idea:
Y = C * Up[:, 1:p-1].getH() * Y
would be correct.
Does anyone have any ideas?
I am not very experienced with numpy, but based on the comments of #hpaulj I can suggest the following:
If you don't want to be subject to the limitations of numpy.matrix objects (see warning here), you can define your own function for doing a conjugate transpose. All you need to do is transpose your array, then subtract the imaginary part of the result, times 2, from the result. I am not sure how computationally efficient this is, but it should definitely give the correct result.
I'd expect something like this to work:
Y = C * ctranspose(Up[:, 0:p-1]) * Y
...
def ctranspose(arr: np.ndarray) -> np.ndarray:
# Explanation of the math involved:
# x == Real(X) + j*Imag(X)
# conj_x == Real(X) - j*Imag(X)
# conj_x == Real(X) + j*Imag(X) - 2j*Imag(X) == x - 2j*Imag(X)
tmp = arr.transpose()
return tmp - 2j*tmp.imag
(Solution is for Python 3)
A more elegant solution based on the comment by #AndrasDeak:
Y = C * Up[:, 0:p-1].conj().T * Y
Note also, two differences related to indexing between python and MATLAB:
Python is 0-based (i.e. the first index of an array is 0, unlike in MATLAB where it's 1)
The indexing in Python is inclusive:exclusive unlike in MATLAB where it's inclusive:inclusive.
Therefore, when we want to access the first 3 elements of a vector in MATLAB we'd write:
res = vec(1:3);
In Python we'd write:
res = vec[0:3] # or [:3]
(Again, credits to #Andras for this explanation)
Use arr.conj().T to get complex conjugate of a matrix.

Matlab to Python sparse matrix conversion , overcoming the zero index problem

I have an N x N sparse matrix in Matlab, that has cell values indexed by (r,c) pairs such that r and c are unique id's.
The problem is, that after converting this matrix into Python, all of the indices values are decremented by 1.
For example:
Before After
(210058,10326) = 1 (210057,10325) = 1
Currently, I am doing the following to counter this:
mat_contents = sparse.loadmat(filename)
G = mat_contents['G']
I,J = G.nonzero()
I += 1
J += 1
V = G.data
G = sparse.csr_matrix((V,(I,J)))
I have also tried using different options in scipy.sparse.io.loadmat {matlab_compatible, mat_dtype}, but neither worked.
I am looking for a solution that will give me the same indices as the Matlab matrix. Solutions that do not require reconstructing the matrix would be ideal, but I am also curious how others have gotten around this problem.
Thank you all for the good advice.
I decided to stick with Python. I do most of my data transfers between Matlab and Python
using text files now.

Categories