Convert a MATLAB matrix object to a python NumPy array

Convert a MATLAB matrix object to a python NumPy array - python

I want to call some python code from MATLAB, in order to do this I need to convert a matrix object to a NumPy ndarray, through the MATLAB function py.numpy.array. However, passing simply the matrix object to the function does not work. At the moment I solved the problem converting the matrix to a cell of cells object, containing the rows of the matrix. For example
function ndarray = convert(mat)
% This conversion fails
ndarray = py.numpy.array(mat)
% This conversion works
cstr = cell(1, size(mat, 1));
for row = 1:size(mat, 1)
cstr(row) = {mat(row, :)};
end
ndarray = py.numpy.array(cstr);
I was wondering if it exists some more efficient solution.

Assuming your array contains double values, the error tells us exactly what we should do:
A = magic(3);
%% Attempt 1:
try
npA = py.numpy.array(A);
% Result:
% Error using py.numpy.array
% Conversion of MATLAB 'double' to Python is only supported for 1-N vectors.
catch
end
%% Attempt 2:
npA = py.numpy.array(A(:).');
% Result: OK!
Then:
>> whos npA
Name Size Bytes Class Attributes
npA 1x1 8 py.numpy.ndarray
Afterwards you can use numpy.reshape to get the original shape back, either directly in MATLAB or in Python.

Actually, using python 2.7 and Matlab R2018b, it worked with simply doing:
pyvar = py.numpy.array(var);
Matlab tells me that if I want to convert the numpy array to Matlab variable, I can just use double(pyvar)
By the way, it didn't worked with python 3.7, neither using an older version of Matlab . I don't know what this means, but I thought this might be helpful

Related

object arrays are not currently supported

I'm not sure why I'm getting this error. (I'm using python 3 on canopy)
Here is a sample of my code:
dat =pd.read_csv('myfile',sep=',')
dat=dat.values
P=np.ones((240,1))
Y=dat[:,15]
for i in range (2,15):
X= np.column_stack((P,(dat[:,i])))
Xt = np.transpose(X)
p=np.matmul(Xt,X)
I originally input my data as a dataframe but changed it to an np array, and when I check all the data types in X they are floats, and X is created okay, however, when I try to use matmul on it returns the error
TypeError: object arrays are not currently supported
I'm confused as all the elements seem to be floats but it won't multiply.
Thanks

How to convert cartesian coordinates to complex numbers in numpy

I have an array of Cartesian coordinates
xy = np.array([[0,0], [2,3], [3,4], [2,5], [5,2]])
which I want to convert into an array of complex numbers representing the same:
c = np.array([0, 2+3j, 3+4j, 2+5j, 5+2j])
My current solution is this:
c = np.sum(xy * [1,1j], axis=1)
This works but seems crude to me, and probably there is a nicer version with some built-in magic using np.complex() or similar, but the only way I found to use this was
c = np.array(list(map(lambda c: np.complex(*c), xy)))
This doesn't look like an improvement.
Can anybody point me to a better solution, maybe using one of the many numpy functions I don't know by heart (is there a numpy.cartesian_to_complex() working on arrays I haven't found yet?), or maybe using some implicit conversion when applying a clever combination of operators?

Recognize that complex128 is just a pair of floats. You can then do this using a "view" which is free, after converting the dtype from int to float (which I'm guessing your real code might already do):
xy.astype(float).view(np.complex128)
The astype() converts the integers to floats, which requires construction of a new array, but once that's done the view() is "free" in terms of runtime.
The above gives you shape=(n,1); you can np.squeeze() it to remove the extra dimension. This is also just a view operation, so takes basically no time.

How about
c=xy[:,0]+1j*xy[:,1]
xy[:,0] will give an array of all elements in the 0th column of xy and xy[:,1] will give that of the 1st column.
Multiply xy[:,1] with 1j to make it imaginary and then add the result with xy[:,0].

Why I can't use theano.tensor.argmax and theano.tensor.mean correctly

I am learning Theano now but there are always some problems.my code is as follows:
import theano
from numpy import *
import theano.tensor as T
a = [1,2,3,4]
b = [7,8,9,10]
print T.argmax(a)
I thought it would print the index of '4',but the result is:
argmax
what's more,when I am using T.neq().just as follows:
import theano
from numpy import *
import theano.tensor as T
a = [1,2,3,4]
b = [7,8,9,10]
print T.neq(a,b)
the result shows:
Elemwise{neq,no_inplace}.0
I really new to this and have no idea,did I miss anything?thank you in advance..

T.argmax() is expecting a Theano TensorVariable type. Some of the types of Variables used in Theano are listed here. Don't let the name "fully typed constructors" scare you. Think about them more in terms of what type of data you want to use as your input. Are you using float matrices? Then the relevant TensorVariable type is probably "fmatrix." Are you dealing with batches of RGB image data? Then the relevant TensorVariable type is probably "tensor4."
In your code, we are trying to input a list type into T.argmax(). So from the above point of view, that isn't going to work. Also, note that type(T.argmax(a)) is a theano.tensor.var.TensorVariable type. So it is expecting a TensorVariable as input, and it outputs a TensorVariable type as well. So this isn't going to return the actual argmax.
Okay, so what does work? How can we do this computation in Theano?
Let's first identify the type of data you want to deal with. This is going to be the starting point of our computational graph that we will be building. In this case, it looks like we want to deal with arrays or vectors. Theano has an ivector type, which is a vector of integers, or an fvector type which is an vector of float32 values. Let's stick with your data and do ivector since we have integer values:
x = T.ivector('input')
This line just created a TensorVariable x that represents our intended input type, an array of integers.
Now let's define a TensorVariable for the argmax of the elements of x:
y = T.argmax(x)
So far we have built a computational graph, which is expecting an array of integers as input and will output the argmax of that array. However, in order to actually do this, we have to compile this into a function:
get_argmax = theano.function([x], y)
The theano.function syntax can be found here.
Think of this function as now actually performing the computation that we have defined using x and y.
When I execute:
get_argmax([1,2,3,4,19,1])
It returns:
array(4)
So what did we really do? By defining Theano variables and using theano.tensor functions, we build a computational graph. We then used theano.function to compile a function that actually performs that computation on actual inputs that we specify.
To end: how to do the not equals operation?
a = T.ivector('a')
b = T.ivector('b')
out = T.neq(a,b)
get_out = theano.function([a,b], out)
print get_out([1,2,3,4], [7,8,9,10])
will return:
[1,1,1,1]
One of the key conceptual differences is that I treat the a,b as theano TensorVariables, rather than assigning them explicit variables.
You'll get the hang out of it, just remember that you need to define your computation in terms of Theano TensorVariables, and then to actually "use it" you have to compile it using theano.function.

Issue converting Matlab sparse() code to numpy/scipy with csc_matrix()

I'm a bit of a newbie to both Matlab and Python so, many apologies if this question is a bit dumb...
I'm trying to convert some Matlab code over to Python using numpy and scipy and things were going fine until I reached the sparse matrix that someone wrote. The Matlab code goes like:
unwarpMatrix = sparse(phaseOrigin, ceil([1:nRead*nSlice*nPhaseDmap]/expan), 1, numPoints, numPoints)/expan;
Here's my python code (with my thought process) leading up to my attempt at conversion. For a given dataset I was testing with (in both Matlab and Python):
nread = 64
nslice = 28
nphasedmap = 3200
expan = 100
numpoints = 57344
Thus, the length of phaseorigin, s, and j arrays are 5734400 (and I've confirmed the functions that create my phaseorigin array output exactly the same result that Matlab does)
#Matlab sparse takes: S = sparse(i,j,s,m,n)
#Generates an m by n sparse matrix such that: S(i(k),j(k)) = s(k)
#scipy csc matrix takes: csc_matrix((data, ij), shape=(M, N))
#Matlab code is: unwarpMatrix = sparse(phaseOrigin, ceil([1:nRead*nSlice*nPhaseDmap]/expan), 1, numPoints, numPoints)/expan;
size = nread*nslice*nphasedmap
#i would be phaseOrigin variable
j = np.ceil(np.arange(1,size+1, dtype=np.double)/expan)
#Matlab apparently treats '1' as a scalar so I should be tiling 1 to the same size as j and phaseorigin
s = np.tile(1,size)
unwarpmatrix = csc_matrix((s,(phaseorigin, j)), shape=(numpoints,numpoints))/expan
so when I try to run my python code I get:
ValueError: column index exceedes matrix dimensions
This doesn't occur when I run the Matlab code even though the array sizes are larger than the defined matrix size...
What am I doing wrong? I've obviously screwed something up... Thanks very much in advance for any help!

The problem is; Python indexes start from 0, whereas Matlab indexes start from 1. So for an array of size 57344, in Python first element would be arr[0] and last element would be arr[57343].
You variable j has values from 1 to 57344. You probably see the problem. Creating your j like this would solve the problem:
j = np.floor(np.arange(0,size, dtype=np.double)/expan)
Still, better to check this before using...

What is the equivalent of 'fread' from Matlab in Python?

I have practically no knowledge of Matlab, and need to translate some parsing routines into Python. They are for large files, that are themselves divided into 'blocks', and I'm having difficulty right from the off with the checksum at the top of the file.
What exactly is going on here in Matlab?
status = fseek(fid, 0, 'cof');
fposition = ftell(fid);
disp(' ');
disp(['** Block ',num2str(iBlock),' File Position = ',int2str(fposition)]);
% ----------------- Block Start ------------------ %
[A, count] = fread(fid, 3, 'uint32');
if(count == 3)
magic_l = A(1);
magic_h = A(2);
block_length = A(3);
else
if(fposition == file_length)
disp(['** End of file OK']);
else
disp(['** Cannot read block start magic ! Note File Length = ',num2str(file_length)]);
end
ok = 0;
break;
end
fid is the file currently being looked at
iBlock is a counter for which 'block' you're in within the file
magic_l and magic_h are to do with checksums later, here is the code for that (follows straight from the code above):
disp(sprintf(' Magic_L = %08X, Magic_H = %08X, Length = %i', magic_l, magic_h, block_length));
correct_magic_l = hex2dec('4D445254');
correct_magic_h = hex2dec('43494741');
if(magic_l ~= correct_magic_l | magic_h ~= correct_magic_h)
disp(['** Bad block start magic !']);
ok = 0;
return;
end
remaining_length = block_length - 3*4 - 3*4; % We read Block Header, and we expect a footer
disp(sprintf(' Remaining Block bytes = %i', remaining_length));
What is going on with the %08X and the hex2dec stuff?
Also, why specify 3*4 instead of 12?
Really though, I want to know how to replicate [A, count] = fread(fid, 3, 'uint32'); in Python, as io.readline() is just pulling the first 3 characters of the file. Apologies if I'm missing the point somewhere here. It's just that using io.readline(3) on the file seems to return something it shouldn't, and I don't understand how the block_length can fit in a single byte when it could potentially be very long.
Thanks for reading this ramble. I hope you can understand kind of what I want to know! (Any insight at all is appreciated.)

Python Code for Reading a 1-Dimensional Array
When replacing Matlab with Python, I wanted to read binary data into a numpy.array, so I used numpy.fromfile to read the data into a 1-dimensional array:
import numpy as np
with open(inputfilename, 'rb') as fid:
data_array = np.fromfile(fid, np.int16)
Some advantages of using numpy.fromfile versus other Python solutions include:
Not having to manually determine the number of items to be read. You can specify them using the count= argument, but it defaults to -1 which indicates reading the entire file.
Being able to specify either an open file object (as I did above with fid) or you can specify a filename. I prefer using an open file object, but if you wanted to use a filename, you could replace the two lines above with:
data_array = numpy.fromfile(inputfilename, numpy.int16)
Matlab Code for a 2-Dimensional Array
Matlab's fread has the ability to read the data into a matrix of form [m, n] instead of just reading it into a column vector. For instance, to read data into a matrix with 2 rows use:
fid = fopen(inputfilename, 'r');
data_array = fread(fid, [2, inf], 'int16');
fclose(fid);
Equivalent Python Code for a 2-Dimensional Array
You can handle this scenario in Python using Numpy's shape and transpose.
import numpy as np
with open(inputfilename, 'rb') as fid:
data_array = np.fromfile(fid, np.int16).reshape((-1, 2)).T
The -1 tells numpy.reshape to infer the length of the array for that dimension based on the other dimension—the equivalent of Matlab's inf infinity representation.
The .T transposes the array so that it is a 2-dimensional array with the first dimension—the axis—having a length of 2.

From the documentation of fread, it is a function to read binary data. The second argument specifies the size of the output vector, the third one the size/type of the items read.
In order to recreate this in Python, you can use the array module:
f = open(...)
import array
a = array.array("L") # L is the typecode for uint32
a.fromfile(f, 3)
This will read read three uint32 values from the file f, which are available in a afterwards. From the documentation of fromfile:
Read n items (as machine values) from the file object f and append them to the end of the array. If less than n items are available, EOFError is raised, but the items that were available are still inserted into the array. f must be a real built-in file object; something else with a read() method won’t do.
Arrays implement the sequence protocol and therefore support the same operations as lists, but you can also use the .tolist() method to create a normal list from the array.

Really though, I want to know how to replicate [A, count] = fread(fid, 3, 'uint32');
In Matlab, one of fread()'s signatures is fread(fileID, sizeA, precision). This reads in the first sizeA elements (not bytes) of a file, each of a size sufficient for precision. In this case, since you're reading in uint32, each element is of size 32 bits, or 4 bytes.
So, instead, try io.readline(12) to get the first 3 4-byte elements from the file.

The first part is covered by Torsten's answer... you're going to need array or numarray to do anything with this data anyway.
As for the %08X and the hex2dec stuff, %08X is just the print format for those unit32 numbers (8 digit hex, exactly the same as Python), and hex2dec('4D445254') is matlab for 0x4D445254.
Finally, ~= in matlab is a bitwise compare; use == in Python.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Convert a MATLAB matrix object to a python NumPy array - python

Related

object arrays are not currently supported

How to convert cartesian coordinates to complex numbers in numpy

Why I can't use theano.tensor.argmax and theano.tensor.mean correctly

Issue converting Matlab sparse() code to numpy/scipy with csc_matrix()

What is the equivalent of 'fread' from Matlab in Python?

Categories

Resources