I need to diagonalise a very large number of matrices.
These matrices are by themselves quite small (say a x a where a<=10) but due to
their sheer number, it takes a lot of time to diagonalise them all using a for loop
and the numpy.linalg.eig function. So I wanted to make an array of matrices, i.e.,
an array of 2D arrays, but unfortunately, Python seems to consider this to be a 3-dimensional array, gets confused and refuses to do the job. So, is there any way to prevent Python from looking at this array of 2D arrays as a 3D array?
Thanks,
A Python novice
EDIT: To be more clear, I'm not interested in this 3D array per se. Since in general, feeding an array to a function seems to be much faster than using a for loop to feed all elements one by one, I just tried to put all matrices which I need to diagonalise in an array.
If you have an 3D array like:
a = np.random.normal(size=(20,10,10))
you can then just loop through all 20 of the 10x10 arrays using:
for k in xrange(a.shape[0]):
b = np.linalg.eig(a[k,:,:])
where you would save b in a more sophisticated way. This may be what you are already doing, but you can't apply np.linalg.eig to a 3D array and have it calculate along a single axis, so you are stuck with the loop unless there is a formalism for combining all of your arrays into a single 2D array. I doubt however that that would be faster than just looping over the individual 2D arrays.
Related
Let's say I create 2 numpy arrays, one of which is an empty array and one which is of size 1000x1000 made up of zeros:
import numpy as np;
A1 = np.array([])
A2 = np.zeros([1000,1000])
When I want to change a value in A2, this seems to work fine:
A2[n,m] = 17
The above code would change the value of position [n][m] in A2 to 17.
When I try the above with A1 I get this error:
A1[n,m] = 17
IndexError: index n is out of bounds for axis 0 with size 0
I know why this happens, because there is no defined position [n,m] in A1 and that makes sense, but my question is as follows:
Is there a way to define a dynamic array without that updates the array with new rows and columns if A[n,m] = somevalue is entered when n or m or both are greater than the bound of an Array A?
It doesn't have to be in numpy, any library or method that can update array size would be awesome. If it is a method, I can imagine there being an if loop that checks if [n][m] is out of bounds and does something about it.
I am coming from a MATLAB background where it's easy to do this. I tried to find something about this in the documentation in numpy.array but I've been unsuccessful.
EDIT:
I want to know if some way to create a dynamic list is possible at all in Python, not just in the numpy library. It appears from this question that it doesn't work with numpy Creating a dynamic array using numpy in python.
This can't be done in numpy, and it technically can't be done in MATLAB either. What MATLAB is doing behind-the-scenes is creating an entire new matrix, then copying all the data to the new matrix, then deleting the old matrix. It is not dynamically resizing, that isn't actually possible because of how arrays/matrices work. This is extremely slow, especially for large arrays, which is why MATLAB nowadays warns you not to do it.
Numpy, like MATLAB, cannot resize arrays (actually, unlike MATLAB it technically can, but only if you are lucky so I would advise against trying). But in order to avoid the sort of confusion and slow code this causes in MATLAB, numpy requires that you explicitly make the new array (using np.zeros) then copy the data over.
Python, unlike MATLAB, actually does have a truly resizable data structure: the list. Lists still require there to be enough elements, since this avoids silent indexing errors that are hard to catch in MATLAB, but you can resize an array with very good performance. You can make an effectively n-dimensional list by using nested lists of lists. Then, once the list is done, you can convert it to a numpy array.
I have 400 2x2 numpy matrices, and I need to sum them all together. I was wondering if there's any better way to do it than using a for loop, as it consumes a lot of time and memory to iterate through a for loop, particularly if I have more matrices (which might be the case in the future) ?
Just figured it out. All my matrices were on a list, so I used
np.sum(<list>, axis=0)
And it gives me the resultant 2x2 matrix of the sum of all the 400 matrices!
By profiling line by line my code, I observed that the bottleneck of my program, is the following command:
A[index_array]
Where A and index_array are 2D matrices. The result is a 3D array. Right now uses the 40% of the time in a program that uses also costly commands like np.einsum and the Softmax function.
I'm new in python coding and i can't find a more efficient and faster way to write that. Could someone give an advice?
my first question, surfed a lot!
I have an array in Numpy, like
myarray=np.zeros((raws,cols))
then I have raws*cols one dimensional numpy array, all same lenght, let's say deep
then I would insert each one of this one dimensional array into myarray.
expected result:
newarray.shape
(raws,cols,deep)
I use this in a bigger function and the fact I operate this way is due to a parallelization paradigma.
Thank you in advance.
Certain functions in Numpy return a 2d matrix as output. but I want them to be in 2d array form.
What is the most efficient (memory and cpu) way to convert a 2d matrix to a 2d array?
Note that a numpy.matrix is already an ndarray subclass, and nothing more than a specialized 2D array. Hence you're most likely quite alright without converting your matrix to an explicit numpy.array unless you have a particular reason to do so, perhaps the additional generality of a Numpy array.
Should this be the case, you can convert your matrix to an array with numpy.asarray(). It's important you use this method and not numpy.asanyarray() in your case as with numpy.asanyarray() allows subclasses of ndarray to pass through, as your matrix would.