Efficiently unwrap in multiple dimensions with numpy

Efficiently unwrap in multiple dimensions with numpy - python

Let's assume I have an array of phases (from complex numbers)
A = np.angle(np.random.uniform(-1,1,[10,10,10]) + 1j*np.random.uniform(-1,1,[10,10,10]))
I would now like to unwrap this array in ALL dimensions. In the above 3D case I would do
A_unwrapped = np.unwrap(np.unwrap(np.unwrap(A,axis=0), axis=1),axis=2)
While this is still feasible in the 3D case, in case of higher dimensionality, this approach seems a little cumbersome to me. Is there a more efficient way to do this with numpy?

You could use np.apply_over_axes, which is supposed to apply a function over each dimension of an array in turn:
np.apply_over_axes(np.unwrap, A, np.arange(len(A.shape)))
I believe this should do it.

I'm not sure if there is a way to bypass performing the unwrap operation along each axis. Obviously if it acted on individual elements you could use vectorization, but that doesn't seem to be an option here. What you can do that will at least make the code cleaner is create a loop over the dimensions:
for dim in range(len(A.shape)):
A = np.unwrap(A, axis=dim)
You could also repeatedly apply a function that takes the dimension on which to operate as a parameter:
reduce(lambda A, axis: np.unwrap(A, axis=axis), range(len(A.shape)), A)
Remember that in Python 3 reduce needs to be imported from functools.

Related

Are several subsequent `reshape` operations equivalent to only one?

In Pytorch,
I recently stumbled onto code that looks like this.
# initially, my_tensor is one_dimensional, of length b*x*y
my_tensor = my_tensor.reshape(b, x, y)
my_tensor = my_tensor.reshape(b, x*y)
Is it equivalent to only writing the second line?
my_tensor = my_tensor.reshape(b, x*y)
And in general, is doing several reshape operations always equivalent to only doing the last one?
Intuitively, I think so, but the documentation for reshape doesn’t really mention any invariant, and I couldn’t find information for the inner representation of tensors and how reshape changed that

The reshape operation does not (need to) touch the underlying data. It simply adjusts the "meta-data" about the dimensions. So a series of reshape operations (without any operations in between!!) is equal to a single reshape operation.

Is there a faster Numpy function to do an add.at, but with two sets of indices?

I'm trying to do an operation that is pretty similar to a numpy.add.at, but with two pairs of indices, and I'm wondering if there's a faster way to do this with numpy or something else rather than a for loop, which is running pretty slowly.
The following works, but I'm trying to do it faster:
for x,y in indices:
A[B[x,y]] += C[x,y]
where the values obtained for B[x,y] will have a lot of duplicates, so B[1,1] may be equal to B[1,2]
numpy.add.at(A, indices, C) is pretty close, but doesn't get me there, as B basically maps the indices into another space. I'm hoping there's a faster way to do this with numpy or something else, probably without an explicit loop.

Fastest way to refer to np.array elements that meet certain conditions

I have some troubles with numpy arrays of many dimensions. Let's take a generic one, and call it A. Without be too specific, I would like to do, in the most efficient way, these following things:
1) A[A>0] = something
2) Suppose I have another multidimensional numpy array with only 1s or 0s as values, say I, and suppose it has the proper shape. I would like to say something like A[I] = something
I am aware that these expressions already work, if the dimensions are right, but I think they are not efficient enough.

How to make Numpy treat each row/tensor as a value

Many functions like in1d and setdiff1d are designed for 1-d array. One workaround to apply these methods on N-dimensional arrays is to make numpy to treat each row (something more high dimensional) as a value.
One approach I found to do so is in this answer Get intersecting rows across two 2D numpy arrays by Joe Kington.
The following code is taken from this answer. The task Joe Kington faced was to detect common rows in two arrays A and B while trying to use in1d.
import numpy as np
A = np.array([[1,4],[2,5],[3,6]])
B = np.array([[1,4],[3,6],[7,8]])
nrows, ncols = A.shape
dtype={'names':['f{}'.format(i) for i in range(ncols)],
'formats':ncols * [A.dtype]}
C = np.intersect1d(A.view(dtype), B.view(dtype))
# This last bit is optional if you're okay with "C" being a structured array...
C = C.view(A.dtype).reshape(-1, ncols)
I am hoping you to help me with any of the following three questions. First, I do not understand the mechanisms behind this method. Can you try to explain it to me?
Second, is there other ways to let numpy treat an subarray as one object?
One more open question: dose Joe's approach have any drawbacks? I mean whether treating rows as a value might cause some problems? Sorry this question is pretty broad.

Try to post what I have learned. The method Joe used is called structured arrays. It will allow users to define what is contained in a single cell/element.
We take a look at the description of the first example the documentation provided.
x = np.array([(1,2.,'Hello'), (2,3.,"World")], ...
dtype=[('foo', 'i4'),('bar', 'f4'), ('baz', 'S10')])
Here we have created a one-dimensional array of length 2. Each element
of this array is a structure that contains three items, a 32-bit
integer, a 32-bit float, and a string of length 10 or less.
Without passing in dtype, however, we will get a 2 by 3 matrix.
With this method, we would be able to let numpy treat a higher dimensional array as an single element with properly set dtype.
Another trick Joe showed is that we don't need to really form a new numpy array to achieve the purpose. We can use the view function (See ndarray.view) to change the way numpy view data. There is a section of Note section in ndarray.view that I think you should take a look before utilizing the method. I have no guarantee that there would not be side effects. The paragraph below is from the note section and seems to call for caution.
For a.view(some_dtype), if some_dtype has a different number of bytes per entry than the previous dtype (for example, converting a regular array to a structured array), then the behavior of the view cannot be predicted just from the superficial appearance of a (shown by print(a)). It also depends on exactly how a is stored in memory. Therefore if a is C-ordered versus fortran-ordered, versus defined as a slice or transpose, etc., the view may give different results.
Other reference
https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.dtypes.html
https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.dtype.html

Python/Numpy: Divide array

I have some data represented in a 1300x1341 matrix. I would like to split this matrix in several pieces (e.g. 9) so that I can loop over and process them. The data needs to stay ordered in the sense that x[0,1] stays below (or above if you like) x[0,0] and besides x[1,1].
Just like if you had imaged the data, you could draw 2 vertical and 2 horizontal lines over the image to illustrate the 9 parts.
If I use numpys reshape (eg. matrix.reshape(9,260,745) or any other combination of 9,260,745) it doesn't yield the required structure since the above mentioned ordering is lost...
Did I misunderstand the reshape method or can it be done this way?
What other pythonic/numpy way is there to do this?

Sounds like you need to use numpy.split() which has its documentation here ... or perhaps its sibling numpy.array_split() here. They are for splitting an array into equal subsections without re-arranging the numbers like reshape does,
I haven't tested this but something like:
numpy.array_split(numpy.zeros((1300,1341)), 9)
should do the trick.

reshape, to quote its docs,
Gives a new shape to an array without
changing its data.
In other words, it does not move the array's data around at all -- it just affects the array's dimension. You, on the other hand, seem to require slicing; again quoting:
It is possible to slice and stride
arrays to extract arrays of the same
number of dimensions, but of different
sizes than the original. The slicing
and striding works exactly the same
way it does for lists and tuples
except that they can be applied to
multiple dimensions as well.
So for example thearray[0:260, 0:745] is the "upper leftmost part, thearray[260:520, 0:745] the upper left-of-center part, and so forth. You could have references to the various parts in a list (or dict with appropriate keys) to process them separately.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Efficiently unwrap in multiple dimensions with numpy - python

You could use np.apply_over_axes, which is supposed to apply a function over each dimension of an array in turn: np.apply_over_axes(np.unwrap, A, np.arange(len(A.shape))) I believe this should do it.

Related

Are several subsequent `reshape` operations equivalent to only one?

Is there a faster Numpy function to do an add.at, but with two sets of indices?

Fastest way to refer to np.array elements that meet certain conditions

How to make Numpy treat each row/tensor as a value

Python/Numpy: Divide array

Categories

Resources