Multidimensional indexing and mapping

Multidimensional indexing and mapping - python

I see that numpy has many indexing facilities, but still couldn't get them to do what I need.
First, assume there are two one-dimensional arrays A, I of the same shape, a one-dimensional array B which can be indexed with elements of I, and a three-argument function f. Then the result I need can be achieved like starmap(f, zip(A, I, B[I])) (starmap and zip are from pure python, not numpy). So far, so good...
But actually, all the arrays are two-dimensional and I'd like get two-dimensional result as well, which is equivalent to applying the same function as above to each row of the arrays - this is what I now do in a loop.
Are there better ways to do this, than just looping?
UPD:
For example, with one dimensional arrays:
A = np.random.randint(0, 10, size=(3,))
B = np.random.randint(0, 10, size=(5,))
I = np.random.randint(0, 5, size=(3,))
def f(a, i, b):
return (a, i, b)
print A, I, B
print list(starmap(f, zip(A, I, B[I])))
And for two-dimensional:
A = np.random.randint(0, 10, size=(2, 3))
B = np.random.randint(0, 10, size=(2, 5))
I = np.random.randint(0, 5, size=(2, 3))
def f(a, i, b):
return (a, i, b)
print A
print I
print B
print [list(starmap(f, zip(A_row, I_row, B_row[I_row])))
for A_row, I_row, B_row in zip(A, I, B)]

If it's just a normal function, your overhead is going to be primarily function overhead. You could do the zip with an hstack or something, but it's not going to help speed-wise.
If you're just looking for a neater way of doing it, try
x_indexes, _ = numpy.ogrid[:len(I), :0]
numpy.vectorize(f)(A, I, B[x_indexes, I])
Normally a higher-level view can let you vectorize the whole thing, which will be much faster. That's worth keeping in mind if this ends up slow.

Related

Mixing slices with indexes to mgrid input in NumPy

np.mgrid accepts tuple of slices, like np.mgrid[1:3, 4:8] or np.mgrid[np.s_[1:3, 4:8]].
But is there a way to mix both slices and arrays of indexes in a tuple argument to mgrid? E.g.:
extended_mgrid(np.s_[1:3, 4:8] + (np.array([1,2,3]), np.array([7,8])))
should give same results as
np.mgrid[1:3, 4:8, 1:4, 7:9]
But in general an array of indexes inside a tuple may not be representable as a slice.
Solving this task is needed to be able to create N-D tuple of indexes provided a mix of slicing + indexing using np.mgrid like in this my answer for another question.

Task solved with help of #hpaulj using np.meshgrid.
Try it online!
import numpy as np
def extended_mgrid(i):
res = np.meshgrid(*[(
np.arange(e.start or 0, e.stop, e.step or 1)
if type(e) is slice else e
) for e in {slice: (i,), np.ndarray: (i,), tuple: i}[type(i)]
], indexing = 'ij')
return np.stack(res, 0) if type(i) is tuple else res[0]
# Tests
a = np.mgrid[1:3]
b = extended_mgrid(np.s_[1:3])
assert np.array_equal(a, b), (a, b)
a = np.mgrid[(np.s_[1:3],)]
b = extended_mgrid((np.s_[1:3],))
assert np.array_equal(a, b), (a, b)
a = np.array([[[1,1],[2,2]],[[3,4],[3,4]]])
b = extended_mgrid((np.array([1,2]), np.array([3,4])))
assert np.array_equal(a, b), (a, b)
a = np.mgrid[1:3, 4:8, 1:4, 7:9]
b = extended_mgrid(np.s_[1:3, 4:8] + (np.array([1,2,3]), np.array([7,8])))
assert np.array_equal(a, b), (a, b)

How to enumerate all elements of ND-array without hardcoding number of dimensions?

If I know the number of dimensions, say, 3, I an hardcode it with 3 nested loops:
for i in range(A.shape[0]):
for j in range(A.shape[1]):
for k in range(A.shape[2]):
A[i,j,k] = some_formula(i, j, k)
But what if I don't know the number of dimensions? Can I still enumerate array with knowing all indices on each loop?

If your function broadcasts, you can use numpy.fromfunction:
B = numpy.fromfunction(some_formula, A.shape, dtype=int)
If your function doesn't broadcast, you can use numpy.vectorize and numpy.fromfunction, but it'll be a lot less efficient than if your function broadcasted naturally:
B = numpy.fromfunction(numpy.vectorize(some_formula), A.shape, dtype=int)

How to get sum of two points that are tuple?

Is there anyway to sum two points not using "class point"
Input
a= (2,5)
b= (3,4)
c= a +b
Output
(5 , 9)

You can use a comprehension plus zip:
c = tuple(a_n + b_n for a_n, b_n in zip(a, b))
This is obviously cumbersome if you need to do it a lot (not to mention slightly inefficient). If you are going to be doing this sort of computation a lot, then you're better off using a library like numpy which allows arrays to be added as first-class objects.
import numpy as np
a = np.array([2, 5])
b = np.array([3, 4])
c = a + b
If you go the numpy route, converting to and from numpy arrays is a bit expensive so I'd recommend that you store your points as arrays rather than tuple.

If you'd like a functional approach:
t = tuple(map(sum, zip(a, b)))

import numpy
a = (2,5)
b = (3,4)
c = tuple(numpy.asarray(a) + numpy.asarray(b)) #Tuple convert is just because this is how your output defined. you can skip it...

Complex numbers are (2-)tuples in disguise:
>>> a = 2+5j
>>> b = 3+4j
>>> c = a + b
>>> c
(5+9j)

My solution:
reduce(lambda x, y: (x[0] + y[0], x[1] + y[1]), zip(a, b))

How to unpack a tuple for looping without being dimension specific

I'd like to do something like this:
if dim==2:
a,b=grid_shape
for i in range(a):
for j in range(b):
A[i,j] = ...things...
where dim is simply the number of elements in my tuple grid_shape. A is a numpy array of dimension dim.
Is there a way to do it without being dimension specific?
Without having to write ugly code like
if dim==2:
a,b=grid_shape
for i in range(a):
for j in range(b):
A[i,j] = ...things...
if dim==3:
a,b,c=grid_shape
for i in range(a):
for j in range(b):
for k in range(c):
A[i,j,k] = ...things...

Using itertools, you can do it like this:
for index in itertools.product(*(range(x) for x in grid_shape)):
A[index] = ...things...
This relies on a couple of tricks. First, itertools.product() is a function which generates tuples from iterables.
for i in range(a):
for j in range(b):
index = i,j
do_something_with(index)
can be reduced to
for index in itertools.product(range(a),range(b)):
do_something_with(index)
This works for any number of arguments to itertools.product(), so you can effectively create nested loops of arbitrary depth.
The other trick is to convert your grid shape into the arguments for itertools.product:
(range(x) for x in grid_shape)
is equivalent to
(range(grid_shape[0]),range(grid_shape[1]),...)
That is, it is a tuple of ranges for each grid_shape dimension. Using * then expands this into the arguments.
itertools.product(*(range(x1),range(x2),...))
is equivalent to
itertools.product(range(x1),range(x2),...)
Also, since A[i,j,k] is equivalent to A[(i,j,k)], we can just use A[index] directly.
As DSM points out, since you are using numpy, you can reduce
itertools.product(*(for range(x) for x in grid_shape))
to
numpy.ndindex(grid_shape)
So the final loop becomes
for index in numpy.ndindex(grid_shape):
A[index] = ...things...

You can catch the rest of the tuple by putting a star in front of the last variable and make a an array by putting parentheses around it.
>>> tupl = ((1, 2), 3, 4, 5, 6)
>>> a, *b = tupl
>>> a
(1, 2)
>>> b
[3, 4, 5, 6]
>>>
And then you can loop through b. So it would look something like
a,*b=grid_shape
for i in a:
for j in range(i):
for k in b:
for l in range(k):
A[j, l] = ...things...

Simpler/preferred way to iterate over two equal-length lists and append the max of each pair to a new list?

Given two lists of equal length, is there a simpler or preferred way to iterate over two lists of equal length and append the maximum of each pair of elements to a new list? These are the two methods I know of.
import itertools
a = [1,2,3,4,5]
b = [1,1,6,3,8]
m1 = list()
m2 = list()
for x, y in zip(a, b):
m1.append(max(x, y))
for x in itertools.imap(max, a, b):
m2.append(x)
Both of these result in [1, 2, 6, 4, 8], which is correct. Is there a better way?

map(max, a, b)
[max(x, y) for x, y in zip(a, b)]

You could do it like:
a = [1,2,3,4,5]
b = [1,1,6,3,8]
m3 = [max(x,y) for (x,y) in zip(a,b)]
or even
m4 = map(max, zip(a,b))

In Python3, map() no longer returns a list, so you should use the list comprehension or
list(map(max, a, b))
if you really need a list and not just an iterator

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Multidimensional indexing and mapping - python

Related

Mixing slices with indexes to mgrid input in NumPy

How to enumerate all elements of ND-array without hardcoding number of dimensions?

How to get sum of two points that are tuple?

How to unpack a tuple for looping without being dimension specific

Simpler/preferred way to iterate over two equal-length lists and append the max of each pair to a new list?

Categories

Resources