Reduce array over ranges - python

Say I have an array of numbers
np.array(([1, 4, 2, 1, 2, 5]))
And I want to compute the sum over a list of slices
((0, 3), (2, 4), (2, 6))
Giving
[(1 + 4 + 2), (2 + 1), (2 + 1 + 2 + 5)]
Is there a nice way to do this in numpy?
Looking for something equivalent to
def reduce(a, ranges):
np.array(list(np.sum(a[low:high]) for (low, high) in ranges))
Seems like there is probably some fancy numpy way to do this though. Anyone know?

One way is to use np.add.reduceat. If a is the array of values [1, 4, 2, 1, 2, 5]:
>>> np.add.reduceat(a, [0,3, 2,4, 2])[::2]
array([ 7, 3, 10], dtype=int32)
Here the slice indexes are passed in a list and are summed to return [ 7, 1, 3, 2, 10] (i.e. the sums of a[0:3], a[3:], a[2:4], a[4:], a[2:]). We only want every other element from this array.
Longer alternative approach...
The fact that the slices are of different lengths makes this slightly trickier to vectorise in NumPy, but here is one way you approach the problem.
Given an array of values and an array of slices to make...
a = np.array(([1, 4, 2, 1, 2, 5]))
slices = np.array([(0, 3), (2, 4), (2, 6)])
...create a mask-like array z that, for each slice, will be used to "zero-out" the values from a we don't want to sum:
z = np.zeros((3, 6))
s1 = np.arange(6) >= s[:, 0][:,None]
s2 = np.arange(6) < s[:, 1][:,None]
z[s1 & s2] = 1
Then you can do:
>>> (z * a).sum(axis=1)
array([ 7., 3., 10.])
A quick %timeit shows this is slightly faster than the list comprehension, even though we had to construct z and z * a. If slices is made to be of length 3000, this method is around 40 times quicker.
However note that the array z will be of shape (len(slices), len(a)) which may not be as practical if a or slices are both very long - an iterative approach might be preferred to avoid large temporary arrays in memory.

Related

How to index a 2D array with 3D array?

Today, i encountered such a problem:
Tensor A is a segmentation mask with the shape of (1, 4, 4) and its value is either 0 or 1.
Tensor B is a diagonal array created by torch.eye(2).
My problems are why we can index B(2D) with A(3D) in the form of B[A] and why the result is a tensor with the shape of (1, 4, 4, 2)?
Above is my test instance, and the socure code is obtained from a diceloss class:
y_true_dummy = torch.eye(num_classes)[y_true.squeeze(1)]
the shape of y_true is (b, h, w), num_classes equals c.
by the way, why we need function .squeeze()?
I want some explanation about the indexing problem and some videos are more appreciated.
You can understand the problem if you work on a smaller example:
A = torch.randint(2, (4,))
B = torch.eye(2)
>>> A
# tensor([1, 0, 1, 1])
>>> B[A].shape
# (4, 2)
>>> B[A]
# tensor([[0., 1.],
# [1., 0.],
# [0., 1.],
# [0., 1.]])
[1, 0] and [0, 1] are the first and second rows of the 2x2 identity matrix, B. So, using the 1D array A of shape (4, ) as index is selecting 4 "rows" of B / selecting 4 elements of B along axis 0. B[A] is basically [B[1], B[1], B[0], B[1]].
So when A is a 3D array of shape (1, 4, 4), B[A] means selecting (1, 4, 4) rows of B. And because each row in B had 2 elements (2 columns), your output is (1, 4, 4, 2).
B is a 2x2 identity matrix, having 2 rows. Think of it like: you are picking 16 rows out of these 2 rows, getting a (16, 2) matrix -> then you reshape it to get (1, 4, 4, 2) tensor. In fact, you can check this easily:
A = torch.randint(2, (4, 4))
A_flat = A.reshape(-1)
B = torch.eye(2)
>>> torch.allclose(B[A], B[A_flat].reshape(1, 4, 4, -1)])
# True
This isn't also a PyTorch specific phenomenon either. You can observe the same indexing rules in NumPy, which torch maintains close compatibility with.

Return various sum totals of integer list

Is there a way to return various sums of a list of integers? Pythonic or otherwise.
For e.g. various sum totals from [1, 2, 3, 4] would produce 1+2=3, 1+3=4, 1+4=5, 2+3=5, 2+4=6, 3+4=7. Integers to be summed could by default be stuck to two integers only or more I guess.
Can't seem to wrap my head around how to tackle this and can't seem to find an example or explanation on the internet as they all lead to "Sum even/odd numbers in list" and other different problems.
You can use itertools.combinations and sum:
from itertools import combinations
li = [1, 2, 3, 4]
# assuming we don't need to sum the entire list or single numbers,
# and that x + y is the same as y + x
for sum_size in range(2, len(li)):
for comb in combinations(li, sum_size):
print(comb, sum(comb))
outputs
(1, 2) 3
(1, 3) 4
(1, 4) 5
(2, 3) 5
(2, 4) 6
(3, 4) 7
(1, 2, 3) 6
(1, 2, 4) 7
(1, 3, 4) 8
(2, 3, 4) 9
is this what you are looking for ?
A=[1,2,3,4] for i in A: for j in A: if i!=j: print(i+j)

Sums of variable size chunks of a list where sizes are given by other list

I would like to make the following sum given two lists:
a = [0,1,2,3,4,5,6,7,8,9]
b = [2,3,5]
The result should be the sum of the every b element of a like:
b[0] = 2 so the first sum result should be: sum(a[0:2])
b[1] = 3 so the second sum result should be: sum(a[2:5])
b[2] = 5 so the third sum result should be: sum(a[5:10])
The printed result: 1,9,35
You can make use of np.bincount with weights:
groups = np.repeat(np.arange(len(b)), b)
np.bincount(groups, weights=a)
Output:
array([ 1., 9., 35.])
NumPy has a tool to do slice based sum-reduction with np.add.reduceat -
In [46]: np.add.reduceat(a,np.cumsum(np.r_[0,b[:-1]]))
Out[46]: array([ 1, 9, 35])
Hard to compete with the np.bincount solution, but here's another nice way to approach it with np.cumsum:
strides = [0] + np.cumsum(b).tolist() # [0, 2, 5, 10]
stride_slices = zip(strides[:-1], strides[1:]) # [(0, 2), (2, 5), (5, 10)]
[sum(a[s[0]: s[1]]) for s in stride_slices]
# [1, 9, 35]
You mean something like this?
a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
b = [2, 3, 5]
def some_function(a, b): # couldnt come up with a name :D
last_index = 0
for i in b:
print(sum(a[last_index:last_index + i]))
last_index += i
some_function(a, b)
You can use a list comprehension with sum:
a=[0,1,2,3,4,5,6,7,8,9]
b=[2,3,5]
r = [sum(a[(k:=sum(b[:i])):k+j]) for i, j in enumerate(b)]
Output:
[1, 9, 35]

Subtracting one dimensional array (list of scalars) from 3 dimensional arrays using broadcasting

I have a one dimesional array of scalar values
Y = np.array([1, 2])
I also have a 3-dimensional array:
X = np.random.randint(0, 255, size=(2, 2, 3))
I am attempting to subtract each value of Y from X, so I should get back Z which should be of shape (2, 2, 2, 3) or maybe (2, 2, 2, 3).
I can"t seem to figure out how to do this via broadcasting.
I tried changing the change of Y:
Y = np.array([[[1, 2]]])
but not sure what the correct shape should be.
Broadcasting lines up dimensions on the right. So you're looking to operate on a (2, 1, 1, 1) array and a (2, 2, 3) array.
The simplest way I can think of is using reshape:
Y = Y.reshape(-1, 1, 1, 1)
More generally:
Y = Y.reshape(-1, *([1] * X.ndim))
At most one of the arguments to reshape can be -1, indicating all the remaining size not accounted for by other dimensions.
To get Z of shape (2, 2, 2, 3):
Z = X - Y.reshape(-1, *([1] * X.ndim))
If you were OK with having Z of shape (2, 2, 3, 2), the operation would be much simpler:
Z = X[..., None] - Y
None or np.newaxis will insert a unit axis into the end of X's shape, making it broadcast properly with the 1D Y.
I am not entirely sure on which dimension you want your subtraction to take place, but X - Y will not return an error if you define Y such as Y = numpy.array([1,2]).reshape(2, 1, 1) or Y = numpy.array([1,2]).reshape(1, 2, 1).

Python Multiply tuples of equal length

I was hoping for an elegant or effective way to multiply sequences of integers (or floats).
My first thought was to try (1, 2, 3) * (1, 2, 2) would result (1, 4, 6), the products of the individual multiplications.
Though python isn't preset to do that for sequences. Which is fine, I wouldn't really expect it to. So what's the pythonic way to multiply (or possibly other arithmetic operations as well) each item in two series with and to their respective indices?
A second example (0.6, 3.5) * (4, 4) = (2.4, 14)
The simplest way is to use zip function, with a generator expression, like this
tuple(l * r for l, r in zip(left, right))
For example,
>>> tuple(l * r for l, r in zip((1, 2, 3), (1, 2, 3)))
(1, 4, 9)
>>> tuple(l * r for l, r in zip((0.6, 3.5), (4, 4)))
(2.4, 14.0)
In Python 2.x, zip returns a list of tuples. If you want to avoid creating the temporary list, you can use itertools.izip, like this
>>> from itertools import izip
>>> tuple(l * r for l, r in izip((1, 2, 3), (1, 2, 3)))
(1, 4, 9)
>>> tuple(l * r for l, r in izip((0.6, 3.5), (4, 4)))
(2.4, 14.0)
You can read more about the differences between zip and itertools.izip in this question.
A simpler way would be:
from operator import mul
In [19]: tuple(map(mul, [0, 1, 2, 3], [10, 20, 30, 40]))
Out[19]: (0, 20, 60, 120)
If you are interested in element-wise multiplication, you'll probably find that many other element-wise mathematical operations are also useful. If that is the case, consider using the numpy library.
For example:
>>> import numpy as np
>>> x = np.array([1, 2, 3])
>>> y = np.array([1, 2, 2])
>>> x * y
array([1, 4, 6])
>>> x + y
array([2, 4, 5])
With list comprehensions the operation could be completed like
def seqMul(left, right):
return tuple([value*right[idx] for idx, value in enumerate(left)])
seqMul((0.6, 3.5), (4, 4))
A = (1, 2, 3)
B = (4, 5, 6)
AB = [a * b for a, b in zip(A, B)]
use itertools.izip instead of zip for larger inputs.

Categories