Summing each array inside list - python

I have a list of arrays generated from another function:
testGroup = [array([18]), array([], dtype=int64), array([56, 75, 55, 55]), array([32])]
I'd like to return the sum of each individual array in the list, with the empty ones returning as zero
I've tried using numpy, as per the documentation:
np.sum([[0, 1], [0, 5]], axis=1)
array([1, 5])np.sum([[0, 1], [0, 5]], axis=1)
array([1, 5])
But when I try np.sum(testGroup, axis=1) I get an axis error as I suppose the empty arrays have a dimension less than one?
I've also tried summing it directly arraySum = sum(testGroup) but get a ValueError
Any ideas on how to achieve a sum of the arrays inside the
testGroup list?

testGroup is a plain python list, that happens to contain numpy.array elements. Instead you can use a list comprehension
>>> [np.sum(a) for a in testGroup]
[18, 0, 241, 32]

Try
list(map(np.sum, testGroup))
it gives
[18, 0, 241, 32]

You might use so-called list-comprehension to apply function to each element of list as follow
import numpy as np
testGroup = [np.array([18]), np.array([], dtype=np.int64), np.array([56, 75, 55, 55]), np.array([32])]
totals = [np.sum(i) for i in testGroup]
print(totals)
output
[18, 0, 241, 32]

Related

Replace multiple values of 3d Numpy array by list of indexes

I need to replace multiple values of a 3d NumPy array with other values using the list of indexes. For example:
old_array = np.full((224,224,3),10)
# in below list first position element are indexes and 2nd one are their corresponding values
list_idx_val = [[array([[ 2, 14, 0],
[ 2, 14, 1],
[ 2, 14, 2],
[99, 59, 1],
[99, 61, 1],
[99, 61, 2]], dtype=uint8), array([175, 168, 166,119, 117, 119], dtype=uint8)]
#need to do
old_array[2,14,1] =168
One way is to simply access index values and replace old values with the new one. But the NumPy array and list of indexes is quite big. I would be highly thankful for a fast and efficient solution (without a loop, preferably slicing or other maybe) to replace the array values as I need to create thousands of arrays by replacing values using such list of indexes with minimal latency.
Let's do array indexing:
idx, vals = list_idx_val
old_array[idx[:,0], idx[:,1], idx[:,2]] = vals
Output (plt.show):

Recommended way to replace several values in a tensor at once?

Is there a batch way to replace several particular values in a pytorch tensor at once without a for loop?
Example:
old_values = torch.Tensor([1, 2, 3, 4, 5, 5, 2, 3, 3, 2])
old_new_value = [[2,22], [3,33], [6, 66]]
old_new_value = [[2,22], [3,33], [6, 66]], which means 2 should be replaced by 22, and 3 should be replaced by 33 and 6 to 66
Can I have an efficient way to achieve the following end_result?
end_result = torch.Tensor([1, 22, 33, 4, 5, 5, 22, 33, 33, 22])
Note that old_values is not unique. Also, it is possible that old_new_value has a pair here(6, 66) that does not exist in the old_values.
Also, the old_new_values includes unique rows,
If you don't have any duplicate elements in your input tensor, here's one straightforward way using masking and value assignment using basic indexing. (I'll assume that the data type of the input tensor is int. But, you can simply adapt this code in a straightforward manner to other dtypes). Below is a reproducible illustration, with explanations interspersed in inline comments.
# input tensors to work with
In [75]: old_values
Out[75]: tensor([1, 2, 3, 4, 5], dtype=torch.int32)
In [77]: old_new_value
Out[77]:
tensor([[ 2, 22],
[ 3, 33]], dtype=torch.int32)
# generate a boolean mask using the values that need to be replaced (i.e. 2 & 3)
In [78]: boolean_mask = (old_values == old_new_value[:, :1]).sum(dim=0).bool()
In [79]: boolean_mask
Out[79]: tensor([False, True, True, False, False])
# assign the new values by basic indexing
In [80]: old_values[boolean_mask] = old_new_value[:, 1:].squeeze()
# sanity check!
In [81]: old_values
Out[81]: tensor([ 1, 22, 33, 4, 5], dtype=torch.int32)
A small note on efficiency: Throughout the whole process, we never made any copy of the data (i.e. we operate only on new views by massaging the shapes according to our needs). Therefore, the runtime would be blazing fast.
Not sure anyone still cares about this, but just in case, here is a solution that also works when old_values is not unique:
mask = old_values == old_new_value[:, :1]
new_values = (1 - mask.sum(dim=0)) * old_values + (mask * old_new_value[:,1:]).sum(dim=0)
Masking works as in #kmario23's solution, but the mask is multiplied with the new values and sum-reduced to end up with the new values at all the right replacement positions. The negative mask is applied to the old values to use those at all other positions. Then both masked tensors are summed to obtain the desired result.

Numpy array: How to extract whole rows based on values in a column

I am looking for the equivalent of an SQL 'where' query over a table. I have done a lot of searching and I'm either using the wrong search terms or not understanding the answers. Probably both.
So a table is a 2 dimensional numpy array.
my_array = np.array([[32, 55, 2],
[15, 2, 60],
[76, 90, 2],
[ 6, 65, 2]])
I wish to 'end up' with a numpy array of the same shape where eg the second column values are >= 55 AND <= 65.
So my desired numpy array would be...
desired_array([[32, 55, 2],
[ 6, 65, 2]])
Also, does 'desired_array' order match 'my_array' order?
Just make mask and use it.
mask = np.logical_and(my_array[:, 1] >= 55, my_array[:, 1] <= 65)
desired_array = my_array[mask]
desired_array
The general Numpy approach to filtering an array is to create a "mask" that matches the desired part of the array, and then use it to index in.
>>> my_array[((55 <= my_array) & (my_array <= 65))[:, 1]]
array([[32, 55, 2],
[ 6, 65, 2]])
Breaking it down:
# Comparing an array to a scalar gives you an array of all the results of
# individual element comparisons (this is called "broadcasting").
# So we take two such boolean arrays, resulting from comparing values to the
# two thresholds, and combine them together.
mask = (55 <= my_array) & (my_array <= 65)
# We only want to care about the [1] element in the second array dimension,
# so we take a 1-dimensional slice of that mask.
desired_rows = mask[:, 1]
# Finally we use those values to select the desired rows.
desired_array = my_array[desired_rows]
(The first two operations could instead be swapped - that way I imagine is more efficient, but it wouldn't matter for something this small. This way is the way that occurred to me first.)
You dont mean the same shape. You probably meant the same column size. The shape of my_array is (4, 3) and the shape of your desired array is (2, 3). I would recommend masking, too.
You can use a filter statement with a lambda that checks each row for the desired condition to get the desired result:
my_array = np.array([[32, 55, 2],
[15, 2, 60],
[76, 90, 2],
[ 6, 65, 2]])
desired_array = np.array([l for l in filter(lambda x: x[1] >= 55 and x[1] <= 65, my_array)])
Upon running this, we get:
>>> desired_array
array([[32, 55, 2],
[ 6, 65, 2]])

Swaping values of two lists based on given index

I have a list which consists out of two numpy arrays, the first one telling the index of a value and the second containing the belonging value itself. It looks a little like this:
x_glob = [[0, 2], [85, 30]]
A function is now receiving the following input:
x = [-10, 0, 77, 54]
My goal is to swap the values of x with the values from x_glob based on the given index array from x_glob. This example should result in something like this:
x_new = [85, 0, 30, 54]
I do have a solution using a loop. But I am pretty sure there is a way in python to solve this issue more efficient and elegant.
Thank you!
NumPy arrays may be indexed with other arrays, which makes this replacement trivial.
All you need to do is index your second array with x_glob[0], and then assign x_glob[1]
x[x_glob[0]] = x_glob[1]
To see how this works, just look at the result of the indexing:
>>> x[x_glob[0]]
array([-10, 77])
The result is an array containing the two values that we need to replace, which we then replace with another numpy array, x_glob[1], to achieve the desired result.
>>> x_glob = np.array([[0, 2], [85, 30]])
>>> x = np.array([-10, 0, 77, 54])
>>> x[x_glob[0]] = x_glob[1]
>>> x
array([85, 0, 30, 54])
For a non-numpy solution, you could create a dict mapping the indices from x_glob to the respective values and then use a list comprehension with that dict's get method:
>>> x_glob = [[0, 2], [85, 30]]
>>> x = [-10, 0, 77, 54]
>>> d = dict(zip(*x_glob))
>>> [d.get(i, n) for i, n in enumerate(x)]
[85, 0, 30, 54]
Or using map with multiple parameter lists (or without zip using itertools.starmap):
>>> list(map(d.get, *zip(*enumerate(x))))
[85, 0, 30, 54]
My solution also uses for loop, but it's pretty short and elegant (I think), works in place and is effective as it does not have to iterate through full x array, just through list of globs:
for k,v in zip(*x_glob):
x[k] = v

Slicing Numpy Array by Array of Indices

I am attempting to slice a Numpy array by an array of indices. For example,
array = [10,15,20,25,32,66]
indices = [1,4,5]
The optimal output would be
[[10][15,20,25,32][66]]
I have tried using
array[indices]
but this just produces the single values of each individual index rather than all those in between.
Consider using np.split, like so
array = np.asarray([10, 15, 20, 25, 32, 66])
indices = [1, 5]
print(np.split(array, indices))
Produces
[array([10]), array([15, 20, 25, 32]), array([66])]
As split uses breakpoints only, where the index indicates the points at which to break blocks. Hence, no need to indicate 1:4, this is implicitly defined by breakpoints 1, 5.
According to your comment this generator produces the desired result:
def slice_multi(array, indices):
current = 0
for index in indices:
yield array[current:index]
current = index
array = [10,15,20,25,32,66]
indices = [1,4,5]
list(slice(array, indices)) # [[10], [15, 20, 25], [32]]

Categories