Python 2D array sum enumeration - python

I'm trying to iterate through a 2D array getting the sum for each list inside the array. For example I have:
test = [[5, 3, 6], [2, 1, 3], [1, 1, 3], [2, 6, 6], [4, 5, 3], [3, 6, 2], [5, 5, 2], [4, 4, 4], [3, 5, 3], [1, 3, 4]]
I want to take the values of each smaller array, so for example 5+3+6 and 2+1+3 and put them into a new array. So I'm aiming for something like:
testSum = [14, 6, 5, 14...].
I'm having trouble properly enumerating through a 2D array. It seems to jump around. I know my codes not correct but this is what i have so far:
k = 10
m = 3
testSum = []
#create array with 10 arrays of length 3
test = [[numpy.random.randint(1,7) for i in range(m)] for j in range(k)]
sum = 0
#go through each sub-array in test array
for array in test:
#add sums of sub-arrays
for i in array
sum += test[array][i]
testSum.append(sum)

You can do this more pythonic way,
In [17]: print [sum(i) for i in test]
[14, 6, 5, 14, 12, 11, 12, 12, 11, 8]
or
In [19]: print map(sum,test)
[14, 6, 5, 14, 12, 11, 12, 12, 11, 8]

Since you're using Numpy, you should let Numpy handle the looping: it's much more efficient than using explicit Python loops.
import numpy as np
k = 10
m = 3
test = np.random.randint(1, 7, size=(k, m))
print(test)
print('- ' * 20)
testSum = np.sum(test, axis=1)
print(testSum)
typical output
[[2 5 1]
[1 5 5]
[6 5 3]
[1 1 1]
[2 5 6]
[4 2 5]
[3 3 1]
[6 4 6]
[2 5 1]
[6 5 2]]
- - - - - - - - - - - - - - - - - - - -
[ 8 11 14 3 13 11 7 16 8 13]
As for the code you posted, it has a few problems. The main one being that you need to set the sum variable to zero for each sub-list. BTW, you shouldn't use sum as a variable name because that shadows Python's built-in sum function.
Also, your array access is wrong. (And you shouldn't use array as a variable name either, since it's the name of a standard module).
for array in test:
for i in array:
iterates over each list in test and then over each item in each of those list, so i is already an item of an inner list, so in
sum += test[array][i]
you are attempting to index the test list with a list instead of an integer, and then you're trying to index the result of that with the current item in i.
(In other words, in Python, when you iterate over a container object in a for loop the loop variable takes on the values of the items in the container, not their indices. This may be confusing if you are coming from a language where the loop variable gets the indices of those items. If you want the indices you can use the built-in enumerate function to get the indices and items at the same time).
Here's a repaired version of your code.
import numpy as np
k = 10
m = 3
#create array with 10 arrays of length 3
test = [[np.random.randint(1,7) for i in range(m)] for j in range(k)]
print(test)
print()
testSum = []
#go through each sub-array in test array
for array in test:
#add sums of sub-arrays
asum = 0
for i in array:
asum += i
testSum.append(asum)
print(testSum)
typical output
[[4, 5, 1], [3, 6, 6], [3, 4, 1], [2, 1, 1], [1, 6, 4], [3, 4, 4], [3, 2, 6], [6, 3, 2], [1, 3, 5], [5, 3, 3]]
[10, 15, 8, 4, 11, 11, 11, 11, 9, 11]
As I said earlier, it's much better to use Numpy arrays and let Numpy do the looping for you. However, if your program is only processing small lists there's no need to use Numpy: just use the functions in the standard random module to generate your random numbers and use the technique shown in Rahul K P's answer to calculate the sums: it's more compact and faster than using a Python loop.

Related

How to modify every third element in matrix?

I had to make a matrix using numpy.array method. How can I now update every third element of my matrix? I have made a for loop for the problem but that is not the optimal solution. Is there a way to avoid loops? For example if I have this matrix:
matrix = np.array([[1,2,3,4],
[5,6,7,8],
[4,7,6,9]])
is there a way to add 1 to every third element and get this matrix:
[[2,2,3,5],[5,6,8,8],[4,8,6,9]]
Solution:
matrix = np.ascontiguousarray(matrix)
matrix.ravel()[::3] += 1
Why does the ascontiguousarray is needed? Because matrix may not be c-contiguous (for example matrix may have fortran-order - column major). It that case ravel returns a copy instead of a view so a simple inplace operation matrix.ravel()[::3] += 1 will not work as expected.
Example 1
import numpy as np
arr = np.array([
[1, 2, 3, 4],
[5, 6, 7, 8],
[4, 7, 6, 9]])
arr.ravel()[::3] += 1
print(arr)
Works as expected:
[[2 2 3 5]
[5 6 8 8]
[4 8 6 9]]
Example 2
But with fortran-order
import numpy as np
arr = np.array([
[1, 2, 3, 4],
[5, 6, 7, 8],
[4, 7, 6, 9]])
arr = np.asfortranarray(arr)
arr.ravel()[::3] += 1
print(arr)
produces:
[[1 2 3 4]
[5 6 7 8]
[4 7 6 9]]
Example 3
Will work as expected in both cases
import numpy as np
arr = np.array([
[1, 2, 3, 4],
[5, 6, 7, 8],
[4, 7, 6, 9]])
# arr = np.asfortranarray(arr)
arr = np.ascontiguousarray(arr)
arr.ravel()[::3] += 1
print(arr)

In a specific row of a numpy array, how to find column indexes of the top 3 largest values

I have an array X:
X = np.array([[4, 3, 5, 2],
[9, 6, 7, 3],
[8, 6, 7, 5],
[3, 4, 5, 3],
[5, 3, 2, 6]])
I want the indices of the top 3 greatest values in a row with index 1. The result of that would be :
[0,2,1]
I am relatively new to Python. I tried doing it with argsort, but am not able to do it for one specific row.
You can use argsort on axis=1 (by row) and then extract the last 3 indices for each row:
X.argsort(axis=1)[:,:-4:-1]
#[[2 0 1]
# [0 2 1]
# [0 2 1]
# [2 1 3]
# [3 0 1]]
X = np.array([[4, 3, 5, 2],
[9, 6, 7, 3],
[8, 6, 7, 5],
[3, 4, 5, 3],
[5, 3, 2, 6]])
# get top 3 values in the row with index 1
row_sorted = sorted(X[1], reverse=True)[0:3]
# Find the corresponding index of theses top 3 values
indexes = [list(X[1]).index(i) for i in row_sorted]
output:
[0, 2, 1]
For sufficiently large arrays, np.argpartition will be the most efficient solution. It will place the last three elements of the sort indices in the right positions:
i = np.argpartition(x[1], [-3, -2, -1])[:-4:-1]
This behaves similarly to np.argsort except that only the selected indices are in the right place. All the other elements are only guaranteed to be in the correct side relative to each partition point, but not the exact position.

Numpy: Indices of multiple values

I would like to set some values of a 2D array to a specific number by indexing them efficiently.
Say I have a 2D numpy array,
A = array([[1, 6, 6],
[9, 7, 7],
[10, 2, 2]])
and I would like to get the indices in the array that belong to a set of numbers, say indList=[10, 1] so that I can set them to zero. However, indList can be a huge list.
Is there a faster way for doing this without a for loop?
As a for loop it would be,
indList = [10, 1]
for i in indList:
A[A==i] = 0
But this can get inefficient when indList is large.
With numpy, you can vectorize this by first finding the indices of elements that are in indList and then setting them to be zero.
A = np.array([[1, 6, 6],
[9, 7, 7],
[10 ,2 ,2]])
A[np.where(np.isin(A, [10,1]))] = 0
This gives
A = [[0 6 6]
[9 7 7]
[0 2 2]]
From #Miket25's answer, there is actually no need to add the np.where layer. np.isin(A, [10, 1]) returns a boolean array which is perfectly acceptable as an index. So simply do
A[np.isin(A, [10, 1])] = 0

algorithm to randomize a matrix with uniqueness constraint

I'm trying to develop an algorithm for randomizing an NxN matrix N times with the following constraint: any two values A and B can exist at most one time across all the columns in the resulting matrices. For example, a 3x3 matrix is randomized 3 times with the following result:
matrix #0
[0, 3, 6]
[1, 4, 7]
[2, 5, 8]
matrix #1
[0, 3, 6]
[7, 1, 4]
[5, 8, 2]
matrix #2
[0, 3, 6]
[4, 7, 1]
[8, 2, 5]
The pairing of any two number A and B in any given column, say 0 and 1 in column 0 of matrix #0 are unique for all the columns in each resulting matrix. This condition must hold for every two-paired values in the matrices.
I developed what I believed was a solution with the following code:
#!/usr/bin/python
w,h = 5,5
matrix_list = []
def rotate(l,n):
return l[-n:] + l[:-n]
def transpose(l):
return list(map(list, zip(*l)))
matrix = [[x*w + y for x in range(w)] for y in range(h)]
#matrix = transpose(matrix)
for i in range(w):
matrix_list.append(matrix[:])
matrix = [rotate(matrix[n],n) for n in range(w)]
for m in matrix_list:
for arr in m:
print(arr)
print('\n')
It simply shift the values of each row N places were N is the value of the row index of the matrix.
However, I found that the algorithm does not work whenever N is even and N > 2, as illustrated by the following partial output of a 4x4 matrix (the pairing of values in rows 0 and 2 are repeated):
(from matrix #0)
[0, 4, 8, 12]
[1, 5, 9, 13]
[2, 6, 10, 14]
[3, 7, 11, 15]
(from matrix #2)
[0, 4, 8, 12]
[9, 13, 1, 5]
[2, 6, 10, 14]
[11, 15, 3, 7]
I have tried all sorts of shifting and transposing methods and continue to come up empty. Any assistance in creating a solution for even-dimensioned matrices or a general solution covering both odd and even matrices would be much appreciated.

sort 2-D list python

I'm relatively new to programming, and I want to sort a 2-D array (lists as they're called in Python) by the value of all the items in each sub-array. For example:
pop = [[1,5,3],[1,1,1],[7,5,8],[2,5,4]]
The sum of the first element of pop would be 9, because 1 + 5 + 3 = 9. The sum of the second would be 3, because 1 + 1 + 1 = 3, and so on.
I want to rearrange this so the new order would be:
newPop = [pop[1], pop[0], pop[3], pop[2]]
How would I do this?
Note: I don't want to sort the elements each sub-array, but sort according to the sum of all the numbers in each sub-array.
You can use sorted():
>>> pop = [[1,5,3],[1,1,1],[7,5,8],[2,5,4]]
>>> newPop = sorted(pop, key=sum)
>>> newPop
[[1, 1, 1], [1, 5, 3], [2, 5, 4], [7, 5, 8]]
You can also sort in-place with pop.sort(key=sum). Unless you definitely want to preserve the original list, you should prefer in-pace sorting.
Try this:
sorted(pop, key=sum)
Explanation:
The sorted() procedure sorts an iterable (a list in this case) in ascending order
Optionally, a key parameter can be passed to determine what property of the elements in the list is going to be used for sorting
In this case, the property is the sum of each of the elements (which are sublists)
So essentially this is what's happening:
[[1,5,3], [1,1,1], [7,5,8], [2,5,4]] # original list
[sum([1,5,3]), sum([1,1,1]), sum([7,5,8]), sum([2,5,4])] # key=sum
[9, 3, 20, 11] # apply key
sorted([9, 3, 20, 11]) # sort
[3, 9, 11, 20] # sorted
[[1,1,1], [1,5,3], [2,5,4], [7,5,8]] # elements coresponding to keys
#arshajii beat me to the punch, and his answer is good. However, if you would prefer an in-place sort:
>>> pop = [[1,5,3],[1,1,1],[7,5,8],[2,5,4]]
>>> pop.sort(key=sum)
>>> pop
[[1, 1, 1], [1, 5, 3], [2, 5, 4], [7, 5, 8]]
I have to look up Python's sorting algorithm -- I think it's called Timsort, bit I'm pretty sure an in-place sort would be less memory intensive and about the same speed.
Edit: As per this answer, I would definitely recommend x.sort()
If you wanted to sort the lists in a less traditional way, you could write your own function (that takes one parameter.) At risk of starting a flame war, I would heavily advise against lambda.
For example, if you wanted the first number to be weighted more heavily than the second number more heavily than the third number, etc:
>>> def weightedSum(listToSum):
... ws = 0
... weight = len(listToSum)
... for i in listToSum:
... ws += i * weight
... weight -= 1
... return ws
...
>>> weightedSum([1, 2, 3])
10
>>> 1 * 3 + 2 * 2 + 3 * 1
10
>>> pop
[[1, 5, 3], [1, 1, 1], [7, 5, 8], [2, 5, 4]]
>>> pop.sort(key=weightedSum)
>>> pop
[[1, 1, 1], [1, 5, 3], [2, 5, 4], [7, 5, 8]]
>>> pop += [[1, 3, 8]]
>>> pop.sort(key=weightedSum)
>>> pop
[[1, 1, 1], [1, 5, 3], [1, 3, 8], [2, 5, 4], [7, 5, 8]]

Categories