Iterate Across Columns in a List of Lists in Python - python

When I attempt iteration across columns in a row, the column does no change within a nested loop:
i_rows = 4
i_cols = 3
matrix = [[0 for c in xrange(i_cols)] for r in xrange(i_rows)]
for row, r in enumerate(matrix):
for col, c in enumerate(r):
r[c] = 1
print matrix
Observed output
[[1, 0, 0], [1, 0, 0], [1, 0, 0], [1, 0, 0]]
Expected output
[[1, 1, 1], [1, 1, 1], [1, 1, 1], [1, 1, 1]]
I have tried different expressions such as xrange() and len() and I am considering switching to numpy. I am a bit surprised that a two-dimensional array in Python is not so intuitive as my first impression of the language.
The goal is a two-dimensional array with varying integer values, which I later need to parse to represent 2D graphics on the screen.
How can I iterate across columns in a list of lists?

You just have to assign the value against the col, not c
for row, r in enumerate(matrix):
for col, c in enumerate(r):
r[col] = 1 # Note `col`, not `c`
Because the first value returned by enumerate will be the index and the second value will be the actual value itself.

Related

Updating 2D Array Values in Python - Updating whole column wrong?

I am trying to create a 2D array as such and just update single values at a time, shown here:
M = [[0]*3]*3
M[0][0] = 3
print(M)
which is returning the following:
[[3, 0 , 0], [3, 0, 0], [3, 0, 0]]
Anyone have an idea of what I've done wrong?
What your first line is doing is creating one inner length 3 list, and adding three references of it to your outer list M. You must declare each internal list independently if you want them to be independent lists.
The following is different in that it creates 3 separate instances of inner length 3 lists:
M = [[0]*3 for _ in range(3)]
M[0][0] = 3
print(M)
OUTPUT
[[3, 0, 0], [0, 0, 0], [0, 0, 0]]
The 2D array is at the same address as the first array.
M = [[0,0,0],[0,0,0],[0,0,0]]
M[0][0] = 3
print(M)
Which is returning the following:
[[3, 0, 0], [0, 0, 0], [0, 0, 0]]
FYI: Problem same as this: Why in a 2D array a and *a point to same address?

How do I rewrite the following code in Python without for loops?

I have an array "b" of size L^2 x L*(L+1), and an array "a" of size L x L.
currently my code is
for i in range (L):
for j in range (L):
b[i+j*L,i+j*(L+1)] = a[j,i]
What this means is that, for example for L=2, if the 2x2 array "a" has the form
ab
cd
then I want the 4x6 array "b" to be
a00000
0b0000
000c00
0000d0
How do I rewrite the same thing without using for loops?
What you want is to fill the diagonal of matrix B with the values of (flattened) A. Numpy has functions for this:
https://numpy.org/doc/stable/reference/generated/numpy.ndarray.flatten.html
https://numpy.org/doc/stable/reference/generated/numpy.fill_diagonal.html
import numpy as np
# Set sample data
a = np.array([[1, 2], [3, 4]])
b = np.zeros([4,6])
# This is it:
np.fill_diagonal(b, a.flatten())
If you don't want to use a library, for example because this is a programming assignment, you can represent matrices as nested lists and use a list comprehension, as this:
# Prepare data
a = [[1, 2], [3, 4]]
L = len(a)
# "Build" the result
b = [[a[i//L][i%L] if i == j else 0 for j in range(L*(L+1))] for i in range(L*L)]
# Same, with better formatting:
b = [[a[i//L][i%L]
if i == j else 0
for j in range(L*(L+1))]
for i in range(L*L)]
# b will be = [[1, 0, 0, 0, 0, 0],
# [0, 2, 0, 0, 0, 0],
# [0, 0, 3, 0, 0, 0],
# [0, 0, 0, 4, 0, 0]]
Anyway you need to iterate through the items in 'a', so you are just replacing the 'for' constructions by a list comprehension. This might be more efficient for large matrices but arguably less clear.
Generalizing the answer from Milo:
L = (a.shape)[0]
b = np.zeros([L*L, L*(L+1)])
np.fill_diagonal(b, a.flatten())

iterating through multidimensional array to store index/coordinates of occupied locations in dict using value as key

I am iterating through a multi-dimensional Python list using a for loop. The list represents a board, and I am looking for occupied spaces. There will always be more than one space for each game piece, which are each represented by an integer.
I want to use that integer as the key and then store occupied coordinates in individual lists as the values of a dictionary. Finally, a third value will be stored in each coordinate that represents its state: (x, y, state).
For example,
board = [
[0, 0, 1],
[0, 0, 1],
[0, 2, 0],
[0, 2, 0],
]
ships = {}
Here is my code:
for row in board:
for column in row:
if column != 0:
if column not in ships:
ships[column] = [[row.index(column), board.index(row), 0]]
else:
ships[column].append([row.index(column), board.index(row), 0])
I'm not sure why it's adding the same coordinate twice. It should never encounter that coordinate again if it is looping linearly through each row and column.
Your problem is the use of the index() method. When you write row.index(column), it returns you the index for the first column it finds that is equal to the one you gave it.
So the problem arises by the fact that you have rows that are equivalent (first and seconds rows for examples, [0, 0, 1] and [0, 0, 1].
I suggest you use the enumerate() built-in function to retrieve the indices.
for row_index, row in enumerate(board):
for column_index, column in enumerate(row):
if column != 0:
if column not in ships:
ships[column] = [[column_index, row_index, 0]]
else:
ships[column].append([column_index, row_index, 0])
Output for
board = [
[0, 0, 1],
[0, 0, 1],
[0, 2, 0],
[0, 2, 0],
]
{1: [[2, 0, 0], [2, 1, 0]], 2: [[1, 2, 0], [1, 3, 0]]}.
Just a reminder that the indices start from 0.
Using numpy.nonzero() and collections.defaultdict:
from collections import defaultdict
import numpy as np
board = np.asarray(board)
ships = defaultdict(list)
for i, j in zip(*np.nonzero(board)):
ships[board[i, j]].append([i, j, 0])
ships
# defaultdict(list, {1: [[0, 2, 0], [1, 2, 0]], 2: [[2, 1, 0], [3, 1, 0]]})
np.nonzero(board) gives a tuple of (rows, columns) at indices where the board is not zero. For each of these (i, j) combinations, access the corresponding element at the board and append its coordinates to the dictionary with that value as the key.
An alternative with np.ndenumerate():
for (row, col), val in np.ndenumerate(board):
# Introspect this with `list(np.ndenumerate(board))`
if val != 0:
ships[val].append([row, col, 0])
If you want a Python dictionary, just use dict(ships).

compare large sets of arrays

I have a numpy array A of n 1x3 arrays where n is the total number of possible combinations of elements in the 1x3 arrays, where each element ranges from 0 to 50. That is,
A = [[0,0,0],[0,0,1]...[0,1,0]...[50,50,50]]
and
len(A) = 50*50*50 = 125000
I have a numpy array B of m 1x3 arrays where m = 10 million, and the arrays can have values belonging to the set described by A.
I want to count up how many of each combination is present in B, that is, how many times [0,0,0] appears in B, how many times [0,0,1] appears...how many times [50,50,50] appears. So far I have the following:
for i in range(len(A)):
for j in range(len(B)):
if np.array_equal(A[i], B[j]):
y[i] += 1
where y keeps track of how many times the ith array occurs. So, y[0] is how many times [0,0,0] appeared in B, y[1] is how many times [0,0,1] appeared...y[125000] is how many times [50,50,50] appeared, etc.
The problem is this takes forever. It has to check 10 million entries, 125000 times. Is there a quicker and more efficient way to do this?
Here is a fast approach. It processes 10 million tuples out of range(50)^3 in a fraction of a second and is roughly 100 times faster than the next best solution (#Primusa's):
It uses the fact that there is a straight-forward translation between such tuples and the numbers 0 - 50^3 - 1. (The mapping happens to be the same as the one between the rows of your A and the row numbers.) The functions np.ravel_multi_index and np.unravel_index implement this translation and its inverse.
Once B is translated into numbers, their frequencies can be determined very efficiently using np.bincount. Below I reshape the result to get a 50x50x50 histogram but that is just a matter of taste and can be left out. (I have taken the liberty to only use numbers 0 through 49, so len(A) becomes 125000):
>>> B = np.random.randint(0, 50, (10000000, 3))
>>> Br = np.ravel_multi_index(B.T, (50, 50, 50))
>>> result = np.bincount(Br, minlength=125000).reshape(50, 50, 50)
Let's look at a smaller example for demonstration:
>>> B = np.random.randint(0, 3, (10, 3))
>>> Br = np.ravel_multi_index(B.T, (3, 3, 3))
>>> result = np.bincount(Br, minlength=27).reshape(3, 3, 3)
>>>
>>> B
array([[1, 1, 2],
[2, 1, 2],
[2, 0, 0],
[2, 1, 0],
[2, 0, 2],
[0, 0, 2],
[0, 0, 2],
[0, 2, 2],
[2, 0, 0],
[0, 2, 0]])
>>> result
array([[[0, 0, 2],
[0, 0, 0],
[1, 0, 1]],
[[0, 0, 0],
[0, 0, 1],
[0, 0, 0]],
[[2, 0, 1],
[1, 0, 1],
[0, 0, 0]]])
To query for example how many times [2,1,0] is in B one would do
>>> result[2,1,0]
1
As mentioned above: To convert between indices into your A and the actual rows of A (which are the indices into my result), np.ravel_multi_index and np.unravel_index can be used. Or you can leave out the last reshape (i.e. use result = np.bincount(Br, minlength=125000); then the counts are indexed exactly the same as A.
You can use a dict() to speed up this process to just going through 10 million entries.
So the first thing you want to do is to change all the sublists in A to hashable objects do you can use them as keys in a dict.
Converting all the sublists to tuples:
A = [tuple(i) for i in A]
Then create a dict() with every value in A as the key and the value being 0.
d = {i:0 for i in A}
Now for each subarray in your numpy array, you just want to convert it to a tuple and increment d[that array] by 1
for subarray in B:
d[tuple(subarray)] += 1
D is now a dictionary where for each key the value is how many times that key occured in B.
You can find the unique rows and their counts from array B by calling the np.unique over its first axis and return_counts=True. Then, you can use broadcasting to find the indices of the B's unique rows in A by calling the ndarray.all and ndarray.any methods on proper axises. Then all you need is just a simple indexing:
In [82]: unique, counts = np.unique(B, axis=0, return_counts=True)
In [83]: indices = np.where((unique == A[:,None,:]).all(axis=2).any(axis=0))[0]
# Get items from A that exist in B
In [84]: unique[indices]
# Get the counts
In [85]: counts[indices]
Example:
In [86]: arr = np.array([[2 ,3, 4], [5, 6, 0], [2, 3, 4], [1, 0, 4], [3, 3, 3], [5, 6, 0], [2, 3, 4]])
In [87]: a = np.array([[2, 3, 4], [1, 9, 5], [3, 3, 3]])
In [88]: unique, counts = np.unique(arr, axis=0, return_counts=True)
In [89]: indices = np.where((unique == a[:,None,:]).all(axis=2).any(axis=0))[0]
In [90]: unique[indices]
Out[90]:
array([[2, 3, 4],
[3, 3, 3]])
In [91]: counts[indices]
Out[91]: array([3, 1])
You can do this
y=[np.where(np.all(B==arr,axis=1))[0].shape[0] for arr in A]
arr just iterate over A and np.all checks where it matches with B and np.where returns the positions of those matches as an array then shape just returns the length of that array or in other words the desired frequency

Removing a specific list from an array

I have this list
list = [[0, 0], [0, 1], [0, 2], [1, 0], [1, 1], [1, 2], [2, 1], [2, 2], [2, 0]]
I want to take 2 integers
row = 2 and column = 1
Combine them
thing = (str(row) + str(", ") + str(column))
then I want to remove the list
[2, 1]
from the array. How would I do this?
EDIT: The language is Python
First of all, don't name your list list. It will overwrite the builtin function list() and potentially mess with your code later.
Secondly, finding and removing elements in a list is done like
data.remove(value)
or in your case
data.remove([2, 1])
Specifically, where you are looking for an entry [row, column], you would do
data.remove([row, column])
where row and column are your two variables.
It may be a bit confusing to name them row and column, though. because your data could be interpreted as a matrix/2D array, where "row" and "column" have a different meaning.

Categories