constucting a matrix with lists of lists in Python [duplicate] - python

This question already has answers here:
List of lists changes reflected across sublists unexpectedly
(17 answers)
Closed 6 years ago.
I try to construct a matrix with a lists of lists in Python3.
When I implement the following function,
def myfunction(h, l):
resultmat = (h+1) * [(l+1) * [0]]
for i in range(1,h):
for j in range(1, l+1):
resultmat[i][j] = i + j
return resultmat
I get the following result:
myfunction(2,3)
Out[42]: [[0, 2, 3, 4], [0, 2, 3, 4], [0, 2, 3, 4]]
I had expected that the first rows and columns would be filled with zeroes, and that all other elements would be the sum of the row and column index. This is not exactly what I get here, where each row is a copy of the others. Can someone explain me (a) what is going on (b) how I can solve this problem?

The problem is the list multiplication. You aren't getting four lists, you're getting one list four times. Use
resultmat = [[0 for i in range(h+1)] for j in range(l+1)]
to build your zero-list instead.
Edit:
Reading the rest of your question, you could probably do the whole thing in a list comprehension
[[i+j if i and j else 0 for i in range(h+1)] for j in range(l+1)]

When you created the initial zero-matrix:
resultmat = (h+1) * [(l+1) * [0]]
This creates a list of lists of zeros. But, the lists of zeros (the rows) are all a reference to the same list. When you change one, it changes all the others:
>>> l = 3*[3*[0]]
>>> l
[[0, 0, 0], [0, 0, 0], [0, 0, 0]]
>>> l[0][0] = 1
>>> l
[[1, 0, 0], [1, 0, 0], [1, 0, 0]]
>>>
When I changed the first list to contain a 1, all the lists now contain the 1 because they're all actually the same list.
While Patrick's answer is correct, here is a slightly more readable version of your code, which accomplishes what you want. It creates a matrix for which every cell is the sum of the two indices, and then zeros the first row and the first column.
from pprint import pprint
def create_matrix(height, length):
matrix = [ [ i + j for j in range(length) ] for i in range(height) ]
matrix[0] = length*[0] # zero the first row
for row in matrix:
row[0] = 0 # zero the first column
return matrix
pprint(create_matrix(10, 11))
Output:
[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
[0, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
[0, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13],
[0, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14],
[0, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15],
[0, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16],
[0, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17],
[0, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18],
[0, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]]

Related

How do I put data in a two-dimensional array sequentially without duplication?

I want to put data consisting of a one-dimensional array into a two-dimensional array. I will assume that the number of rows and columns is 5.
The code I tried is as follows.
data = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
a = []
for i in range(5):
a.append([])
for j in range(5):
a[i].append(j)
print(a)
# result : [[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]
# I want this : [[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20]]
You don't have to worry about the last [20].
The important thing is that the row must change without duplicating the data.
I want to solve it, but I can't think of any way. I ask for your help.
There are two issues with the current code.
It doesn't actually use any of the values from the variable data.
The data does not contain enough items to populate a 5x5 array.
After adding 0 to the beginning of the variable data and using the values from the variable, the code becomes
data = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
a = []
for i in range(5):
a.append([])
for j in range(5):
if i*5+j >= len(data):
break
a[i].append(data[i*5+j])
print(a)
The output of the new code will be
[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20]]
This should deliever the desired output
data = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
a = []
x_i = 5
x_j = 5
for i in range(x_i):
a.append([])
for j in range(x_j):
a[i].append(i*x_j+j)
print(a)
Output:
[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20, 21, 22, 23, 24]]
By using list comprehension...
data = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
columns = 5
rows = 5
result = [data[i * columns: (i + 1) * columns] for i in range(rows)]
print(result)
# [[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20]]
You could use itertools.groupby with an integer division to create the groups
from itertools import groupby
data = [0, 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
grouped_data = [list(v) for k, v in groupby(data, key=lambda x: x//5)]
print(grouped_data)
Output
[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13, 14], [15, 16, 17, 18, 19], [20]]

Efficiently iterate over nested lists to find sum

I have an array of arrays and want to check if the sum equals 40. The problem is that the array has around 270,000,000 elements and doing in sequentially is out of the picture. The problem that I am having is finding the sums in a reasonable amount of time. I have ran this program overnight and it is still running in the morning. How can I make this program more efficient and run decently fast?
Here is my code so far:
import numpy as np
def cartesianProduct(arrays):
la = arrays.shape[0]
arr = np.empty([la] + [a.shape[0] for a in arrays], dtype="int32")
for i, a in enumerate(np.ix_(*arrays)):
arr[i, ...] = a
return arr.reshape(la, -1).T
rows = np.array(
[
[2, 15, 23, 19, 3, 2, 3, 27, 20, 11, 27, 10, 19, 10, 13, 10],
[22, 9, 5, 10, 5, 1, 24, 2, 10, 9, 7, 3, 12, 24, 10, 9],
[16, 0, 17, 0, 2, 0, 2, 0, 10, 0, 15, 0, 6, 0, 9, 0],
[11, 27, 14, 5, 5, 7, 8, 24, 8, 3, 6, 15, 22, 6, 1, 1],
[10, 0, 2, 0, 22, 0, 2, 0, 17, 0, 15, 0, 14, 0, 5, 0],
[1, 6, 10, 6, 10, 2, 6, 10, 4, 1, 5, 5, 4, 8, 6, 3],
[6, 0, 13, 0, 3, 0, 3, 0, 6, 0, 10, 0, 10, 0, 10, 0],
],
dtype="int32",
)
product = cartesianProduct(rows)
combos = []
for row in product:
if sum(row) == 40:
combos.append(row)
print(combos)
I believe what you are trying to do is called NP-hard. Look into "dynamic programming" and "subset sum"
Examples:
https://www.geeksforgeeks.org/subset-sum-problem-dp-25/
https://www.techiedelight.com/subset-sum-problem/
As suggested in the comments one way to optimize this is to check if the sum of a sub array already exceeds your threshold (40 in this case). and as another optimization to this you can even sort the arrays incrementally from largest to smallest.
Check heapq.nlargest() for incremental partial sorting.

Python creating a list of lists overrides but does not append

Folks,
Basically what I am expecting is a list of lists based on the input comma separated numbers. As you can see I have 5,6 which means I need to create a 5 lists with 6 elements and each of the element in the lists will have to be multiplied by the index position. So what I need from the below input is
[[0,0,0,0,0,0], [0,1,2,3,4,5], [0,2,4,6,8,10], [0,3,6,9,12,15],[0,4,8,12,16,20]]
instead what I get is
[[0, 4, 8, 12, 16, 20], [0, 4, 8, 12, 16, 20], [0, 4, 8, 12, 16, 20], [0, 4, 8, 12, 16, 20], [0, 4, 8, 12, 16, 20]]
not sure what I am doing wrong.. Can anyone please help?
input_str = '5,6'
lst = list(input_str.split(","))
main_lst = []
total_list = int(lst[0])
total_elements = int(lst[1])
lst1 = []
for i in range(total_list):
lst1.clear()
for j in range(total_elements):
lst1.append(j*i)
print(lst1)
main_lst[i] = lst1
print(main_lst)
This can easily be done using list comprehension
lstCount,elementCount = map(int,'5,6'.split(','))
bigLst = [[i*j for j in range(elementCount)] for i in range(lstCount)]
output
[[0, 0, 0, 0, 0, 0], [0, 1, 2, 3, 4, 5], [0, 2, 4, 6, 8, 10], [0, 3, 6, 9, 12, 15], [0, 4, 8, 12, 16, 20]]
You should append lst1's copy, not assign to it.Or the value in main_lst will change when you change lst1's value.
code:
input_str = '5,6'
lst = list(input_str.split(","))
main_lst = []
total_list = int(lst[0])
total_elements = int(lst[1])
lst1 = []
for i in range(total_list):
lst1.clear()
for j in range(total_elements):
lst1.append(j*i)
print(lst1)
main_lst.append(lst1.copy())#you should append its copy
print(main_lst)
result:
[0, 0, 0, 0, 0, 0]
[0, 1, 2, 3, 4, 5]
[0, 2, 4, 6, 8, 10]
[0, 3, 6, 9, 12, 15]
[0, 4, 8, 12, 16, 20]
[[0, 0, 0, 0, 0, 0], [0, 1, 2, 3, 4, 5], [0, 2, 4, 6, 8, 10], [0, 3, 6, 9, 12, 15], [0, 4, 8, 12, 16, 20]]

Zero pad array based on other array's shape

I've got K feature vectors that all share dimension n but have a variable dimension m (n x m). They all live in a list together.
to_be_padded = []
to_be_padded.append(np.reshape(np.arange(9),(3,3)))
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
to_be_padded.append(np.reshape(np.arange(18),(3,6)))
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17]])
to_be_padded.append(np.reshape(np.arange(15),(3,5)))
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
What I am looking for is a smart way to zero pad the rows of these np.arrays such that they all share the same dimension m. I've tried solving it with np.pad but I have not been able to come up with a pretty solution. Any help or nudges in the right direction would be greatly appreciated!
The result should leave the arrays looking like this:
array([[0, 1, 2, 0, 0, 0],
[3, 4, 5, 0, 0, 0],
[6, 7, 8, 0, 0, 0]])
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17]])
array([[ 0, 1, 2, 3, 4, 0],
[ 5, 6, 7, 8, 9, 0],
[10, 11, 12, 13, 14, 0]])
You could use np.pad for that, which can also pad 2-D arrays using a tuple of values specifying the padding width, ((top, bottom), (left, right)). For that you could define:
def pad_to_length(x, m):
return np.pad(x,((0, 0), (0, m - x.shape[1])), mode = 'constant')
Usage
You could start by finding the ndarray with the highest amount of columns. Say you have two of them, a and b:
a = np.array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
b = np.array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
m = max(i.shape[1] for i in [a,b])
# 5
And then use this parameter to pad the ndarrays:
pad_to_length(a, m)
array([[0, 1, 2, 0, 0],
[3, 4, 5, 0, 0],
[6, 7, 8, 0, 0]])
I believe there is no very efficient solution for this. I think you will need to loop over the list with a for loop and treat every array individually:
for i in range(len(to_be_padded)):
padded = np.zeros((n, maxM))
padded[:,:to_be_padded[i].shape[1]] = to_be_padded[i]
to_be_padded[i] = padded
where maxM is the longest m of the matrices in your list.

Python: How to create a zero-indexed 2-dimensional matrix with a size of n

Given a value n, what is the most efficient way to create a zero-indexed matrix with the column and row size equal to the value n?
I know the following command will create a 2d matrix with 0 as all of the cell values:
[[0 for x in range(n)] for y in range(n)]
So when the n = 4, the matrix would look as follows:
[[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]]
But is there any way to create the matrix as follows?
[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]]
Since you are talking about 2-d matrices, and you're specifically asking about efficiency, it might be best to turn to numpy:
import numpy as np
n = 4
my_matrix = np.arange(n*n).reshape(n,n)
>>> my_matrix
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
Or you could use a list comprehension such as the following (or one of the ones suggested by #hiroprotagonist):
n = 4
my_ListOfLists = [list(range(i,i+n)) for i in range(0,n*n,n)]
>>> my_ListOfLists
[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15]]
you could simply use the current coordinates x and y to calculate the value x + y*n:
n = 4
print([[x + y*n for x in range(n)] for y in range(n)])
our you could use itertools.count:
from itertools import count
n = 4
counter = count()
print([[next(counter) for x in range(n)] for y in range(n)])
Something like this would work.
[[i for i in range(j, j + 4)] for j in range(0, 20, 4)]
[[0, 1, 2, 3],
[4, 5, 6, 7],
[8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19]]

Categories