unsure of how 2-d lists work in python - python

Need to make a 6x8 matrix. In this matrix, need to assign values to various cells within the matrix and then simulate a heating/cooling system. Before I get there though, I need to make sure this is right. Is this how you make rows and columns? Does it matter that it does not display this way when printed? Like I said I need to assign values to each of theses cells, does it matter that some already have a value by the way I made the lists? Is there a way to make the list without any initial values?
matrix = [] # Create an empty list
numberOfRows = 6
numberOfColumns = 8
for row in range(0, numberOfRows):
matrix.append([]) # Add an empty new row
for column in range(0, numberOfColumns):
matrix[row].append(column)
print(matrix)
[[0, 1, 2, 3, 4, 5, 6, 7], [0, 1, 2, 3, 4, 5, 6, 7], [0, 1, 2, 3, 4, 5, 6, 7], [0, 1, 2, 3, 4, 5, 6, 7], [0, 1, 2, 3, 4, 5, 6, 7], [0, 1, 2, 3, 4, 5, 6, 7]]

Yes, it is a good method. However, it is much easier and more pythonic to use list comprehension : matrix = [list(range(numberOfColumns)) for _ in range(numberOfRows)]
And yes, you can make a list with no value : [] or list()
And even a 2-dimensional list : [[]]
However, it has no use.

You can use this way to make 2D arrays, but a few comments:
You may consider initializing the elements to 0 (or -1) by changing the last line to matrix[row].append(0).
This is mostly a design choice. For example, if, in your program, later on you do not change the values at every position, then you will be left with old values from the initialization.
You can re-write range(0, numberOfRows) to range(numberOfRows); it starts from 0 by default.
List comprehension, as Labo has already mentioned

Related

Remove duplicate from list without using another list or set

Want to remove duplicate in the python list, just want to keep unique values
l=[3,2,3,4,5,6,1,2]
for i in range(len(l)):
if i in l[:i+1]:
l.pop(i-2)
If am puting pop(i).... it is giving pop out of range
while moving forward in through loop in list, am trying to check if they are present in previous part l[0:i+1], if it's present pop the current item.
Don't wan't to use set!!!
You can use this method by converting the list into the set and then converting it into the list again.
The set data structure contains only unique elements so by converting the into the set we are eliminating all the duplicate elements but the data type will be set.
So to get the original data type we need to convert the set to a list again. Now we have a list without any duplicate elements.
l=[3,2,3,4,5,6,1,2]
list(set(l))
Output
[1, 2, 3, 4, 5, 6]
Since you've mentioned not to use any other data structure, I am providing a solution that runs in quadratic time with respect to the length of the list.
l = [3, 2, 3, 4, 5, 6, 1, 2]
for i in range(len(l)):
for j in range(len(l) - 1, i, -1):
if l[i] == l[j]:
l.pop(j)
print(l)
How it works:
The outer loop with variable i is used to iterate over the list. The nested loop with variable j is used to check if the item is present in the slice after the index i. Every such index is discarded.
Note that we are iterating backwards in the nested loop to avoid index out-of-range situations.
This implementation will not alter the order of the elements and doesn't use any extra space.
l=[3,2,3,4,5,6,1,2]
loc = 0
itr = 0
while loc < len(l):
# test if the item in the current position already exists in l[:loc]
if l[loc] in l[:loc]:
# remove the current item
l.pop(loc)
else:
# move the loc variable to test the next item
loc += 1
# if we removed an item we test the new item that gets pushed into this position
itr+=1
print('itr:{}, loc:{}, l:{}'.format(itr, loc, l))
print('Result:{}'.format(l))
itr:1, loc:1, l:[3, 2, 3, 4, 5, 6, 1, 2]
itr:2, loc:2, l:[3, 2, 3, 4, 5, 6, 1, 2]
itr:3, loc:2, l:[3, 2, 4, 5, 6, 1, 2]
itr:4, loc:3, l:[3, 2, 4, 5, 6, 1, 2]
itr:5, loc:4, l:[3, 2, 4, 5, 6, 1, 2]
itr:6, loc:5, l:[3, 2, 4, 5, 6, 1, 2]
itr:7, loc:6, l:[3, 2, 4, 5, 6, 1, 2]
itr:8, loc:6, l:[3, 2, 4, 5, 6, 1]
Result:[3, 2, 4, 5, 6, 1]
If we typecast a list into set then also we can remove duplicates
example:
x= [1,1,2,1,2,1,3,4]
here x is a list which contains duplicates
y = set(x)
print(y)
output - {1,2,3,4}
So, in this example when I type casted x to set and stored output in y , then y as a set contains only unique elements
Also if we want to convert this set back to list then we can do this:
z= list(y)

2D Vectorization of unique values per row with condition

Consider the array and function definition shown:
import numpy as np
a = np.array([[2, 2, 5, 6, 2, 5],
[1, 5, 8, 9, 9, 1],
[0, 4, 2, 3, 7, 9],
[1, 4, 1, 1, 5, 1],
[6, 5, 4, 3, 2, 1],
[3, 6, 3, 6, 3, 6],
[0, 2, 7, 6, 3, 4],
[3, 3, 7, 7, 3, 3]])
def grpCountSize(arr, grpCount, grpSize):
count = [np.unique(row, return_counts=True) for row in arr]
valid = [np.any(np.count_nonzero(row[1] == grpSize) == grpCount) for row in count]
return valid
The point of the function is to return the rows of array a that have exactly grpCount groups of elements that each hold exactly grpSize identical elements.
For example:
# which rows have exactly 1 group that holds exactly 2 identical elements?
out = a[grpCountSize(a, 1, 2)]
As expected, the code outputs out = [[2, 2, 5, 6, 2, 5], [3, 3, 7, 7, 3, 3]].
The 1st output row has exactly 1 group of 2 (ie: 5,5), while the 2nd output row also has exactly 1 group of 2 (ie: 7,7).
Similarly:
# which rows have exactly 2 groups that each hold exactly 3 identical elements?
out = a[grpCountSize(a, 2, 3)]
This produces out = [[3, 6, 3, 6, 3, 6]], because only this row has exactly 2 groups each holding exactly 3 elements (ie: 3,3,3 and 6,6,6)
PROBLEM: My actual arrays have just 6 columns, but they can have many millions of rows. The code works perfectly as intended, but it is VERY SLOW for long arrays. Is there a way to speed this up?
np.unique sorts the array which makes it less efficient for your purpose. Use np.bincount and that way you most likely will save some time(depending on your array shape and values in the array). You also will not need np.any anymore:
def grpCountSize(arr, grpCount, grpSize):
count = [np.bincount(row) for row in arr]
valid = [np.count_nonzero(row == grpSize) == grpCount for row in count]
return valid
Another way that might even save more time is using same number of bins for all rows and create one array:
def grpCountSize(arr, grpCount, grpSize):
m = arr.max()
count = np.stack([np.bincount(row, minlength=m+1) for row in arr])
return (count == grpSize).sum(1)==grpCount
Another yet upgrade is to use vectorized 2D bin count from this post. For example (note that Numba solutions tested in the post above is faster. I just provided the numpy solution for example. You can replace the function with any of the suggested ones in the post linked above):
def grpCountSize(arr, grpCount, grpSize):
count = bincount2D_vectorized(arr)
return (count == grpSize).sum(1)==grpCount
#from the post above
def bincount2D_vectorized(a):
N = a.max()+1
a_offs = a + np.arange(a.shape[0])[:,None]*N
return np.bincount(a_offs.ravel(), minlength=a.shape[0]*N).reshape(-1,N)
output of all solutions above:
a[grpCountSize2(a, 1, 2)]
#array([[2, 2, 5, 6, 2, 5],
# [3, 3, 7, 7, 3, 3]])

Numpy increment array indexed array? [duplicate]

I am trying to efficiently update some elements of a numpy array A, using another array b to indicate the indexes of the elements of A to be updated. However b can contain duplicates which are ignored whereas I would like to be taken into account. I would like to avoid for looping b. To illustrate it:
>>> A = np.arange(10).reshape(2,5)
>>> A[0, np.array([1,1,1,2])] += 1
>>> A
array([[0, 2, 3, 3, 4],
[5, 6, 7, 8, 9]])
whereas I would like the output to be:
array([[0, 3, 3, 3, 4],
[5, 6, 7, 8, 9]])
Any ideas?
To correctly handle the duplicate indices, you'll need to use np.add.at instead of +=. Therefore to update the first row of A, the simplest way would probably be to do the following:
>>> np.add.at(A[0], [1,1,1,2], 1)
>>> A
array([[0, 4, 3, 3, 4],
[5, 6, 7, 8, 9]])
The documents for the ufunc.at method can be found here.
One approach is to use numpy.histogram to find out how many values there are at each index, then add the result to A:
A[0, :] += np.histogram(np.array([1,1,1,2]), bins=np.arange(A.shape[1]+1))[0]

How to return the order statistic of a whole array?

I have searched the web but could not find a solution. If I have an array, let's say:
x=[17, 1, 2, 7, 8, 5, 27, 29]
I am searching for an easy way such that a vector of order statistics, i.e.
y=[6, 1, 2, 4, 5, 3, 7, 8]
is returned. Of course it can also be (typical for python) indexed starting with zero; Additinally, it would be perfect if there are two or more entries of the same value like:
x=[17, 1, 2, 1, 8, 5, 27, 29]
That we have a result like this:
y=[6, 2, 3, 2, 5, 4, 7, 8]
Basically, since I dont have LaTeX, I want as a result:
"#numbers smaller or equal this number"; Therefore either entry, that is one has two numbers which are smaller or equal one and therefore the desired entry would be 2;
Use sorted:
s = sorted(x)
[s.index(i) + 1 for i in x]
Output:
[6, 1, 2, 4, 5, 3, 7, 8]
Note that index originally starts with 0, thus +1 is slightly unconventional, which may raise error later if you were to use it back to find the original value.

Python list.append output values differ from list.extend

Saw a question on another site about a piece of Python code that was driving someone nuts. It was a fairly small, straightforward-looking piece of code, so I looked at it, figured out what it was trying to do, then ran it on my local system, and discovered why it was driving the original questioner nuts. Hoping that someone here can help me understand what's going on.
The code seems to be a straightforward "ask the user for three values (x,y,z) and a sum (n); iterate all values to find tuples that sum to n, and add those tuples to a list." solution. But what it outputs is, instead of all tuples that sum to n, a list of tuples the count of which is equal to the count of tuples that sum to n, but the contents of which are all "[x,y,z]". Trying to wrap my head around this, I changed the append call to an extend call (knowing that this would un-list the added tuples), to see if the behavior changed at all. I expected to get the same output, just as "x,y,z,x,y,z..." repeatedly, instead of "[x,y,z],[x,y,z]" repeatedly, because as I read and understand the Python documentation, that's the difference between append and extend on lists. What I got instead when I used extend was the correct values of the tuples that summed to n, just broken out of their tuple form by extend.
Here's the problem code:
my = []
x = 3
y = 5
z = 7
n = 11
part = [0,0,0]
for i in range(x+1):
part[0] = i
for j in range(y+1):
part[1] = j
for k in range(z+1):
part[2] = k
if sum(part) == n:
my.append(part)
print(my)
and the output:
[[3, 5, 7], [3, 5, 7], [3, 5, 7], [3, 5, 7], [3, 5, 7], [3, 5, 7], [3, 5, 7], [3, 5, 7], [3, 5, 7], [3, 5, 7], [3, 5, 7], [3, 5, 7], [3, 5, 7], [3, 5, 7]]
And here's the extend output:
[0, 4, 7, 0, 5, 6, 1, 3, 7, 1, 4, 6, 1, 5, 5, 2, 2, 7, 2, 3, 6, 2, 4, 5, 2, 5, 4, 3, 1, 7, 3, 2, 6, 3, 3, 5, 3, 4, 4, 3, 5, 3]
And the extend code:
my = []
x = 3
y = 5
z = 7
n = 11
part = [0,0,0]
for i in range(x+1):
part[0] = i
for j in range(y+1):
part[1] = j
for k in range(z+1):
part[2] = k
if sum(part) == n:
my.extend(part)
print(my)
Any light that could be shed on this would be greatly appreciated. I've dug around for a while on Google and several Q&A sites, and the only things that I found regarding Python append/extend deltas are things that don't seem to have any relevance to this issue.
{edit: environment detail}
Also, ran this in both Python 2.7.10 and Python 3.4.3 (cygwin, under Windows 10 home) with the same results.
extend adds items from the parameter list to the list object making the call. More like, dump objects from one list to another without emptying the former.
append on the other hand, just appends; nothing more. Therefore, appending a list object to another list with an existing reference to the appended list could do some damage - as in this case. After the list has been appended, part still holds a reference to the list (since you're modifying in place), so you're essentially modifying and (re-)appending the same list object every time.
You can prevent this by either building a new list at the start of each parent iteration of the append case.
Or by simply appending a copy of the part list:
my.append(part[:])
my.append(list(part))
my.append(part.copy()) # Python 3 only
This will append a list that has no other existing reference outside its new parent list.
There are a couple of things going on - the difference between append and extend, and the mutability of a list.
Consider a simpler case:
In [320]: part=[0,0,0]
In [321]: alist=[]
In [322]: alist.append(part)
In [323]: alist
Out[323]: [[0, 0, 0]]
The append actually put a pointer to part in the list.
In [324]: alist.extend(part)
In [325]: alist
Out[325]: [[0, 0, 0], 0, 0, 0]
extend put the elements of part in the list, not part itself.
If we change an element in part, we can see the consequences of this difference:
In [326]: part[1]=1
In [327]: alist
Out[327]: [[0, 1, 0], 0, 0, 0]
The append part also changed, but the extended part did not.
That's why your append case consists of sublists, and the sublists all have the final value of part - because they all are part.
The extend puts the current values of part in the list. Not only aren't they sublists, but they don't change as part changes.
Here's a variation on that list pointer issue:
In [333]: alist = [part]*3
In [334]: alist
Out[334]: [[0, 1, 0], [0, 1, 0], [0, 1, 0]]
In [335]: alist[0][0]=2
In [336]: part
Out[336]: [2, 1, 0]
In [337]: alist
Out[337]: [[2, 1, 0], [2, 1, 0], [2, 1, 0]]
alist contains 3 pointers to part (not 3 copies). Change one of those sublists, and we change them all, including part.

Categories