Python Reference Appending List - python

Im a bit baffled how this works.
x = []
y = []
for i in range(5):
y.append(i) # Why does this create full copies of sub lists?
x.append(y)
#x.extend(y) # This works normal
print x
Why is x.append(y) cauusing the final result to be the following? Could you please explain if there is some background reference values going on?
[[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]

There is only one object pointed to by y. It starts as an empty list. Each time through the loop, you are making that single object longer. The list x is essentially the same as [y, y, y, y, y], which gives you the result you describe.
When you use x.extend(y), then the current elements of y are copied onto the end of the list x. This is a totally different operation.

Related

How to iterate over a list that has duplicate values?

This is probably a very basic question but I dont know what I have to search for to find the answer for it:
I have this code:
list = [[0,1],[0,2],[1,3],[1,4],[1,5]]
list.append(list[0])
for i in list:
i.append(0)
print(list)
This List will later be used as coordinates for a curve. I need to duplicate the first coordinate at the end to get a closed curve.
If I then want to add a third value to each coordinate in the list the first and last item in list will be iterated over twice:
[[0, 1, 0, 0], [0, 2, 0], [1, 3, 0], [1, 4, 0], [1, 5, 0], [0, 1, 0, 0]]
I am guessing they have the same memory address and thereby the append-function is applied to the same object at this address once for the first index and once for the last.
What is this phenomenon called ? what is the easiest way to get the list like this:
[[0, 1, 0], [0, 2, 0], [1, 3, 0], [1, 4, 0], [1, 5, 0], [0, 1, 0]]
Thank you for your help
You can do a list comprehension:
list = [[0,1],[0,2],[1,3],[1,4],[1,5]]
list.append(list[0])
list = [x + [0] for x in list]
print(list)
# [[0, 1, 0], [0, 2, 0], [1, 3, 0], [1, 4, 0], [1, 5, 0], [0, 1, 0]]
EDIT: The trick here is, using x + [0] within the list comprehension. This way new lists are created, thus you do not append 0 to the same list twice (Hattip to #dx_over_dt)
The problem you have with your approach is, that the first and last element of your list refers to the very same object. You can see this, when you print i and list for every iteration:
for i in list:
i.append(0)
print(i)
print(list)
So for the first and last i in your loop, you will append a 0 to the very same list.
You could stick to your approach appending a copy of the first element:
list.append(list[0].copy())
The simplest answer is to add the 0's before appending the closing point.
list = [[0,1],[0,2],[1,3],[1,4],[1,5]]
for i in list:
i.append(0)
list.append(list[0])
print(list)
It's the tiniest bit more efficient than a list comprehension because it's not making copies of the elements.

How do I take a python Dictionary or List of size X and assign each element with Y random index values of the dictionary itself

I'm not sure if it would be best to use a list or a dictionary for this algorithm. Assuming I use a dictionary, I want to create a dictionary of size X and randomly assign each element with Y index values of the dictionary itself.
Meaning I could take a dictionary of size 5, and assign each of those 5 elements with 2 index values ranging between 1-5.
The constraints would be that the index value can not be assigned to its own index, so the 2nd index can only be assigned values 1,3,4,5; And Y must always be less that X, in order to prevent assigning duplicate index values to the same index.
What I have so far is being done with a list rather than a dictionary, but I'm not sure if this is the best method. I'd like to keep the algorithm running at 0(n) speed as well, even if the size of the list/dictionary is huge. Either way, this is where I'm at.
So, I make X a list of size 5. I set Y equal to 3, meaning I want each of those 5 elements to contain 3 index values. In the for-loop I create another list excluding the index value I'm currently assigning values to.
X = range(5)[::1] # [0, 1, 2, 3, 4]
print(X)
Y = 3
assigned = []
for k in range(0, len(X)):
XExcluded = [x for i,x in enumerate(X) if i!=k] # if k==3 then [0, 1, 2, 4]
print("EXcluded: {}" .format(XExcluded))
assigned.append(list(random.sample(XExcluded, Y)))
print("assigned: {}" .format(assigned))
Sample Output:
[0, 1, 2, 3, 4]
EXcluded: [1, 2, 3, 4]
assigned: [[1, 2, 3]]
EXcluded: [0, 2, 3, 4]
assigned: [[1, 2, 3], [3, 2, 4]]
EXcluded: [0, 1, 3, 4]
assigned: [[1, 2, 3], [3, 2, 4], [3, 4, 1]]
EXcluded: [0, 1, 2, 4]
assigned: [[1, 2, 3], [3, 2, 4], [3, 4, 1], [0, 1, 2]]
EXcluded: [0, 1, 2, 3]
assigned: [[1, 2, 3], [3, 2, 4], [3, 4, 1], [0, 1, 2], [2, 3, 1]]
One thing I would really like to implement is someway to average out which index values are being assigned over time, because right now the algorithm may assign certain index values more than others. This may be more apparent when starting with smaller lists, but I'd imagine this wont be as much of a problem when starting with a very large list since it would allow the randomly sampled index values to better average out over time.
To balance selections, in the case of your example, where X and Y are both small, a naive solution would be checking which value is missing in each iteration and adding that value to next round to add additional weighting for it during random sampling.
Below is just a simple workable example, without too much effort in efficiency optimization. You may consider using other data structure than set for optimization. (Diff in sets takes O(n). At the end it may get to O(n^2). Since this is aimed at short list case, set was chosen for its simplicity in code.)
import random
X = set(range(5)[::1]) # assume you have distinct values as shown in your example
Y = 3
assigned = []
tobeweighted=None
def assigned_and_missing(fullset, index):
def sample(inlist, weighteditem, Y):
if weighteditem is not None:
inlist.append(weighteditem)
print("Weighted item: {}" .format(weighteditem))
print("Interim selection (weighted item appended): {}" .format(inlist))
randselection = set(random.sample(inlist, Y))
remain = fullset - set((index+1,)) - randselection
missingitem = remain.pop()
if len(randselection) < Y:
randselection.add(remain.pop())
return randselection, missingitem
return sample
for k in range(0, len(X)):
weighted_random = assigned_and_missing(X, k)
XExcluded = [x for i,x in enumerate(X) if i!=k] # if k==3 then [0, 1, 2, 4]
print()
print("EXcluded: {}" .format(XExcluded))
selection, tobeweighted = weighted_random(XExcluded, tobeweighted, Y)
print("Final selection: {}" .format(selection))
assigned.append(selection)
print("assigned: {}" .format(assigned))
print("Item needs to be weighted in next round: {}" .format(tobeweighted))

Why does this function return different results?

Can someone explain to me why this function returns different results:
def g(x, z):
x.append(z)
return x
y = [1, 2, 3]
g(y, 4).extend(g(y[:], 4))
y = [1, 2, 3]
g(y[:], 4).extend(g(y, 4))
The first returns
[1, 2, 3, 4, 1, 2, 3, 4, 4]
and the second
[1, 2, 3, 4]
In both cases, None is returned, because list.extend() extends the list in-place. So you must be looking at what y ends up as. And that's where the rub is; you didn't extend y itself in the second example.
In the first example, you essentially do this:
y.append(4) # y = [1, 2, 3, 4]
temp_copy = y[:] # temp_copy = [1, 2, 3, 4]
temp_copy.append(4) # temp_copy = [1, 2, 3, 4, 4]
y.extend(temp_copy) # y = [1, 2, 3, 4, 1, 2, 3, 4, 4]
del temp_copy
print(y)
The temp_copy name is never really created; the list is only available on the stack and briefly as x inside g(), which is why I delete temp_copy again at the end to make this clear.
So y is first appended to, then extended with another list (which happens to be a copy of y with another element added).
In your second example, you do this instead:
temp_copy = y[:] # temp_copy = [1, 2, 3]
temp_copy.append(4) # temp_copy = [1, 2, 3, 4]
y.append(4) # y = [1, 2, 3, 4]
temp_copy.extend(y) # temp_copy = [1, 2, 3, 4, 1, 2, 3, 4]
del temp_copy
print(y)
You appended one element to y, and all other manipulations apply to a copy. The copy is discarded again, because in your code there is no reference to it.
You made kind of a mess with assignments and copies there. Note that:
append() modifies the list in-place, without creating a new one
so does extend()
y[:] does create a new list
Your expressions return None. You only effect modifications to the lists, you don't save references to new ones.
Let me "unroll" your code, to show the difference:
# First snippet:
y = [1, 2, 3]
y.append(4)
y_copy = list(y)
y_copy.append(4)
y.extend(y_copy)
# Second snippet:
y = [1, 2, 3]
y_copy = list(y)
y_copy.append(4)
y.append(4)
y_copy.extend(y)
As you can see, in the second example, you apply most modifications to the copy, not the original. In the first, all changes go to the original.
On a subjective note, that code piece was very hard to understand. You wrote it yourself and couldn't follow it, and I have years of experience in Python and still had to pull the "unrolling" trick. Try to keep your code simpler, so that objects can be followed and reasoned about.
On the first call you pass the list by reference, on the second call you made a copy of the list(sublisting it).
explain :
>>> one = [1,2,3]
>>> ref = one
>>> copy = one[:]
>>> one
[3, 2, 3]
>>> ref
[3, 2, 3]
>>> copy
[1, 2, 3]

Python list.append output values differ from list.extend

Saw a question on another site about a piece of Python code that was driving someone nuts. It was a fairly small, straightforward-looking piece of code, so I looked at it, figured out what it was trying to do, then ran it on my local system, and discovered why it was driving the original questioner nuts. Hoping that someone here can help me understand what's going on.
The code seems to be a straightforward "ask the user for three values (x,y,z) and a sum (n); iterate all values to find tuples that sum to n, and add those tuples to a list." solution. But what it outputs is, instead of all tuples that sum to n, a list of tuples the count of which is equal to the count of tuples that sum to n, but the contents of which are all "[x,y,z]". Trying to wrap my head around this, I changed the append call to an extend call (knowing that this would un-list the added tuples), to see if the behavior changed at all. I expected to get the same output, just as "x,y,z,x,y,z..." repeatedly, instead of "[x,y,z],[x,y,z]" repeatedly, because as I read and understand the Python documentation, that's the difference between append and extend on lists. What I got instead when I used extend was the correct values of the tuples that summed to n, just broken out of their tuple form by extend.
Here's the problem code:
my = []
x = 3
y = 5
z = 7
n = 11
part = [0,0,0]
for i in range(x+1):
part[0] = i
for j in range(y+1):
part[1] = j
for k in range(z+1):
part[2] = k
if sum(part) == n:
my.append(part)
print(my)
and the output:
[[3, 5, 7], [3, 5, 7], [3, 5, 7], [3, 5, 7], [3, 5, 7], [3, 5, 7], [3, 5, 7], [3, 5, 7], [3, 5, 7], [3, 5, 7], [3, 5, 7], [3, 5, 7], [3, 5, 7], [3, 5, 7]]
And here's the extend output:
[0, 4, 7, 0, 5, 6, 1, 3, 7, 1, 4, 6, 1, 5, 5, 2, 2, 7, 2, 3, 6, 2, 4, 5, 2, 5, 4, 3, 1, 7, 3, 2, 6, 3, 3, 5, 3, 4, 4, 3, 5, 3]
And the extend code:
my = []
x = 3
y = 5
z = 7
n = 11
part = [0,0,0]
for i in range(x+1):
part[0] = i
for j in range(y+1):
part[1] = j
for k in range(z+1):
part[2] = k
if sum(part) == n:
my.extend(part)
print(my)
Any light that could be shed on this would be greatly appreciated. I've dug around for a while on Google and several Q&A sites, and the only things that I found regarding Python append/extend deltas are things that don't seem to have any relevance to this issue.
{edit: environment detail}
Also, ran this in both Python 2.7.10 and Python 3.4.3 (cygwin, under Windows 10 home) with the same results.
extend adds items from the parameter list to the list object making the call. More like, dump objects from one list to another without emptying the former.
append on the other hand, just appends; nothing more. Therefore, appending a list object to another list with an existing reference to the appended list could do some damage - as in this case. After the list has been appended, part still holds a reference to the list (since you're modifying in place), so you're essentially modifying and (re-)appending the same list object every time.
You can prevent this by either building a new list at the start of each parent iteration of the append case.
Or by simply appending a copy of the part list:
my.append(part[:])
my.append(list(part))
my.append(part.copy()) # Python 3 only
This will append a list that has no other existing reference outside its new parent list.
There are a couple of things going on - the difference between append and extend, and the mutability of a list.
Consider a simpler case:
In [320]: part=[0,0,0]
In [321]: alist=[]
In [322]: alist.append(part)
In [323]: alist
Out[323]: [[0, 0, 0]]
The append actually put a pointer to part in the list.
In [324]: alist.extend(part)
In [325]: alist
Out[325]: [[0, 0, 0], 0, 0, 0]
extend put the elements of part in the list, not part itself.
If we change an element in part, we can see the consequences of this difference:
In [326]: part[1]=1
In [327]: alist
Out[327]: [[0, 1, 0], 0, 0, 0]
The append part also changed, but the extended part did not.
That's why your append case consists of sublists, and the sublists all have the final value of part - because they all are part.
The extend puts the current values of part in the list. Not only aren't they sublists, but they don't change as part changes.
Here's a variation on that list pointer issue:
In [333]: alist = [part]*3
In [334]: alist
Out[334]: [[0, 1, 0], [0, 1, 0], [0, 1, 0]]
In [335]: alist[0][0]=2
In [336]: part
Out[336]: [2, 1, 0]
In [337]: alist
Out[337]: [[2, 1, 0], [2, 1, 0], [2, 1, 0]]
alist contains 3 pointers to part (not 3 copies). Change one of those sublists, and we change them all, including part.

Add arrays to the numpy array

I try to add newly created arrays to other numpy array, but I'm doing something wrong. What I want, is to add multiple arrays like numpy.array([0, 1, 2, 3]), to already created array, so I could get something like this:
x = numpy.array([])
for i in np.arange(5):
y = numpy.array([0, 1, 2, 3])
x = np.append(x, y)
result:
x = [0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3],
[0, 1, 2, 3]
However, with the loop shown above I get this:
x = [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]
Try this:
x = []
for i in range(5):
y = numpy.array([0, 1, 2, 3])
x.append(y)
x = numpy.array(x)
or:
N = 5
x = numpy.zeros((N, 4))
for i in range(N):
x[i] = numpy.array([0, 1, 2, 3])
Here I avoid numpy.append and numpy.vstack inside the loop because it can be quite slow. Every call to numpy.append or numpy.vstack creates an empty array and copies both x and y into the new empty array. If you use a list to hold the rows of array until the loop is over, the array just gets copied once at the end.
If neither of the above work for you, you could do something like this (but it'll be slower):
x = numpy.zeros((0, 4))
for i in range(5):
y = numpy.array([0, 1, 2, 3])
x = numpy.vstack(x, y)
append adds to the end of the array. Since x only has one dimension (it has shape (0,) to begin with) it can grow only in the way you observe.
It's not generally the right tool to use to build multi-dimensional arrays incrementally as you're doing - you can add append to a specific access (and so stack arrays) but you need to ensure that both arrays are the same shape, and same size along that axis. On top of this the array you're appending to must be copied each time.
A more succinct way to build your required array could be to use np.tile instead:
>>> np.tile([1, 2, 3, 4, 5], (5, 1)) # (5,1) means 5/1 copies along axis 0/1
array([[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5]])

Categories