Very weird Python variable scope behaviour - python

I'm having a problem with Python 2.7 that is driving me insane.
I'm passing an array to some functions and altough that variable is suposed to be local, in the end the value of the variable inside main is changed.
I'm a bit new to Python, but this goes against any common sense I got.
Any ideas of what I'm doing wrong?
def mutate(chromo):
# chooses random genes and mutates them randomly to 0 or 1
for gene in chromo:
for codon in gene:
for base in range(2):
codon[randint(0, len(codon)-1)] = randint(0, 1)
return chromo
def mate(chromo1, chromo2):
return mutate([choice(pair) for pair in zip(chromo1, chromo2)])
if __name__ == '__main__':
# top 3 is a multidimensional array with 3 levels (in here I put just 2 for simplicity)
top3 = [[1, 0], [0, 0], [1, 1]]
offspring = []
for item in top3:
offspring.append(mate(top3[0], item))
# after this, top3 is diferent from before the for cycle
UPDATE
Because Python passes by reference, I must make a real copy fo the arrays before using them, so the mate functions must be changed to:
import copy
def mate(chromo1, chromo2):
return mutate([choice(pair) for pair in zip(copy.deepcopy(chromo1), copy.deepcopy(chromo2))])

The problem you are having is stemming from the fact that arrays and dictionaries in python are passed by reference. This means that instead of a fresh copy being created by the def and used locally you are getting a pointer to your array in memory...
x = [1,2,3,4]
def mystery(someArray):
someArray.append(4)
print someArray
mystery(x)
[1, 2, 3, 4, 4]
print x
[1, 2, 3, 4, 4]

You manipulate chromo, which you pass by reference. Therefore the changes are destructive... the return is therefore kind of moot as well (codon is in gene and gene is in chromo). You'll need to make a (deep) copy of your chromos, I think.

try changing
offspring.append(mate(top3[0], item)) to
offspring.append(mate(top3[0][:], item[:]))
or use the list() function

Related

How to generate more than one list from a list, using python functions

I am trying to make a 8 puzzle problem solver using different algorithms, such as BFS,DFS, A* etc. using python. For those who are not familiar with the problem, 8 puzzle problem is a game consisting of 3 rows and 3 columns. You can move the empty tile only horizontally or vertically, 0 represents the empty tile. It looks like this (I couldn't add the images due to my accounts reputation.):
https://miro.medium.com/max/679/1*yekmcvT48y6mB8dIcK967Q.png
initial_state = [0,1,3,4,2,5,7,8,6]
goal_state = [1,2,3,4,5,6,7,8,0]
def find_zero(state):
global loc_of_zero
loc_of_zero = (state.index(0))
def swap_positions(list, pos1, pos2):
first = list.pop(pos1)
second = list.pop(pos2-1)
list.insert(pos1,second)
list.insert(pos2,first)
return list
def find_new_nodes(state):
if loc_of_zero == 0:
right = swap_positions(initial_state,0,1)
left = swap_positions(initial_state,0,3)
return(right,left)
find_zero(initial_state)
print(find_new_nodes(initial_state))
The problem I have is this, I want the function "find_new_nodes(state)" return 2 different lists, so I can choose the most promising node, depending on the algorithm) and so on. But the output of my code consists of two identical lists.
This is my output:
([4, 0, 3, 1, 2, 5, 7, 8, 6], [4, 0, 3, 1, 2, 5, 7, 8, 6])
What can I do to make it return 2 different lists? My goal is to return all possible moves depending on where the 0 is, using the find_new_nodes function. Apologies if this is an easy question, This is my first time making a project this complicated.
The problem is that swap_positions obtains a reference to the global initial_state and not a clone of it. So both calls to swap_positions mutate the same array.
A solution would be to clone the array on the first call:
right = swap_positions(initial_state[:],0,1)
probably a better solution for swap_positions would also be:
# please do not name variables same as builtin names
def swap_positions(lis, pos1, pos2):
# create a new tuple of both elements and destruct it directly
lis[pos1], lis[pos2] = lis[pos2], lis[pos1]
return lis
see also here
You don't really have "two identical list", you only have one list object that you're returning twice. To avoid modifying the original list and also two work with different lists, you should pass copies around.
initial_state = [0,1,3,4,2,5,7,8,6]
goal_state = [1,2,3,4,5,6,7,8,0]
def find_zero(state):
global loc_of_zero
loc_of_zero = (state.index(0))
def swap_positions(states, pos1, pos2):
first = states.pop(pos1)
second = states.pop(pos2-1)
states.insert(pos1,second)
states.insert(pos2,first)
return states
def find_new_nodes(states):
if loc_of_zero == 0:
right = swap_positions(states.copy(),0,1) # pass around a copy
left = swap_positions(states.copy(),0,3) # pass around a copy
return(right,left)
find_zero(initial_state)
print(find_new_nodes(initial_state))
Side note 1: I have renamed your vairable list to states, otherwise it would shadow the built in list function
Side note 2: find_new_nodes did not work with the parameter, instead it used the global list. I changed that, too.
Side note 3: There are different ways to create a copy of your (shallow) list. I think list.copy() is the most verbose one. You could also use the copy module, use [:] or something else.
Output:
([1, 0, 3, 4, 2, 5, 7, 8, 6], [4, 1, 3, 0, 2, 5, 7, 8, 6])
Ok, first of all, some thoughts...
Try to not use "list" as a variable, it's a Python identifier for "list" type. It seems that you are redefining the term.
Usually, it's a bad idea to use global vars such as loc_of_zero.
About your problem:
I believe that the problem is that you are getting a lot of references of the same variable. Try to avoid it. One idea:
from copy import deepcopy
def swap_positions(list0, pos1, pos2):
list1 = deepcopy(list0)
first = list1.pop(pos1)
second = list1.pop(pos2-1)
list1.insert(pos1,second)
list1.insert(pos2,first)
return list1

Pass by reference in python [duplicate]

This question already has answers here:
How do I pass a variable by reference?
(39 answers)
Closed 2 years ago.
As far as I know, python passes parameter as reference. I have following code
def func(arr):
print(arr)
if arr == [] :
return
for i in range(len(arr)):
arr[i] *= 2
func(arr[1:])
r = [1,1,1,1]
func(r)
print(r)
I would expect the output to be [2,4,8,16].
Why it outputs [2,2,2,2] as if the reference only works for one level of recursion?
maybe 'arr[1:]' always creates a new object? If that is the case, is there any way to make arr[1:] work?
You've asked a couple different questions, let's go through them one by one.
As far as I know, python passes parameter as reference.
The correct term for how python passes it's arguments is "pass py assignment". It means that parameters inside the function behave similar to how they would if they had been directly assigned with an = sign. (So, mutations will reflect across all references to the object. So far so good)
Sidenote (skip if this is confusing): For all intents and purposes, the distinction of "pass by assignment" is important because it abstracts away pass by value vs pass by reference, concepts that are not exposed in python directly. If you wish to know how the underlying mechanism works, it's actually a pass by value, but every value itself is a reference to an object (equivalent to a first level pointer in C speak). We can see why it's easier and important initially not to worry about this particular abstraction, and use "pass by assignment" as the more intuitive explanation.
Next,
maybe 'arr[1:]' always creates a new object?
Correct, slicing always creates a shallow copy of the list. docs
If that is the case, is there any way to make arr[1:] work?
Not directly, but we can use indexes instead to build a solution that works and gives us the output you desire. Just keep track of a starting index while doing the recursion, and increment it as you continue recursing.
def func(arr, start=0):
print(arr)
if arr[start:] == [] :
return
for i in range(start, len(arr)):
arr[i] *= 2
func(arr, start + 1)
r = [1,1,1,1]
func(r)
print(r)
Output:
[1, 1, 1, 1]
[2, 2, 2, 2]
[2, 4, 4, 4]
[2, 4, 8, 8]
[2, 4, 8, 16]
[2, 4, 8, 16]
You do slice arr[1:] and as the result it creates new list. That's why you got such result, in future I would not recommend you do such thing it's implicit and hard to debug. Try to return new value instead of changing it by reference when you work with functions
For example like this:
def multiplier(arr):
return [
value * (2 ** idx)
for idx, value in enumerate(arr, start=1)
]
result = multiplier([1, 1, 1, 1])
print(result) # [2, 4, 8, 16]
Slicing a list will create a new object (as you speculated), which would explain why the original list isn't updated after the first call.
Yes, arr[1:] create a new object. You can pass the index to indicate the starting index.
def func(arr, start_idx):
print(arr)
if arr == [] :
return
for i in range(start_idx, len(arr)):
arr[i] *= 2
func(arr, start_idx + 1)
r = [1,1,1,1]
func(r, 0)
print(r)
You can use this. The variable s represents start and e represents end, of the array.
def func(arr,s,e):
print(arr) #comment this line if u dont want the output steps
if s>=e:
return
for i in range(s,e):
arr[i] *= 2
func(arr,s+1,e)
r = [1,1,1,1]
func(r,0,len(r))
print(r)

Stop Python Functions Overwriting Inputs

I have a simple function that is supposed to run down the diagonal of an array and turn all the values to 0.
def diagonal_zeros(dataset):
zero = dataset[:]
length = len(zero)
for i in range(length):
zero[i, i] = 0
return zero
When I run this function on an array, it outputs the new, correct 'zero' array, but it also goes back and overwrites the original 'dataset.' I had thought that the line zero = dataset[:] would have prevented this.
I do not, however, get the same behavior with this function:
def seperate_conditions(dataset, first, last):
dataset = dataset[first:last, :]
return dataset
Which leaves the first dataset unchanged. I've been reading StackOverflow answers to related questions, but I cannot for the life of me figure this out. I'm working on a scientific analysis pipeline so I really want to be able to refer back to the matrices at every step.
Thanks
Arguments in python are passed by assignment (thanks to #juanpa.arrivillaga for the correction) and not by value. This means that generally the function does not recieve a copy of the argument, but a "pointer" to the argument itself. If you alter the object referenced by the argument in the function, you are modifying the same object outside. Here's a page with some more information.
A possibility is to use the copy module inside your function, to create a copy of the dataset.
As an example, for your code:
import copy
myDataset = [[1,2,3],[2,3,4],[3,4,5]]
def diagonal_zeros(dataset):
zero = copy.deepcopy(dataset)
length = len(zero)
for i in range(length):
zero[i][i] = 0
return zero
result = diagonal_zeros(myDataset)
print(result) #[[0, 2, 3], [2, 0, 4], [3, 4, 0]]
print(myDataset) #[[1, 2, 3], [2, 3, 4], [3, 4, 5]]
This article helped me a lot with this concept.

Python: .append(0)

I would like to ask what the following does in Python.
It was taken from http://danieljlewis.org/files/2010/06/Jenks.pdf
I have entered comments telling what I think is happening there.
# Seems to be a function that returns a float vector
# dataList seems to be a vector of flat.
# numClass seems to an int
def getJenksBreaks( dataList, numClass ):
# dataList seems to be a vector of float. "Sort" seems to sort it ascendingly
dataList.sort()
# create a 1-dimensional vector
mat1 = []
# "in range" seems to be something like "for i = 0 to len(dataList)+1)
for i in range(0,len(dataList)+1):
# create a 1-dimensional-vector?
temp = []
for j in range(0,numClass+1):
# append a zero to the vector?
temp.append(0)
# append the vector to a vector??
mat1.append(temp)
(...)
I am a little confused because in the pdf there are no explicit variable declarations. However I think and hope I could guess the variables.
Yes, the method append() adds elements to the end of the list. I think your interpretation of the code is correct.
But note the following:
x =[1,2,3,4]
x.append(5)
print(x)
[1, 2, 3, 4, 5]
while
x.append([6,7])
print(x)
[1, 2, 3, 4, 5, [6, 7]]
If you want something like
[1, 2, 3, 4, 5, 6, 7]
you may use extend()
x.extend([6,7])
print(x)
[1, 2, 3, 4, 5, 6, 7]
Python doesn't have explicit variable declarations. It's dynamically typed, variables are whatever type they get assigned to.
Your assessment of the code is pretty much correct.
One detail: The range function goes up to, but does not include, the last element. So the +1 in the second argument to range causes the last iterated value to be len(dataList) and numClass, respectively. This looks suspicious, because the range is zero-indexed, which means it will perform a total of len(dataList) + 1 iterations (which seems suspicious).
Presumably dataList.sort() modifies the original value of dataList, which is the traditional behavior of the .sort() method.
It is indeed appending the new vector to the initial one, if you look at the full source code there are several blocks that continue to concatenate more vectors to mat1.
append is a list function used to append a value at the end of the list
mat1 and temp together are creating a 2D array (eg = [[], [], []]) or matrix of (m x n)
where m = len(dataList)+1 and n = numClass
the resultant matrix is a zero martix as all its value is 0.
In Python, variables are implicitely declared. When you type this:
i = 1
i is set to a value of 1, which happens to be an integer. So we will talk of i as being an integer, although i is only a reference to an integer value. The consequence of that is that you don't need type declarations as in C++ or Java.
Your understanding is mostly correct, as for the comments. [] refers to a list. You can think of it as a linked-list (although its actual implementation is closer to std::vectors for instance).
As Python variables are only references to objects in general, lists are effectively lists of references, and can potentially hold any kind of values. This is valid Python:
# A vector of numbers
vect = [1.0, 2.0, 3.0, 4.0]
But this is perfectly valid code as well:
# The list of my objects:
list = [1, [2,"a"], True, 'foo', object()]
This list contains an integer, another list, a boolean... In Python, you usually rely on duck typing for your variable types, so this is not a problem.
Finally, one of the methods of list is sort, which sorts it in-place, as you correctly guessed, and the range function generates a range of numbers.
The syntax for x in L: ... iterates over the content of L (assuming it is iterable) and sets the variable x to each of the successive values in that context. For example:
>>> for x in ['a', 'b', 'c']:
... print x
a
b
c
Since range generates a range of numbers, this is effectively the idiomatic way to generate a for i = 0; i < N; i += 1 type of loop:
>>> for i in range(4): # range(4) == [0,1,2,3]
... print i
0
1
2
3

Python: Change the parameter of the loop while the loop is running

I want to change a in the for-loop to [4,5,6].
This code just print: 1, 2, 3
a = [1,2,3]
for i in a:
global a
a = [4,5,6]
print i
I want the ouput 1, 4, 5, 6.
You'll need to clarify the question because there is no explanation of how you should derive the desired output 1, 4, 5, 6 when your input is [1, 2, 3]. The following produces the desired output, but it's completely ad-hoc and makes no sense:
i = 0
a = [1, 2, 3]
while i < len(a):
print(a[i])
if a[i] == 1:
a = [4, 5, 6]
i = 0 # edit - good catch larsmans
else:
i += 1
The main point is that you can't modify the parameters of a for loop while the loop is executing. From the python documentation:
It is not safe to modify the sequence being iterated over in the loop
(this can only happen for mutable sequence types, such as lists). If
you need to modify the list you are iterating over (for example, to
duplicate selected items) you must iterate over a copy.
Edit: if based on the comments you are trying to walk URLs, you need more complicated logic to do a depth-first or breadth-first walk than just replacing one list (the top-level links) with another list (links in the first page). In your example you completely lose track of pages 2 and 3 after diving into page 1.
The issue is that the assignment
a = [4,5,6]
just changes the variable a, not the underlying object. There are various ways you could deal with this; one would be to use a while loop like
a = [1,2,3]
i = 0
while i<len(a):
print a[i]
a = [4,5,6]
i += 1
prints
1
5
6
If you print id(a) at useful points in your code you'll realise why this doesn't work.
Even something like this does not work:
a = [1,2,3]
def change_a(new_val):
a = new_val
for i in a:
change_a([4,5,6])
print i
I don't think it is possible to do what you want. Break out of the current loop and start a new one with your new value of a.

Categories