Stop Python Functions Overwriting Inputs - python

I have a simple function that is supposed to run down the diagonal of an array and turn all the values to 0.
def diagonal_zeros(dataset):
zero = dataset[:]
length = len(zero)
for i in range(length):
zero[i, i] = 0
return zero
When I run this function on an array, it outputs the new, correct 'zero' array, but it also goes back and overwrites the original 'dataset.' I had thought that the line zero = dataset[:] would have prevented this.
I do not, however, get the same behavior with this function:
def seperate_conditions(dataset, first, last):
dataset = dataset[first:last, :]
return dataset
Which leaves the first dataset unchanged. I've been reading StackOverflow answers to related questions, but I cannot for the life of me figure this out. I'm working on a scientific analysis pipeline so I really want to be able to refer back to the matrices at every step.
Thanks

Arguments in python are passed by assignment (thanks to #juanpa.arrivillaga for the correction) and not by value. This means that generally the function does not recieve a copy of the argument, but a "pointer" to the argument itself. If you alter the object referenced by the argument in the function, you are modifying the same object outside. Here's a page with some more information.
A possibility is to use the copy module inside your function, to create a copy of the dataset.
As an example, for your code:
import copy
myDataset = [[1,2,3],[2,3,4],[3,4,5]]
def diagonal_zeros(dataset):
zero = copy.deepcopy(dataset)
length = len(zero)
for i in range(length):
zero[i][i] = 0
return zero
result = diagonal_zeros(myDataset)
print(result) #[[0, 2, 3], [2, 0, 4], [3, 4, 0]]
print(myDataset) #[[1, 2, 3], [2, 3, 4], [3, 4, 5]]
This article helped me a lot with this concept.

Related

How to generate more than one list from a list, using python functions

I am trying to make a 8 puzzle problem solver using different algorithms, such as BFS,DFS, A* etc. using python. For those who are not familiar with the problem, 8 puzzle problem is a game consisting of 3 rows and 3 columns. You can move the empty tile only horizontally or vertically, 0 represents the empty tile. It looks like this (I couldn't add the images due to my accounts reputation.):
https://miro.medium.com/max/679/1*yekmcvT48y6mB8dIcK967Q.png
initial_state = [0,1,3,4,2,5,7,8,6]
goal_state = [1,2,3,4,5,6,7,8,0]
def find_zero(state):
global loc_of_zero
loc_of_zero = (state.index(0))
def swap_positions(list, pos1, pos2):
first = list.pop(pos1)
second = list.pop(pos2-1)
list.insert(pos1,second)
list.insert(pos2,first)
return list
def find_new_nodes(state):
if loc_of_zero == 0:
right = swap_positions(initial_state,0,1)
left = swap_positions(initial_state,0,3)
return(right,left)
find_zero(initial_state)
print(find_new_nodes(initial_state))
The problem I have is this, I want the function "find_new_nodes(state)" return 2 different lists, so I can choose the most promising node, depending on the algorithm) and so on. But the output of my code consists of two identical lists.
This is my output:
([4, 0, 3, 1, 2, 5, 7, 8, 6], [4, 0, 3, 1, 2, 5, 7, 8, 6])
What can I do to make it return 2 different lists? My goal is to return all possible moves depending on where the 0 is, using the find_new_nodes function. Apologies if this is an easy question, This is my first time making a project this complicated.
The problem is that swap_positions obtains a reference to the global initial_state and not a clone of it. So both calls to swap_positions mutate the same array.
A solution would be to clone the array on the first call:
right = swap_positions(initial_state[:],0,1)
probably a better solution for swap_positions would also be:
# please do not name variables same as builtin names
def swap_positions(lis, pos1, pos2):
# create a new tuple of both elements and destruct it directly
lis[pos1], lis[pos2] = lis[pos2], lis[pos1]
return lis
see also here
You don't really have "two identical list", you only have one list object that you're returning twice. To avoid modifying the original list and also two work with different lists, you should pass copies around.
initial_state = [0,1,3,4,2,5,7,8,6]
goal_state = [1,2,3,4,5,6,7,8,0]
def find_zero(state):
global loc_of_zero
loc_of_zero = (state.index(0))
def swap_positions(states, pos1, pos2):
first = states.pop(pos1)
second = states.pop(pos2-1)
states.insert(pos1,second)
states.insert(pos2,first)
return states
def find_new_nodes(states):
if loc_of_zero == 0:
right = swap_positions(states.copy(),0,1) # pass around a copy
left = swap_positions(states.copy(),0,3) # pass around a copy
return(right,left)
find_zero(initial_state)
print(find_new_nodes(initial_state))
Side note 1: I have renamed your vairable list to states, otherwise it would shadow the built in list function
Side note 2: find_new_nodes did not work with the parameter, instead it used the global list. I changed that, too.
Side note 3: There are different ways to create a copy of your (shallow) list. I think list.copy() is the most verbose one. You could also use the copy module, use [:] or something else.
Output:
([1, 0, 3, 4, 2, 5, 7, 8, 6], [4, 1, 3, 0, 2, 5, 7, 8, 6])
Ok, first of all, some thoughts...
Try to not use "list" as a variable, it's a Python identifier for "list" type. It seems that you are redefining the term.
Usually, it's a bad idea to use global vars such as loc_of_zero.
About your problem:
I believe that the problem is that you are getting a lot of references of the same variable. Try to avoid it. One idea:
from copy import deepcopy
def swap_positions(list0, pos1, pos2):
list1 = deepcopy(list0)
first = list1.pop(pos1)
second = list1.pop(pos2-1)
list1.insert(pos1,second)
list1.insert(pos2,first)
return list1

recursive function python, create function that generates all numbers that have same sum N

I am trying to code a recursive function that generates all the lists of numbers < N who's sum equal to N in python
This is the code I wrote :
def fn(v,n):
N=5
global vvi
v.append(n) ;
if(len(v)>N):
return
if(sum(v)>=5):
if(sum(v)==5): vvi.append(v)
else:
for i in range(n,N+1):
fn(v,i)
this is the output I get
vvi
Out[170]: [[1, 1, 1, 1, 1, 2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5, 2, 3, 4, 5]]
I tried same thing with c++ and it worked fine
What you need to do is to just formulate it as a recursive description and implement it. You want to prepend all singleton [j] to each of the lists with sum N-j, unless N-j=0 in which you also would include the singleton itself. Translated into python this would be
def glist(listsum, minelm=1):
for j in range(minelm, listsum+1):
if listsum-j > 0:
for l in glist(listsum-j, minelm=j):
yield [j]+l
else:
yield [j]
for l in glist(5):
print(l)
The solution contains a mechanism that will exclude permutated solutions by requiring the lists to be non-decreasing, this is done via the minelm argument that limits the values in the rest of the list. If you wan't to include permuted lists you could disable the minelm mechanism by replacing the recursion call to glist(listsum-j).
As for your code I don't really follow what you're trying to do. I'm sorry, but your code is not very clear (and that's not a bad thing only in python, it's actually more so in C).
First of all it's a bad idea to return the result from a function via a global variable, returning result is what return is for, but in python you have also yield that is nice if you want to return multiple elements as you go. For a recursive function it's even more horrible to return via a global variable (or even use it) since you are running many nested invocations of the function, but have only one global variable.
Also calling a function fn taking arguments v and n as argument. What do that actually tell you about the function and it's argument? At most that it's a function and probably that one of the argument should be a number. Not very useful if somebody (else) is to read and understand the code.
If you want an more elaborate answer what's formally wrong with your code you should probably include a minimal, complete, verifiable example including the expected output (and perhaps observed output).
You may want to reconsider the recursive solution and consider a dynamic programming approach:
def fn(N):
ways = {0:[[]]}
for n in range(1, N+1):
for i, x in enumerate(range(n, N+1)):
for v in ways[i]:
ways.setdefault(x, []).append(v+[n])
return ways[N]
>>> fn(5)
[[1, 1, 1, 1, 1], [1, 1, 1, 2], [1, 2, 2], [1, 1, 3], [2, 3], [1, 4], [5]]
>>> fn(3)
[[1, 1, 1], [1, 2], [3]]
Using global variables and side effects on input parameters is generally consider bad practice and you should look to avoid.

What's the difference between these two codes?

I recently started coding in Python 2.7. I'm a molecular biologist.
I'm writing a script that involves creating lists like this one:
mylist = [[0, 4, 6, 1], 102]
These lists are incremented by adding an item to mylist[0] and summing a value to mylist[1].
To do this, I use the code:
def addres(oldpep, res):
return [oldpep[0] + res[0], oldpep[1] + res[1]]
Which works well. Since mylist[0] can become a bit long, and I have millions of these lists to take care of, I thought that using append or extend might make my code faster, so I tried:
def addres(pep, res):
pep[0].extend(res[0])
pep[1] += res[1]
return pep
Which in my mind should give the same result. It does give the same result when I try it on an arbitrary list. But when I feed it to the million of lists, it gives me a very different result. So... what's the difference between the two? All the rest of the script is exactly the same.
Thank you!
Roberto
The difference is that the second version of addres modifies the list that you passed in as pep, where the first version returns a new one.
>>> mylist = [[0, 4, 6, 1], 102]
>>> list2 = [[3, 1, 2], 205]
>>> addres(mylist, list2)
[[0, 4, 6, 1, 3, 1, 2], 307]
>>> mylist
[[0, 4, 6, 1, 3, 1, 2], 307]
If you need to not modify the original lists, I don't think you're going to really going to get a faster Python implementation of addres than the first one you wrote. You might be able to deal with the modification, though, or come up with a somewhat different approach to speed up your code if that's the problem you're facing.
List are objects in python which are passed by reference.
a=list()
This doesn't mean that a is the list but a is pointing towards a list just created.
In first example, you are using list element and creating a new list, an another object while in the second one you are modifying the list content itself.

Python: .append(0)

I would like to ask what the following does in Python.
It was taken from http://danieljlewis.org/files/2010/06/Jenks.pdf
I have entered comments telling what I think is happening there.
# Seems to be a function that returns a float vector
# dataList seems to be a vector of flat.
# numClass seems to an int
def getJenksBreaks( dataList, numClass ):
# dataList seems to be a vector of float. "Sort" seems to sort it ascendingly
dataList.sort()
# create a 1-dimensional vector
mat1 = []
# "in range" seems to be something like "for i = 0 to len(dataList)+1)
for i in range(0,len(dataList)+1):
# create a 1-dimensional-vector?
temp = []
for j in range(0,numClass+1):
# append a zero to the vector?
temp.append(0)
# append the vector to a vector??
mat1.append(temp)
(...)
I am a little confused because in the pdf there are no explicit variable declarations. However I think and hope I could guess the variables.
Yes, the method append() adds elements to the end of the list. I think your interpretation of the code is correct.
But note the following:
x =[1,2,3,4]
x.append(5)
print(x)
[1, 2, 3, 4, 5]
while
x.append([6,7])
print(x)
[1, 2, 3, 4, 5, [6, 7]]
If you want something like
[1, 2, 3, 4, 5, 6, 7]
you may use extend()
x.extend([6,7])
print(x)
[1, 2, 3, 4, 5, 6, 7]
Python doesn't have explicit variable declarations. It's dynamically typed, variables are whatever type they get assigned to.
Your assessment of the code is pretty much correct.
One detail: The range function goes up to, but does not include, the last element. So the +1 in the second argument to range causes the last iterated value to be len(dataList) and numClass, respectively. This looks suspicious, because the range is zero-indexed, which means it will perform a total of len(dataList) + 1 iterations (which seems suspicious).
Presumably dataList.sort() modifies the original value of dataList, which is the traditional behavior of the .sort() method.
It is indeed appending the new vector to the initial one, if you look at the full source code there are several blocks that continue to concatenate more vectors to mat1.
append is a list function used to append a value at the end of the list
mat1 and temp together are creating a 2D array (eg = [[], [], []]) or matrix of (m x n)
where m = len(dataList)+1 and n = numClass
the resultant matrix is a zero martix as all its value is 0.
In Python, variables are implicitely declared. When you type this:
i = 1
i is set to a value of 1, which happens to be an integer. So we will talk of i as being an integer, although i is only a reference to an integer value. The consequence of that is that you don't need type declarations as in C++ or Java.
Your understanding is mostly correct, as for the comments. [] refers to a list. You can think of it as a linked-list (although its actual implementation is closer to std::vectors for instance).
As Python variables are only references to objects in general, lists are effectively lists of references, and can potentially hold any kind of values. This is valid Python:
# A vector of numbers
vect = [1.0, 2.0, 3.0, 4.0]
But this is perfectly valid code as well:
# The list of my objects:
list = [1, [2,"a"], True, 'foo', object()]
This list contains an integer, another list, a boolean... In Python, you usually rely on duck typing for your variable types, so this is not a problem.
Finally, one of the methods of list is sort, which sorts it in-place, as you correctly guessed, and the range function generates a range of numbers.
The syntax for x in L: ... iterates over the content of L (assuming it is iterable) and sets the variable x to each of the successive values in that context. For example:
>>> for x in ['a', 'b', 'c']:
... print x
a
b
c
Since range generates a range of numbers, this is effectively the idiomatic way to generate a for i = 0; i < N; i += 1 type of loop:
>>> for i in range(4): # range(4) == [0,1,2,3]
... print i
0
1
2
3

Very weird Python variable scope behaviour

I'm having a problem with Python 2.7 that is driving me insane.
I'm passing an array to some functions and altough that variable is suposed to be local, in the end the value of the variable inside main is changed.
I'm a bit new to Python, but this goes against any common sense I got.
Any ideas of what I'm doing wrong?
def mutate(chromo):
# chooses random genes and mutates them randomly to 0 or 1
for gene in chromo:
for codon in gene:
for base in range(2):
codon[randint(0, len(codon)-1)] = randint(0, 1)
return chromo
def mate(chromo1, chromo2):
return mutate([choice(pair) for pair in zip(chromo1, chromo2)])
if __name__ == '__main__':
# top 3 is a multidimensional array with 3 levels (in here I put just 2 for simplicity)
top3 = [[1, 0], [0, 0], [1, 1]]
offspring = []
for item in top3:
offspring.append(mate(top3[0], item))
# after this, top3 is diferent from before the for cycle
UPDATE
Because Python passes by reference, I must make a real copy fo the arrays before using them, so the mate functions must be changed to:
import copy
def mate(chromo1, chromo2):
return mutate([choice(pair) for pair in zip(copy.deepcopy(chromo1), copy.deepcopy(chromo2))])
The problem you are having is stemming from the fact that arrays and dictionaries in python are passed by reference. This means that instead of a fresh copy being created by the def and used locally you are getting a pointer to your array in memory...
x = [1,2,3,4]
def mystery(someArray):
someArray.append(4)
print someArray
mystery(x)
[1, 2, 3, 4, 4]
print x
[1, 2, 3, 4, 4]
You manipulate chromo, which you pass by reference. Therefore the changes are destructive... the return is therefore kind of moot as well (codon is in gene and gene is in chromo). You'll need to make a (deep) copy of your chromos, I think.
try changing
offspring.append(mate(top3[0], item)) to
offspring.append(mate(top3[0][:], item[:]))
or use the list() function

Categories