Python regular expression sub - python

So I am running into something I cant explain and was hoping someone could shed light on... Here is my code:
fd = open(inFile, 'r')
contents = fd1.readlines()
fd.close()
contentsOrig = contents
contents[3] = re.sub(replaceRegex, thingToReplaceWith, contentsOrig[3])
Now when I print out contents and contentsOrig they are exactly the same. I was trying to preserve what I originally read in but from this little code it doesn't seem to be working. Can anyone enlighten me?
I am running Python 2.7.7

Yes, when you assign a list to another variable, it's not a copy of that list, it is a reference. Meaning there is still one copy of that list and now both variables point to it.
contentsOrig = contents
contentsOrig is contents
# Result: True
When you change one of the values or modify the list in any it is changing the same list. So what you need to do is make a copy of the list. This is done by either of these ways:
contentsOrig = contents[:]
or
contentsOrig = list(contents)
The first way is using the list slicing to produce a new list from the beginning to the end. The second takes a list and returns a new copy of the list.
Note that both ways do not make new copies of the items inside the list. So it's the same items, but different containers. But since these are strings, they are not mutable and therefore if modified inside a list, the original string will be untouched in the the original list.

In Python, mutable objects cannot be copied the way you are copying them. If effect, doing x = y only sets x and y to the same address in memory - they reference the same object (in this case a list). If you use the ID function id(x) and id(y) they will actually be the same!
Edit: note that the poster below showed x is y; the is keyword is using the id function in the background.
A simpler example is here:
list_one = [1, 2, 3]
list_two = list_one
list_one.append(4)
print list_one #will show [1, 2, 3, 4], as expected
print list_two #will ALSO show [1, 2, 3, 4]!
In this case, to get around this problem you can use list splicing to make the new list:
new_list = original_list[:]
If you have multiple nested lists (or other mutable objects), then you can use the deepcopy module to make a copy, if you want all the nested lists to be copied as well.

Related

Reverse function reverses wrong list

I am really really new to coding and this is my first post but I haven't found anybody else with the same problem yet.
This is a snippet of my code:
nr_list = [3, 4, 9]
nr_list_r = []
nr_list_r = nr_list
nr_list_r.reverse()
print(nr_list)
It returns [9, 4, 3]
I honestly don't know why nr_list is reversed when I only used the reverse function on nr_list_r.
Why is nr_list reversed as well?
In Python, variables refer to values rather than contain them. So when you do
nr_list_r = nr_list
You're not making a new list. You're making a new variable refer to the same list. If you want to make a copy, you can use the slice syntax [:]
nr_list_r = nr_list[:]
But we also already have a way to reverse a list without modifying it, so you may as well just do it using that built-in function.
nr_list_r = list(reversed(nr_list))
We use reversed to reverse the iterable and then list to convert the result (which is an arbitrary iterable) into a concrete list.
Basically with:
nr_list_r = nr_list
You are just atributting another name to the list "nr_list"
So when you reverse "nr_list_r" the "nr_list" is gonna get reversed aswell.
In order to reverse just one of them you should copy the list
nr_list_r = nr_list.copy()
the full code is like this:
nr_list = [3, 4, 9]
nr_list_r = nr_list.copy()
nr_list_r.reverse()
print(nr_list)
print(nr_list_r)
The concept of References is one of the key optimization in many programming languages.
An object starts its life when it is created. This point of time at least one reference is created.
nr_list = [3, 4, 9] # one reference.
As the execution proceeds, more number of references may get created, based on the usage.
nr_list_r = nr_list # one more reference added.
And also the number of references may be reduced.
nr_list_r = [4,6,8] # one reference reduced.
Finally the object is considered alive, if there at least one reference exists.
When there are no more references, the object is no more accessible and the memory is reclaimed.

Python remove method removes element from both lists

When I run the code below:
a=[0,1,2,3]
t=a
t.remove(3)
print(a)
it gives me a result of [0,1,2] even though I didn't use the remove() method for list a. Why does it happen?
Because list(and dicts) are pass by reference in Python. You have to shallow copy the list if you don't want it to happen like this:
t=a[:]
or
t=a.copy()
Lists in python are mutable objects and hence when you assigned t = a,
you created list 't' which is reference to list 'a'. Both lists refer to the same memory location.
Original list
a=[0,1,2,3]
id(a) # gives unique number which points to memory location
>> 60528136
assigning to t
t = a
id(t)
>> 60528136
The result shows same number which signifies that both lists refer to the same memory location. So any modification on list 'a' would be reflected in list 't' and vice-versa.
To avoid this create a copy.
t = a[::]
or
t = a.copy()
id(t)
>>60571720
Check out the Python docs on Programming FAQ here.
Why did changing list ‘y’ also change list ‘x’?
There are two factors that produce this result:
Variables are simply names that refer to objects. Doing y = x doesn’t create a copy of the list – it creates a new variable y that refers to the same object x refers to. This means that there is only one object (the list), and both x and y refer to it.
Lists are mutable, which means that you can change their content.
How do I copy an object in Python?
In general, try copy.copy() or copy.deepcopy() for the general case. Not all objects can be copied, but most can.
Some objects can be copied more easily. Dictionaries have a copy() method:
newdict = olddict.copy()
Sequences can be copied by slicing:
new_l = l[:]

Remove the object being iterated over from list

Consider:
fooList = [1, 2, 3, 4] # Ints for example only, in real application using objects
for foo in fooList:
if fooChecker(foo):
remove_this_foo_from_list
How is the specific foo to be removed from the list? Note that I'm using ints for example only, in the real application there is a list of arbitrary objects.
Thanks.
Generally, you just don't want to do this. Instead, construct a new list instead. Most of the time, this is done with a list comprehension:
fooListFiltered = [foo for foo in fooList if not fooChecker(foo)]
Alternatively, a generator expression (my video linked above covers generator expressions as well as list comprehensions) or filter() (note that in 2.x, filter() is not lazy - use a generator expression or itertools.ifilter() instead) might be more appropriate (for example, a large file that is too big to be read into memory wouldn't work this way, but would with a generator expression).
If you need to actually modify the list (rare, but can be the case on occasion), then you can assign back:
fooList[:] = fooListFiltered
Iterate over a shallow copy of the list.
As you can't modify a list while iterating over so you need to iterate over a shallow copy of the list.
fooList = [1, 2, 3, 4]
for foo in fooList[:]: #equivalent to list(fooList), but much faster
if fooChecker(foo):
fooList.remove(foo)
Use filter:
newList = list(filter(fooChecker, fooList))
or
newItems = filter(fooChecker, fooList))
for item in newItems:
print item # or print(item) for python 3.x
http://docs.python.org/2/library/functions.html#filter

Python: Local variables mysteriously update Global variables

I have a function where I work with a local variable, and then pass back the final variable after the function is complete. I want to keep a record of what this variable was before the function however the global variable is updated along with the local variable. Here is an abbreviated version of my code (its quite long)
def Turn(P,Llocal,T,oflag):
#The function here changes P, Llocal and T then passes those values back
return(P, Llocal, T, oflag)
#Later I call the function
#P and L are defined here, then I copy them to other variables to save
#the initial values
P=Pinitial
L=Linitial
P,L,T,oflag = Turn(P,L,T,oflag)
My problem is that L and Linitial are both updated exactly when Llocal is updated, but I want Linitial to not change. P doesn't change so I'm confused about what is happening here. Help? Thanks!
The whole code for brave people is here: https://docs.google.com/document/d/1e6VJnZgVqlYGgYb6X0cCIF-7-npShM7RXL9nXd_pT-o/edit
The problem is that P and L are names that are bound to objects, not values themselves. When you pass them as parameters to a function, you're actually passing a copy of the binding to P and L. That means that, if P and L are mutable objects, any changes made to them will be visible outside of the function call.
You can use the copy module to save a copy of the value of a name.
Lists are mutable. If you pass a list to a function and that function modifies the list, then you will be able to see the modifications from any other names bound to the same list.
To fix the problem try changing this line:
L = Linitial
to this:
L = Linitial[:]
This slice makes a shallow copy of the list. If you add or remove items from the list stored in L it will not change the list Lintial.
If you want to make a deep copy, use copy.deepcopy.
The same thing does not happen with P because it is an integer. Integers are immutable.
In Python, a variable is just a reference to an object or value in the memory. For example, when you have a list x:
x = [1, 2, 3]
So, when you assign x to another variable, let's call it y, you are just creating a new reference (y) to the object referenced by x (the [1, 2, 3] list).
y = x
When you update x, you are actually updating the object pointed by x, i.e. the list [1, 2, 3]. As y references the same value, it appears to be updated too.
Keep in mind, variables are just references to objects.
If you really want to copy a list, you shoud do:
new_list = old_list[:]
Here's a nice explanation: http://henry.precheur.org/python/copy_list

Why does not the + operator change a list while .append() does?

I'm working through Udacity and Dave Evans introduced an exercise about list properties
list1 = [1,2,3,4]
list2 = [1,2,3,4]
list1=list1+[6]
print(list1)
list2.append(6)
print(list2)
list1 = [1,2,3,4]
list2 = [1,2,3,4]
def proc(mylist):
mylist = mylist + [6]
def proc2(mylist):
mylist.append(6)
# Can you explain the results given by the four print statements below? Remove
# the hashes # and run the code to check.
print (list1)
proc(list1)
print (list1)
print (list2)
proc2(list2)
print (list2)
The output is
[1, 2, 3, 4, 6]
[1, 2, 3, 4, 6]
[1, 2, 3, 4]
[1, 2, 3, 4]
[1, 2, 3, 4]
[1, 2, 3, 4, 6]
So in a function the adding a 6 to the set doesn't show but it does when not in a function?
So in a function the adding a 6 to the set doesn't show but it does when not in a function?
No, that is not what happens.
What happens is that, when you execute mylist = mylist + [6], you are effectively creating an entirely new list and putting it in the local mylist variable. This mylist variable will vanish after the execution of the function and the newly created list will vanish as well.
OTOH when you execute mylist.append(6) you do not create a new list. You get the list already in the mylist variable and add a new element to this same list. The result is that the list (which is pointed by list2 too) will be altered itself. The mylist variable will vanish again, but in tis case you altered the original list.
Let us see if a more visual explanation can help you :)
What happens when you call proc()
When you write list1 = [1, 2, 3, 4, 5] you are creating a new list object (at the right side of the equals sign) and creating a new variable, list1, which will point to this object.
Then, when you call proc(), you create another new variable, mylist, and since you pass list1 as parameter, mylist will point to the same object:
However, the operation mylist + [6] creates a whole new list object whose contents are the contents of the object pointed by mylist plus the content of the following list object - that is, [6]. Since you attribute this new object to mylist, our scenario changes a bit and mylist does not point to the same object pointed by list1 anymore:
What I have not said is that mylist is a local variable: it will disappear after the end of the proc() function. So, when the proc() execution ended, the mylist is gone:
Since no other variable points to the object generated by mylist + [6], it will disappear, too (since the garbage collector* will collect it):
Note that, in the end, the object pointed by list1 is not changed.
What happens when you call proc2()
Everything changes when you call proc2(). At first, it is the same thing: you create a list...
...and pass it as a parameter to a function, which will generate a local variable:
However, instead of using the + concatenation operator, which generates a new list, you apply the append() method to the existing list. The append() method does not create a new object; instead, it _changes the existing one:
After the end of the function, the local variable will disappear, but the original object pointed by it and by list1 will be already altered:
Since it is still pointed by list1, the original list is not destroyed.
EDIT: if you want to take a look at all this stuff happening before your eyes just go to this radically amazing simulator:
* If you do not know what is garbage collector... well, you will discover soon after understanding your own question.
Variables in python can always be thought of as references. When you call a function with an argument, you are passing in a reference to the actual data.
When you use the assignment operator (=), you're assigning that name to refer to an entirely new object. So, mylist = mylist + [6] creates a new list containing the old contents of mylist, as well as 6, and assigns the variable mylist to refer to the new list. list1 is still pointing to the old list, so nothing changes.
On the other hand, when you use .append, that is actually appending an element to the list that the variable refers to - it is not assigning anything new to the variable. So your second function modifies the list that list2 refers to.
Ordinarily, in the first case in function proc you would only be able to change the global list by assignment if you declared
global mylist
first and did not pass mylist as a parameter. However, in this case you'd get an error message that mylist is global and local:
name 'mylist' is local and global. What happens in proc is that a local list is created when the assignment takes place. Since local variables go away when the function ends, the effect of any changes to the local list doesn't propagate to the rest of the program when it is printed out subsequently.
But in the second function proc2 you are modifying the list by appending rather than assigning to it, so the global keyword is not required and changes to the list show elsewhere.
As well as the comprehensive answers already given, it's also worth being aware that if you want the same looking syntax as:
mylist = mylist + [6]
...but still want the list to be updated "in place", you can do:
mylist += [6]
Which, while it looks like it would do the same thing as the first version, is actually the same as:
mylist.extend([6])
(Note that extend takes the contents of an iterable and adds them one by one, whereas append takes whatever it is given and adds that as a single item. See append vs. extend for a full explanation.)
This, in one form or another, is a very common question. I took a whack at explaining Python parameter passing myself a couple days ago. Basically, one of these creates a new list and the other modifies the existing one. In the latter case, all variables that refer to the list "see" the change because it's still the same object.
In the third line, you did this
list1=list1+[6]
So, when you did the below after the above line,
print (list1)
which printed list1 that you created at the start and proc procedure which adds list1 + [6] which is creating a new list inside the function proc. Where as when you are appending [6], you are not creating a new list rather you are appending a list item to already existing list.
But, keep in mind. In line 7, you again created the list
list1 = [1,2,3,4]
Now you wanted to print the list1 by calling it explicitly, which will print out the list1 that you reinitialized again but not the previous one.

Categories