I am just learning python and I am going though the tutorials on https://developers.google.com/edu/python/strings
Under the String Slices section
s[:] is 'Hello' -- omitting both always gives us a copy of the whole
thing (this is the pythonic way to copy a sequence like a string or
list)
Out of curiosity why wouldn't you just use an = operator?
s = 'hello';
bar = s[:]
foo = s
As far as I can tell both bar and foo have the same value.
= makes a reference, by using [:] you create a copy. For strings, which are immutable, this doesn't really matter, but for lists etc. it is crucial.
>>> s = 'hello'
>>> t1 = s
>>> t2 = s[:]
>>> print s, t1, t2
hello hello hello
>>> s = 'good bye'
>>> print s, t1, t2
good bye hello hello
but:
>>> li1 = [1,2]
>>> li = [1,2]
>>> li1 = li
>>> li2 = li[:]
>>> print li, li1, li2
[1, 2] [1, 2] [1, 2]
>>> li[0] = 0
>>> print li, li1, li2
[0, 2] [0, 2] [1, 2]
So why use it when dealing with strings? The built-in strings are immutable, but whenever you write a library function expecting a string, a user might give you something that "looks like a string" and "behaves like a string", but is a custom type. This type might be mutable, so it's better to take care of that.
Such a type might look like:
class MutableString(object):
def __init__(self, s):
self._characters = [c for c in s]
def __str__(self):
return "".join(self._characters)
def __repr__(self):
return "MutableString(\"%s\")" % str(self)
def __getattr__(self, name):
return str(self).__getattribute__(name)
def __len__(self):
return len(self._characters)
def __getitem__(self, index):
return self._characters[index]
def __setitem__(self, index, value):
self._characters[index] = value
def __getslice__(self, start, end=-1, stride=1):
return str(self)[start:end:stride]
if __name__ == "__main__":
m = MutableString("Hello")
print m
print len(m)
print m.find("o")
print m.find("x")
print m.replace("e", "a") #translate to german ;-)
print m
print m[3]
m[1] = "a"
print m
print m[:]
copy1 = m
copy2 = m[:]
print m, copy1, copy2
m[1] = "X"
print m, copy1, copy2
Disclaimer: This is just a sample to show how it could work and to motivate the use of [:]. It is untested, incomplete and probably horribly performant
They have the same value, but there is a fundamental difference when dealing with mutable objects.
Say foo = [1, 2, 3]. You assign bar = foo, and baz = foo[:]. Now let's say you want to change bar - bar.append(4). You check the value of foo, and...
print foo
# [1, 2, 3, 4]
Now where did that extra 4 come from? It's because you assigned bar to the identity of foo, so when you change one you change the other. You change baz - baz.append(5), but nothing has happened to the other two - that's because you assigned a copy of foo to baz.
Note however that because strings are immutable, it doesn't matter.
If you have a list the result is different:
l = [1,2,3]
l1 = l
l2 = l[:]
l2 is a copy of l (different object) while l1 is an alias of l which means that l1[0]=7 will modify also l, while l2[1]=7 will not modify l.
While referencing an object and referencing the object's copy doesn't differ for an immutable object like string, they do for mutable objects (and mutable methods), for instance list.
Same thing on mutable objects:
a = [1,2,3,4]
b = a
c = a[:]
a[0] = -1
print a # will print [1,2,3,4]
print b # will print [-1,2,3,4]
print c # will print [1,2,3,4]
A visualization on pythontutor of the above example - http://goo.gl/Aswnl.
Related
This is just a question asking for the difference in the code.
I have several lists ie. a=[], b=[], c=[], d=[]
Say if I have a code that appends to each list, and I want to reset all these lists to its original empty state, I created a function:
def reset_list():
del a[:]
del b[:]
del c[:]
del d[:]
So whenever I call reset_list() in a code, it removes all the appended items and set all lists to []. However, the one below doesn't work:
def reset_list():
a = []
b = []
c = []
d = []
This might be a stupid question but I was wondering why the second one wouldn't work.
When you do del a[:] then it looks for the variable a (including outer contexts) and then performs del found_a[:] on it.
But when you use a = [] it creates a name a in the current context and assigns an empty list to it. When the function exits the variable a from the function is not "accessible" anymore (destroyed).
So in short the first works because you change the a from an outer context, the second does not work because you don't modify the a from the outer context, you just create a new a name and temporarily (for the duration of the function) assigns an empty list to it.
There's a difference between del a[:] and a = []
Note that these actually do something different which becomes apparent if you have additional references (aliases) to the original list. (as noted by #juanpa.arrivillaga in the comments)
del list[:] deletes all elements in the list but doesn't create a new list, so the aliases are updated as well:
>>> list_1 = [1,2,3]
>>> alias_1 = list_1
>>> del alist_1[:]
>>> list_1
[]
>>> alias_1
[]
However a = [] creates a new list and assigns that to a:
>>> list_2 = [1,2,3]
>>> alias_2 = list_2
>>> list_2 = []
>>> list_2
[]
>>> alias_2
[1, 2, 3]
If you want a more extensive discussion about names and references in Python I can highly recommend Ned Batchelders blog post on "Facts and myths about Python names and values".
A better solution?
In most cases where you have multiple variables that belong together I would use a class for them. Then instead of reset you could simply create a new instance and work on that:
class FourLists:
def __init__(self):
self.a = []
self.b = []
self.c = []
self.d = []
Then you can create a new instance and work with the attributes of that instance:
>>> state = FourLists()
>>> state.a
[]
>>> state.b.append(10)
>>> state.b.extend([1,2,3])
>>> state.b
[10, 1, 2, 3]
Then if you want to reset the state you could simply create a new instance:
>>> new_state = FourLists()
>>> new_state.b
[]
You need to declare a,b,c,d as global if you want python to use the globally defined 'versions' of your variables. Otherwise, as pointed out in other answers, it will simply declare new local-scope 'versions'.
a = [1,2,3]
b = [1,2,3]
c = [1,2,3]
d = [1,2,3]
def reset_list():
global a,b,c,d
a = []
b = []
c = []
d = []
print(a,b,c,d)
reset_list()
print(a,b,c,d)
Outputs:
[1, 2, 3] [1, 2, 3] [1, 2, 3] [1, 2, 3]
[] [] [] []
As pointed out by #juanpa.arrivillaga, there is a difference between del a[:] and a = []. See this answer.
The 1st method works because:
reset_list() simply deletes the contents of the four lists. It works on the lists that you define outside the function, provided they are named the same. If you had a different name, you'd get an error:
e = [1,2,3,4]
def reset_list():
del a[:] #different name for list
NameError: name 'e' is not defined
The function will only have an effect if you initialize the lists before the function call. This is because you are not returning the lists back after the function call ends:
a = [1,2,3,4] #initialize before function definition
def reset_list():
del a[:]
reset_list() #function call to modify a
print(a)
#[]
By itself the function does not return anything:
print(reset_list())
#None
The 2nd method doesn't work because:
the reset_list() function creates 4 empty lists that are not pointing to the lists that may have been defined outside the function. Whatever happens inside the function stays inside(also called scope) and ends there unless you return the lists back at the end of the function call. The lists will be modified and returned only when the function is called. Make sure that you specify the arguments in reset_list(a,..) in the function definition:
#function definition
def reset_list(a):
a = []
return a
#initialize list after function call
a = [1,2,3,4]
print("Before function call:{}".format(a))
new_a = reset_list(a)
print("After function call:{}".format(new_a))
#Output:
Before function call:[1, 2, 3, 4]
After function call:[]
As you've seen, you should always return from a function to make sure that your function "does some work" on the lists and returns the result in the end.
The second function (with a = [ ] and so on) initialises 4 new lists with a local scope (within the function). It is not the same as deleting the contents of the list.
I imagine this is one in a very long list of questions from people who have inadvertantly created references in python, but I've got the following situation. I'm using scipy minimize to set the sum of the top row of an array to 5 (as an example).
class problem_test:
def __init__(self):
test_array = [[1,2,3,4,5,6,7],
[4,5,6,7,8,9,10]]
def set_top_row_to_five(x, array):
array[0] = array[0] + x
return abs(sum(array[0]) - 5)
adjustment = spo.minimize(set_top_row_to_five,0,args=(test_array))
print(test_array)
print(adjustment.x)
ptest = problem_test()
However, the optimization is altering the original array (test_array):
[array([-2.03, -1.03, -0.03, 0.97, 1.97, 2.97, 3.97]), [4, 5, 6, 7, 8, 9, 10]]
[-0.00000001]
I realize I can solve this using, for example, deepcopy, but I'm keen to learn why this is happening so I don't do the same in future by accident.
Thanks in advance!
Names are references to objects. What is to observe is whether the objects (also passed in an argument) is modified itself or a new object is created. An example would be:
>>> l1 = list()
>>> l2 = l1
>>> l2.append(0) # this modifies object currently reference to by l1 and l2
>>> print(l1)
[0]
Whereas:
>>> l1 = list()
>>> l2 = list(l1) # New list object has been created with initial values from l1
>>> l2.append(0)
>>> print(l1)
[]
Or:
>>> l1 = list()
>>> l2 = l1
>>> l2 = [0] # New list object has been created and assigned to l2
>>> l2.append(0)
>>> print(l1)
[]
Similarly assuming l = [1, 2, 3]:
>>> def f1(list_arg):
... return list_arg.reverse()
>>> print(f1, l)
None [3, 2, 1]
We have just passed None returned my list.reverse method through and reversed l (in place). However:
>>> def f2(list_arg):
... ret_list = list(list_arg)
... ret_list.reverse()
... return ret_list
>>> print(f2(l), l)
[3, 2, 1] [1, 2, 3]
Function returns a new reversed object (initialized) from l which remained unchanged (NOTE: in this exampled built-in reversed or slicing would of course make more sense.)
When nested, one must not forget that for instance:
>>> l = [1, 2, 3]
>>> d1 = {'k': l}
>>> d2 = dict(d1)
>>> d1 is d2
False
>>> d1['k'] is d2['k']
True
Dictionaries d1 and d2 are two different objects, but their k item is only one (and shared) instance. This is the case when copy.deepcopy might come in handy.
Care needs to be taken when passing objects around to make sure they are modified or copy is used as wanted and expected. It might be helpful to return None or similar generic value when making in place changes and return the resulting object when working with a copy so that the function/method interface itself hints what the intention was and what is actually going on here.
When immutable objects (as the name suggests) are being "modified" a new object would actually be created and assigned to a new or back to the original name/reference:
>>> s = 'abc'
>>> print('0x{:x} {}'.format(id(s), s))
0x7f4a9dbbfa78 abc
>>> s = s.upper()
>>> print('0x{:x} {}'.format(id(s), s))
0x7f4a9c989490 ABC
Note though, that even immutable type could include reference to a mutable object. For instance for l = [1, 2, 3]; t1 = (l,); t2 = t1, one can t1[0].append(4). This change would also be seen in t2[0] (for the same reason as d1['k'] and d2['k'] above) while both tuples themselves remained unmodified.
One extra caveat (possible gotcha). When defining default argument values (using mutable types), that default argument, when function is called without passing an object, behaves like a "static" variable:
>>> def f3(arg_list=[]):
... arg_list.append('x')
... print(arg_list)
>>> f3()
['x']
>>> f3()
['x', 'x']
Since this is often not a behavior people assume at first glance, using mutable objects as default argument value is usually better avoided.
Similar would be true for class attributes where one object would be shared between all instances:
>>> class C(object):
... a = []
... def m(self):
... self.a.append('x') # We actually modify value of an attribute of C
... print(self.a)
>>> c1 = C()
>>> c2 = C()
>>> c1.m()
['x']
>>> c2.m()
['x', 'x']
>>> c1.m()
['x', 'x', 'x']
Note what the behavior would be in case of class immutable type class attribute in a similar example:
>>> class C(object):
... a = 0
... def m(self):
... self.a += 1 # We assign new object to an attribute of self
... print(self.a)
>>> c1 = C()
>>> c2 = C()
>>> c1.m()
1
>>> c2.m()
1
>>> c1.m()
2
All the fun details can be found in the documentation: https://docs.python.org/3.6/reference/datamodel.html
In python objects such as lists are passed by reference. Assignment with the = operator assigns by reference. So this function:
def modify_list(A):
A = [1,2,3,4]
Takes a reference to list and labels it A, but then sets the local variable A to a new reference; the list passed by the calling scope is not modified.
test = []
modify_list(test)
print(test)
prints []
However I could do this:
def modify_list(A):
A += [1,2,3,4]
test = []
modify_list(test)
print(test)
Prints [1,2,3,4]
How can I assign a list passed by reference to contain the values of another list? What I am looking for is something functionally equivelant to the following, but simpler:
def modify_list(A):
list_values = [1,2,3,4]
for i in range(min(len(A), len(list_values))):
A[i] = list_values[i]
for i in range(len(list_values), len(A)):
del A[i]
for i in range(len(A), len(list_values)):
A += [list_values[i]]
And yes, I know that this is not a good way to do <whatever I want to do>, I am just asking out of curiosity not necessity.
You can do a slice assignment:
>>> def mod_list(A, new_A):
... A[:]=new_A
...
>>> liA=[1,2,3]
>>> new=[3,4,5,6,7]
>>> mod_list(liA, new)
>>> liA
[3, 4, 5, 6, 7]
The simplest solution is to use:
def modify_list(A):
A[::] = [1, 2, 3, 4]
To overwrite the contents of a list with another list (or an arbitrary iterable), you can use the slice-assignment syntax:
A = B = [1,2,3]
A[:] = [4,5,6,7]
print(A) # [4,5,6,7]
print(A is B) # True
Slice assignment is implemented on most of the mutable built-in types. The above assignment is essentially the same the following:
A.__setitem__(slice(None, None, None), [4,5,6,7])
So the same magic function (__setitem__) is called when a regular item assignment happens, only that the item index is now a slice object, which represents the item range to be overwritten. Based on this example you can even support slice assignment in your own types.
I have a list of lists. Each sublist contains objects of a custom class. What I want to do is set a certain attribute of each class object to 0. The simple way to do this would be a double for loop or similar:
for subl in L:
for myObj in subL:
myObj.attr = 0
Alternatively, I could use itertools.chain:
for myObj in itertools.chain.from_iterable(L):
myObj.attr = 0
However, I wonder if I could set everything in one line. Could I perhaps use a generator-like structure to do this? Something along the lines of:
(myObj.attr=0 for subl in L for myObj in subl)
Now that won't really work, and will raise a SyntaxError, but is something even remotely similar possible?
This is an abuse of generator expressions, but:
any(setattr(obj, "attr", 0) for sub in L for obj in sub)
Or, perhaps slightly faster since there's no testing of each object:
from collections import deque
do = deque(maxlen=0).extend
do(setattr(obj, "attr", 0) for sub in L for obj in sub)
See this example:
class C:
def __init__(self):
self.a = None
def f(self, para):
self.a = para
list1 = [C() for e in range(3)]
list2 = [C() for e in range(3)]
list3 = [list1, list2]
[c.f(5) for l in list3 for c in l]
for e in list3:
for c in e:
print c.a
Conclusion
You could create a method to set the attribute. It will look something like:
[myObj.setattr(0) for subl in L for myObj in subl]
Note the brackets.
Here is a simple solution that popped out in my head.
Using the built-in setattr, your suggestion - itertools.chain.from_iterable -, and an abuse of list comprehension:
class Foo():
def __init__(self):
my_attr = 10
A = Foo()
B = Foo()
C = Foo()
D = Foo()
obj_list = [[A, B], [C, D]]
a = [setattr(obj, "my_attr", 0) for obj in itertools.chain.from_iterable(obj_list)]
Result:
>>> a
[None, None, None, None]
>>> A.my_attr
0
>>> B.my_attr
0
>>> C.my_attr
0
>>> D.my_attr
0
I found setattr to be very useful for cases like this, it's simple, short, and effective.
Hope this helps!
When I have a function like
def foo(A):
tmp=A
tmp=tmp+1
*rest of the function*
return
or
def foo(A):
global tmp
tmp=A
tmp=tmp+1
*rest of the function*
return
both to "A" and "tmp" 1 is added instead of only to "tmp". What am I doing wrong and how can I fix it?
As nneonneo's comment under Blake's answer (which describes a significant part of the problem) states, the code in the original post doesn't actually do what he states. For example:
def foo(A):
tmp = A
tmp = tmp + 1
return (A, tmp)
foo(3)
returns (3,4), meaning A has been left unchanged.
Where this would not be true is where A is a mutable type. Mutable types include lists, dictionaries, and derived types of those, while integers, floats, and tuples are not mutable.
For example:
def foo(A):
tmp = A
tmp[0] = tmp[0] + 1
return (A, tmp)
foo([1, 2])
which returns ([2, 2], [2, 2]), and in this case A has changed.
That foo changes the value of A in the list case but not the integer case is because lists are mutable and integers are not. Assigning A to tmp when A is a mutable type assigns a reference to the mutable object, and changing one of its elements (as in tmp[0] = tmp[0] + 1) doesn't make a new object.
If you do not want your function to have this side-effect-like behavior for a list, for example, a common Python idiom is to use slice notation to duplicate the list. This will make a new list object when you assign it to tmp that is a copy of the list object in A:
def foo(A):
tmp = A[:]
# this slice makes a new list, a copy of A
tmp[0] = tmp[0] + 1
return (A, tmp)
foo([1, 2])
This returns ([1, 2], [2, 2]), so A is unchanged and tmp is changed.
There are other ways to copy lists or other mutable objects which are subtly different from each other. How to clone or copy a list? has a great description of your choices.
That's because python method parameters are pass by reference, not pass by value. You're essentially modifying the same place in memory, that two different variables point to.
>>> def foo(a):
tmp = a
print(tmp, a, id(a), id(tmp))
>>> foo(5)
5 5 505910928 505910928
>>> b = 5
>>> foo(b)
5 5 505910928 505910928
>>> id(b)
505910928
And with the global example:
>>> def foo(a):
global tmp
a = tmp
print(a, tmp, id(a), id(tmp))
>>> foo(5)
7 7 505910960 505910960
>>> foo('s')
7 7 505910960 505910960
>>> tmp
7
>>> tmp = 6
>>> foo('a')
6 6 505910944 505910944