I am having a hard time understanding whats going on in the my code. So if I have the following line:
d = {}
d.setdefault("key",[]).append("item")
This returns
{'key': ['item']}
So I get what setdefault does. It checks for "key" in the d, a dictionary, and if it doesn't exist it creates it otherwise if it does exist then it returns the value. This returns a copy which can be manipulated and will be updated in the original dictionary. This is a new idea to me. Does this mean that setdefault returns a deep copy, as opposed to a shallow copy? Trying to get wrap my head around this shallow copy vs. deep copy.
No Python operation does implicit copying. Ever. Implicit copying is evil, as far as Python is concerned.
It's literals that create objects. Every time setdefault is called, it evaluates both its arguments. When it evaluates its second argument ([]), a new list is created. It's completely the same as a = [].
If you write el = [] and then try .setdefaulting el into some dict more than one time, you'll see that no copies are being made.
it is equivelent to
item = d.get(key,default)
d[key] = item
d[key].action #in this case append
From the holy docs:
setdefault(key[, default])
If key is in the dictionary, return its
value. If not, insert key with a value of default and return default.
default defaults to None.
The behaviour is easily explicable once you drop the idea that it is a copy. It is not; it is the actual object.
Related
Was solving an algorithms problem and had to reverse a list.
When done, this is what my code looked like:
def construct_path_using_dict(previous_nodes, end_node):
constructed_path = []
current_node = end_node
while current_node:
constructed_path.append(current_node)
current_node = previous_nodes[current_node]
constructed_path = reverse(constructed_path)
return constructed_path
But, along the way, I tried return constructed_path.reverse() and I realized it wasn't returning a list...
Why was it made this way?
Shouldn't it make sense that I should be able to return a reversed list directly, without first doing list.reverse() or list = reverse(list) ?
What I'm about to write was already said here, but I'll write it anyway because I think it will perhaps add some clarity.
You're asking why the reverse method doesn't return a (reference to the) result, and instead modifies the list in-place. In the official python tutorial, it says this on the matter:
You might have noticed that methods like insert, remove or sort that only modify the list have no return value printed – they return the default None. This is a design principle for all mutable data structures in Python.
In other words (or at least, this is the way I think about it) - python tries to mutate in-place where-ever possible (that is, when dealing with an immutable data structure), and when it mutates in-place, it doesn't also return a reference to the list - because then it would appear that it is returning a new list, when it is really returning the old list.
To be clear, this is only true for object methods, not functions that take a list, for example, because the function has no way of knowing whether or not it can mutate the iterable that was passed in. Are you passing a list or a tuple? The function has no way of knowing, unlike an object method.
list.reverse reverses in place, modifying the list it was called on. Generally, Python methods that operate in place don’t return what they operated on to avoid confusion over whether the returned value is a copy.
You can reverse and return the original list:
constructed_path.reverse()
return constructed_path
Or return a reverse iterator over the original list, which isn’t a list but doesn’t involve creating a second list just as big as the first:
return reversed(constructed_path)
Or return a new list containing the reversed elements of the original list:
return constructed_path[::-1]
# equivalent: return list(reversed(constructed_path))
If you’re not concerned about performance, just pick the option you find most readable.
methods like insert, remove or sort that only modify the list have no return value printed – they return the default None. 1 This is a design principle for all mutable data structures in Python.
PyDocs 5.1
As I understand it, you can see the distinction quickly by comparing the differences returned by modifying a list (mutable) ie using list.reverse() and mutating a list that's an element within a tuple (non-mutable), while calling
id(list)
id(tuple_with_list)
before and after the mutations. Mutable data-type mutations returning none is part allowing them to be changed/expanded/pointed-to-by-multiple references without reallocating memory.
I changed the following
sales_gross_last_7_days = self.events_sales_gross_last_7_days_incl_today.get(event.pk, {})
sales_gross_last_7_days.pop(timezone.now().date(), 0)
to that one (I added .copy()):
sales_gross_last_7_days = self.events_sales_gross_last_7_days_incl_today.copy().get(event.pk, {})
sales_gross_last_7_days.pop(timezone.now().date(), 0)
Before my change .pop() also affected the original dict. Is that normal behaviour for Python?
Yes, the pop() method removes the key that you pass from the dictionary.
Dictionaries are mutable objects, when you use the pop() method from this object, you are changing his content.
I'm looking at an output from 2to3 that includes this change:
- for file_prefix in output.keys():
+ for file_prefix in list(output.keys()):
where output is a dictionary.
What is the significance of this change? Why does 2to3 do this?
How does this change make the code Python 3 compatible?
In Python 3, the .keys() method returns a view object rather than a list, for efficiency's sake.
In the iteration case, this doesn't actually matter, but where it would matter is if you were doing something like foo.keys()[0] - you can't index a view. Thus, 2to3 always adds an explicit list conversion to make sure that any potential indexing doesn't break.
You can manually remove the list() call anywhere that a view would work fine; 2to3 just isn't smart enough to tell which case is which.
(Note that the 2.x version could call iterkeys() instead, since it's not indexing.)
In Python 2.x, dict.keys() returns a list.
In Python 3.x, dict.keys() returns a view and must be passed to list() in order to make it a list.
Since the Python 2.x code doesn't need a list it should call dict.iterkeys() instead.
In Python 2, .keys() returns a list of the keys, but in Python 3 it returns a non-list iterator. Since 2to3 can't know whether you really needed the keys to be a list, it has to err on the side of caution and wrap the call in list so you really get a list.
In Python2, keys returned a list while in 3 the return of keys is a dict_keys object. So if you were dependent on the list-result behavior, the explicit conversion is necessary.
I saw following code from here.
d[key] = data # store data at key (overwrites old data if
# using an existing key)
data = d[key] # retrieve a COPY of data at key (raise KeyError if no
# such key)
I don't understand the meaning of doing so. It is said retrieve a COPY of data at key. Seems dict lookup (getitem, or indexing, which one is the proper term?) will make a cope of the object? Right?
You're seeing shelve module documentation.
shelve.open returns a dictionary-like object, not a dictionary. It does not load all key-value pair at once; so comments in the example make sense.
Ordinarily, dict lookup returns the value stored at the key, not a copy of the value. This is important for mutable objects. For instance:
A = dict()
A["a"] = ["Hello", "world"] # Stores a 2-element list in the dict, at key "a"
B = A["a"] # Gets the list that was just stored
B[0] = "Goodbye" # Changes the first element of the list
print(A["a"][0]) # Prints "Goodbye"
In contrast, shelve will return a copy of the value stored with the key, so changing the returned value will not change the shelved value.
You are confusing implementation (i.e. what __getitem__ does for one specific type of object) for a specification (i.e. a prescription for what __getitem__ should do all the time).
__getitem__ just implements syntactic sugar around x[i] - it places no demands on how that is actually done. x[i] could just return the value associated with i in a dictionary. It could return a copy. It could cause way more side effects - i.e. it could cause files to be created/deleted, databases to be connected/disconnected, objects to be created/deleted, etc.
For dict, __getitem__ is defined to return the original object. But you shouldn't assume those semantics will apply for all other objects that implement it - you will be disappointed. When in doubt, you are doing the right thing - check the docs.
This appeared as some test question.
If you consider this function which uses a cache argument as the 1st argument
def f(cache, key, val):
cache[key] = val
# insert some insanely complicated operation on the cache
print cache
and now create a dictionary and use the function like so:
c = {}
f(c,"one",1)
f(c,"two",2)
this seems to work as expected (i.e adding to the c dictionary), but is it actually passing that reference or is it doing some inefficient copy ?
The dictionary passed to cache is not copied. As long as the cache variable is not rebound inside the function, it stays the same object, and modifications to the dictionary it refers to will affect the dictionary outside.
There is not even any need to return cache in this case (and indeed the sample code does not).
It might be better if f was a method on a dictionary-like object, to make this more conceptually clear.
If you use the id() function (built-in, does not need to be imported) you can get a unique identifier for any object. You can use that to confirm that you are really and truly dealing with the same object and not any sort of copy.