Does orderedDict.values(), .keys(), .iterkeys() etc. returns the values in the order that items were first inserted?
I assume that values\keys function does not change the order of the dict, and if it's orderedDict, then i get the values in the order that they were added to the dictionary.
That's true?
Does orderedDict.values(), .keys(), .iterkeys() etc. returns the values in the order that items were first inserted?
Yes, they are preserved. From the documentation (emphasis mine):
Equality tests between OrderedDict objects are order-sensitive and are
implemented as list(od1.items())==list(od2.items()).
Thats how equality comparison is done, and it implies that the result of items() are ordered. Now for other two functions you can look into a substitute implementation for OrderedDict here
Your assumption is correct. OrderedDict maintains a list of (thus ordered) keys to iterate over values and keys, so they will always be ordered the same way as in for loops.
The comments in the source code also states this, saying:
The inherited dict provides __getitem__, __len__, __contains__, and get. The remaining methods are order-aware. Big-O running times for all methods are the same as regular dictionaries.
Emphasis mine.
Related
Set is unordered and unindexed. Thus, there is no concept of last entered element. Thus, there is no popitem. Is this the reasoning for no popitem in set?
If this is valid reasoning then why dictionary has popitem. dictionary is also unordered like Set.
The corresponding method for sets is pop():
pop()
Remove and return an arbitrary element from the set. Raises KeyError if the set is empty.
Prior to Python 3.7 dicts were unordered and popitem() returned an arbitrary key-value pair. It's only since 3.7 that dicts have been ordered and popitem() defined to return items in LIFO order.
It's called popitem() for dicts because there's already a pop(key) method that removes the item with the specified key.
Python set has pop(), to return an arbitrary item. You don't know which one it turn, though.
There is no method to remove the last-entered element from a set because set is unordered in Python.
If you want an easy way to emulate an ordered set using the standard library you can use collections.OrderedDict instead:
from collections import OrderedDict
s = OrderedDict.fromkeys((2,5,2,6,7))
s.popitem()
From the example above, s.keys() would become:
odict_keys([2, 5, 6])
I have a variable (object from database). In some cases this variable can be type of list and in some cases dictionary.
Standard for cycle if variable is list:
for value in object_values:
self.do_something(value)
Standard for cycle if variable is dictionary:
for key, value in object_values.items():
self.do_something(value)
I can use instanceof() two check the type, but then I still need two functions or if with two for cycles. I have now if condition which calls one of the two functions, one for iterating as list (e.g. iterate_list()) and the second for iterating as dictionary (e.g. iterate_dict()) .
Is there any better option how elegantly and more pythonic way resolve problem that I don't know if the variable will be list or dictionary?
in your case, since the data is either the items or the values of the dictionary, you could use a ternary to get values() or just the iterable depending on the type:
def iterate(self,object_values):
for value in object_values.values() if isinstance(object_values,dict) else object_values:
self.do_something(value)
If you pass a tuple, generator or other iterable, it falls back on "standard" iteration. If you pass a dictionary (or OrderedDict or other), it iterates on the values.
Performance-wise, the ternary expression is evaluated only once at the start of the iteration, so it's fine.
The isinstance bit could even be replaced by if hasattr(object_values,"values") so even non-dict objects with a values member would match.
(Note that you should be aware of the "least atonishment" principle. Some people may expect an iteration on the keys of the dictionary when calling the method)
I have read that lists cannot be dictionary keys because mutable objects cannot be hashed.
However, custom objects appear to be mutable as well:
# custom object
class Vertex(object):
def __init__(self, key):
self.key = key
v = Vertex(1)
v.color = 'grey' # this line suggests the custom object is mutable
But, unlike lists, they can be used as dictionary keys; why is this? Couldn't we simply hash some sort of id (such as the address of the object in memory) in both cases?
as noted in Why Lists can't be Dictionary Keys:
Lists as Dictionary Keys
That said, the simple answer to why lists cannot be used as dictionary keys is that lists do not provide a valid hash method. Of course, the obvious question is, "Why not?"
Consider what kinds of hash functions could be provided for lists.
If lists hashed by id, this would certainly be valid given Python's definition of a hash function -- lists with different hash values would have different ids. But lists are containers, and most other operations on them deal with them as such. So hashing lists by their id instead would produce unexpected behavior such as:
Looking up different lists with the same contents would produce different results, even though comparing lists with the same contents would indicate them as equivalent.
Using a list literal in a dictionary lookup would be pointless -- it would always produce a KeyError.
User Defined Types as Dictionary Keys
What about instances of user defined types?
By default, all user defined types are usable as dictionary keys with hash(object) defaulting to id(object), and cmp(object1, object2) defaulting to cmp(id(object1), id(object2)). This same suggestion was discussed above for lists and found unsatisfactory. Why are user defined types different?
In the cases where an object must be placed in a mapping, object identity is often much more important than object contents.
In the cases where object content really is important, the default settings can be redefined by overridding __hash__ and __cmp__ or __eq__.
Note that it is often better practice, when an object is to be associated with a value, to simply assign that value as one of the object's attributes.
I recently wrote some code that looked something like this:
# dct is a dictionary
if "key" in dct.keys():
However, I later found that I could achieve the same results with:
if "key" in dct:
This discovery got me thinking and I began to run some tests to see if there could be a scenario when I must use the keys method of a dictionary. My conclusion however is no, there is not.
If I want the keys in a list, I can do:
keys_list = list(dct)
If I want to iterate over the keys, I can do:
for key in dct:
...
Lastly, if I want to test if a key is in dct, I can use in as I did above.
Summed up, my question is: am I missing something? Could there ever be a scenario where I must use the keys method?...or is it simply a leftover method from an earlier installation of Python that should be ignored?
On Python 3, use dct.keys() to get a dictionary view object, which lets you do set operations on just the keys:
>>> for sharedkey in dct1.keys() & dct2.keys(): # intersection of two dictionaries
... print(dct1[sharedkey], dct2[sharedkey])
In Python 2.7, you'd use dct.viewkeys() for that.
In Python 2, dct.keys() returns a list, a copy of the keys in the dictionary. This can be passed around an a separate object that can be manipulated in its own right, including removing elements without affecting the dictionary itself; however, you can create the same list with list(dct), which works in both Python 2 and 3.
You indeed don't want any of these for iteration or membership testing; always use for key in dct and key in dct for those, respectively.
Source: PEP 234, PEP 3106
Python 2's relatively useless dict.keys method exists for historical reasons. Originally, dicts weren't iterable. In fact, there was no such thing as an iterator; iterating over sequences worked by calling __getitem__, the element access method, with increasing integer indices until an IndexError was raised. To iterate over the keys of a dict, you had to call the keys method to get an explicit list of keys and iterate over that.
When iterators went in, dicts became iterable, because it was more convenient, faster, and all around better to say
for key in d:
than
for key in d.keys()
This had the side-effect of making d.keys() utterly superfluous; list(d) and iter(d) now did everything d.keys() did in a cleaner, more general way. They couldn't get rid of keys, though, since so much code already called it.
(At this time, dicts also got a __contains__ method, so you could say key in d instead of d.has_key(key). This was shorter and nicely symmetrical with for key in d; the symmetry is also why iterating over a dict gives the keys instead of (key, value) pairs.)
In Python 3, taking inspiration from the Java Collections Framework, the keys, values, and items methods of dicts were changed. Instead of returning lists, they would return views of the original dict. The key and item views would support set-like operations, and all views would be wrappers around the underlying dict, reflecting any changes to the dict. This made keys useful again.
Assuming you're not using Python 3, list(dct) is equivalent to dct.keys(). Which one you use is a matter of personal preference. I personally think dct.keys() is slightly clearer, but to each their own.
In any case, there isn't a scenario where you "need" to use dct.keys() per se.
In Python 3, dct.keys() returns a "dictionary view object", so if you need to get a hold of an unmaterialized view to the keys (which could be useful for huge dictionaries) outside of a for loop context, you'd need to use dct.keys().
key in dict
is much faster than checking
key in dict.keys()
What does this mean?
The only types of values not acceptable as dictionary keys are values containing lists or dictionaries or other mutable types that are compared by value rather than by object identity, the reason being that the efficient implementation of dictionaries requires a key’s hash value to remain constant.
I think even for tuples, comparison will happen by value.
The problem with a mutable object as a key is that when we use a dictionary, we rarely want to check identity. For example, when we use a dictionary like this:
a = "bob"
test = {a: 30}
print(test["bob"])
We expect it to work - the second string "bob" may not be the same as a, but it is the same value, which is what we care about. This works as any two strings that equate will have the same hash, meaning that the dict (implemented as a hashmap) can find those strings very efficiently.
The issue comes into play when we have a list as a key, imagine this case:
a = ["bob"]
test = {a: 30}
print(test[["bob"]])
We can't do this any more - the comparison won't work as the hash of a list is not based on it's value, but rather the instance of the list (aka (id(a) != id(["bob"))).
Python has the choice of making the list's hash change (undermining the efficiency of a hashmap) or simply comparing on identity (which is useless in most cases). Python disallows these specific mutable keys to avoid subtle but common bugs where people expect the values to be equated on value, rather than identity.
The documentation mixes together two different things: mutability, and value-comparable. Let's separate them out.
Immutable objects that compare by identity are fine. The identity can
never change, for any object.
Immutable objects that compare by value are fine. The value can never
change for an immutable object. This includes tuples.
Mutable objects that compare by identity are fine. The identity can
never change, for any object.
Mutable objects that compare by value are not acceptable. The value
can change for a mutable object, which would make the dictionary
invalid.
Meanwhile, your wording isn't quite the same as Mapping Types (4.10 in Python 3.3 or 5.8 in Python 2.7, both of which say:
A dictionary’s keys are almost arbitrary values. Values that are not hashable, that is, values containing lists, dictionaries or other mutable types (that are compared by value rather than by object identity) may not be used as keys.
Anyway, the key point here is that the rule is "not hashable"; "mutable types (that are compared by value rather than by object identity)" is just to explain things a little further. It isn't strictly true that comparing by object identity and hashing by object identity are always the same (the only thing that's required is that if id is equal, the hash is equal).
The part about "efficient implementation of dictionaries" from the version you posted just adds to the confusion (which is probably why it's not in the reference documentation). Even if someone came up with an efficient way to deal with storing lists as dict keys tomorrow, the language doesn't allow it.
A hash is way of calculating an unique code for an object, this code always the same for the same object. hash('test') for example is 2314058222102390712, so is a = 'test'; hash(a) = 2314058222102390712.
Internally a dictionary value is searched by the hash, not by the variable you specify. A list is mutable, a hash for a list, if it where defined, would be changing whenever the list changes. Therefore python's design does not hash lists. Lists therefore can not be used as dictionary keys.
Tuples are immutable, therefore tubles have hashes e.G. hash((1,2)) = 3713081631934410656. one could compare whether a tuple a is equal to the tuple (1,2) by comparing the hash, rather than the value. This would be more efficient as we have to compare only one value instead of two.