I'm wondering if there is a built-in way to do this... Take this simple code for example:
D = {'one': objectA(), 'two': objectB(), 'three': objectC()}
object_a = D['one']
I believe object_a is just pointing at the objectA() created on the first line, and knows nothing about the dictionary D, but my question is, does Python store the Key of the dictionary value? Is there a way to get the Key 'one' if all you have is the variable object_a (without looping over the dictionary, of course)?
If not, I can store the value 'one' inside objectA(), but I'm just curious if Python already stores that info.
I think no.
Consider the case of adding a single object to a (large) number of different dictionaries. It would become quite expensive for Python to track that for you, it would cost a lot for a feature not used by most.
The dict mapping is not trivially "reversible" as you describe.
The key must be immutable. It must be immutable so that it can be hashed for lookup and not suffer spontaneous changes.
The value does not have to be immutable, it is not hashed for quick lookup.
You cannot simply go from value back to key without (1) creating an immutable value and (2) populating some other kind of mapping with the "reversed" value -> key mapping.
Is there a way to get the Key 'one' if
all you have is the variable object_a
(without looping over the dictionary,
of course)?
No, Python imposes no such near-useless redundancy on you. If objA is a factory callable:
d = {'zap': objA()}
a = d['zap']
and
b = objA()
just as well as
L = [objA()]
c = L[0]
all result in exactly the same kind of references in a, b and c, to exactly equivalent objects (if that's what objA gives you in the first place), without one bit wasted (neither in said objects nor in any redundant and totally hypothetical auxiliary structure) to record "this is/was a value in list L and/or dict d at these index/key" ((or indices/keys since of cource there could be many)).
Like others have said, there is no built-in way to do this, since it takes up memory and is not usually needed.
If not, I can store the value 'one' inside objectA(), but I'm just curious if Python already stores that info.
Just wanted to add that it should be pretty easy to add a more general solution which does this automatically. For example:
def MakeDictReversible(dict):
for k, v in dict.iteritems():
v.dict_key = k
This function just embeds every object in the dictionary with a member "dict_key", which is the dictionary key used to store the object.
Of course, this code can only work once (i.e., run this on two different dictionaries which share an object, and the object's "dict_key" member will be overwritten by the second dictionary).
Related
I'd like to find out how dictionary equivalence is checked in python, depending on whether the same dictionary instance is used, or a different instance with the same values. I've found some stackoverflow posts hinting at the answer, but none stating it explicitely.
Basically, assuming I have an array of dictionaries, with various key/value pairs. If I now created a new instance of a dictionary, and performed an mydict in mylist check, when would the result of this check be True, and when would it be False?
Optimally, I'd hope that the behavior is as follows:
Same instance
Different instance
Same value
T
T
Different value
F
F
However, it looks like the real result is something more like this:
Same instance
Different instance
Same value
T
T
Different value
T
F
Is my second result guaranteed to always be the case? How does equivalence of dicts work exactly, in Python?
Also, what is the best way to check if a dict is in a list anyway? With the following code:
if b not in last[a]:
last[a][b] = [] #Since this happens in a for loop, just make sure the array is initialized
newIni = {"val1": val1, "val2": val2}
if newIni not in last[a][b]:
# do stuff
Gives me the error unhashable type dict, which supposedly should only come up when trying to use dicts as a key, but here I'm using dicts as a value, so not sure what's going on...
I recently wrote some code that looked something like this:
# dct is a dictionary
if "key" in dct.keys():
However, I later found that I could achieve the same results with:
if "key" in dct:
This discovery got me thinking and I began to run some tests to see if there could be a scenario when I must use the keys method of a dictionary. My conclusion however is no, there is not.
If I want the keys in a list, I can do:
keys_list = list(dct)
If I want to iterate over the keys, I can do:
for key in dct:
...
Lastly, if I want to test if a key is in dct, I can use in as I did above.
Summed up, my question is: am I missing something? Could there ever be a scenario where I must use the keys method?...or is it simply a leftover method from an earlier installation of Python that should be ignored?
On Python 3, use dct.keys() to get a dictionary view object, which lets you do set operations on just the keys:
>>> for sharedkey in dct1.keys() & dct2.keys(): # intersection of two dictionaries
... print(dct1[sharedkey], dct2[sharedkey])
In Python 2.7, you'd use dct.viewkeys() for that.
In Python 2, dct.keys() returns a list, a copy of the keys in the dictionary. This can be passed around an a separate object that can be manipulated in its own right, including removing elements without affecting the dictionary itself; however, you can create the same list with list(dct), which works in both Python 2 and 3.
You indeed don't want any of these for iteration or membership testing; always use for key in dct and key in dct for those, respectively.
Source: PEP 234, PEP 3106
Python 2's relatively useless dict.keys method exists for historical reasons. Originally, dicts weren't iterable. In fact, there was no such thing as an iterator; iterating over sequences worked by calling __getitem__, the element access method, with increasing integer indices until an IndexError was raised. To iterate over the keys of a dict, you had to call the keys method to get an explicit list of keys and iterate over that.
When iterators went in, dicts became iterable, because it was more convenient, faster, and all around better to say
for key in d:
than
for key in d.keys()
This had the side-effect of making d.keys() utterly superfluous; list(d) and iter(d) now did everything d.keys() did in a cleaner, more general way. They couldn't get rid of keys, though, since so much code already called it.
(At this time, dicts also got a __contains__ method, so you could say key in d instead of d.has_key(key). This was shorter and nicely symmetrical with for key in d; the symmetry is also why iterating over a dict gives the keys instead of (key, value) pairs.)
In Python 3, taking inspiration from the Java Collections Framework, the keys, values, and items methods of dicts were changed. Instead of returning lists, they would return views of the original dict. The key and item views would support set-like operations, and all views would be wrappers around the underlying dict, reflecting any changes to the dict. This made keys useful again.
Assuming you're not using Python 3, list(dct) is equivalent to dct.keys(). Which one you use is a matter of personal preference. I personally think dct.keys() is slightly clearer, but to each their own.
In any case, there isn't a scenario where you "need" to use dct.keys() per se.
In Python 3, dct.keys() returns a "dictionary view object", so if you need to get a hold of an unmaterialized view to the keys (which could be useful for huge dictionaries) outside of a for loop context, you'd need to use dct.keys().
key in dict
is much faster than checking
key in dict.keys()
What does this mean?
The only types of values not acceptable as dictionary keys are values containing lists or dictionaries or other mutable types that are compared by value rather than by object identity, the reason being that the efficient implementation of dictionaries requires a key’s hash value to remain constant.
I think even for tuples, comparison will happen by value.
The problem with a mutable object as a key is that when we use a dictionary, we rarely want to check identity. For example, when we use a dictionary like this:
a = "bob"
test = {a: 30}
print(test["bob"])
We expect it to work - the second string "bob" may not be the same as a, but it is the same value, which is what we care about. This works as any two strings that equate will have the same hash, meaning that the dict (implemented as a hashmap) can find those strings very efficiently.
The issue comes into play when we have a list as a key, imagine this case:
a = ["bob"]
test = {a: 30}
print(test[["bob"]])
We can't do this any more - the comparison won't work as the hash of a list is not based on it's value, but rather the instance of the list (aka (id(a) != id(["bob"))).
Python has the choice of making the list's hash change (undermining the efficiency of a hashmap) or simply comparing on identity (which is useless in most cases). Python disallows these specific mutable keys to avoid subtle but common bugs where people expect the values to be equated on value, rather than identity.
The documentation mixes together two different things: mutability, and value-comparable. Let's separate them out.
Immutable objects that compare by identity are fine. The identity can
never change, for any object.
Immutable objects that compare by value are fine. The value can never
change for an immutable object. This includes tuples.
Mutable objects that compare by identity are fine. The identity can
never change, for any object.
Mutable objects that compare by value are not acceptable. The value
can change for a mutable object, which would make the dictionary
invalid.
Meanwhile, your wording isn't quite the same as Mapping Types (4.10 in Python 3.3 or 5.8 in Python 2.7, both of which say:
A dictionary’s keys are almost arbitrary values. Values that are not hashable, that is, values containing lists, dictionaries or other mutable types (that are compared by value rather than by object identity) may not be used as keys.
Anyway, the key point here is that the rule is "not hashable"; "mutable types (that are compared by value rather than by object identity)" is just to explain things a little further. It isn't strictly true that comparing by object identity and hashing by object identity are always the same (the only thing that's required is that if id is equal, the hash is equal).
The part about "efficient implementation of dictionaries" from the version you posted just adds to the confusion (which is probably why it's not in the reference documentation). Even if someone came up with an efficient way to deal with storing lists as dict keys tomorrow, the language doesn't allow it.
A hash is way of calculating an unique code for an object, this code always the same for the same object. hash('test') for example is 2314058222102390712, so is a = 'test'; hash(a) = 2314058222102390712.
Internally a dictionary value is searched by the hash, not by the variable you specify. A list is mutable, a hash for a list, if it where defined, would be changing whenever the list changes. Therefore python's design does not hash lists. Lists therefore can not be used as dictionary keys.
Tuples are immutable, therefore tubles have hashes e.G. hash((1,2)) = 3713081631934410656. one could compare whether a tuple a is equal to the tuple (1,2) by comparing the hash, rather than the value. This would be more efficient as we have to compare only one value instead of two.
I'm wondering if it is OK to modify the values of a Python dictionary when it doesn't depend on the keys:
# d is some dictionary containing classes I wrote
for v in d.itervalues():
# modify v, and v's type may or may not change
I'm not sure what the Python standard says about this, could somebody please provide some information?
Thanks!
If you mean constructs like:
d = {1: [1]}
for v in d.itervalues():
v[0] += 1
then yes, this is completely safe. The dict just stores a reference to the object in question and does not touch it in any way other than storage and retrieval. This is not explicitly documented, but it is implicit in the definition of mapping (of which dict is a subtype):
A mapping object maps hashable values to arbitrary objects.
"Arbitrary" means the object may be mutable.
Suppose I've got two dicts in Python:
mydict = { 'a': 0 }
defaults = {
'a': 5,
'b': 10,
'c': 15
}
I want to be able to expand mydict using the default values from defaults, such that 'a' remains the same but 'b' and 'c' are filled in. I know about dict.setdefault() and dict.update(), but each only do half of what I want - with dict.setdefault(), I have to loop over each variable in defaults; but with dict.update(), defaults will blow away any pre-existing values in mydict.
Is there some functionality I'm not finding built into Python that can do this? And if not, is there a more Pythonic way of writing a loop to repeatedly call dict.setdefaults() than this:
for key in defaults.keys():
mydict.setdefault(key, defaults[key])
Context: I'm writing up some data in Python that controls how to parse an XML tree. There's a dict for each node (i.e., how to process each node), and I'd rather the data I write up be sparse, but filled in with defaults. The example code is just an example... real code has many more key/value pairs in the default dict.
(I realize this whole question is but a minor quibble, but it's been bothering me, so I was wondering if there was a better way to do this that I am not aware of.)
Couldnt you make mydict be a copy of default, That way, mydict would have all the correct values to start with?
mydict = default.copy()
If you don't mind creating a new dictionary in the process, this will do the trick:
newdict = dict(defaults)
newdict.update(mydict)
Now newdict contains what you need.
Since Python 3.9, you can do:
mydict = defaults | mydict
I found this solution to be the most elegant in my usecase.
You can do this the same way Python's collections.DefaultDict works:
class MultiDefaultDict(dict):
def __init__(self, defaults, **kwargs):
self.defaults = defaults
self.update(kwargs)
def __missing__(self, key):
return self.defaults[key]
>>> mydict2 = MultiDefaultDict(defaults, a=0)
>>> mydict2['a']
0
>>> mydict2['b']
10
>>> mydict2
{'a': 0}
The other solutions posted so far duplicate all the default values; this one shares them, as requested. You may or may not want to override other dict methods like __contains__(), __iter__(), items(), keys(), values() -- this class as defined here iterates over the non-default items only.
defaults.update(mydict)
Personally I like to append the dictionary object. It works mostly like a dictionary except that you have to create the object first.
class d_dict(dict):
'Dictionary object with easy defaults.'
def __init__(self,defaults={}):
self.setdefault(defaults)
def setdefault(self,defaults):
for key, value in defaults.iteritems():
if not key in self:
dict.__setitem__(self,key,value)
This provides the exact same functionality as the dict type except that it overrides the setdefault() method and will take a dictionary containing one or more items. You can set the defaults at creation.
This is just a personal preference. As I understand all that dict.setdefault() does is set the items which haven't been set yet. So probably the simplest in place option is:
new_dict = default_dict.copy()
new_dict.update({'a':0})
However, if you do this more than once you might make a function out of this. At this point it may just be easier to use a custom dict object, rather than constantly adding defaults to your dictionaries.