Why can't I make a frozeset of a list? - python

When I try the code below:
frozenset([[]])
I get
Traceback (most recent call last):
File "-----.-----.py", line 1, in <module>
frozenset([[]]) TypeError: unhashable type: 'list'
Why can't I do this?

In Python, the elements of set must be hashable. As the Python docs explain:
An object is hashable if it has a hash value which never changes during its lifetime (it needs a hash() method), and can be compared to other objects (it needs an eq() method). Hashable objects which compare equal must have the same hash value.
This is because the set needs to be able to efficiently perform set operations and quickly determine if an item is in the set or not (by comparing hash values). Since list are mutable and not hashable, then they can't be put in a set.
In your code, if you were to say frozenset([]) then that would be fine. In this case, you are creating a frozenset of the items in [] which should all be hashable (since there aren't any items in the list, then the hashable-ness is not a problem). But when you say frozenset([[]]), then Python tries to create a frozenset of all the items in the outer list; but the first item in the outer list is another list ([]) which is not hashable; so you will get an error.

Because lists are mutable and their values can change.
You need immutable objects like strings or tuples etc.
If the value of an object changes it's hash value would also change.
object values and types
Brandon Rhodes gives a very good explanation on how hashing works in Python in relation to dictionaries the mighty dictionary

Related

Explaining appearent idiosyncrasy in Python when using tuples as elements in a set [duplicate]

This question already has answers here:
What does "hashable" mean in Python?
(10 answers)
Closed last month.
If we execute the two following lines in Python:
a = (1,)
{a}
it won't cause any problems. Instead, if we execute the two following lines:
a = ([1],)
{a}
now what we get is "TypeError: unhashable type: 'list'".
In both cases, we're building a set consisting of elements whose type is a tuple, which is immutable (in the first case (1,) and in the second case ([1],)). In the second case, however, our tuple contains an object of mutable type, i.e., the list [1].
It seems that the condition that the elements of a set must be immutable is not enough to guarantee that sets are successfully created. What is the exact condition that won't lead to any error when building a set? What is happening at low level?
As the error message indicates, the actual criterion for whether something can be in a set is whether it is hashable, as stated in the set docs:
Set elements, like dictionary keys, must be hashable.
However, only immutable object should be hashable, as Python's docs on __hash__() explain:
If a class defines mutable objects and implements an __eq__() method, it should not implement __hash__(), since the implementation of hashable collections requires that a key’s hash value is immutable (if the object’s hash value changes, it will be in the wrong hash bucket).
In order for the hash to be representative of the entire contained data, immutable containers like tuples implement hashing by hashing each contained element in turn, with all the elements' hashes then going into the computation of the container's hash. Of course this is continued recursively if any elements are themselves immutable containers.
In your case, hashing the tuple will thus lead to an attempted hashing of the contained list, which fails as lists are not hashable:
>>> hash([1, 2])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
As for why set elements must be hashable: It's because that will allow membership checking to proceed in O(1) time (think hash map lookup). You could make your own set type without this requirement that would just do membership checking by going through the list of elements one by one, of course - that would effectively just be a list with an addition operation that refuses to insert duplicates.

can i use dictionary as a value of set?

as you can see that i am trying to use dictionary in set as a value but it is showing error i want to know that why it is not possible to use dictionary as a value of set ? i want to know why ? is there any reason ? it is not working at all and so many errors are comming So please help me what is the problem ? why can not i use dictionary as a value of set sequence ? but it is working with list and tuple but it is not working only with set why?
s={1,2,4,{1:'fc',2:'tw'},'co-operator'}
print(s)
A set requires that all elements in it are hashable. A dictionary is not hashable.
An object is hashable if it has a hash value which never
changes during its lifetime (it needs a hash() method), and can be
compared to other objects (it needs an eq() method). Hashable
objects which compare equal must have the same hash value.
Hashability makes an object usable as a dictionary key and a set
member, because these data structures use the hash value internally.
Most of Python’s immutable built-in objects are hashable; mutable
containers (such as lists or dictionaries) are not; immutable
containers (such as tuples and frozensets) are only hashable if their
elements are hashable. Objects which are instances of user-defined
classes are hashable by default. They all compare unequal (except with
themselves), and their hash value is derived from their id().

In Python, why is a tuple hashable but not a list?

Here below when I try to hash a list, it gives me an error but works with a tuple. Guess it has something to do with immutability. Can someone explain this in detail ?
List
x = [1,2,3]
y = {x: 9}
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
Tuple
z = (5,6)
y = {z: 89}
print(y)
{(5, 6): 89}
Dicts and other objects use hashes to store and retrieve items really quickly. The mechanics of this all happens "under the covers" - you as the programmer don't need to do anything and Python handles it all internally. The basic idea is that when you create a dictionary with {key: value}, Python needs to be able to hash whatever you used for key so it can store and look up the value quickly.
Immutable objects, or objects that can't be altered, are hashable. They have a single unique value that never changes, so python can "hash" that value and use it to look up dictionary values efficiently. Objects that fall into this category include strings, tuples, integers and so on. You may think, "But I can change a string! I just go mystr = mystr + 'foo'," but in fact what this does is create a new string instance and assigns it to mystr. It doesn't modify the existing instance. Immutable objects never change, so you can always be sure that when you generate a hash for an immutable object, looking up the object by its hash will always return the same object you started with, and not a modified version.
You can try this for yourself: hash("mystring"), hash(('foo', 'bar')), hash(1)
Mutable objects, or objects that can be modified, aren't hashable. A list can be modified in-place: mylist.append('bar') or mylist.pop(0). You can't safely hash a mutable object because you can't guarantee that the object hasn't changed since you last saw it. You'll find that list, set, and other mutable types don't have a __hash__() method. Because of this, you can't use mutable objects as dictionary keys:
>>> hash([1,2,3])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
Eric Duminil's answer provides a great example of the unexpected behaviour that arises from using mutable objects as dictionary keys
Here are examples why it might not be a good idea to allow mutable types as keys. This behaviour might be useful in some cases (e.g. using the state of the object as a key rather than the object itself) but it also might lead to suprising results or bugs.
Python
It's possible to use a numeric list as a key by defining __hash__ on a subclass of list :
class MyList(list):
def __hash__(self):
return sum(self)
my_list = MyList([1, 2, 3])
my_dict = {my_list: 'a'}
print(my_dict.get(my_list))
# a
my_list[2] = 4 # __hash__() becomes 7
print(next(iter(my_dict)))
# [1, 2, 4]
print(my_dict.get(my_list))
# None
print(my_dict.get(MyList([1,2,3])))
# None
my_list[0] = 0 # __hash_() is 6 again, but for different elements
print(next(iter(my_dict)))
# [0, 2, 4]
print(my_dict.get(my_list))
# 'a'
Ruby
In Ruby, it's allowed to use a list as a key. A Ruby list is called an Array and a dict is a Hash, but the syntax is very similar to Python's :
my_list = [1]
my_hash = { my_list => 'a'}
puts my_hash[my_list]
#=> 'a'
But if this list is modified, the dict doesn't find the corresponding value any more, even if the key is still in the dict :
my_list << 2
puts my_list
#=> [1,2]
puts my_hash.keys.first
#=> [1,2]
puts my_hash[my_list]
#=> nil
It's possible to force the dict to calculate the key hashes again :
my_hash.rehash
puts my_hash[my_list]
#=> 'a'
A hashset calculates the hash of an object and based on that hash, stores the object in the structure for fast lookup. As a result, by contract once an object is added to the dictionary, the hash is not allowed to change. Most good hash functions will depend on the number of elements and the elements itself.
A tuple is immutable, so after construction, the values cannot change and therefore the hash cannot change either (or at least a good implementation should not let the hash change).
A list on the other hand is mutable: one can later add/remove/alter elements. As a result the hash can change violating the contract.
So all objects that cannot guarantee a hash function that remains stable after the object is added, violate the contract and thus are no good candidates. Because for a lookup, the dictionary will first calculate the hash of the key, and determine the correct bucket. If the key is meanwhile changed, this could result in false negatives: the object is in the dictionary, but it can no longer be retrieved because the hash is different so a different bucket will be searched than the one where the object was originally added to.
I would like to add the following aspect as it's not covered by other answers already.
There's nothing wrong about making mutable objects hashable, it's just not unambiguous and this is why it needs to be defined and implemented consistently by the programmer himself (not by the programming language).
Note that you can implement the __hash__ method for any custom class which allows its instances to be stored in contexts where hashable types are required (such as dict keys or sets).
Hash values are usually used to decide if two objects represent the same thing. So consider the following example. You have a list with two items: l = [1, 2]. Now you add an item to the list: l.append(3). And now you must answer the following question: Is it still the same thing? Both - yes and no - are valid answers. "Yes", it is still the same list and "no", it has not the same content anymore.
So the answer to this question depends on you as the programmer and so it is up to you to manually implement hash methods for your mutable types.
Based on Python Glossary
An object is hashable if it has a hash value which never changes during its lifetime (it needs a __hash__() method), and can be compared to other objects (it needs an __eq__() method). Hashable objects which compare equal must have the same hash value.
All of Python’s immutable built-in objects are hashable; mutable containers (such as lists or dictionaries) are not.
Because a list is mutable, while a tuple is not. When you store the hash of a value in, for example, a dict, if the object changes, the stored hash value won't find out, so it will remain the same. The next time you look up the object, the dictionary will try to look it up by the old hash value, which is not relevant anymore.
To prevent that, python does not allow you to has mutable items.

why sets,dicts,list are unhashable in python

What exactly is meant by unhashable?
>>> a={1,2,3}
>>> b={4,5,6}
>>> set([a,b])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'set'
>>>
Can any one tell what the error is exactly? Also can i add set into another set in python?
Objects that doesn't have the __hash__() attribute called unhashable. Python documentation has described the reason very well:
If a class defines mutable objects and implements an __eq__() method, it should not implement __hash__(), since the implementation of hashable collections requires that a key’s hash value is immutable (if the object’s hash value changes, it will be in the wrong hash bucket).
As Kasramvd explained, objects in python that are mutable and implement the __eq__ function are unhashable.
Since sets, lists and dicts are mutable (i.e. they can be changed; for instance you can add, remove items to all of them) , they cannot be hashed.
Since a set of sets is not possible, perhaps a set of tuple might work, though you will need to do additional bookkeeping (e.g ensure unique values) in order to achieve exactly what you described.
a = (1,2,3)
b = (4,5,6)
c = set([a,b])
Or even better, a set of frozensets. Similar to sets, but immutable (you cannot add or remove elements from them).
a = frozenset(a)
b = frozenset(b)
c = set([a,b])
A hash function is any function that can be used to map data of
arbitrary size to data of fixed size. The values returned by a hash
function are called hash values, hash codes, hash sums, or simply
hashes.
The dictionary in python is just a hash map.
And sets could only contain strings or chars or numbers, but not dics or another sets.
You might wanna look at: https://docs.python.org/2/tutorial/datastructures.html#sets

when compare by id is used in Python? Dictionary key comparison?

What does this mean?
The only types of values not acceptable as dictionary keys are values containing lists or dictionaries or other mutable types that are compared by value rather than by object identity, the reason being that the efficient implementation of dictionaries requires a key’s hash value to remain constant.
I think even for tuples, comparison will happen by value.
The problem with a mutable object as a key is that when we use a dictionary, we rarely want to check identity. For example, when we use a dictionary like this:
a = "bob"
test = {a: 30}
print(test["bob"])
We expect it to work - the second string "bob" may not be the same as a, but it is the same value, which is what we care about. This works as any two strings that equate will have the same hash, meaning that the dict (implemented as a hashmap) can find those strings very efficiently.
The issue comes into play when we have a list as a key, imagine this case:
a = ["bob"]
test = {a: 30}
print(test[["bob"]])
We can't do this any more - the comparison won't work as the hash of a list is not based on it's value, but rather the instance of the list (aka (id(a) != id(["bob"))).
Python has the choice of making the list's hash change (undermining the efficiency of a hashmap) or simply comparing on identity (which is useless in most cases). Python disallows these specific mutable keys to avoid subtle but common bugs where people expect the values to be equated on value, rather than identity.
The documentation mixes together two different things: mutability, and value-comparable. Let's separate them out.
Immutable objects that compare by identity are fine. The identity can
never change, for any object.
Immutable objects that compare by value are fine. The value can never
change for an immutable object. This includes tuples.
Mutable objects that compare by identity are fine. The identity can
never change, for any object.
Mutable objects that compare by value are not acceptable. The value
can change for a mutable object, which would make the dictionary
invalid.
Meanwhile, your wording isn't quite the same as Mapping Types (4.10 in Python 3.3 or 5.8 in Python 2.7, both of which say:
A dictionary’s keys are almost arbitrary values. Values that are not hashable, that is, values containing lists, dictionaries or other mutable types (that are compared by value rather than by object identity) may not be used as keys.
Anyway, the key point here is that the rule is "not hashable"; "mutable types (that are compared by value rather than by object identity)" is just to explain things a little further. It isn't strictly true that comparing by object identity and hashing by object identity are always the same (the only thing that's required is that if id is equal, the hash is equal).
The part about "efficient implementation of dictionaries" from the version you posted just adds to the confusion (which is probably why it's not in the reference documentation). Even if someone came up with an efficient way to deal with storing lists as dict keys tomorrow, the language doesn't allow it.
A hash is way of calculating an unique code for an object, this code always the same for the same object. hash('test') for example is 2314058222102390712, so is a = 'test'; hash(a) = 2314058222102390712.
Internally a dictionary value is searched by the hash, not by the variable you specify. A list is mutable, a hash for a list, if it where defined, would be changing whenever the list changes. Therefore python's design does not hash lists. Lists therefore can not be used as dictionary keys.
Tuples are immutable, therefore tubles have hashes e.G. hash((1,2)) = 3713081631934410656. one could compare whether a tuple a is equal to the tuple (1,2) by comparing the hash, rather than the value. This would be more efficient as we have to compare only one value instead of two.

Categories