How to know when creating set or frozenset - python

I am new to Python. I am reading Building Skills in Python (Lott) and trying out some examples. I see that the set(iterable) function creates both a mutable set and an immutable frozenset. How do I know if I am creating a set or a frozenset?

That is simply incorrect. The set() built-in returns a set, not a frozenset. frozenset() returns a frozenset. A set and a frozenset are both set types, however they are distinct set types.
The Python docs can always be useful for clarification on things like this, there's an entire list of built-in functions.
Excerpt from the book Building Skills in Python (Lott) noted by OP in a comment, emphasis mine.
A set value is created by using the set() or frozenset() factory
functions. These can be applied to any iterable container, which includes any sequence, the keys of a dict,
or even a file.
The author here is using "set value" to describe a value of set type, and is thus not indicating that set() and frozenset() do the same thing - they produce values of distinct set types, namely sets and frozensets.

Related

can i use dictionary as a value of set?

as you can see that i am trying to use dictionary in set as a value but it is showing error i want to know that why it is not possible to use dictionary as a value of set ? i want to know why ? is there any reason ? it is not working at all and so many errors are comming So please help me what is the problem ? why can not i use dictionary as a value of set sequence ? but it is working with list and tuple but it is not working only with set why?
s={1,2,4,{1:'fc',2:'tw'},'co-operator'}
print(s)
A set requires that all elements in it are hashable. A dictionary is not hashable.
An object is hashable if it has a hash value which never
changes during its lifetime (it needs a hash() method), and can be
compared to other objects (it needs an eq() method). Hashable
objects which compare equal must have the same hash value.
Hashability makes an object usable as a dictionary key and a set
member, because these data structures use the hash value internally.
Most of Python’s immutable built-in objects are hashable; mutable
containers (such as lists or dictionaries) are not; immutable
containers (such as tuples and frozensets) are only hashable if their
elements are hashable. Objects which are instances of user-defined
classes are hashable by default. They all compare unequal (except with
themselves), and their hash value is derived from their id().

dictionary keys: custom objects vs lists

I have read that lists cannot be dictionary keys because mutable objects cannot be hashed.
However, custom objects appear to be mutable as well:
# custom object
class Vertex(object):
def __init__(self, key):
self.key = key
v = Vertex(1)
v.color = 'grey' # this line suggests the custom object is mutable
But, unlike lists, they can be used as dictionary keys; why is this? Couldn't we simply hash some sort of id (such as the address of the object in memory) in both cases?
as noted in Why Lists can't be Dictionary Keys:
Lists as Dictionary Keys
That said, the simple answer to why lists cannot be used as dictionary keys is that lists do not provide a valid hash method. Of course, the obvious question is, "Why not?"
Consider what kinds of hash functions could be provided for lists.
If lists hashed by id, this would certainly be valid given Python's definition of a hash function -- lists with different hash values would have different ids. But lists are containers, and most other operations on them deal with them as such. So hashing lists by their id instead would produce unexpected behavior such as:
Looking up different lists with the same contents would produce different results, even though comparing lists with the same contents would indicate them as equivalent.
Using a list literal in a dictionary lookup would be pointless -- it would always produce a KeyError.
User Defined Types as Dictionary Keys
What about instances of user defined types?
By default, all user defined types are usable as dictionary keys with hash(object) defaulting to id(object), and cmp(object1, object2) defaulting to cmp(id(object1), id(object2)). This same suggestion was discussed above for lists and found unsatisfactory. Why are user defined types different?
In the cases where an object must be placed in a mapping, object identity is often much more important than object contents.
In the cases where object content really is important, the default settings can be redefined by overridding __hash__ and __cmp__ or __eq__.
Note that it is often better practice, when an object is to be associated with a value, to simply assign that value as one of the object's attributes.

Equality between frozensets

Example:
>>> tuple((1, 2)) == tuple((2, 1))
False
>>> frozenset((1, 2)) == frozenset((2, 1))
True
Frozen sets are immutable. I would expect that equality between immutable objects should by determined by order, but here obviously that is not the case.
How can I discard frozensets with same elements and different order, without casting to different type?
The short answer is you can't, since as pointed out in the comments, sets and frozensets are unordered data structures. Here are some excerpts from the docs* to support this statement:
A set object is an unordered collection of distinct hashable objects.
There are currently two built-in set types, set and frozenset. The set type is mutable — the contents can be changed using methods like add() and remove(). Since it is mutable, it has no hash value and cannot be used as either a dictionary key or as an element of another set. The frozenset type is immutable and hashable — its contents cannot be altered after it is created; it can therefore be used as a dictionary key or as an element of another set.
* Python 2.7.12
For a better grasp of the equality issue, I would encourage you to run the following snippet using the Online Python Tutor:
tup_1 = tuple((1, 2))
tup_2 = tuple((2, 1))
fs_1 = frozenset((1, 2))
fs_2 = frozenset((2, 1))
This is an extremely handy tool that renders a graphical representation of the objects in memory while the code is executed step by step. I'm attaching a screenshot:
The answer by Tonechas is wrong.
frozenset is unordered, but it can be used for equality comparison. Quote from the python official doc:
Both set and frozenset support set to set comparisons. Two sets are equal if and only if every element of each set is contained in the other (each is a subset of the other). A set is less than another set if and only if the first set is a proper subset of the second set (is a subset, but is not equal). A set is greater than another set if and only if the first set is a proper superset of the second set (is a superset, but is not equal).

What is the use case of the immutable objects

What is the use case of immutable types/objects like tuple in python.
Tuple('hello')
('h','i')
Where we can use the not changeable sequences.
One common use case is the list of (unnamed) arguments to a function.
In [1]: def foo(*args):
...: print(type(args))
...:
In [2]: foo(1,2,3)
<class 'tuple'>
Technically, tuples are semantically different to lists.
When you have a list, you have something that is... a list. Of items of some sort. And therefore can have items added or removed to it.
A tuple, on the other hand, is a set of values in a given order. It just happens to be one value that is made up of more than one value. A composite value.
For example. Say you have a point. X, Y. You could have a class called Point, but that class would have a dictionary to store its attributes. A point is only two values which are, most of the time, used together. You don't need the flexibility or the cost of a dictionary for storing named attributes, you can use a tuple instead.
myPoint = 70, 2
Points are always X and Y. Always 2 values. They are not lists of numbers. They are two values in which the order of a value matters.
Another example of tuple usage. A function that creates links from a list of tuples. The tuples must be the href and then the label of the link. Fixed order. Order that has meaning.
def make_links(*tuples):
return "".join('%s' % t for t in tuples)
make_links(
("//google.com", "Google"),
("//stackoveflow.com", "Stack Overflow")
)
So the reason tuples don't change is because they are supposed to be one single value. You can only assign the whole thing at once.
Here is a good resource that describes the difference between tuples and lists, and the reasons for using each: https://mail.python.org/pipermail/tutor/2001-September/008888.html
The main reason outlined in that link is that tuples are immutable and less extensive than say, lists. This makes them useful only in certain situations, but if those situations can be identified, tuples take up much less resources.
Immutable objects will make life simpler in many cases. They are especially applicable for value types, where objects don't have an identity so they can be easily replaced. And they can make concurrent programming way safer and cleaner (most of the notoriously hard to find concurrency bugs are ultimately caused by mutable state shared between threads). However, for large and/or complex objects, creating a new copy of the object for every single change can be very costly and/or tedious. And for objects with a distinct identity, changing an existing objects is much more simple and intuitive than creating a new, modified copy of it.

when compare by id is used in Python? Dictionary key comparison?

What does this mean?
The only types of values not acceptable as dictionary keys are values containing lists or dictionaries or other mutable types that are compared by value rather than by object identity, the reason being that the efficient implementation of dictionaries requires a key’s hash value to remain constant.
I think even for tuples, comparison will happen by value.
The problem with a mutable object as a key is that when we use a dictionary, we rarely want to check identity. For example, when we use a dictionary like this:
a = "bob"
test = {a: 30}
print(test["bob"])
We expect it to work - the second string "bob" may not be the same as a, but it is the same value, which is what we care about. This works as any two strings that equate will have the same hash, meaning that the dict (implemented as a hashmap) can find those strings very efficiently.
The issue comes into play when we have a list as a key, imagine this case:
a = ["bob"]
test = {a: 30}
print(test[["bob"]])
We can't do this any more - the comparison won't work as the hash of a list is not based on it's value, but rather the instance of the list (aka (id(a) != id(["bob"))).
Python has the choice of making the list's hash change (undermining the efficiency of a hashmap) or simply comparing on identity (which is useless in most cases). Python disallows these specific mutable keys to avoid subtle but common bugs where people expect the values to be equated on value, rather than identity.
The documentation mixes together two different things: mutability, and value-comparable. Let's separate them out.
Immutable objects that compare by identity are fine. The identity can
never change, for any object.
Immutable objects that compare by value are fine. The value can never
change for an immutable object. This includes tuples.
Mutable objects that compare by identity are fine. The identity can
never change, for any object.
Mutable objects that compare by value are not acceptable. The value
can change for a mutable object, which would make the dictionary
invalid.
Meanwhile, your wording isn't quite the same as Mapping Types (4.10 in Python 3.3 or 5.8 in Python 2.7, both of which say:
A dictionary’s keys are almost arbitrary values. Values that are not hashable, that is, values containing lists, dictionaries or other mutable types (that are compared by value rather than by object identity) may not be used as keys.
Anyway, the key point here is that the rule is "not hashable"; "mutable types (that are compared by value rather than by object identity)" is just to explain things a little further. It isn't strictly true that comparing by object identity and hashing by object identity are always the same (the only thing that's required is that if id is equal, the hash is equal).
The part about "efficient implementation of dictionaries" from the version you posted just adds to the confusion (which is probably why it's not in the reference documentation). Even if someone came up with an efficient way to deal with storing lists as dict keys tomorrow, the language doesn't allow it.
A hash is way of calculating an unique code for an object, this code always the same for the same object. hash('test') for example is 2314058222102390712, so is a = 'test'; hash(a) = 2314058222102390712.
Internally a dictionary value is searched by the hash, not by the variable you specify. A list is mutable, a hash for a list, if it where defined, would be changing whenever the list changes. Therefore python's design does not hash lists. Lists therefore can not be used as dictionary keys.
Tuples are immutable, therefore tubles have hashes e.G. hash((1,2)) = 3713081631934410656. one could compare whether a tuple a is equal to the tuple (1,2) by comparing the hash, rather than the value. This would be more efficient as we have to compare only one value instead of two.

Categories