How are Python dictionaries (builtin hashtables) implemented? [duplicate] - python

This question already has answers here:
How are Python's Built In Dictionaries Implemented?
(3 answers)
Closed 8 years ago.
I was wondering how is the python dict (dictionary/hashtable) implemented. Particularly, if I write something like
my_dict = {"key": {"key: {"key": "value"}}}
what possibly does the python interpreter do? I want to know the internal working of it.
Does it treat each dictionary as an object (mostly yes)? If so, is the hashing same for same keys across different dictionaries? For e.g.
dict1 = {"key": "value", "k": "v"}
dict2 = {"key": [1, 2.], "k": "value"}
How different would the look-up for the keys in these 2 distinct dicts be? Also, how does it decide the size of the buckets? Or is similar to the handling of list size?
Hope you get my question. Thanks!
EDIT - No, I am not asking how hash-tables work. I know that part.

Python dictionary are basically the implementation of hash tables. Now, the question is what is hash table? From wikipedia, short and sweet answer:
a hash table (also hash map) is a data structure used to implement an
associative array, a structure that can map keys to values
A hash table uses a hash function to compute an index into an array of buckets or slots, from which the correct value can be found.
These two questions in SO covers some of the things you are interested in:
How are Python's Built In Dictionaries Implemented
How can Python dict have multiple keys with same hash?
I would be repeating the same things if I go any further.

The specification reads
Another useful data type built into Python is the dictionary (see
Mapping Types — dict). Dictionaries are sometimes found in other
languages as “associative memories” or “associative arrays”.
It is best to think of a dictionary as an unordered set of key: value
pairs, with the requirement that the keys are unique (within one
dictionary). A pair of braces creates an empty dictionary: {}. Placing
a comma-separated list of key:value pairs within the braces adds
initial key:value pairs to the dictionary; this is also the way
dictionaries are written on output.
And places items in memory in a deterministic fashion through a hash function

Related

Understanding Python dictionary "lookups" between dictionaries to replace keys [duplicate]

This question already has answers here:
How do Python dictionary hash lookups work?
(5 answers)
Closed last month.
I have two dictionaries and my objective was to replace the keys in first_dict, with the values in second_dict.
I got the code working, but largely through trial and error, so would like some help understanding and translate exactly what is going on here in Python.
first_dict={"FirstName": "Jeff", "Town": "Birmingham"}
second_dict={"FirstName": "c1", "Town": "c2"}
new_dict = {second_dict[k]: v for k, v in first_dict.items()}
This gives me what I want, a new dict as follows:
{'c1': 'Jeff', 'c2': 'Birmingham'}
How is this working?
"new_dict" creates a new dictionary
so "in first_dict.items()", i.e. for each key-value paid in "first_dict":
the value in the new_dict is the value from "row"
the key in the new_dict is the value from the second_dict
How does "second_dict[k]" do this? it seems like it is doing some sort of a lookup to match between the keys of first_dict and second_dict? Is this right, and if so, how does it work?
Python dictionaries are implemented using hash tables. It is basically an array. To access the array we need indices. The indices are obtained using a hash function on the keys. Hash function tries to distribute the key evenly (property of any hash function - hash functions hate collisions).
When you are creating the last dictionary, it just reads the k and v from the other dictionary and then the value v's become the key of the new dictionary. So yes, the hash function finds out the hashed values and then in that index put the correct value (which is k from other dictionary for you).
Note: How a hashtable handles collision is a separate topic in itself. There are several ways of handling this. One of them is open addressing scheme. You can look that up for further details.

Python "valueless" dictionary [duplicate]

This question already has answers here:
Is there a Python dict without values?
(3 answers)
Closed 2 years ago.
I have a problem where I want to keep track over a large number of values. If I never encountered the value, I'll do action A, otherwise - action B. Naturally, I considered using dictionary to keep track of the values, since the lookup is fast, ~O(1).
However, dictionary is a key-value system, while all I want to take advantage of, is the key.
I can assign a bogus value
"myvalue": None
but I can't help but wonder if there's a more elegant way to go about it.
Thoughts? Ideas?
Thanks!
That's what a set is for:
members = set()
members.add("mykey")
members.add("otherkey")
if "mykey" in members:
. . .
If I were to stick to your dict implementation, I would:
if value in dict:
#Action B
else:
#Action A
dict[value] = 1
so that you wouldn't need to save unseen values in your dict in the first place.
The best suited for your task is frozenset().
The frozenset type is immutable and hashable — its contents cannot be
altered after it is created; it can therefore be used as a dictionary
key or as an element of another set.
members = frozenset([keylist])
if "mykey" in members:
Based on your question, this is the best suited collection form for your task in python.

Dictionary value is different in input and different in output [duplicate]

This question already has answers here:
How to keep keys/values in same order as declared?
(13 answers)
Closed 4 years ago.
I have some error in Python 3 while using dictionaries. The input and output does not match
What you are getting is not an error. Read about dictionaries first: https://www.w3schools.com/python/python_dictionaries.asp
Dictionaries don't work as list. They do not have order. They are hashed data structure that strongly binds keys with value. 5 will always be bound with "five" and 4 will always be bound with "four". If you type dict1[5], you will always get 'five'. In dictionaries, order of arrangement is not important, because python uses complex algorithms to keep key - value bound by hashing, and these algorithms may alter the order of arrangement, but order of arrangement is anyways not important for us in dictionaries.
Never use dictionaries as lists. Dictionaries are collection of key value pairs and you access a value by a key. Lists are like arrays, you access a value by index.

Is there a better way to store a twoway dictionary than storing its inverse separate? [duplicate]

This question already has answers here:
How to implement an efficient bidirectional hash table?
(8 answers)
Closed 9 years ago.
Given a one-to-one dictionary (=bijection) generated à la
for key, value in someGenerator:
myDict[key] = value
an inverse lookup dictionary can be trivially created by adding
invDict[value] = key
to the for loop. But is this a Pythonic way? Should I instead write a class Bijection(dict) which manages this inverted dictionary in addition and provides a second lookup function? Or does such a structure (or a similar one) already exist?
What I've done in the past is created a reversedict function, which would take a dict and return the opposite mapping, either values to keys if I knew it was one-to-one (throwing exceptions on seeing the same value twice), or values to lists of keys if it wasn't. That way, instead of having to construct two dicts at the same time each time I wanted the inverse look-up, I could create my dicts as normal and just call the generic reversedict function at the end.
However, it seems that the bidict solution that Jon mentioned in the comments is probably the better one. (My reversedict function seems to be his bidict's ~ operator).
if you want O(log(n)) time for accessing values, you will need both a representation of the map and a representation of the inverse map.
otherwise the best you can do is O(log(n)) in one direction and O(n) in the other.
Edit: not O(log(n)), thanks Claudiu, but you are still going to need two data structures to implement the quick access times. And this will be more or less the same space as a dict and an inverse dict.

Dictionary into dictionary in python

Ok, this one should be simple. I have 3 dictionaries. They are all made, ordered, and filled to my satisfaction but I would like to put them all in an overarching dictionary so I can reference and manipulate them more easily and efficiently.
Layer0 = {}
Layer1 = {}
Layer2 = {}
here they are when created, and afterwards I feebly tried different things based on SO questions:
Layers = {Layer0, Layer1, Layer2}
which raised a syntax error
Layers = {'Layer0', 'Layer1', 'Layer2'}
which raised another syntax error
(Layers is the Dictionary I'm trying to create that will have all the previously made dictionaries within it)
All the other examples I found on SO have been related to creating dictionaries within dictionaries in order to fill them (or filling them simultaneously) and since I already coded a large number of lines to make these dictionaries, I'd rather put them into a dictionary after the fact instead of re-writing code.
It would be best if the order of the dictionaries are preserved when put into Layers
Does anyone know if this is possible and how I should do it?
Dictionary items have both a key and a value.
Layers = {'Layer0': Layer0, 'Layer1': Layer1, 'Layer2': Layer2}
Keep in mind that dictionaries don't have an order, since a dictionary is a hash table (i.e. a mapping from your key names to a unique hash value). Using .keys() or .values() generates a list, which does have an order, but the dictionary itself doesn't.
So when you say "It would be best if the order of the dictionaries are preserved when put into Layers" - this doesn't really mean anything. For example, if you rename your dictionaries from "Layer1, Layer2, Layer3" to "A, B, C," you'll see that Layers.keys() prints in the order "A, C, B." This is true regardless of the order you used when building the dictionary. All this shows is that the hash value of "C" is less than that of "B," and it doesn't tell you anything about the structure of your dictionary.
This is also why you can't directly iterate over a dictionary (you have to iterate over e.g. a list of the keys).
As a side note, this hash function is what allows a dictionary to do crazy fast lookups. A good hash function will give you constant time [O(1)] lookup, meaning you can check if a given item is in your dictionary in the same amount of time whether the dictionary contains ten items or ten million. Pretty cool.

Categories