I have a dictionary and a list of values such as:
dictionary = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
liste = [2, 3]
I would like:
result = ['b', 'c']
If I have a very large dictionary, what is the most optimal way to do this?
The keys have unique values.
The idea here is to create a reverse_dict for an efficient lookup, otherwise the complexity can be O(mn), m-number of keys, n-length of liste. A value can be duplicate, so keeping a list for the keys is also important.
result = []
reverse_dict = collections.defaultdict(list)
for key, value in dictionary.items():
reverse_dict[value].append(key)
for v in liste:
result.extend(reverse_dict[v])
dictionary = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
liste = [2, 3]
result = []
for key, value in dictionary.items():
if value in liste:
result.append(key)
As a list comprehension:
result = [key for key, value in dictionary.items() if value in liste]
Lookup by value is exactly what dictionaries are not meant for.
Since you said the dictionary is very large, the most efficient solution relies on how many times you're operating on it.
If it's going to be a frequent task, you may need to create a reverse dictionary. This can be done comprehensively:
rev_dict = {v: k for k, v in dictionary.items()}
Then you can lookup the dictionary the way it's designed for.
On the contrary, an isolated case does not justify creating a copy of a very large dictionary, which can be memory and time consuming. So I came up with this awful messy un-Pythonic construct, which exploits the ordered feature of Python 3.7+ dictionaries:
list(dictionary.keys())[list(dictionary.values()).index(your_value)]
Mind you, use this only if you're desperate.
Obviously, the best solution is the one everybody knows but no-one wants: hardcode a reverse dictionary before running the script.
This can also bring some issues to your attention that the above solutions are unable to avoid at runtime (not without being explicitly handled), e.g. reversing your dict may result in having duplicate keys, which is not illegal, but will result in a sadly shortened dictionary.
Related
I have a very large nested dictionary and below I am showing a sample of it.
tmp_dict = {1: {'A': 1, 'B': 2},
2: {'A': 0, 'B': 0}}
The question is what is any better/efficient way to add a new pair key value to my existing nested dict. I am currently looping through the keys to do so. Here is an example:
>>> for k in tmp_dict.keys():
tmp_dict[k].update({'C':1})
A simple method would be like so:
for key in tmp_dict:
tmp_dict[key]['C']=1
Or, you could use dictionary comprehension, as sushanth suggested
tmp_dict = {k: {**v, 'C': 1} for k, v in timp_dict.items()}
You can read more about the asterisks (and why this works) here.
In terms of complexity, they are all O(N) time complexity (I think the dict comprehension maybe O(N^2)). So, your solution should have a relatively quick run time anyways.
Here's a function that is supposed to swap dictionary keys and values. {'a': 3} is supposed to become {3: 'a'}.
def change_keys_values(d):
for key in d:
value = d[key]
del d[key]
d[value] = key
return d
I've realized that this function shouldn't work because I'm changing dictionary keys during iteration. This is the error I get: "dictionary keys changed during iteration". However, I don't get this error on a three key-value pair dictionary. So, while {'a': 3, 't': 8, 'r': 2, 'z': 44, 'u': 1, 'b': 4} results in the above mentioned error, {'a': 3, 't': 8, 'r': 2} gets solved without any issues. I'm using python 3. What is causing this?
You must never modify a dictionary inside a loop. The reason is the way the dictionaries are often implemented.
Hash Tables
Basically, when you create a dictionary, each item is indexed using the hash value of the key.
Dictionaries are implemented sparsely
Another implementation detail involves the fact that dictionaries are implemented in a sparse manner. Namely, when you create a dictionary, there are empty places in memory (called buckets). When you add or remove elements from a dictionary, it may hit a threshold where the dictionary key hashes are re-evaluated and as a consequence, the indexes are changed.
Roughly speaking, these two points are the reason behind the problem you are observing.
Moral Point: Never modify a dictionary inside a loop of any kind.
Here's a simple code to do what you want:
def change_keys_values(d):
new_dict = {value: key for key, value in d.items()}
return new_dict
You need to verify that the values are unique, after that, no problem :)
But be sure not to change a dictionary while parsing it. Otherwise, you could encounter an already changed index that get's interpreted twice or even more. I suggest making a new variable (a copy):
def invert(dict_: dict) -> dict:
if list(set(dict_.values())) == list(dict_.values()): # evaluates if "inverting key:value" is possible (if keys are unique)
return {b: a for a, b in dict_.items()}
else:
raise ValueError("Dictionary values contain duplicates. Inversion not possible!")
print(invert({"a": 1, "b": 2, "c": 3, "d": 4})) # works
print(invert({"a": 1, "b": 2, "c": 3, "d": 3})) # fails
To fix your issue, just iterate over copy, not the original dict:
import copy
def change_keys_values(d):
for key in copy.deepcopy(d):
value = d[key]
del d[key]
d[value] = key
return d
Then the good alternative using zip would be:
def change_keys_values(d):
a, b = zip(*d.items())
d = dict(list(zip(b,a)))
return d
If I have dict like this:
some_dict = {'a': 1, 'b': 2, 'c': 2}
How to get keys that have values 2, like this:
some_dict.search_keys(2)
This is example. Assume some_dict is has many thousands or more keys.
You can do it like this:
[key for key, value in some_dict.items() if value == 2]
This uses a list comprehension to iterate through the pairs of (key, value) items, selecting those keys whose value equals 2.
Note that this requires a linear search through the dictionary, so it is O(n). If this performance is not acceptable, you will probably need to create and maintain another data structure that indexes your dictionary by value.
you can also use dictionary comprehension, if you want result to be dictionary
{ x:y for x,y in some_dict.items() if y == 2}
output:
{'c': 2, 'b': 2}
Well, you can use generator to produce found key values, one by one, instead of returning all of them at once.
The function search_keys returns generator
def search_keys(in_dict, query_val):
return (key for key, val in in_dict.iteritems() if val == query_val)
# get keys, one by one
for found_key in search_keys(some_dict, 2):
print(found_key)
Python language.
I know how to remove keys in a dictionary, for example:
def remove_zeros(dict)
dict = {'B': 0, 'C': 7, 'A': 1, 'D': 0, 'E': 5}
del dict[5]
return dict
I want to know how to remove all values with zero from the dictionary and then sort the keys alphabetically. Using the example above, I'd want to get ['A', 'C', 'E'] as a result, eliminating key values B and D completely.
To sort do I just use dict.sort() ?
Is there a special function I must use?
sorted(k for (k, v) in D.iteritems() if v)
Sometimes when you code you have to take a step back and try to go for your intent, rather than trying to do one specific thing and miss the entire big picture. In python you have this feature called list/dictionary comprehension, from which you can use to filter the input to get the results you desire. So you want to filter out all values in your dictionary that are 0, it's simply this:
{k, v for k, v in d.items() if v != 0}
Now, dictionaries are hash tables, by default they are not sortable, however there is a class that can help you with this in collections. Using the OrderedDict class to facilitate the sorting, the code will end up like this:
OrderedDict(sorted(((k, v) for k, v in d.items() if v != 0)), key=lambda t: t[0])
Also, it's highly inadvisable to name your variables with the same name as a builtin type or method, such as dict.
I want to copy pairs from this dictionary based on their values so they can be assigned to new variables. From my research it seems easy to do this based on keys, but in my case the values are what I'm tracking.
things = ({'alpha': 1, 'beta': 2, 'cheese': 3, 'delta': 4})
And in made-up language I can assign variables like so -
smaller_things = all values =3 in things
You can use .items() to traverse through the pairs and make changes like this:
smaller_things = {}
for k, v in things.items():
if v == 3:
smaller_things[k] = v
If you want a one liner and only need the keys back, list comprehension will do it:
smaller_things = [k for k, v in things.items() if v == 3]
>>> things = { 'a': 3, 'b': 2, 'c': 3 }
>>> [k for k, v in things.items() if v == 3]
['a', 'c']
you can just reverse the dictionary and pull from that:
keys_values = { 1:"a", 2:"b"}
values_keys = dict(zip(keys_values.values(), keys_values.keys()))
print values_keys
>>> {"a":1, "b":2}
That way you can do whatever you need to with standard dictionary syntax.
The potential drawback is if you have non-unique values in the original dictionary; items in the original with the same value will have the same key in the reversed dictionary, so you can't guarantee which of the original keys would be the new value. And potentially some values are unhashable (such as lists).
Unless you have a compulsive need to be clever, iterating over items is easier:
for key, val in my_dict.items():
if matches_condition(val):
do_something(key)
kindly this answer is as per my understanding of your question .
The dictionary is a kind of hash table , the main intension of dictionary is providing the non integer indexing to the values . The keys in dictionary are just like indexes .
for suppose consider the "array" , the elements in array are addressed by the index , and we have index for the elements not the elements for index . Just like that we have keys(non integer indexes) for values in dictionary .
And there is one implication the values in dictionary are non hashable I mean the values in dictionary are mutable and keys in dictionary are immutable ,simply values could be changed any time .
simply it is not good approach to address any thing by using values in dictionary