I'd like to build a dictionary in python in which different keys refer to the same element. I have this dictionary:
persons = {"George":'G.MacDonald', "Luke":'G.MacDonald', "Larry":'G.MacDonald'}
the key refer all to an identical string but the strings have different memory location inside the program, I'd like to make a dictionary in which all these keys refer to the same element, is that possible?
You could do something like:
import itertools as it
unique_dict = {}
value_key=lambda x: x[1]
sorted_items = sorted(your_current_dict.items(), key=value_key)
for value, group in it.groupby(sorted_items, key=value_key):
for key in group:
unique_dict[key] = value
This transforms your dictionary into a dictionary where equal values of any kind(but comparable) are unique. If your values are not comparable(but are hashable) you could use a temporary dict:
from collections import defaultdict
unique_dict = {}
tmp_dict = defaultdict(list)
for key, value in your_current_dict.items():
tmp_dict[value].append(key)
for value, keys in tmp_dict.items():
unique_dict.update(zip(keys, [value] * len(keys)))
If you happen to be using python 3, sys.intern offers a very elegant solution:
for k in persons:
persons[k] = sys.intern(persons[k])
In Python 2.7, you can do roughly the same thing with one extra step:
interned = { v:v for v in set(persons.itervalues()) }
for k in persons:
persons[k] = interned[persons[k]]
In 2.x (< 2.7), you can write interned = dict( (v, v) for … ) instead.
Related
The code below generates random key, value pairs in a dictionary and then sorts the dictionary. I am wondering how to insert 100 random key, value pairs into sorted dictionary and keep it sorted.
from random import randrange
mydict = {}
for i in range(10):
mydict['key'+str(i)] = randrange(10)
sort_mydic=sorted(mydict.items(), key=lambda x: x[1])
Is there any reason you can't use OrderedDict for this purpose? That's what it is meant for. Maybe something like this (haven't checked that it compiles yet):
from random import randrange
from collections import OrderedDict
mydict = {}
for i in range(10):
mydict['key'+str(i)] = randrange(10)
sort_mydic = OrderedDict(sorted(mydict.items(), key=lambda x: x[1]))
OrderedDict behaves like any other dictionary except that it is guaranteed to be in insertion order and can be rearranged as necessary. In fact, there's probably a "better" way than the above to do what you want that does not involve an intermediate dict.
Based on the description above, I would skip the step of creating the dictionary. If you start from non-empty dictionary, then run the code above. Once it is set up, use bisect and insert as below.
new_item = randrange(10)
new_key = 'key' + str(new_item)
# find the index where to insert
idx = bisect.bisect(sort_mydic, (new_key, -1)) # the second argument should be smaller than any value
# check if the key is already used or not
if idx == len(sort_mydic) or sort_mydic[idx][0] != new_key:
sort_mydic.insert(idx, (new_key, new_item))
else:
# update the value -- as in dictionary
sort_mydic[idx] = (new_key, new_item)
In case you need to retrieve an item.
def get_item(key):
idx = bisect.bisect(sort_mydic, (key, -1))
if idx == len(sort_mydic) or sort_mydic[idx][0] != key:
return None # there is no such item
else:
return sort_mydic[idx][1]
Order of keys in a dictionary is insertion order for python 3.6+.
Sorting a dict in older versions of python is impossible.
How do I sort a dictionary by value?
Solution:
Insert new key-value pairs
Sort
from random import randrange
def make_rand_dict(prefix, n_items=2):
keys = [prefix + str(i) for i in range(n_items)]
return {k: randrange(10) for k in keys}
def sort_dict(d):
return {k: v for k, v in sorted(d.items())}
def main():
# Make dict with sorted and random
# key-value pair
original = make_rand_dict("key", n_items=2)
original = sort_dict(original)
# New random key-value pairs
# Set n_items=100 for more pairs
update = make_rand_dict("aye", n_items=2)
# Insert the random key-value pair from
# the update dict into the original dict
original.update(update)
print("before sort:", original)
# Sort the dict
sorted_combined = sort_dict(original)
print("after sort:", sorted_combined)
if __name__ == '__main__':
main()
Result:
before sort: {'key0': 9, 'key1': 7, 'aye0': 2, 'aye1': 4}
after sort: {'aye0': 2, 'aye1': 4, 'key0': 9, 'key1': 7}
I have a dict that looks like the following:
d = {"employee": ['PER', 'ORG']}
I have a list of tags ('PER', 'ORG',....) that is extracted from the specific entity list.
for t in entities_with_tag: # it includes words with a tag such as: [PER(['Bill']), ORG(['Microsoft']),
f = t.tag # this extract only tag like: {'PER, ORG'}
s =str(f)
q.add(s)
Now I want if {'PER, ORG'} in q, and it matched with d.values(), it should give me the keys of {'PER, ORG'} which is 'employee'. I try it this but does not work.
for x in q:
if str(x) in str(d.values()):
print(d.keys()) # this print all the keys of dict.
If I understand correctly you should loop he dictionary instead of the tag list. You can check if the dictionary tags are in the list using sets.
d = {"employee": ['PER', 'ORG'],
"located": ["ORG", "LOC"]}
q = ["PER", "ORG", "DOG", "CAT"]
qset = set(q)
for key, value in d.items():
if set(value).issubset(qset):
print (key)
Output:
employee
You mean with... nothing?
for x in q:
if str(x) in d.values():
print(d.keys())
What you can do is to switch keys and values in the dict and then access by key.
tags = ('PER', 'ORG')
data = dict((val, key) for key, val in d.items())
print(data[tags])
Just be careful to convert the lists in tuples, since lists are not hashable.
Another solution would be to extract both key and value in a loop. But that's absolutely NOT efficient at all.
for x in q:
if str(x) in str(d.values()):
for key, val in d.items():
if val == x:
print(key) # this print all the keys of dict.
What you can do is make two lists. One which contains the keys and one which contains the values. Then for the index of the required value in the list with values you can call the key from the list of keys.
d = {"employee": ['PER', 'ORG']}
key_list = list(d.keys())
val_list = list(d.values())
print(key_list[val_list.index(['PER','ORG'])
Refer: https://www.geeksforgeeks.org/python-get-key-from-value-in-dictionary/
This is an example of a complexe data structure. The depth of the structure is not fixed. To reference a specific datum in the structure I need a unknown number of indices (for list()) and keys (for dict()).
>>> x = [{'child': [{'text': 'ass'}, {'group': 'wef'}]}]
>>> x[0]['child'][0]['text']
'ass'
Now I want to have single keys for the values like this.
keys = {'ID01': [0]['child'][0]['text'],
'ID02': [1]['group']}
But this is not possible. Is there another pythonic way?
I think you need a couple of things here. First is a custom lookup function:
def lookup(obj, keys):
for k in keys:
obj = obj[k]
return obj
Then a dictionary of keys to key list tuples:
keys = {'ID01': (0,'child',0,'text'),
'ID02': (1,'group')}
then you can do this:
lookup(x, keys['ID01']) # returns 'ass'
I've hit a bit of a problem with creating empty dictionaries within dictionaries while using fromkeys(); they all link to the same one.
Here's a quick bit of code to demonstrate what I mean:
a = dict.fromkeys( range( 3 ), {} )
for key in a:
a[key][0] = key
Output I'd want is like a[0][0]=0, a[1][0]=1, a[2][0]=2, yet they all equal 2 since it's editing the same dictionarionary 3 times
If I was to define the dictionary like a = {0: {}, 1: {}, 2: {}}, it works, but that's not very practical for if you need to build it from a bigger list.
With fromkeys, I've tried {}, dict(), dict.copy() and b={}; b.copy(), how would I go about doing this?
The problem is that {} is a single value to fromkeys, and not a factory. Therefore you get the single mutable dict, not individual copies of it.
defaultdict is one way to create a dict that has a builtin factory.
from collections import defaultdict as dd
from pprint import pprint as pp
a = dd(dict)
for key in range(3):
a[key][0] = key
pp(a)
If you want something more strictly evaluated, you will need to use a dict comprehension or map.
a = {key: {} for key in range(3)}
But then, if you're going to do that, you may as well get it all done
a = {key: {0: key} for key in range(3)}
Just iterate over keys and insert a dict for each key:
{k: {0: k} for k in keys}
Here, keys is an iterable of hashable values such as range(3) in your example.
I have a dictionary like this :
d = {'v03':["elem_A","elem_B","elem_C"],'v02':["elem_A","elem_D","elem_C"],'v01':["elem_A","elem_E"]}
How would you return a new dictionary with the elements that are not contained in the key of the highest value ?
In this case :
d2 = {'v02':['elem_D'],'v01':["elem_E"]}
Thank you,
I prefer to do differences with the builtin data type designed for it: sets.
It is also preferable to write loops rather than elaborate comprehensions. One-liners are clever, but understandable code that you can return to and understand is even better.
d = {'v03':["elem_A","elem_B","elem_C"],'v02':["elem_A","elem_D","elem_C"],'v01':["elem_A","elem_E"]}
last = None
d2 = {}
for key in sorted(d.keys()):
if last:
if set(d[last]) - set(d[key]):
d2[last] = sorted(set(d[last]) - set(d[key]))
last = key
print d2
{'v01': ['elem_E'], 'v02': ['elem_D']}
from collections import defaultdict
myNewDict = defaultdict(list)
all_keys = d.keys()
all_keys.sort()
max_value = all_keys[-1]
for key in d:
if key != max_value:
for value in d[key]:
if value not in d[max_value]:
myNewDict[key].append(value)
You can get fancier with set operations by taking the set difference between the values in d[max_value] and each of the other keys but first I think you should get comfortable working with dictionaries and lists.
defaultdict(<type 'list'>, {'v01': ['elem_E'], 'v02': ['elem_D']})
one reason not to use sets is that the solution does not generalize enough because sets can only have hashable objects. If your values are lists of lists the members (sublists) are not hashable so you can't use a set operation
Depending on your python version, you may be able to get this done with only one line, using dict comprehension:
>>> d2 = {k:[v for v in values if not v in d.get(max(d.keys()))] for k, values in d.items()}
>>> d2
{'v01': ['elem_E'], 'v02': ['elem_D'], 'v03': []}
This puts together a copy of dict d with containing lists being stripped off all items stored at the max key. The resulting dict looks more or less like what you are going for.
If you don't want the empty list at key v03, wrap the result itself in another dict:
>>> {k:v for k,v in d2.items() if len(v) > 0}
{'v01': ['elem_E'], 'v02': ['elem_D']}
EDIT:
In case your original dict has a very large keyset [or said operation is required frequently], you might also want to substitute the expression d.get(max(d.keys())) by some previously assigned list variable for performance [but I ain't sure if it doesn't in fact get pre-computed anyway]. This speeds up the whole thing by almost 100%. The following runs 100,000 times in 1.5 secs on my machine, whereas the unsubstituted expression takes more than 3 seconds.
>>> bl = d.get(max(d.keys()))
>>> d2 = {k:v for k,v in {k:[v for v in values if not v in bl] for k, values in d.items()}.items() if len(v) > 0}