I'm just practising with python. I have a dictionary in the form:
my_dict = [{'word': 'aa', 'value': 2},
{'word': 'aah', 'value': 6},
{'word': 'aahed', 'value': 9}]
How would I go about ordering this dictionary such that if I had thousands of words I would then be able to select the top 100 based on their value ranking? e.g., from just the above example:
scrabble_rank = [{'word': 'aahed', 'rank': 1},
{'word': 'aah', 'rank': 2},
{'word': 'aa', 'rank': 3}]
Firstly, that's not a dictionary; it's a list of dictionaries. Which is good, because dictionaries are unordered, but lists are ordered.
You can sort the list by the value of the rank element by using it as a key to the sort function:
scrabble_rank.sort(key=lambda x: x['value'])
Is this what you are looking for:
scrabble_rank = [{'word':it[1], 'rank':idx+1} for idx,it in enumerate(sorted([[item['value'],item['word']] for item in my_dict],reverse=True))]
Using Pandas Library:
import pandas as pd
There is this one-liner:
scrabble_rank = pd.DataFrame(my_dict).sort_values('value', ascending=False).reset_index(drop=True).reset_index().to_dict(orient='records')
It outputs:
[{'index': 0, 'value': 9, 'word': 'aahed'},
{'index': 1, 'value': 6, 'word': 'aah'},
{'index': 2, 'value': 2, 'word': 'aa'}]
Basically it reads your records into a DataFrame, then it sort by value in descending order, then it drops original index (order), and it exports as records (your previous format).
You can use heapq:
import heapq
my_dict = [{'word': 'aa', 'value': 2},
{'word': 'aah', 'value': 6},
{'word': 'aahed', 'value': 9}]
# Select the top 3 records based on `value`
values_sorted = heapq.nlargest(3, # fetch top 3
my_dict, # dict to be used
key=lambda x: x['value']) # Key definition
print(values_sorted)
[{'word': 'aahed', 'value': 9}, {'word': 'aah', 'value': 6}, {'word': 'aa', 'value': 2}]
Related
I have a question about the convert key.
First, I have this type of word count in Data Frame.
[Example]
dict = {'forest': 10, 'station': 3, 'office': 7, 'park': 2}
I want to get this result.
[Result]
result = {'name': 'forest', 'value': 10,
'name': 'station', 'value': 3,
'name': 'office', 'value': 7,
'name': 'park', 'value': 2}
Please check this issue.
As Rakesh said:
dict cannot have duplicate keys
The closest way to achieve what you want is to build something like that
my_dict = {'forest': 10, 'station': 3, 'office': 7, 'park': 2}
result = list(map(lambda x: {'name': x[0], 'value': x[1]}, my_dict.items()))
You will get
result = [
{'name': 'forest', 'value': 10},
{'name': 'station', 'value': 3},
{'name': 'office', 'value': 7},
{'name': 'park', 'value': 2},
]
As Rakesh said, You can't have duplicate values in the dictionary
You can simply try this.
dict = {'forest': 10, 'station': 3, 'office': 7, 'park': 2}
result = {}
count = 0;
for key in dict:
result[count] = {'name':key, 'value': dict[key]}
count = count + 1;
print(result)
I have a nested dictionary mydict = {'Item1': {'name': 'pen', 'price': 2}, 'Item2': {'name': 'apple', 'price': 0.69}}. How do I get all the values of the same key? For example, I want to get a list [2, 0.69] corresponding to the key 'price'. What is the best way to do that without using a loop?
I doubt it is possible literally without any loop, so here is a solution using list coprehension:
mydict = {'Item1': {'name': 'pen', 'price': 2}, 'Item2': {'name': 'apple', 'price': 0.69}}
output = [v["price"] for v in mydict.values()]
print(output)
Or a solution using map:
output = list(map(lambda v: v["price"], mydict.values()))
print(output)
All outputs:
[2, 0.69]
I am trying to create a list of dictionaries comprised of the same key paired up with sequentially selected values from a different list.
The solutions here did not help me:
Creating a unique list of dictionaries from a list of dictionaries which contains same keys but different values,
One liner: creating a dictionary from list with indices as keys,
Create dictionary from list python
ids = [8, 9, 10, 11, 12]
field = ['person']
container = []
The closest I've gotten is:
for i in ids:
container.append(dict(zip(field, [i for i in ids])))
Which results in:
[{'person': 8}, {'person': 8}, {'person': 8}, {'person': 8}, {'person': 8}]
What I need:
[{'person': 8}, {'person': 9}, {'person': 10}, {'person': 11}, {'person': 12}]
You're already iterating over ids with the for loop, you don't need a list comprehension as well.
And you don't need zip. It's not doing anything useful, because it always stops when it reaches the end of the shortest sequence. Since field only has one element, it just uses the first element of [i for i in ids], which is why you always get 8.
for i in ids:
container.append({field[0]: i})
Why bother with zip if field only has one element, and you are already doing a list comprehension?
container = [{field[0]: i} for i in ids]
If the id's are in a sequence and continuous, (you mentioned sequentially selected values) you can use range, additionally if the field list has only one element, why not use the string directly.
Then this is a simple one-liner
print([{'person': i} for i in range(8, 13)])
#[{'person': 8}, {'person': 9}, {'person': 10}, {'person': 11}, {'person': 12}]
If you have multiple elements in the list, then also you will not need zip
fields = ['person', 'animal']
print([{item: i} for i in range(8, 13) for item in fields])
#[{'person': 8}, {'animal': 8}, {'person': 9}, {'animal': 9}, {'person': 10}, {'animal': 10}, {'person': 11}, {'animal': 11}, {'person': 12}, {'animal': 12}]
Another alternative is itertools.product
from itertools import product
ids = [8, 9, 10, 11, 12]
field = ['person']
print([{item[0]: item[1]} for item in product(field, ids)])
#[{'person': 8}, {'person': 9}, {'person': 10}, {'person': 11}, {'person': 12}]
from itertools import product
ids = [8, 9, 10, 11, 12]
field = ['person', 'field']
print([{item[0]: item[1]} for item in product(field, ids)])
#[{'person': 8}, {'person': 9}, {'person': 10}, {'person': 11}, {'person': 12}, {'field': 8}, {'field': 9}, {'field': 10}, {'field': 11}, {'field': 12}]
You can do something like this, if you have one field:
for i in ids:
container.append(dict(zip(field, [i])))
for multiple field items you can do something like this:
from itertools import product
for i,j in product(ids, field):
container.append(dict(zip([i],[j])))
I have a dataframe in pandas as follows:
df = pd.DataFrame({'key1': ['abcd', 'defg', 'hijk', 'abcd'],
'key2': ['zxy', 'uvq', 'pqr', 'lkj'],
'value': [1, 2, 4, 5]})
I am trying to create a dictionary with a key of key1 and a nested dictionary of key2 and value. I have tried the following:
dct = df.groupby('key1')[['key2', 'value']].apply(lambda x: x.set_index('key2').to_dict(orient='index')).to_dict()
dct
{'abcd': {'zxy': {'value': 1}, 'lkj': {'value': 5}},
'defg': {'uvq': {'value': 2}},
'hijk': {'pqr': {'value': 4}}}
Desired output:
{'abcd': {'zxy': 1, 'lkj': 5}, 'defg': {'uvq': 2}, 'hijk': {'pqr': 4}}
Using collections.defaultdict, you can construct a defaultdict of dict objects and add elements while iterating your dataframe:
from collections import defaultdict
d = defaultdict(dict)
for row in df.itertuples(index=False):
d[row.key1][row.key2] = row.value
print(d)
defaultdict(dict,
{'abcd': {'lkj': 5, 'zxy': 1},
'defg': {'uvq': 2},
'hijk': {'pqr': 4}})
As defaultdict is a subclass of dict, this should require no further work.
I am looking for the most efficient way to extract items from a list of dictionaries.I have a list of about 5k dictionaries. I need to extract those records/items for which grouping by a particular field gives more than a threshold T number of records. For example, if T = 2 and dictionary key 'id':
list = [{'name': 'abc', 'id' : 1}, {'name': 'bc', 'id' : 1}, {'name': 'c', 'id' : 1}, {'name': 'bbc', 'id' : 2}]
The result should be:
list = [{'name': 'abc', 'id' : 1}, {'name': 'bc', 'id' : 1}, {'name': 'c', 'id' : 1}]
i.e. All the records with some id such that there are atleast 3 records of same id.
l = [{'name': 'abc', 'id' : 1}, {'name': 'bc', 'id' : 1}, {'name': 'c', 'id' : 1}, {'name': 'bbc', 'id' : 2}]
from collections import defaultdict
from itertools import chain
d = defaultdict(list)
T = 2
for dct in l:
d[dct["id"]].append(dct)
print(list(chain.from_iterable(v for v in d.values() if len(v) > T)))
[{'name': 'abc', 'id': 1}, {'name': 'bc', 'id': 1}, {'name': 'c', 'id': 1}]
If you want to keep them in groups don't chain just use each value:
[v for v in d.values() if len(v) > T] # itervalues for python2
[[{'name': 'abc', 'id': 1}, {'name': 'bc', 'id': 1}, {'name': 'c', 'id': 1}]]
Avoid using list as a variable as it shadows the python list type and if you had a variable list then the code above would cause you a few problems in relation to d = defaultdict(list)
to start out I would make a dictionary to group by your id
control = {}
for d in list:
control.setdefault(d['id'],[]).append(d)
from here all you have to do is check the length of control to see if its greater than your specified threshold
put it in a function like so
def find_by_id(obj, threshold):
control = {}
for d in obj:
control.setdefault(d['id'], []).append(d)
for val in control.values():
if len(val) > threshold:
print val