Dictionary of Dictionaries : Sorting by a specific key - python

I have a dictionary that looks like this
{'Africa': {'Name': 'Africa',
'men': 33333,
'priority': 3,
'women': 30000},
'America': {'Name': 'USA',
'men': 1114444411333L,
'priority': 4,
'women': 44430000},
'Asia': {'Name': 'China',
'men': 444433333,
'priority': 2,
'women': 444430000},
'Europe': {'Name': 'UK',
'men': 11111333,
'priority': 1,
'women': 1111430000}}
I need to sort this dictionary by Key = Priority
I'm using 2.7 and have tried few options (which dont look very elegant). Any suggestions?

>>> d = {"Africa" :
{ "Name" : "Africa", "men": 33333, "women" : 30000, "priority" :3},
"Asia":
{ "Name" : "China", "men": 444433333, "women" : 444430000, "priority" :2},
"Europe":
{ "Name" : "UK", "men": 11111333, "women" : 1111430000, "priority" :1},
"America":
{ "Name" : "USA", "men": 1114444411333, "women" : 44430000, "priority" :4}
}
>>> from collections import OrderedDict
>>> OrderedDict(sorted(d.items(), key=lambda x: x[1]['priority']))
OrderedDict([('Europe', {'priority': 1, 'men': 11111333, 'Name': 'UK', 'women': 1111430000}), ('Asia', {'priority': 2, 'men': 444433333, 'Name': 'China', 'women': 444430000}), ('Africa', {'priority': 3, 'men': 33333, 'Name': 'Africa', 'women': 30000}), ('America', {'priority': 4, 'men': 1114444411333L, 'Name': 'USA', 'women': 44430000})])

It is not possible to sort a dict.You can't get a dictionary as sorted, but you can convert it to sorted tuple list. Here is another version of sorting it;
data={'Africa': {'Name': 'Africa',
'men': 33333,
'priority': 3,
'women': 30000},
'America': {'Name': 'USA',
'men': 1114444411333L,
'priority': 4,
'women': 44430000},
'Asia': {'Name': 'China',
'men': 444433333,
'priority': 2,
'women': 444430000},
'Europe': {'Name': 'UK',
'men': 11111333,
'priority': 1,
'women': 1111430000}}
from operator import itemgetter
listOfTuple = sorted(data.items(), key= lambda(k,v):itemgetter(1)('priority'))
listOfTuple.sort(key=lambda tup:tup[1]['priority'])
print listOfTuple
>>>[('Europe', {'priority': 1, 'men': 11111333, 'Name': 'UK', 'women': 1111430000}), ('Asia', {'priority': 2, 'men': 444433333, 'Name': 'China', 'women': 444430000}), ('Africa', {'priority': 3, 'men': 33333, 'Name': 'Africa', 'women': 30000}), ('America', {'priority': 4, 'men': 1114444411333L, 'Name': 'USA', 'women': 44430000})]

Related

Store rows of DataFrame with certain value in list

I have a DataFrame like:
id
country
city
amount
duplicated
1
France
Paris
200
1
2
France
Paris
200
1
3
France
Lyon
50
2
4
France
Lyon
50
2
5
France
Lyon
50
2
And I would like to store a list per distinct value in duplicated, like:
list 1
[
{
"id": 1,
"country": "France",
"city": "Paris",
"amount": 200,
},
{
"id": 2,
"country": "France",
"city": "Paris",
"amount": 200,
}
]
list 2
[
{
"id": 3,
"country": "France",
"city": "Lyon",
"amount": 50,
},
{
"id": 4,
"country": "France",
"city": "Lyon",
"amount": 50,
},
{
"id": 5,
"country": "France",
"city": "Lyon",
"amount": 50,
}
]
I tried filtering duplicates with
df[df.duplicated(['country','city','amount', 'duplicated'], keep = False)]
but it just returns the same df.
You can use groupby:
lst = (df.groupby(['country', 'city', 'amount']) # or .groupby('duplicated')
.apply(lambda x: x.to_dict('records'))
.tolist())
Output:
>>> lst
[[{'id': 3,
'country': 'France',
'city': 'Lyon',
'amount': 50,
'duplicated': 2},
{'id': 4,
'country': 'France',
'city': 'Lyon',
'amount': 50,
'duplicated': 2},
{'id': 5,
'country': 'France',
'city': 'Lyon',
'amount': 50,
'duplicated': 2}],
[{'id': 1,
'country': 'France',
'city': 'Paris',
'amount': 200,
'duplicated': 1},
{'id': 2,
'country': 'France',
'city': 'Paris',
'amount': 200,
'duplicated': 1}]]
Another solution if you want a dict indexed by duplicated key:
data = {k: v.to_dict('records') for k, v in df.set_index('duplicated').groupby(level=0)}
>>> data[1]
[{'id': 1, 'country': 'France', 'city': 'Paris', 'amount': 200},
{'id': 2, 'country': 'France', 'city': 'Paris', 'amount': 200}]
>>> data[2]
[{'id': 3, 'country': 'France', 'city': 'Lyon', 'amount': 50},
{'id': 4, 'country': 'France', 'city': 'Lyon', 'amount': 50},
{'id': 5, 'country': 'France', 'city': 'Lyon', 'amount': 50}]
If I understand you correctly, you can use DataFrame.to_dict('records') to make your lists:
list_1 = df[df['duplicated'] == 1].to_dict('records')
list_1 = df[df['duplicated'] == 2].to_dict('records')
Or for an arbitrary number of values in the column, you can make a dict:
result = {}
for value in df['duplicated'].unique():
result[value] = df[df['duplicated'] == value].to_dict('records')

python sort list by custom criteria

I have following data read from csv :
venues =[{'capacity': 700, 'id': 1, 'name': 'AMD'},
{'capacity': 2000, 'id': 2, 'name': 'Honda'},
{'capacity': 2300, 'id': 3, 'name': 'Austin Kiddie Limits'},
{'capacity': 2000, 'id': 4, 'name': 'Austin Ventures'}]
i get the unique keys with :
b= list({k for d in venues for k in d.keys()})
which results in random order :
['name', 'capacity', 'id']
i would like to sort the unique key result in following manner :
sorted_keys = ['id','name','capacity']
how may i achieve this ?
In python tuples are sorted element-wise, so using a key function that produces tuple from your dictionaries should do the trick.
>>> sorted(venues, key=lambda row: (row['id'], row['name'], row['capacity']))
To be slightly more concise, you could use operator.itemgetter.
>>> from operator import itemgetter
>>> sorted(venues, key=itemgetter('id','name','capacity'))
You can use sort() function and its property key to introduce specific criteria when sorting your list:
venues =[{'capacity': 700, 'id': 1, 'name': 'AMD'},
{'capacity': 2000, 'id': 2, 'name': 'Honda'},
{'capacity': 2300, 'id': 3, 'name': 'Austin Kiddie Limits'},
{'capacity': 2000, 'id': 4, 'name': 'Austin Ventures'}]
venues.sort(key=lambda x: x["capacity"])
print(venues)
Output: In this case it sorts by capacity parameter
[{'capacity': 700, 'id': 1, 'name': 'AMD'}, {'capacity': 2000, 'id': 2, 'name': 'Honda'}, {'capacity': 2000, 'id': 4, 'name': 'Austin Ventures'}, {'capacity': 2300, 'id': 3, 'name': 'Austin Kiddie Limits'}]
Also, you can sort unique keys as follows:
venues =[{'capacity': 700, 'id': 1, 'name': 'AMD'},
{'capacity': 2000, 'id': 2, 'name': 'Honda'},
{'capacity': 2300, 'id': 3, 'name': 'Austin Kiddie Limits'},
{'capacity': 2000, 'id': 4, 'name': 'Austin Ventures'}]
venues.sort(key=lambda x: (x["id"], x["name"], x["capacity"]))
print(venues)
To get your sort order you could use name length as the key.
b = sorted(b, key=lambda x: len(x))

keep duplicates by key in a list of dictionaries

I have a list of dictionaries, and I would like to obtain those that have the same value in a key:
my_list_of_dicts = [{
'id': 3,
'name': 'John'
},{
'id': 5,
'name': 'Peter'
},{
'id': 2,
'name': 'Peter'
},{
'id': 6,
'name': 'Mariah'
},{
'id': 7,
'name': 'John'
},{
'id': 1,
'name': 'Louis'
}
]
I want to keep those items that have the same 'name', so, I would like to obtain something like:
duplicates: [{
'id': 3,
'name': 'John'
},{
'id': 5,
'name': 'Peter'
},{
'id': 2,
'name': 'Peter'
}, {
'id': 7,
'name': 'John'
}
]
I'm trying (not successfully):
duplicates = [item for item in my_list_of_dicts if len(my_list_of_dicts.get('name', None)) > 1]
I have clear my problem with this code, but not able to do the right sentence
Another concise way using collections.Counter:
from collections import Counter
my_list_of_dicts = [{
'id': 3,
'name': 'John'
},{
'id': 5,
'name': 'Peter'
},{
'id': 2,
'name': 'Peter'
},{
'id': 6,
'name': 'Mariah'
},{
'id': 7,
'name': 'John'
},{
'id': 1,
'name': 'Louis'
}
]
c = Counter(x['name'] for x in my_list_of_dicts)
duplicates = [x for x in my_list_of_dicts if c[x['name']] > 1]
You could use the following list comprehension:
>>> [d for d in my_list_of_dicts if len([e for e in my_list_of_dicts if e['name'] == d['name']]) > 1]
[{'id': 3, 'name': 'John'},
{'id': 5, 'name': 'Peter'},
{'id': 2, 'name': 'Peter'},
{'id': 7, 'name': 'John'}]
my_list_of_dicts = [{
'id': 3,
'name': 'John'
},{
'id': 5,
'name': 'Peter'
},{
'id': 2,
'name': 'Peter'
},{
'id': 6,
'name': 'Mariah'
},{
'id': 7,
'name': 'John'
},{
'id': 1,
'name': 'Louis'
}
]
df = pd.DataFrame(my_list_of_dicts)
df[df.name.isin(df[df.name.duplicated()]['name'])].to_json(orient='records')
Attempt similar to #cucuru
Hopefully Helpful.
Explained in comments what I did differently.
my_list_of_dicts = [{
'id': 3,
'name': 'John'
},{
'id': 5,
'name': 'Peter'
},{
'id': 2,
'name': 'Peter'
},{
'id': 6,
'name': 'Mariah'
},{
'id': 7,
'name': 'John'
},{
'id': 1,
'name': 'Louis'
}
]
# Create a list of names
names = [person.get('name') for person in my_list_of_dicts]
# Add item to list if the name occurs more than once in names
duplicates = [item for item in my_list_of_dicts if names.count(item.get('name')) > 1]
print(duplicates)
produces
[{'id': 3, 'name': 'John'}, {'id': 5, 'name': 'Peter'}, {'id': 2, 'name': 'Peter'}, {'id': 7, 'name': 'John'}]
[Program finished]

Most Common (Max) Items in JSON data

How can I split this JSON data into groups according to 'name' and sum the number of 'items' in each group in order to find the most common name (based on number of items). The JSON data I am working with is as follows:
json_data= [
{'code': '0101010G0AAABAB',
'items': 2,
'practice': 'N81013',
'name': 'Co-Magaldrox_Susp 195mg/220mg/5ml S/F',
'nic': 5.98,
'act_cost': 5.56,
'quantity': 1000},
{'code': '0101021B0AAAHAH',
'items': 1,
'practice': 'N81013',
'name': 'Alginate_Raft-Forming Oral Susp S/F',
'nic': 1.95,
'act_cost': 1.82,
'quantity': 500},
{'code': '0101021B0AAALAL',
'items': 12,
'practice': 'N81013',
'name': 'Sod Algin/Pot Bicarb_Susp S/F',
'nic': 64.51,
'act_cost': 59.95,
'quantity': 6300},
{'code': '0101021B0AAAPAP',
'items': 3,
'practice': 'N81013',
'name': 'Sod Alginate/Pot Bicarb_Tab Chble 500mg',
'nic': 9.21,
'act_cost': 8.55,
'quantity': 180},
{'code': '0101021B0BEADAJ',
'items': 6,
'practice': 'N81013',
'name': 'Gaviscon Advance_Liq (Peppermint) S/F',
'nic': 28.92,
'act_cost': 26.84,
'quantity': 90},
{'code': '0101021B0BEAIAL',
'items': 15,
'practice': 'N81013',
'name': 'Gaviscon Advance_Liq (Peppermint) S/F',
'nic': 82.62,
'act_cost': 76.67,
'quantity': 7800},
{'code': '0101021B0BEAQAP',
'items': 5,
'practice': 'N81013',
'name': 'Gaviscon Advance_Liq (Peppermint) S/F',
'nic': 13.47,
'act_cost': 12.93,
'quantity': 116},
{'code': '0101021B0BEBEAL',
'items': 10,
'practice': 'N81013',
'name': 'Gaviscon Advance_Liq (Peppermint) S/F',
'nic': 64.0,
'act_cost': 59.45,
'quantity': 6250},
{'code': '0101021B0BIABAH',
'items': 2,
'practice': 'N81013',
'name': 'Sod Algin/Pot Bicarb_Susp S/F',
'nic': 3.9,
'act_cost': 3.64,
'quantity': 1000},
{'code': '0102000A0AAAAAA',
'items': 1,
'practice': 'N81013',
'name': 'Alverine Cit_Cap 60mg',
'nic': 19.48,
'act_cost': 18.05,
'quantity': 100}]
I have been able to identify the number of unique values for 'name' but I do not know how to proceed from there. Here is the code I used:
names =[]
for item in range(len(json_data)):
names.append(json_data[item]['name'])
names=set(names)
names=list(names)
print(len(names))
I expect the output to be in the following format:
most_common = [("", 0)]
with the name in quotes followed by the total sum of items.
e.g:
most_common = [("Gaviscon Advance_Liq (Peppermint) S/F", 36)]
Please bear with me. I am new to Stackoverflow and this is my first question so I'm still trying to get used to how to ask a question here.
You can use a Counter. It is a class from collections module that allows you to easily count how many times each item appears in a collection.
>>> from collections import Counter
>>> name_numbers = Counter()
>>> for item in json_data:
... name_numbers[item['name']] += item['items']
...
>>> name_numbers
Counter({'Gaviscon Advance_Liq (Peppermint) S/F': 36, 'Sod Algin/Pot Bicarb_Susp S/F': 14, 'Sod Alginate/Pot Bicarb_Tab Chble 500mg': 3, 'Co-Magaldrox_Susp 195mg/220mg/5ml S/F': 2, 'Alginate_Raft-Forming Oral Susp S/F': 1, 'Alverine Cit_Cap 60mg': 1})
>>> name_numbers.most_common(1)
[('Gaviscon Advance_Liq (Peppermint) S/F', 36)]

Find a minimal value in each array in Python

Suppose I have the following data in Python 3.3:
my_array =
[{'man_id': 1, '_id': ObjectId('1234566'), 'type': 'worker', 'value': 11},
{'man_id': 1, '_id': ObjectId('1234577'), 'type': 'worker', 'value': 12}],
[{'man_id': 2, '_id': ObjectId('1234588'), 'type': 'worker', 'value': 11},
{'man_id': 2, '_id': ObjectId('3243'), 'type': 'worker', 'value': 7},
{'man_id': 2, '_id': ObjectId('54'), 'type': 'worker', 'value': 99},
{'man_id': 2, '_id': ObjectId('9879878'), 'type': 'worker', 'value': 135}],
#.............................
[{'man_id': 13, '_id': ObjectId('111'), 'type': 'worker', 'value': 1},
{'man_id': 13, '_id': ObjectId('222'), 'type': 'worker', 'value': 2},
{'man_id': 13, '_id': ObjectId('3333'), 'type': 'worker', 'value': 9}]
There are 3 arrays. How do I find an element in each array with minimal value?
[min(arr, key=lambda s:s['value']) for arr in my_array]
Maybe something like that is acc for you:
for arr in my_array:
minVal = min([row['value'] for row in arr])
print [row for row in arr if row['value'] == minVal]

Categories