Iterating through a nested dictionary with loop and conditional statement - python

I have the following nested dictionary:
Student_dict = {'A123':
{'Student_Name': 'Lisa Simpson',
'Class_Year': 1,
'CGPA': 3.98},
'A125':
{'Student_Name': 'Bart Simpson',
'Class_Year': 3,
'CGPA': 2.51},
'A234': {'Student_Name': 'Milhouse Houten',
'Class_Year': 3,
'CGPA': 3.62}}
And I have to add the key "Honors" within each of the dictionaries and associate the value "Yes" if the CGPA value of the student is higher than 3.7 and "No" otherwise, using loops and if statement.
The expected output is the following:
Student_dict = {
'A123': {
'Student_Name': 'Lisa Simpson',
'Class_Year': 1,
'CGPA': 3.98,
'Honors': 'Yes'
},
'A125': {
'Student_Name': 'Bart Simpson',
'Class_Year': 3,
'CGPA': 2.51,
'Honors': 'No'
},
'A234': {
'Student_Name': 'Milhouse Houten',
'Class_Year': 3,
'CGPA': 3.62,
'Honors': 'No'}}
I've tried various things, but my problem is assessing the different student ids (A123, A125, A234) to create my loop.
Could someone help me please?
Thank you in advance!

You may refer to the inner dictionaries using .values(), this way, you do not need to know the explicit studentpeople ids.
for value in Student_dictPeople.values():
if value['CGPA']value['Age'] > 3.7018:
value['Honors']value['Adult'] = "Yes"
else:
value['Honors']value['Adult'] = "No"
print(Student_dict)
Output:
{'A123': {'Student_Name': 'Lisa Simpson', 'Class_Year': 1, 'CGPA': 3.98, 'Honors': 'Yes'}, 'A125': {'Student_Name': 'Bart Simpson', 'Class_Year': 3, 'CGPA': 2.51, 'Honors': 'No'}, 'A234': {'Student_Name': 'Milhouse Houten', 'Class_Year': 3, 'CGPA': 3.62, 'Honors': 'No'}}

Looping through dict in python work like this :
dict = {'A123' : {'Student_Name' : 'Lisa'} }
# Access ID with for and subcategory using this id
for id in dict:
name = dict[id]['Student_Name']
# Assigning new value :
value = 3
for id in dict:
dict[id]['CGPA'] = value
I'll do it that way
def treat_dict(input):
for id in input:
if input[id]['CGPA'] >= 3.7:
input[id]['Honors'] = 'Yes' # Add new value
else:
input[id]['Honors'] = 'No' # Add new value
return input
In a more compressed way:
def treat_dict(input):
for id in input:
input[id]['Honors'] = 'Yes' if input[id]['CGPA'] >= 3.7 else 'No'
return input
Edit : didn't see that someone replied but letting it...

Related

Would it be considered "pythonic" to use a nested defaultdict where bottom level is defaulted to 0 for counting?

I am building something to sort and add values from an API response. I ended up going with an interesting structure, and I just want to make sure there's nothing inherently wrong with it.
from collections import defaultdict
# Helps create a unique nested default dict object
# for code readability
def dict_counter():
return defaultdict(lambda: 0)
# Creates the nested defaultdict object
ad_data = defaultdict(dict_counter)
# Sorts each instance into its channel, and
# adds the dict values incrimentally
for ad in example:
# Collects channel and metrics
channel = ad['ad_group']['type_']
metrics = dict(
impressions= int(ad['metrics']['impressions']),
clicks = int(ad['metrics']['clicks']),
cost = int(ad['metrics']['cost_micros'])
)
# Adds the variables
ad_data[channel]['impressions'] += metrics['impressions']
ad_data[channel]['clicks'] += metrics['clicks']
ad_data[channel]['cost'] += metrics['cost']
The output is as desired. Again, I just want to make sure I'm not reinventing the wheel or doing something really inefficient here.
defaultdict(<function __main__.dict_counter()>,
{'DISPLAY_STANDARD': defaultdict(<function __main__.dict_counter.<locals>.<lambda>()>,
{'impressions': 14, 'clicks': 4, 'cost': 9}),
'SEARCH_STANDARD': defaultdict(<function __main__.dict_counter.<locals>.<lambda>()>,
{'impressions': 6, 'clicks': 2, 'cost': 4})})
Here's what my input data would look like:
example = [
{
'campaign':
{
'resource_name': 'customers/12345/campaigns/12345',
'status': 'ENABLED',
'name': 'test_campaign_2'
},
'ad_group': {
'resource_name': 'customers/12345/adGroups/12345',
'type_': 'DISPLAY_STANDARD'},
'metrics': {
'clicks': '1', 'cost_micros': '3', 'impressions': '5'
},
'ad_group_ad': {
'resource_name': 'customers/12345/adGroupAds/12345~12345',
'ad': {
'resource_name': 'customers/12345/ads/12345'
}
}
},
{
'campaign':
{
'resource_name': 'customers/12345/campaigns/12345',
'status': 'ENABLED',
'name': 'test_campaign_2'
},
'ad_group': {
'resource_name': 'customers/12345/adGroups/12345',
'type_': 'SEARCH_STANDARD'},
'metrics': {
'clicks': '2', 'cost_micros': '4', 'impressions': '6'
},
'ad_group_ad': {
'resource_name': 'customers/12345/adGroupAds/12345~12345',
'ad': {
'resource_name': 'customers/12345/ads/12345'
}
}
},
{
'campaign':
{
'resource_name': 'customers/12345/campaigns/12345',
'status': 'ENABLED',
'name': 'test_campaign_2'
},
'ad_group': {
'resource_name': 'customers/12345/adGroups/12345',
'type_': 'DISPLAY_STANDARD'},
'metrics': {
'clicks': '3', 'cost_micros': '6', 'impressions': '9'
},
'ad_group_ad': {
'resource_name': 'customers/12345/adGroupAds/12345~12345',
'ad': {
'resource_name': 'customers/12345/ads/12345'
}
}
}
]
Thanks!
There's nothing wrong with the code you have, but the code for copying the values from one dict to another is a bit repetitive and a little vulnerable to mis-pasting a key name. I'd suggest putting the mapping between the keys in a dict so that there's a single source of truth for what keys you're copying from the input metrics dicts and what keys that data will live under in the output:
fields = {
# Map input metrics dicts to per-channel metrics dicts.
'impressions': 'impressions', # same
'clicks': 'clicks', # same
'cost_micros': 'cost', # different
}
Since each dict in your output is going to contain the keys from fields.values(), you have the option of creating these as plain dicts with their values initialized to zero rather than as defaultdicts (this doesn't have any major benefits over defaultdict(int), but it does make pretty-printing a bit easier):
# Create defaultdict of per-channel metrics dicts.
ad_data = defaultdict(lambda: dict.fromkeys(fields.values(), 0))
and then you can do a simple nested iteration to populate ad_data:
# Aggregate input metrics into per-channel metrics.
for ad in example:
channel = ad['ad_group']['type_']
for k, v in ad['metrics'].items():
ad_data[channel][fields[k]] += int(v)
which for your example input produces:
{'DISPLAY_STANDARD': {'impressions': 14, 'clicks': 4, 'cost': 9},
'SEARCH_STANDARD': {'impressions': 6, 'clicks': 2, 'cost': 4}}
I think you overthought this one a bit. Consider this simple function that sums two dicts:
def add_dicts(a, b):
return {
k: int(a.get(k, 0)) + int(b.get(k, 0))
for k in a | b
}
Using this func, the main loop gets trivial:
stats = {}
for obj in example:
t = obj['ad_group']['type_']
stats[t] = add_dicts(stats.get(t, {}), obj['metrics'])
That's it. No defaultdicts needed.

create new list and append to dictionary

I'm trying to learn python. Assuming I have the below two dict().
In the 1st dict, it includes user info and the reporting line structures.
In the 2nd dict, it includes item counts belong to each individual.
I want to compare again these two dict and sum up the total item counts then display the result under name_S. The outcome is shown as follow:
data_set_1 = { 'id': 'mid',
'name': 'name_S',
'directory': [
{
'id': 'eid_A',
'name': 'name_A',
'directory': []
},
{ 'id': 'eid_B',
'name': 'name_B',
'directory': []
},
{ 'id': 'eid_C',
'name': 'name_C',
'directory': [
{'id': 'eid_C1',
'name': 'name_C1',
'directory': []},
{'id': 'eid_C2',
'name': 'name_C2',
'directory': []}]
}]}
data_set_2 = { 'eid_A': 5,
'eid_F': 3,
'eid_G': 0,
'eid_C': 1,
'eid_C1': 10,
'eid_C2': 20
}
Result:
{'name_S': 36}
I'm able to get the result if I did this way:
def combine_compare(data_set_1, data_set_2):
combine_result = dict()
combine_id = data_set_1['id']
combine_name = data_set_1['name']
combine_directory = data_set_1['directory']
if combine_directory:
combine_item_sum = 0
combine_item_count = data_set_2.get(combine_id, 0)
for combine_user in combine_directory:
# Recursion starts
for i in [combine_compare(combine_user, data_set_2)]:
for key, value in i.items():
combine_item_sum += value
combine_result[combine_name] = combine_item_sum + combine_item_count
else:
combine_result[combine_name] = data_set_2.get(combine_id, 0)
return combine_result
Now if I want to include the ids that have item counts in the final result, something like this:
#if there is result and directory is not None under name_S
{'name_S': [36, ('eid_A', 'eid_C', eid_C1', 'eid_C2')]}
#if there is no result and directory is not None under name_S, display a default str
{'name_S': [0, ('Null')]}
#if there is result and directory is None under name_S, display name_S id
{'name_S': [1, ('mid')]}
My original idea is to create a list and append the counts and ids but I'm struggling how I can accomplish this. Here is what I try but the list is not returning the outcome I'm expecting and I'm not sure how I can append the count in the list. Any help would be greatly appreciated.
def combine_compare(data_set_1, data_set_2):
combine_result = dict()
# Creating a new list
combine_list = list()
combine_id = data_set_1['id']
combine_name = data_set_1['name']
combine_directory = data_set_1['directory']
if combine_directory:
combine_item_sum = 0
combine_item_count = data_set_2.get(combine_id, 0)
for combine_user in combine_directory:
# Recursion starts
for i in [combine_compare(combine_user, data_set_2)]:
for key, value in i.items():
combine_item_sum += value
# Trying to append the ids where count > 0
if data_set_2.get(combine_user['id'], 0) != 0:
combine_list.append(combine_user['id'])
combine_result[combine_name] = combine_item_sum + combine_item_count
else:
combine_result[combine_name] = data_set_2.get(combine_id, 0)
# Trying to print the list to see the results
print(combine_list)
return combine_result
So what i understood from your question is that you want to know which ids mentioned in data_set_2 are there in data_set_1 and what is there sum. The following code is my version of solving the above mentioned problem.
data_set_1 = { 'id': 'mid',
'name': 'name_S',
'directory': [
{
'id': 'eid_A',
'name': 'name_A',
'directory': []
},
{ 'id': 'eid_B',
'name': 'name_B',
'directory': []
},
{ 'id': 'eid_C',
'name': 'name_C',
'directory': [
{'id': 'eid_C1',
'name': 'name_C1',
'directory': []},
{'id': 'eid_C2',
'name': 'name_C2',
'directory': []
}
]
}
]
}
data_set_2 = { 'eid_A': 5,
'eid_F': 3,
'eid_G': 0,
'eid_C': 1,
'eid_C1': 10,
'eid_C2': 20
}
value = 0
final_result={}
def compare(d1,d2):
global value
temp_result={}
if d1['name'] not in temp_result:
temp_result[d1['name']]=[]
for items in d1['directory']:
if items['directory']:
temp_value=compare(items,d2)
temp_result[d1['name']].append(temp_value)
result=check_for_value(items,d2)
if result:
value+=result
temp_result[d1['name']].append(items['id'])
return temp_result
def check_for_value(d,d2):
if d['id'] in d2:
return d2[d['id']]
final_result=compare(data_set_1,data_set_2)
final_result['value']=value
print("final_result:",final_result)
Which gives you an output dict which tells you exactly under which name does those ids come under.
The output is as follows:
final_result: {'value': 36, 'name_S': ['eid_A', {'name_C': ['eid_C1', 'eid_C2']}, 'eid_C']}
You can change the structure of the final result to make it exactly how you want it to be. Let me know if you have any trouble understanding the program.

Compare key and values of one nested dictionary in other nested dictionary

I am trying to recursively compare below two python dictionaries:
expectededr = {'uid': 'e579b8cb-7d9f-4c0b-97de-a03bb52a1ec3', 'attempted': {'smpp': {'registeredDelivery': 0}, 'status': 'success', 'OATON': 1, 'OANPI': 1, 'DATON': 1, 'DANPI': 1, 'OA': '12149921220', 'DA': '1514525404'}, 'customerID': 'customer01', 'productID': 'product'}
edr = {'Category': 'NO', 'Type': 'mt', 'uid': 'e579b8cb-7d9f-4c0b-97de-a03bb52a1ec3', 'protocolID': 'smpp', 'direction': 'attempted', 'attempted': {'status': 'success', 'OANPI': 1, 'DATON': 1, 't2': 1512549691602, 'DANPI': 1, 'OA': '12149921220', 'DA': '1514525404', 'smpp': {'fragmented': False, 'sequenceID': 1, 'registeredDelivery': 0, 'messageID': '4e7b48ad-b39e-4e91-a7bb-2de463e4a6ee', 'srcPort': 39417, 'messageType': 4, 'Status': 0, 'ESMClass': 0, 'dstPort': 0, 'size': 0}, 'OATON': 1, 'PID': 0, 't1': 1512549691602}, 'customerID': 'customer01', 'productID': 'product'}
I am trying to compare the in a way that find and compare the key and value of first dictionary in second and if matching then print PASS else print FAIL.
for key in expectededr:
if expectededr[key] == edr[key]:
print("PASS")
else:
print("FAIL")
Output:
FAIL
PASS
PASS
PASS
Above code is not able to compare all the keys and values as these are nested dictionaries.
As you can see below, if i print key and values above i see that its not going in sub dictionary and missing their keys:
for key in expectededr:
if expectededr[key] == edr[key]:
print(expectededr[key])
print(edr[key])
Output:
customer01
customer01
e579b8cb-7d9f-4c0b-97de-a03bb52a1ec3
e579b8cb-7d9f-4c0b-97de-a03bb52a1ec3
product
product
Could someone help to update this code so that I can do the comparision in these nested dictionaries ?
One way is to flatten the dictionaries and then compare if the keys match.
So Lets initialiaze your dicts first:
In [23]: expectededr = {'uid': 'e579b8cb-7d9f-4c0b-97de-a03bb52a1ec3', 'attempted': {'smpp': {'registeredDelivery': 0}, 'status': 'success', 'OATON': 1, 'OANP
...: I': 1, 'DATON': 1, 'DANPI': 1, 'OA': '12149921220', 'DA': '1514525404'}, 'customerID': 'customer01', 'productID': 'product'}
...:
...: edr = {'Category': 'NO', 'Type': 'mt', 'uid': 'e579b8cb-7d9f-4c0b-97de-a03bb52a1ec3', 'protocolID': 'smpp', 'direction': 'attempted', 'attempted': {'
...: status': 'success', 'OANPI': 1, 'DATON': 1, 't2': 1512549691602, 'DANPI': 1, 'OA': '12149921220', 'DA': '1514525404', 'smpp': {'fragmented': False, '
...: sequenceID': 1, 'registeredDelivery': 0, 'messageID': '4e7b48ad-b39e-4e91-a7bb-2de463e4a6ee', 'srcPort': 39417, 'messageType': 4, 'Status': 0, 'ESMCl
...: ass': 0, 'dstPort': 0, 'size': 0}, 'OATON': 1, 'PID': 0, 't1': 1512549691602}, 'customerID': 'customer01', 'productID': 'product'}
...:
For flattening your dictionaries, we can use the approach suggested in Flatten nested Python dictionaries, compressing keys:
In [24]: import collections
...:
...: def flatten(d, parent_key='', sep='_'):
...: items = []
...: for k, v in d.items():
...: new_key = parent_key + sep + k if parent_key else k
...: if isinstance(v, collections.MutableMapping):
...: items.extend(flatten(v, new_key, sep=sep).items())
...: else:
...: items.append((new_key, v))
...: return dict(items)
...:
And generated flattened dicts
In [25]: flat_expectededr = flatten(expectededr)
In [26]: flat_edr = flatten(edr)
Now its a simple comparison:
In [27]: for key in flat_expectededr:
...: if flat_edr.get(key) == flat_expectededr[key]:
...: print "PASS"
...: else:
...: print "FAIL"
PASS
PASS
PASS
PASS
PASS
PASS
PASS
PASS
PASS
PASS
PASS
Simple way :
for i in edr.keys():
if i in expectededr.keys():
print 'true : key'+i
else:
print 'fail : key'+ i

limit() and sort() order pymongo and mongodb

Despite reading peoples answers stating that the sort is done first, evidence shows something different that the limit is done before the sort. Is there a way to force sort always first?
views = mongo.db.view_logging.find().sort([('count', 1)]).limit(10)
Whether I use .sort().limit() or .limit().sort(), the limit takes precedence. I wonder if this is something to do with pymongo...
According to the documentation, regardless of which goes first in your chain of commands, sort() would be always applied before the limit().
You can also study the .explain() results of your query and look at the execution stages - you will find that the sorting input stage examines all of the filtered (in your case all documents in the collection) and then the limit is applied.
Let's go through an example.
Imagine there is a foo database with a test collection having 6 documents:
>>> col = db.foo.test
>>> for doc in col.find():
... print(doc)
{'time': '2016-03-28 12:12:00', '_id': ObjectId('56f9716ce4b05e6b92be87f2'), 'value': 90}
{'time': '2016-03-28 12:13:00', '_id': ObjectId('56f971a3e4b05e6b92be87fc'), 'value': 82}
{'time': '2016-03-28 12:14:00', '_id': ObjectId('56f971afe4b05e6b92be87fd'), 'value': 75}
{'time': '2016-03-28 12:15:00', '_id': ObjectId('56f971b7e4b05e6b92be87ff'), 'value': 72}
{'time': '2016-03-28 12:16:00', '_id': ObjectId('56f971c0e4b05e6b92be8803'), 'value': 81}
{'time': '2016-03-28 12:17:00', '_id': ObjectId('56f971c8e4b05e6b92be8806'), 'value': 90}
Now, let's execute queries with different order of sort() and limit() and check the results and the explain plan.
Sort and then limit:
>>> from pprint import pprint
>>> cursor = col.find().sort([('time', 1)]).limit(3)
>>> sort_limit_plan = cursor.explain()
>>> pprint(sort_limit_plan)
{u'executionStats': {u'allPlansExecution': [],
u'executionStages': {u'advanced': 3,
u'executionTimeMillisEstimate': 0,
u'inputStage': {u'advanced': 6,
u'direction': u'forward',
u'docsExamined': 6,
u'executionTimeMillisEstimate': 0,
u'filter': {u'$and': []},
u'invalidates': 0,
u'isEOF': 1,
u'nReturned': 6,
u'needFetch': 0,
u'needTime': 1,
u'restoreState': 0,
u'saveState': 0,
u'stage': u'COLLSCAN',
u'works': 8},
u'invalidates': 0,
u'isEOF': 1,
u'limitAmount': 3,
u'memLimit': 33554432,
u'memUsage': 213,
u'nReturned': 3,
u'needFetch': 0,
u'needTime': 8,
u'restoreState': 0,
u'saveState': 0,
u'sortPattern': {u'time': 1},
u'stage': u'SORT',
u'works': 13},
u'executionSuccess': True,
u'executionTimeMillis': 0,
u'nReturned': 3,
u'totalDocsExamined': 6,
u'totalKeysExamined': 0},
u'queryPlanner': {u'indexFilterSet': False,
u'namespace': u'foo.test',
u'parsedQuery': {u'$and': []},
u'plannerVersion': 1,
u'rejectedPlans': [],
u'winningPlan': {u'inputStage': {u'direction': u'forward',
u'filter': {u'$and': []},
u'stage': u'COLLSCAN'},
u'limitAmount': 3,
u'sortPattern': {u'time': 1},
u'stage': u'SORT'}},
u'serverInfo': {u'gitVersion': u'6ce7cbe8c6b899552dadd907604559806aa2e9bd',
u'host': u'h008742.mongolab.com',
u'port': 53439,
u'version': u'3.0.7'}}
Limit and then sort:
>>> cursor = col.find().limit(3).sort([('time', 1)])
>>> limit_sort_plan = cursor.explain()
>>> pprint(limit_sort_plan)
{u'executionStats': {u'allPlansExecution': [],
u'executionStages': {u'advanced': 3,
u'executionTimeMillisEstimate': 0,
u'inputStage': {u'advanced': 6,
u'direction': u'forward',
u'docsExamined': 6,
u'executionTimeMillisEstimate': 0,
u'filter': {u'$and': []},
u'invalidates': 0,
u'isEOF': 1,
u'nReturned': 6,
u'needFetch': 0,
u'needTime': 1,
u'restoreState': 0,
u'saveState': 0,
u'stage': u'COLLSCAN',
u'works': 8},
u'invalidates': 0,
u'isEOF': 1,
u'limitAmount': 3,
u'memLimit': 33554432,
u'memUsage': 213,
u'nReturned': 3,
u'needFetch': 0,
u'needTime': 8,
u'restoreState': 0,
u'saveState': 0,
u'sortPattern': {u'time': 1},
u'stage': u'SORT',
u'works': 13},
u'executionSuccess': True,
u'executionTimeMillis': 0,
u'nReturned': 3,
u'totalDocsExamined': 6,
u'totalKeysExamined': 0},
u'queryPlanner': {u'indexFilterSet': False,
u'namespace': u'foo.test',
u'parsedQuery': {u'$and': []},
u'plannerVersion': 1,
u'rejectedPlans': [],
u'winningPlan': {u'inputStage': {u'direction': u'forward',
u'filter': {u'$and': []},
u'stage': u'COLLSCAN'},
u'limitAmount': 3,
u'sortPattern': {u'time': 1},
u'stage': u'SORT'}},
u'serverInfo': {u'gitVersion': u'6ce7cbe8c6b899552dadd907604559806aa2e9bd',
u'host': u'h008742.mongolab.com',
u'port': 53439,
u'version': u'3.0.7'}}
As you can see, in both cases the sort is applied first and affects all the 6 documents and then the limit limits the results to 3.
And, the execution plans are exactly the same:
>>> from copy import deepcopy # just in case
>>> cursor = col.find().sort([('time', 1)]).limit(3)
>>> sort_limit_plan = deepcopy(cursor.explain())
>>> cursor = col.find().limit(3).sort([('time', 1)])
>>> limit_sort_plan = deepcopy(cursor.explain())
>>> sort_limit_plan == limit_sort_plan
True
Also see:
How do you tell Mongo to sort a collection before limiting the results?
The mongodb documentation states that the skip() method controls the starting point of the results set, followed by sort() and ends with the limit() method.
This is regardless the order of your code. The reason is that mongo gets all the methods for the query, then it orders the skip-sort-limit methods in that exact order, and then runs the query.
I suspect, you're passing wrong key in sort parameter. something like "$key_name" instead of just "key_name"
refer How do you tell Mongo to sort a collection before limiting the results?solution for same problem as yours
Logically it should be whatever comes first in pipeline, But MongoDB always sort first before limit.
In my test Sort operation does takes precedence regardless of if it's coming before skip or after. However, it appears to be very strange behavior to me.
My sample dataset is:
[
{
"_id" : ObjectId("56f845fea524b4d098e0ef81"),
"number" : 48.98052410874508
},
{
"_id" : ObjectId("56f845fea524b4d098e0ef82"),
"number" : 50.98747461471063
},
{
"_id" : ObjectId("56f845fea524b4d098e0ef83"),
"number" : 81.32911244349772
},
{
"_id" : ObjectId("56f845fea524b4d098e0ef84"),
"number" : 87.95549919039071
},
{
"_id" : ObjectId("56f845fea524b4d098e0ef85"),
"number" : 81.63582683594402
},
{
"_id" : ObjectId("56f845fea524b4d098e0ef86"),
"number" : 43.25696270026136
},
{
"_id" : ObjectId("56f845fea524b4d098e0ef87"),
"number" : 88.22046335409453
},
{
"_id" : ObjectId("56f845fea524b4d098e0ef88"),
"number" : 64.00556739160076
},
{
"_id" : ObjectId("56f845fea524b4d098e0ef89"),
"number" : 16.09353150244296
},
{
"_id" : ObjectId("56f845fea524b4d098e0ef8a"),
"number" : 17.46667776660574
}
]
Python test code:
import pymongo
client = pymongo.MongoClient("mongodb://localhost:27017")
database = client.get_database("test")
collection = database.get_collection("collection")
print("----------------[limit -> sort]--------------------------")
result = collection.find().limit(5).sort([("number", pymongo.ASCENDING)])
for r in result:
print(r)
print("----------------[sort -> limit]--------------------------")
result = collection.find().sort([("number", pymongo.ASCENDING)]).limit(5)
for r in result:
print(r)
Result:
----------------[limit -> sort]--------------------------
{u'_id': ObjectId('56f845fea524b4d098e0ef89'), u'number': 16.09353150244296}
{u'_id': ObjectId('56f845fea524b4d098e0ef8a'), u'number': 17.46667776660574}
{u'_id': ObjectId('56f845fea524b4d098e0ef86'), u'number': 43.25696270026136}
{u'_id': ObjectId('56f845fea524b4d098e0ef81'), u'number': 48.98052410874508}
{u'_id': ObjectId('56f845fea524b4d098e0ef82'), u'number': 50.98747461471063}
----------------[sort -> limit]--------------------------
{u'_id': ObjectId('56f845fea524b4d098e0ef89'), u'number': 16.09353150244296}
{u'_id': ObjectId('56f845fea524b4d098e0ef8a'), u'number': 17.46667776660574}
{u'_id': ObjectId('56f845fea524b4d098e0ef86'), u'number': 43.25696270026136}
{u'_id': ObjectId('56f845fea524b4d098e0ef81'), u'number': 48.98052410874508}
{u'_id': ObjectId('56f845fea524b4d098e0ef82'), u'number': 50.98747461471063}
The accepted answer didn't work for me, but this does:
last5 = db.collection.find( {'key': "YOURKEY"}, sort=[( '_id', pymongo.DESCENDING )] ).limit(5)
with the limit outside and the sort inside of the find argument.

search array of dictionaries and extract value

I got the following data:
[{'name': 'SqueezePlay', 'power': '1', 'playerid': '91:e2:b5:24:49:63', 'ip': '192.168.1.144:51346', 'canpoweroff': 1, 'displaytype': 'none', 'seq_no': '10', 'connected': 1, 'isplayer': 1, 'model': 'squeezeplay', 'uuid': 'fgh79fg7h98789798978'}, {'name': "FLX's iPhone", 'power': '1', 'playerid': '4c:32:d1:45:6c:4e', 'ip': '84.105.161.205:53972', 'canpoweroff': 1, 'displaytype': 'none', 'seq_no': 0, 'connected': 1, 'isplayer': 1, 'model': 'iPengiPod', 'uuid': '9791c009e3e7fghfg346456456'}]
I changed the values for privacy means.
I'd like to search the array based on "name" ("SqueezePlay" for example) and I'd to retrieve the "playerid" ("91:e2:b5:24:49:63" for example).
What would be the most efficient way to do this in Python? Thanks!
If your list of dicts is data, then you can try this:
next(d for d in data if d['name'] == 'SqueezePlay')['playerid']
This returns '91:e2:b5:24:49:63' (for the first occurence of the given name).
You have to define what to do if given name is not in your data.
Define a function to find a player based on its name:
def find_player(all_players, name):
for player in all_players:
if player['name'] == name:
return player['playerid']
This way (I'm guessing name is unique) you don't have to loop the whole list of players, instead, once you find it, return its playerid:
>>> p = [{'name': 'SqueezePlay', 'power': '1', 'playerid': '91:e2:b5:24:49:63', 'ip': '192.168.1.144:51346', 'canpoweroff': 1, 'displaytype': 'none', 'seq_no': '10', 'connected': 1, 'isplayer': 1, 'model': 'squeezeplay', 'uuid': 'fgh79fg7h98789798978'}, {'name': "FLX's iPhone", 'power': '1', 'playerid': '4c:32:d1:45:6c:4e', 'ip': '84.105.161.205:53972', 'canpoweroff': 1, 'displaytype': 'none', 'seq_no': 0, 'connected': 1, 'isplayer': 1, 'model': 'iPengiPod', 'uuid': '9791c009e3e7fghfg346456456'}]
>>> find_player(p, 'SqueezePlay')
'91:e2:b5:24:49:63'
The solutions posted by others work great if you are only searching the list once or a few times. If you will be searching it frequently, or if the list is more than a few items, and the names are guaranteed to be unique, it might pay off to make a dictionary from that list once, and then access the items by name in that dictionary. Or, if your program is making the list, put them in a dictionary to begin with. (If the order is important, i.e. you want to display the items in the order they were entered by a user, use a collections.OrderedDict.)
lyst = [{'name': 'SqueezePlay', 'power': '1', 'playerid': '91:e2:b5:24:49:63',
'ip': '192.168.1.144:51346', 'canpoweroff': 1, 'displaytype': 'none',
'seq_no': '10', 'connected': 1, 'isplayer': 1, 'model': 'squeezeplay',
'uuid': 'fgh79fg7h98789798978'}, {'name': "FLX's iPhone", 'power': '1',
'playerid': '4c:32:d1:45:6c:4e', 'ip': '84.105.161.205:53972',
'canpoweroff': 1, 'displaytype': 'none', 'seq_no': 0, 'connected': 1,
'isplayer': 1, 'model': 'iPengiPod', 'uuid': '9791c009e3e7fghfg346456456'}]
dyct = dict((item.pop("name"), item) for item in lyst)
# Python 3: {item.pop("name"): item for item in lyst}
print dyct["SqueezePlay"]
Note that the resulting dictionary no longer has name as a key of the nested dictionaries; it has been popped to avoid duplicating data in two places (if you keep it in two places, it's twice as much work to update it, and if you forget somewhere, they get out of sync). If you want to keep it, write this instead:
dyct = dict((item["name"], item) for item in lyst)
# Python 3: {item["name"]: item for item in lyst}

Categories