update values in a list of dictionaries - python

I have the following list:
list_dict = [
{'name': 'Old Ben', 'age': 71, 'country': 'Space', 'hobbies': ['getting wise']},
{'name': 'Han', 'age': 26, 'country': 'Space', 'hobbies': ['shooting']},
{'name': 'Luke', 'age': 24, 'country': 'Space', 'hobbies': ['being arrogant']},
{'name': 'R2', 'age': 'unknown', 'country': 'Space', 'hobbies': []}
]
I would like to add a hobby to R2:
for i in range(len(list_dict)):
people = list_dict[i]
if people['name'] == 'R2':
people['hobbies'] = ['lubrication']
print(list_dict)
I got what I was expecting but as a newbie I'd like to learn a few easy tricks to make it shorter.

I'd express as:
people = {person['name']: person for person in list_dict}
people['R2']['hobbies'] += ['lubrication'] # reads nicely as "add a hobby to R2"
For all those hung up on the academic concern of "potential people with same name", although the "Leetcode answer" would be:
for person in list_dict:
if person['name'] == 'R2':
person['hobbies'] += ['lubrication']
But in practice, remodeling data to have & use primary keys is probably what you want in most cases.

You can just iterate over the list and condense the if statement:
for person in list_dict:
person['hobbies'] = ['lubrication'] if person['name'] == 'R2' else person['hobbies']

There are some other answers here but in my eyes they don't really help a newbie seeking to shorten his code. I don't suggest to use the below proposed shortening except the looping over items in list instead of using indices, but to give here an answer with some maybe worth to know 'tricks' (you can have more than one name pointing to same variable/object, you can iterate directly over items of a list instead using indices) to a newbie here a shortened version of the code:
for p in l:
if p[n]=='R2': p[h]=['lubrication']
print(l)
and below all of the code using the proposed shortening with comments pointing out the 'tricks':
list_dict = [
{'name': 'Old Ben', 'age': 71, 'country': 'Space', 'hobbies': ['getting wise']},
{'name': 'Han', 'age': 26, 'country': 'Space', 'hobbies': ['shooting']},
{'name': 'Luke', 'age': 24, 'country': 'Space', 'hobbies': ['being arrogant']},
{'name': 'R2', 'age': 'unknown', 'country': 'Space', 'hobbies': []},
{'name': 'R2', 'age': 15, 'country': 'Alduin', 'hobbies': []}
]
l = list_dict; n='name'; h='hobbies'
# I would like to add a hobby to R2:
"""
for i in range(len(list_dict)):
people = list_dict[i]
if people['name'] == 'R2':
people['hobbies'] = ['lubrication']
"""
# I got what I was expecting but as a newbie I'd like to learn a few
# easy tricks to make it shorter:
for p in l: # loop over items in list instead of using indices
if p[n]=='R2': p[h]=['lubrication'] # use short variable names
# ^-- give --^ long string constants short names
print(l)

there is no need to loop over the lenght, you can loop through the list and you can condense with a one-liner if statement
#!usr/bin/python3
from pprint import pprint
for person in list_dict:
person['hobbies'].append('lubrification') if person['name'] == 'R2' else ...
pprint(list_dict)
>>> [
{'name': 'Old Ben', 'age': 71, 'country': 'Space', 'hobbies': ['getting wise']},
{'name': 'Han', 'age': 26, 'country': 'Space', 'hobbies': ['shooting']},
{'name': 'Luke', 'age': 24, 'country': 'Space', 'hobbies': ['being arrogant']},
{'name': 'R2', 'age': 'unknown', 'country': 'Space', 'hobbies': ["lubrification"]}
]
you can also do this with a comprehension:
[person['hobbies'].append('lubrification') for person in list_dict if person['name']]
But, if you just want to change one, you can use this code:
m = next(i for i,person in enumerate(list_dict) if person["name"]=="R2")
list_dict[m]["hobbies"].append("lubrification")

Related

Restrict adding bad entry to python dictionary?

We have sample Python dictionary as given below, which can be modify by multiple developers in our project.
What could be best way to avoid developers not to add other than given sample keys, also to avoid them to add incorrect zip entry in address of person( zip should validate through post codes then allow to add zip), etc. also there is no currency exists with 'AAA' its invalid key, also having invalid value, I think we can do it with special functions dict.add dict.key is there any other best way to do it.
people = {1: {'name': 'John', 'age': '27', 'sex': 'Male', 'Address': { 'Door': '9-8-11', 'street': 'John-street', 'city': 'NewYork', 'zip':'99705'}, 'ExchangeCurrency':{'USD':'INR'}},
2: {'name': 'Marie', 'age': '22', 'sex': 'Female', 'Address': { 'Door': '9-8-11', 'street': 'John-street', 'city': 'NewYork', 'zip': '99705'}, 'ExchangeCurrency':{'INR':'EUR'}}}
Eg of bad entry:
{1: {'name12': 'John', 'age': '27', 'sex': 'Male', 'Address': { 'Door': '9-8-11', 'street': 'John-street', 'city': 'NewYork', 'zip':'000000'}, 'ExchangeCurrency':{'AAA':'CCC'}}
One approach that you might want to consider is creating your own class for each of the people for which the "given sample keys" (e.g. name, age, sex, and address) are object attributes and keyword arguments to the initialization function. For instance, you could do the following.
class Person(object):
def __init__(self, name = '', age = '0', sex = None, address = {}):
self.name = name
self.age = age
self.sex = sex
self.address = address
With that, you can enter person one into the system with
person_1 = Person(name = 'John',
age = '27',
sex = 'Male',
address = { 'Door': '9-8-11', 'street': 'John-street', 'city': 'NewYork', 'zip':'99705'})
In fact, you can very nicely unpack your dictionary version of a person to create this a Person instance. For example, we can change your dictionary of dictionaries to a dictionary of Person objects as follows.
people_dic_1 = {1: {'name': 'John', 'age': '27', 'sex': 'Male', 'address': { 'door': '9-8-11', 'street': 'John-street', 'city': 'NewYork', 'zip':'99705'}},
2: {'name': 'Marie', 'age': '22', 'sex': 'Female', 'address': { 'door': '9-8-11', 'street': 'John-street', 'city': 'NewYork', 'zip': '99705'}}}
people_dic_2 = {k:Person(**v) for k,v in people_dic.items()}
In the above, Person(**v) is a person object with attributes determined by the dictionary v. Notably, calling Person(**v) will only work if each key corresponds to a (correctly named) keyword argument of the __init__ method.
To go the other direction, you can call vars on a Person object to produce the kind of dictionary that you've been using. For instance, calling vars(people_dic_2[2]) after running the above code yields the dictionary
{'name': 'Marie',
'age': '22',
'sex': 'Female',
'address': {'door': '9-8-11',
'street': 'John-street',
'city': 'NewYork',
'zip': '99705'}}
You could similarly create a class for addresses, if you're so inclined.

How to check if specific keys and values are also in a dictionary?

The following is a subset of a nested dictionary that I have:
data = {
'1': {'Address': '10/3 Beevers St',
'Age': '27',
'Job': 'Doctor',
'Married': 'No',
'Name': 'John',
'Sex': 'Male',
'Suburb': 'Marine'},
'2': {'Address': '11/2 Sayers St',
'Age': '22',
'Job': 'Lawyer',
'Married': 'Yes',
'Name': 'Marie',
'Sex': 'Female',
'Suburb': 'Raffles'},
'3': {'Address': '5/1 Swamphen St',
'Age': '24',
'Job': 'Manager',
'Married': 'No',
'Name': 'Luna',
'Sex': 'Female',
'Suburb': 'Eunos'},
'4': {'Address': '25/12 Swamphen St',
'Age': '35',
'Job': 'Teacher',
'Married': 'Yes',
'Name': 'Larry',
'Sex': 'Male',
'Suburb': 'Eunos'}
}
And here is a JSON string:
json_str = '[{"Suburb": "Marine", "Address": "3 Beevers St"},\
{"Suburb": "Raffles", "Address": "11/2 Sayers St"},\
{"Suburb": "Eunos", "Address": "Swamphen St"}]'
My task is to check if a house ("Suburb" and "Address") in json_str is also in the original dataset (nested dictionary called data). If it is, then add the key/value for 'Age' and 'Name' to the JSON string for that house.
The output looks something like this:
[{'Age': 27,
'Address': '10/3 Beevers St',
'Name': 'John',
'Suburb': 'Marine'},
{'Age': 22,
'Address': '11/2 Sayers St',
'Name': 'Marie',
'Suburb': 'Raffles'}]
Thus, I was wondering if I can get some help on how to approach this question? I have tried writing out a code, but I keep getting errors, so I believe my approach is very incorrect...
def add_additional_info(data, json_str):
python_dict = json.loads(json_str)
new_dict = {}
for some_id, sales_data in data.items():
for houses in python_dict:
if houses["Suburb"] == sales_data["Suburb"] and \
houses["Address"] == sales_data["Address"]:
new_dict[houses] = {'Age': sales_data['Age'],
'Address': sales_data['Address'],
'Name': sales_data['Name'],
'Suburb': sales_data['Suburb']}
else:
new_dict.remove(houses)
return new_dict.dumps
I think this does what you said you want, include the clarifications you mentioned in comments about ignoring duplicate matches and the specified ordering of the keys in the results.
First, the test data:
data = {
'1': {'Address': '10/3 Beevers St',
'Age': '27',
'Job': 'Doctor',
'Married': 'No',
'Name': 'John',
'Sex': 'Male',
'Suburb': 'Marine'},
'2': {'Address': '11/2 Sayers St',
'Age': '22',
'Job': 'Lawyer',
'Married': 'Yes',
'Name': 'Marie',
'Sex': 'Female',
'Suburb': 'Raffles'},
'3': {'Address': '5/1 Swamphen St',
'Age': '24',
'Job': 'Manager',
'Married': 'No',
'Name': 'Luna',
'Sex': 'Female',
'Suburb': 'Eunos'},
'4': {'Address': '25/12 Swamphen St',
'Age': '35',
'Job': 'Teacher',
'Married': 'Yes',
'Name': 'Larry',
'Sex': 'Male',
'Suburb': 'Eunos'}
}
json_str = '''[{"Suburb": "Marine", "Address": "3 Beevers St"},
{"Suburb": "Raffles", "Address": "11/2 Sayers St"},
{"Suburb": "Eunos", "Address": "Swamphen St"}]'''
The code:
SUBSET = 'Age', 'Address', 'Name', 'Suburb' # Keys from data to put in result.
def add_additional_info(data, json_str):
houses = json.loads(json_str)
updated_houses = []
for house in houses:
# Search original dataset for house and add it to results, but only if
# there's a single match.
found = None
for some_id, sales_data in data.items():
if(house["Suburb"] == sales_data["Suburb"] and
house["Address"] in sales_data["Address"]): # Might ignore unit number.
if not found: # First match?
found = {key: sales_data[key] for key in SUBSET}
else:
found = None # Ignore multiple matches.
break # Halt search for house.
if found:
updated_houses.append(found)
return json.dumps(updated_houses, indent=4)
result = add_additional_info(data, json_str)
print(result)
Printed results:
[
{
"Age": "27",
"Address": "10/3 Beevers St",
"Name": "John",
"Suburb": "Marine"
},
{
"Age": "22",
"Address": "11/2 Sayers St",
"Name": "Marie",
"Suburb": "Raffles"
}
]
The way you're doing it now is comparing the complete dict entry for the JSON data with the complete dict entry of the other data. This will never match as these have separate keys, so the dicts are by defenition different.
You need to compare exact keys for both dict entries, fe:
if houses['Address'] == sales_data['Address'] and houses['Suburb'] == sales_data['Suburb']:
Also, never delete entries from a dict you are looping through. This leads to weird results. Create a new dict to which you add the ones you need to return, and add the matching entries to that.

understanding nested python dict comprehension

I am getting along with dict comprehensions and trying to understand how the below 2 dict comprehensions work:
select_vals = ['name', 'pay']
test_dict = {'data': [{'name': 'John', 'city': 'NYC', 'pay': 70000}, {'name': 'Mike', 'city': 'NYC', 'pay': 80000}, {'name': 'Kate', 'city': 'Houston', 'pay': 65000}]}
dict_comp1 = [{key: item[key] for key in select_vals } for item in test_dict['data'] if item['pay'] > 65000 ]
The above line gets me
[{'name': 'John', 'pay': 70000}, {'name': 'Mike', 'pay': 80000}]
dict_comp2 = [{key: item[key]} for key in select_vals for item in test_dict['data'] if item['pay'] > 65000 ]
The above line gets me
[{'name': 'John'}, {'name': 'Mike'}, {'pay': 70000}, {'pay': 80000}]
How does the two o/ps vary when written in a for loop ? When I execute in a for loop
dict_comp3 = []
for key in select_vals:
for item in test_dict['data']:
if item['pay'] > 65000:
dict_comp3.append({key: item[key]})
print(dict_comp3)
The above line gets me same as dict_comp2
[{'name': 'John'}, {'name': 'Mike'}, {'pay': 70000}, {'pay': 80000}]
How do I get the o/p as dict_comp1 in a for loop ?
The select vals iteration should be the inner one
result = []
for item in test_dict['data']:
if item['pay'] > 65000:
aux = {}
for key in select_vals:
aux[key] = item[key]
result.append(aux)

Removeing duplicates of a sorted python list while iterting

I Have a sorted data as follows. I want to compare them and remove anything duplicated. Here I do an simple comparison of field to test the code. Original requirement is to some complex comparison. So I need compare the previous with the successor explicitly.
The comparison is not that simple. This is just to show what I am going to achieve. There are several field that need to compare (but NOT all) and remove the previous if same values and keep the newer one which will be having a incremental number. Hence explicit comparison is required. What is the problem in pop() and append() even I don't iterate it?
I used both list and deque. But duplicates are there. Anything wrong with code?
import collections
data = [
{'name': 'Atomic', 'age': 28},
{'name': 'Atomic', 'age': 28},
{'name': 'Atomic', 'age': 28},
{'name': 'Atomic', 'age': 29},
]
dq = collections.deque()
for i in range(1, len(data)):
prev_name = data[i-1]['name']
prev_age = data[i-1]['age']
next_name = data[i]['name']
next_age = data[i]['age']
dq.append(data[i-1])
if prev_name == next_name and prev_age == next_age:
dq.pop()
dq.append(data[i])
else:
dq.append(data[i])
print(dq)
Output (actual): deque([{'name': 'Atomic', 'age': 28}, {'name': 'Atomic', 'age': 28}, {'name': 'Atomic', 'age': 28}, {'name': 'Atomic', 'age': 29}])
Output (expected): deque([{'name': 'Atomic', 'age': 28}, {'name': 'Atomic', 'age': 29}])
The problem with your code is that you are appending the previous data element first, then if the current and previous variables same then you are removing the last element, but the thing you are not considering is that, once you add the current element after removing the previous element in:
dq.pop()
dq.append(data[i])
In the next iteration, you are again adding the previously added element in:
dq.append(data[i-1])
So, if the "if" condition is satisfied then it will just remove the last element (i.e data[i-1]) from dq and not the last element entered in the dq previously. Therefore, here it is getting duplicated with the same element.
You can try this code:
import collections
data = [
{'name': 'Atomic', 'age': 28},
{'name': 'Atomic', 'age': 28},
{'name': 'Atomic', 'age': 28},
{'name': 'Atomic', 'age': 29},
{'name': 'Atomic', 'age': 29},
{'name': 'Atomic', 'age': 30},
]
dq = collections.deque()
dq.append(data[0])
for i in range(1, len(data)):
prev_name = dq[-1]['name']
prev_age = dq[-1]['age']
next_name = data[i]['name']
next_age = data[i]['age']
if prev_name == next_name and prev_age == next_age:
continue
else:
dq.append(data[i])
print(dq)
Ouput:
deque([{'name': 'Atomic', 'age': 28}, {'name': 'Atomic', 'age': 29}, {'name': 'Atomic', 'age': 30}])
You can try this code :
import collections
data = [
{'name': 'Atomic', 'age': 28},
{'name': 'Atomic', 'age': 28},
{'name': 'Atomic', 'age': 28},
{'name': 'Atomic', 'age': 29},
]
dq = collections.deque()
for i in range(0, len(data)):
if data[i] not in dq:
dq.append(data[i])
print(dq)
output:
deque([{'name': 'Atomic', 'age': 28}, {'name': 'Atomic', 'age': 29}])
data = [
{'name': 'Atomic', 'age': 28},
{'name': 'Atomic', 'age': 28},
{'name': 'Atomic', 'age': 28},
{'name': 'Atomic', 'age': 29},
]
unique = set((tuple(x.items()) for x in data))
print([dict(x) for x in unique])
[{'name': 'Atomic', 'age': 28}, {'name': 'Atomic', 'age': 29}]

Extract multiple key:value pairs from one dict to a new dict

I have a list of dict what some data, and I would like to extract certain key:value pairs into a new list of dicts. I know one way that I could do this would be to use del i['unwantedKey'], however, I would rather not delete any data but instead create a new dict with the needed data.
The column order might change, so I need something to extract the two key:value pairs from the larger dict into a new dict.
Current Data Format
[{'Speciality': 'Math', 'Name': 'Matt', 'Location': 'Miami'},
{'Speciality': 'Science', 'Name': 'Ben', 'Location': 'Las Vegas'},
{'Speciality': 'Language Arts', 'Name': 'Sarah', 'Location': 'Washington DC'},
{'Speciality': 'Spanish', 'Name': 'Tom', 'Location': 'Denver'},
{'Speciality': 'Chemistry', 'Name': 'Jim', 'Location': 'Dallas'}]
Code to delete key:value from dict
import csv
data= []
for line in csv.DictReader(open('data.csv')):
data.append(line)
for i in data:
del i['Speciality']
print data
Desired Data Format without using del i['Speciality']
[{'Name': 'Matt', 'Location': 'Miami'},
{'Name': 'Ben', 'Location': 'Las Vegas'},
{'Name': 'Sarah', 'Location': 'Washington DC'},
{'Name': 'Tom', 'Location': 'Denver'},
{'Name': 'Jim', 'Location': 'Dallas'}]
If you want to give a positive list of keys to copy over into the new dictionaries:
import csv
with open('data.csv', 'rb') as csv_file:
data = list(csv.DictReader(csv_file))
keys = ['Name', 'Location']
new_data = [dict((k, d[k]) for k in keys) for d in data]
print new_data
suppose we have,
l1 = [{'Location': 'Miami', 'Name': 'Matt', 'Speciality': 'Math'},
{'Location': 'Las Vegas', 'Name': 'Ben', 'Speciality': 'Science'},
{'Location': 'Washington DC', 'Name': 'Sarah', 'Speciality': 'Language Arts'},
{'Location': 'Denver', 'Name': 'Tom', 'Speciality': 'Spanish'},
{'Location': 'Dallas', 'Name': 'Jim', 'Speciality': 'Chemistry'}]
to create a new list of dictionaries that do not contain the keys 'Speciality' we can do,
l2 = []
for oldd in l1:
newd = {}
for k,v in oldd.items():
if k != 'Speciality':
newd[k] = v
l2.append(newd)
and now l2 will be your desired output. In general you can exclude an arbitrary list of keys like so
exclude_keys = ['Speciality', 'Name']
l2 = []
for oldd in l1:
newd = {}
for k,v in oldd.items():
if k not in exclude_keys:
newd[k] = v
l2.append(newd)
the same can be done with an include_keys variable
include_keys = ['Name', 'Location']
l2 = []
for oldd in l1:
newd = {}
for k,v in oldd.items():
if k in include_keys:
newd[k] = v
l2.append(newd)
You can create a new list of dicts limited to the keys you want with one line of code (Python 2.6+):
NLoD=[{k:d[k] for k in ('Name', 'Location')} for d in LoD]
Try it:
>>> LoD=[{'Speciality': 'Math', 'Name': 'Matt', 'Location': 'Miami'},
{'Speciality': 'Science', 'Name': 'Ben', 'Location': 'Las Vegas'},
{'Speciality': 'Language Arts', 'Name': 'Sarah', 'Location': 'Washington DC'},
{'Speciality': 'Spanish', 'Name': 'Tom', 'Location': 'Denver'},
{'Speciality': 'Chemistry', 'Name': 'Jim', 'Location': 'Dallas'}]
>>> [{k:d[k] for k in ('Name', 'Location')} for d in LoD]
[{'Name': 'Matt', 'Location': 'Miami'}, {'Name': 'Ben', 'Location': 'Las Vegas'}, {'Name': 'Sarah', 'Location': 'Washington DC'}, {'Name': 'Tom', 'Location': 'Denver'}, {'Name': 'Jim', 'Location': 'Dallas'}]
Since you are using csv, you can limit the columns that you read in the first place to the desired columns so you do not need to delete the undesired data:
dc=('Name', 'Location')
with open(fn) as f:
reader=csv.DictReader(f)
LoD=[{k:row[k] for k in dc} for row in reader]
keys_lst = ['Name', 'Location']
new_data={key:val for key,val in event.items() if key in keys_lst}
print(new_data)

Categories