I am looking to read one list which consists of columns names and another list of lists which consists of data which needs to be mapped to the columns. Each list in the list of list is one row of data to later be push into the database.
I've tried to use the following code to join these two lists:
dict(zip( column_names, data)) but I recieve an error:
TypeError unhashable type: 'list'
How would I join a list of lists and another list together to a dict?
column_names = ['id', 'first_name', 'last_name', 'city', 'dob']
data = [
['1', 'Mike', 'Walters', 'New York City', '1998-12-01'],
['2', 'Daniel', 'Strange', 'Baltimore', '1992-08-12'],
['3', 'Sarah', 'McNeal', 'Miami', '1990-05-05'],
['4', 'Steve', 'Breene', 'Philadelphia', '1988-02-06']
]
The result I'm seeking is:
dict_items = {{'id': '1', 'first_name': 'Mike', 'last_name': 'Walters',
'city': 'New York City', 'dob': '1998-12-01'},
{'id': '2', ...}}
Later looking to push this dict of dicts to the database with SQLAlchemy.
You can create a list of key-value-pairs like this:
result = [dict(zip(column_names, row)) for row in data]
Note the brackets are not curly like you specified.
zip will not work in your case, because its map one to one input arguments.
Zip Documentation
Demo:
>>> l1 = ["key01", "key02", "key03"]
>>> l2 = ["value01", "value02", "value03"]
>>> zip(l1, l2)
[('key01', 'value01'), ('key02', 'value02'), ('key03', 'value03')]
>>> dict(zip(l1, l2))
{'key01': 'value01', 'key02': 'value02', 'key03': 'value03'}
>>>
Use normal iteration and list append method to create final output:
Demo:
>>> list_data_items = []
>>> for item in data:
... list_data_items.append(dict(zip(column_names, item)))
...
All the other answers above worked fine. Just for the sake of completeness you could also use pandas (and it might be convenient if your data is coming from say a csv file).
Just create a data frame with your data and then convert it to dict:
import pandas as pd
df = pd.DataFrame(data, columns=column_names)
df.to_dict(orient='records')
Two simple for-loops:
column_names = ['id', 'first_name', 'last_name', 'city', 'dob']
data = [
['1', 'Mike', 'Walters', 'New York City', '1998-12-01'],
['2', 'Daniel', 'Strange', 'Baltimore', '1992-08-12'],
['3', 'Sarah', 'McNeal', 'Miami', '1990-05-05'],
['4', 'Steve', 'Breene', 'Philadelphia', '1988-02-06']
]
db_result = []
for data_row in data:
new_db_row = {}
for i, data_value in enumerate(data_row):
new_db_row[column_names[i]] = data_value
result.append(new_db_row)
print(result)
First For statement loops over all data rows.
The second uses enumerate to separate the index(i) and the data_value of the rows. The index is used to extract the column names from the list column_names.
I hope this explanation does not make it more complicated.
Following the printed result.
[{'id': '1', 'first_name': 'Mike', 'last_name': 'Walters', 'city': 'New York City', 'dob': '1998-12-01'}, {'id': '2', 'first_name': 'Daniel', 'last_name': 'Strange', 'city': 'Baltimore', 'dob': '1992-08-12'}, {'id': '3', 'first_name': 'Sarah', 'last_name': 'McNeal', 'city': 'Miami', 'dob': '1990-05-05'}, {'id': '4', 'first_name': 'Steve', 'last_name': 'Breene', 'city': 'Philadelphia', 'dob': '1988-02-06'}]
Since you want to construct multiple dictionaries, you have to zip your column names with each list in data and pass the result to the dict constructor. Your result dict_items also needs to be a collection that can store unhashable types such as dictionaries. We cannot use a set for this (which you say you are seeking), but we can use a list (or a tuple).
Employ a simple list comprehension in order to build one dictionary for each sublist in data.
>>> [dict(zip(column_names, sublist)) for sublist in data]
[{'dob': '1998-12-01', 'city': 'New York City', 'first_name': 'Mike', 'last_name': 'Walters', 'id': '1'}, {'dob': '1992-08-12', 'city': 'Baltimore', 'first_name': 'Daniel', 'last_name': 'Strange', 'id': '2'}, {'dob': '1990-05-05', 'city': 'Miami', 'first_name': 'Sarah', 'last_name': 'McNeal', 'id': '3'}, {'dob': '1988-02-06', 'city': 'Philadelphia', 'first_name': 'Steve', 'last_name': 'Breene', 'id': '4'}]
I also assumed that {'id':'2'} in your expected result is a typo.
Using Pandas:
>>> column_names
['id', 'first_name', 'last_name', 'city', 'dob']
>>> data
[['1', 'Mike', 'Walters', 'New York City', '1998-12-01'], ['2', 'Daniel', 'Strange', 'Baltimore', '1992-08-12'], ['3', 'Sarah', 'McNeal', 'Miami', '1990-05-05'], ['4', 'Steve', 'Breene', 'Philadelphia', '1988-02-06']]
>>> import pandas as pd
>>> pd.DataFrame(data, columns=column_names).T.to_dict().values()
[{'dob': '1998-12-01', 'city': 'New York City', 'first_name': 'Mike', 'last_name': 'Walters', 'id': '1'}, {'dob': '1992-08-12', 'city': 'Baltimore', 'first_name': 'Daniel', 'last_name': 'Strange', 'id': '2'}, {'dob': '1990-05-05', 'city': 'Miami', 'first_name': 'Sarah', 'last_name': 'McNeal', 'id': '3'}, {'dob': '1988-02-06', 'city': 'Philadelphia', 'first_name': 'Steve', 'last_name': 'Breene', 'id': '4'}]
column_names = ['id', 'first_name', 'last_name', 'city', 'dob']
data = [
['1', 'Mike', 'Walters', 'New York City', '1998-12-01'],
['2', 'Daniel', 'Strange', 'Baltimore', '1992-08-12'],
['3', 'Sarah', 'McNeal', 'Miami', '1990-05-05'],
['4', 'Steve', 'Breene', 'Philadelphia', '1988-02-06']
]
destinationList = []
for value in data:
destinationList.append(dict(zip(column_names,value)))
print(destinationList)
#
# zip(column_names,value)
# [('id', '1'), ('first_name', 'Mike') , ('last_name', 'Walters'), ('city', 'New York City'),('dob', '1998-12-01')]]
# dict(zip(column_names,value))
# {'last_name': 'Walters', 'dob': '1998-12-01','id': '1','first_name': 'Mike','city': 'New York City'}
Related
I have three different list collection of dictionary as shown all three have same "firstname" and lastname". I need to combine this list in a copy of one without replicating the firstname and lastname, ie for each firstname and lastname a combination of the other three list collection of dictionary:
list one
[{'First Name': 'Justin',
'lastName': 'Walker',
'Age (Years)': '29',
'Sex': 'Male',
'Vehicle Make': 'Toyota',
'Vehicle Model': 'Continental',
'Vehicle Year': '2012',
'Vehicle Type': 'Sedan'},
{'First Name': 'Maria',
'lastName': 'Jones',
'Age (Years)': '66',
'Sex': 'Female',
'Vehicle Make': 'Mitsubishi',
'Vehicle Model': 'Yukon XL 2500',
'Vehicle Year': '2014',
'Vehicle Type': 'Van/Minivan'},
{'First Name': 'Samantha',
'lastName': 'Norman',
'Age (Years)': '19',
'Sex': 'Female',
'Vehicle Make': 'Aston Martin',
'Vehicle Model': 'Silverado 3500 HD Regular Cab',
'Vehicle Year': '1995',
'Vehicle Type': 'SUV'}
list two
[{'firstName': 'Justin',
'lastName': 'Walker',
'age': 71,
'iban': 'GB43YKET96816855547287',
'credit_card_number': '2221597849919620',
'credit_card_security_code': '646',
'credit_card_start_date': '03/18',
'credit_card_end_date': '06/26',
'address_main': '462 Marilyn radial',
'address_city': 'Lynneton',
'address_postcode': 'W4 0GW'},
{'firstName': 'Maria',
'lastName': 'Jones',
'age': 91,
'iban': 'GB53QKRK45175204753504',
'credit_card_number': '4050437758955103343',
'credit_card_security_code': '827',
'credit_card_start_date': '11/21',
'credit_card_end_date': '01/27',
'address_main': '366 Brenda radial',
'address_city': 'Ritafurt',
'address_postcode': 'NE85 1RG'}]
list three
{'firstName': 'Justin',
'lastName': 'Walker',
'age': '64',
'sex': 'Male',
'retired': 'False',
'dependants': '2',
'marital_status': 'single',
'salary': '56185',
'pension': '0',
'company': 'Hudson PLC',
'commute_distance': '14.1',
'address_postcode': 'G2J 0FH'},
{'firstName': 'Maria',
'lastName': 'Jones',
'age': '69',
'sex': 'Female',
'retired': 'False',
'dependants': '1',
'marital_status': 'divorced',
'salary': '36872',
'pension': '0',
'company': 'Wall, Reed and Whitehouse',
'commute_distance': '10.47',
'address_postcode': 'TD95 7FL'}
This is what I trying but
for i in range(0,2):
dict1 = list_one[i]
dict2 = list_two[i]
dict3 = list_three[i]
combine_file = list_three.copy()
for k, v in dict1.items():
if k == "firstname" or "lastname":
for k1, v1 in combine_file.items():
if dict1.get(k) == combine_file.v1:
This is what I'm expecting
print(combine_file)
{'firstName': 'Justin',
'lastName': 'Walker',
'age': '64',
'sex': 'Male',
'retired': 'False',
'dependants': '2',
'marital_status': 'single',
'salary': '56185',
'pension': '0',
'company': 'Hudson PLC',
'commute_distance': '14.1',
'iban': 'GB43YKET96816855547287',
'credit_card_number': '2221597849919620',
'credit_card_security_code': '646',
'credit_card_start_date': '03/18',
'credit_card_end_date': '06/26',
'address_main': '462 Marilyn radial',
'address_city': 'Lynneton',
'address_postcode': 'W4 0GW',
'Vehicle Make': 'Mitsubishi',
'Vehicle Model': 'Yukon XL 2500',
'Vehicle Year': '2014',
'Vehicle Type': 'Van/Minivan'},
{'firstName': 'Maria',
'lastName': 'Jones',
'age': '69',
'sex': 'Female',
'retired': 'False',
'dependants': '1',
'marital_status': 'divorced',
'salary': '36872',
'pension': '0',
'company': 'Wall, Reed and Whitehouse',
'commute_distance': '10.47',
'iban': 'GB53QKRK45175204753504',
'credit_card_number': '4050437758955103343',
'credit_card_security_code': '827',
'credit_card_start_date': '11/21',
'credit_card_end_date': '01/27',
'address_main': '366 Brenda radial',
'address_city': 'Ritafurt',
'address_postcode': 'NE85 1RG',
'Vehicle Make': 'Aston Martin',
'Vehicle Model': 'Silverado 3500 HD Regular Cab',
'Vehicle Year': '1995',
'Vehicle Type': 'SUV'}
Create a new dictionary keyed on a composite of either 'firstname_lastname' or 'First Name_lastname' then you can do this:
master = {}
for _list in list_1, list_2, list_3:
for d in _list:
if not (firstname := d.get('firstName')):
firstname = d['First Name']
name_key = f'{firstname}_{d["lastName"]}'
for k, v in d.items():
master.setdefault(name_key, {})[k] = v
print(list(master.values()))
Python's dict.update() functionality might be what you are looking for.
For example:
dict1 = { 'a' : 0,
'b' : 1,
'c' : 2}
dict2 = { 'c' : 0,
'd' : 1,
'e' : 2}
dict2.update(dict1)
dict2 is now:
{'a' : 0, 'b': 1, 'c': 2, 'd' 1, 'e': 2}
Notice how 'c' was overwritten with the updated value from dict1.
You can't update together dictionaries from different people, but if you run through your lists beforehand you could compile sets of dictionaries where each set belongs to one person.
You can create a new dictionary, called people, and then iterate through your lists of dictionaries and extract the person's name from those dictionaries and turn it into a key in the new "people" dictionary.
If that person's name is not in people yet, you can add that dictionary, so that people[name] points to that dictionary.
If people[name] does exist, then you can use the people[name].update() function on the new dictionary to add the new values.
After this process, you will have a dictionary whose keys are the names of people and the values point to a dictionary containing those people's attributes.
Imagine I have the following dictionary.For every record (row of data), I want to merge the dictionaries of sub fields into a single dictionary. So in the end I have a list of dictionaries. One per each record.
Data = [{'Name': 'bob', 'age': '40’}
{'Name': 'tom', 'age': '30’},
{'Country’: 'US', 'City': ‘Boston’},
{'Country’: 'US', 'City': ‘New York},
{'Email’: 'bob#fake.com', 'Phone': ‘bob phone'},
{'Email’: 'tom#fake.com', 'Phone': ‘none'}]
Output = [
{'Name': 'bob', 'age': '40’,'Country’: 'US', 'City': ‘Boston’,'Email’: 'bob#fake.com', 'Phone': ‘bob phone'},
{'Name': 'tom', 'age': '30’,'Country’: 'US', 'City': ‘New York', 'Email’: 'tom#fake.com', 'Phone': ‘none'}
]
Related: How do I merge a list of dicts into a single dict?
I understand you know which dictionary relates to Bob and which dictionary relates to Tom by their position: dictionaries at even positions relate to Bob, while dictionaries at odd positions relate to Tom.
You can check whether a number is odd or even using % 2:
Data = [{'Name': 'bob', 'age': '40'},
{'Name': 'tom', 'age': '30'},
{'Country': 'US', 'City': 'Boston'},
{'Country': 'US', 'City': 'New York'},
{'Email': 'bob#fake.com', 'Phone': 'bob phone'},
{'Email': 'tom#fake.com', 'Phone': 'none'}]
bob_dict = {}
tom_dict = {}
for i,d in enumerate(Data):
if i % 2 == 0:
bob_dict.update(d)
else:
tom_dict.update(d)
Output=[bob_dict, tom_dict]
Or alternatively:
Output = [{}, {}]
for i, d in enumerate(Data):
Output[i%2].update(d)
This second approach is not only shorter to write, it's also faster to execute and easier to scale if you have more than 2 people.
Splitting the list into more than 2 dictionaries
k = 4 # number of dictionaries you want
Data = [{'Name': 'Alice', 'age': '40'},
{'Name': 'Bob', 'age': '30'},
{'Name': 'Charlie', 'age': '30'},
{'Name': 'Diane', 'age': '30'},
{'Country': 'US', 'City': 'Boston'},
{'Country': 'US', 'City': 'New York'},
{'Country': 'UK', 'City': 'London'},
{'Country': 'UK', 'City': 'Oxford'},
{'Email': 'alice#fake.com', 'Phone': 'alice phone'},
{'Email': 'bob#fake.com', 'Phone': '12345'},
{'Email': 'charlie#fake.com', 'Phone': '0000000'},
{'Email': 'diane#fake.com', 'Phone': 'none'}]
Output = [{} for j in range(k)]
for i, d in enumerate(Data):
Output[i%k].update(d)
# Output = [
# {'Name': 'Alice', 'age': '40', 'Country': 'US', 'City': 'Boston', 'Email': 'alice#fake.com', 'Phone': 'alice phone'},
# {'Name': 'Bob', 'age': '30', 'Country': 'US', 'City': 'New York', 'Email': 'bob#fake.com', 'Phone': '12345'},
# {'Name': 'Charlie', 'age': '30', 'Country': 'UK', 'City': 'London', 'Email': 'charlie#fake.com', 'Phone': '0000000'},
# {'Name': 'Diane', 'age': '30', 'Country': 'UK', 'City': 'Oxford', 'Email': 'diane#fake.com', 'Phone': 'none'}
#]
Additionally, instead of hardcoding k = 4:
If you know the number of fields but not the number of people, you can compute k by dividing the initial number of dictionaries by the number of dictionary types:
fields = ['Name', 'Country', 'Email']
assert(len(Data) % len(fields) == 0) # make sure Data is consistent with number of fields
k = len(Data) // len(fields)
Or alternatively, you can compute k by counting how many occurrences of the 'Names' field you have:
k = sum(1 for d in Data if 'Name' in d)
def remove_repeated_lines(data):
lines_seen = set() # holds lines already seen
d=[]
for t in data:
if t not in lines_seen: # check if line is not duplicate
d.append(t)
lines_seen.add(t)
return d
a=[{'name': 'paul', 'age': '26.', 'hometown': 'AU', 'gender': 'male'},
{'name': 'mei', 'age': '26.', 'hometown': 'NY', 'gender': 'female'},
{'name': 'smith', 'age': '16.', 'hometown': 'NY', 'gender': 'male'},
{'name': 'raj', 'age': '13.', 'hometown': 'IND', 'gender': 'male'}]
age=[]
for line in a:
for key,value in line.items():
if key == 'age':
age.append(remove_repeated_lines(value.replace('.','___')))
print(age)
the output is
[['2', '6', '___'], ['2', '6', '___'], ['1', '6', '___'], ['1', '3', '___']]
my desired output is ['26___','16___','13___']
Here is my code to remove repeated lines from the value of a dictionary. After I run the code, the repeated lines are not remove.
In [37]: a=[{'name': 'paul', 'age': '26.', 'hometown': 'AU', 'gender': 'male'},
...: {'name': 'mei', 'age': '26.', 'hometown': 'NY', 'gender': 'female'},
...: {'name': 'smith', 'age': '16.', 'hometown': 'NY', 'gender': 'male'},
...: {'name': 'raj', 'age': '13.', 'hometown': 'IND', 'gender': 'male'}]
In [40]: set(i["age"].replace(".","")+"_" for i in a)
Out[40]: {'13_', '16_', '26_'}
You can use set comprehension to do it with ease, in a more readable fashion:
age = list({
line['age'].replace('.', '___')
for line in a
if 'age' in line
})
Output:
['26___', '16___', '13___']
How can I convert this json into dataframe in python, by removing fields. I just need employess data in my dataframe.
{'fields': [{'id': 'displayName', 'type': 'text', 'name': 'Display name'},
{'id': 'firstName', 'type': 'text', 'name': 'First name'},
{'id': 'gender', 'type': 'gender', 'name': 'Gender'}],
'employees': [{'id': '123', 'displayName': 'abc', 'firstName': 'abc','gender': 'Female'},
{'id': '234', 'displayName': 'xyz.', 'firstName': 'xyz','gender': 'Female'},
{'id': '345', 'displayName': 'pqr', 'firstName': 'pqr', 'gender': 'Female'}]}
If you wan the employee information you can
JSON = {var:[...],'employees':[{}]}
employee_info = JSON['employees']
employee_info with be a list of dictionaries which you will be able to create a dataframe from by this answer: Convert list of dictionaries to a pandas DataFrame
I'm making some scrip with Python and having one small question.
I have 2 lists:
['name', 'age', 'sex', 'addr', 'city']
['Jack 24 male no23 NY', 'Jane 25 female no24 NY', 'Dane 14 male no14 NY']
So I want to have:
dictofJack = {'name': 'Jack', 'age': '24', 'sex': 'male', 'addr': 'no23', 'city':'NY'}
dictofJane = {'name': 'Jane', 'age': '25', 'sex': 'female', 'addr': 'no24', 'city':'NY'}
dictofDane = {'name': 'Dane', 'age': '14', 'sex': 'male', 'addr': 'no14', 'city':'NY'}
In this case, how can I use zip to make it get the dictionaries automatically in a for loop?
Using list comprehension or generator expression:
>>> header = ['name', 'age', 'sex', 'addr', 'city']
>>> values = ['Jack 24 male no23 NY',
'Jane 25 female no24 NY',
'Dane 14 male no14 NY']
>>> dictofJack, dictofJane, dictofDane = (
dict(zip(header, value.split())) for value in values
)
>>> dictofJack
{'addr': 'no23', 'age': '24', 'city': 'NY', 'name': 'Jack', 'sex': 'male'}
>>> dictofJane
{'addr': 'no24', 'age': '25', 'city':'NY', 'name': 'Jane', 'sex': 'female'}
>>> dictofDane
{'addr': 'no14', 'age': '14', 'city': 'NY', 'name': 'Dane', 'sex': 'male'}
BTW, instead of making multiple variables of dictionaries, I recommend to use dictionary of dictionaries (think of case where 100 of dictionaries required), using dictionary comprehension:
>>> {value.split()[0]: dict(zip(header, value.split())) for value in values}
{'Jane': {'addr': 'no24', 'age': '25', 'city': 'NY', 'name': 'Jane', 'sex': 'female'},
'Dane': {'addr': 'no14', 'age': '14', 'city': 'NY', 'name': 'Dane', 'sex': 'male'},
'Jack': {'addr': 'no23', 'age': '24', 'city': 'NY', 'name': 'Jack', 'sex': 'male'}}