Dictionary transformation and counter - python

Object:
data = [{'key': 11, 'country': 'USA'},{'key': 21, 'country': 'Canada'},{'key': 12, 'country': 'USA'}]
the result should be:
{'USA': {0: {'key':11}, 1: {'key': 12}}, 'Canada': {0: {'key':21}}}
I started experiment with:
result = {}
for i in data:
k = 0
result[i['country']] = dict(k = dict(key=i['key']))
and I get:
{'Canada': {'k': {'key': 21}}, 'USA': {'k': {'key': 12}}}
So how can I put the counter instead k? Maybe there is a more elegant way to create the dictionary?

I used the len() of the existing result item:
>>> import collections
>>> data = [{'key': 11, 'country': 'USA'},{'key': 21, 'country': 'Canada'},{'key': 12, 'country': 'USA'}]
>>> result = collections.defaultdict(dict)
>>> for item in data:
... country = item['country']
... result[country][len(result[country])] = {'key': item['key']}
...
>>> dict(result)
{'Canada': {0: {'key': 21}}, 'USA': {0: {'key': 11}, 1: {'key': 12}}}
There may be a more efficient way to do this, but I thought this would be most readable.

#zigg's answer is better.
Here's an alternative way:
import itertools as it, operator as op
def dict_transform(dataset, key_name=None, group_by=None):
result = {}
sorted_dataset = sorted(data, key=op.itemgetter(group_by))
for k,g in it.groupby(sorted_dataset, key=op.itemgetter(group_by)):
result[k] = {i:{key_name:j[key_name]} for i,j in enumerate(g)}
return result
if __name__ == '__main__':
data = [{'key': 11, 'country': 'USA'},
{'key': 21, 'country': 'Canada'},
{'key': 12, 'country': 'USA'}]
expected_result = {'USA': {0: {'key':11}, 1: {'key': 12}},
'Canada': {0: {'key':21}}}
result = dict_transform(data, key_name='key', group_by='country')
assert result == expected_result

To add the number, use the {key:value} syntax
result = {}
for i in data:
k = 0
result[i['country']] = dict({k : dict(key=i['key'])})

dict(k = dict(key=i['key']))
This passes i['key'] as the key keyword argument to the dict constructor (which is what you want - since that results in the string "key" being used as a key), and then passes the result of that as the k keyword argument to the dict constructor (which is not what you want) - that's how parameter passing works in Python. The fact that you have a local variable named k is irrelevant.
To make a dict where the value of k is used as a key, the simplest way is to use the literal syntax for dictionaries: {1:2, 3:4} is a dict where the key 1 is associated with the value 2, and the key 3 is associated with the value 4. Notice that here we're using arbitrary expressions for keys and values - not names - so we can use a local variable and the resulting dictionary will use the named value.
Thus, you want {k: {'key': i['key']}}.
Maybe there is a more elegant way to create the dictionary?
You could create a list by appending items, and then transform the list into a dictionary with dict(enumerate(the_list)). That at least saves you from having to do the counting manually, but it's pretty indirect.

Related

Python: Way to build a dictionary with a variable key and append to a list as the value inside a loop

I have a list of dictionaries. I want to loop through this list of dictionary and for each specific name (an attribute inside each dictionary), I want to create a dictionary where the key is the name and the value of this key is a list which dynamically appends to the list in accordance with a specific condition.
For example, I have
d = [{'Name': 'John', 'id': 10},
{'Name': 'Mark', 'id': 21},
{'Name': 'Matthew', 'id': 30},
{'Name': 'Luke', 'id': 11},
{'Name': 'John', 'id': 20}]
I then built a list with only the names using names=[i['Name'] for i in dic1] so I have a list of names. Notice John will appear twice in this list (at the beginning and end). Then, I want to create a for-loop (for name in names), which creates a dictionary 'ID' that for its value is a list which appends this id field as it goes along.
So in the end I'm looking for this ID dictionary to have:
John: [10,20]
Mark: [21]
Matthew: [30]
Luke: [11]
Notice that John has a list length of two because his name appears twice in the list of dictionaries.
But I can't figure out a way to dynamically append these values to a list inside the for-loop. I tried:
ID={[]} #I also tried with just {}
for name in names:
ID[names].append([i['id'] for i in dic1 if i['Name'] == name])
Please let me know how one can accomplish this. Thanks.
Don't loop over the list of names and go searching for every one in the list; that's very inefficient, since you're scanning the whole list all over again for every name. Just loop over the original list once and update the ID dict as you go. Also, if you build the ID dict first, then you can get the list of names from it and avoid another list traversal:
names = ID.keys()
The easiest solution for ID itself is a dictionary with a default value of the empty list; that way ID[name].append will work for names that aren't in the dict yet, instead of blowing up with a KeyError.
from collections import defaultdict
ID = defaultdict(list)
for item in d:
ID[item['Name']].append(item['id'])
You can treat a defaultdict like a normal dict for almost every purpose, but if you need to, you can turn it into a plain dict by calling dict on it:
plain_id = dict(ID)
The Thonnu has a solution using get and list concatenation which works without defaultdict. Here's another take on a no-import solution:
ID = {}
for item in d:
name, number = item['Name'], item['id']
if name in ID:
ID[name].append(number)
else:
ID[name] = [ number ]
Using collections.defaultdict:
from collections import defaultdict
out = defaultdict(list)
for item in dic1:
out[item['Name']].append(item['id'])
print(dict(out))
Or, without any imports:
out = {}
for item in dic1:
out[item['Name']] = out.get(item['Name'], []) + [item['id']]
print(out)
Or, with a list comprehension:
out = {}
[out.update({item['Name']: out.get(item['Name'], []) + [item['id']]}) for item in dic1]
print(out)
Output:
{'John': [10, 20], 'Mark': [21], 'Matthew': [30], 'Luke': [11]}
dic1 = [{'Name': 'John', 'id': 10}, {'Name': 'Mark', 'id': 21}, {'Name': 'Matthew', 'id': 30}, {'Name': 'Luke', 'id': 11}, {'Name': 'John', 'id': 20}]
id_dict = {}
for dic in dic1:
key = dic['Name']
if key in id_dict:
id_dict[key].append(dic['id'])
else:
id_dict[key] = [dic['id']]
print(id_dict) # {'John': [10, 20], 'Mark': [21], 'Matthew': [30], 'Luke': [11]}
You can use defaultdict for this to initiate a dictionary with a default value. In this case the default value will be empty list.
from collections import defaultdict
d=defaultdict(list)
for item in dic1:
d[item['Name']].append(item['id'])
Output
{'John': [10, 20], 'Mark': [21], 'Matthew': [30], 'Luke': [11]} # by converting (not required) into pure dict dict(d)
You can do in a easy version
dic1=[{'Name': 'John', 'id':10}, {'Name': 'Mark', 'id':21},{'Name': 'Matthew', 'id':30}, {'Name': 'Luke', 'id':11}, {'Name': 'John', 'id':20}]
names=[i['Name'] for i in dic1]
ID = {}
for i, name in enumerate(names):
if name in ID:
ID[name].append(dic1[i]['id'])
else:
ID[name] = [dic1[i]['id']]
print(ID)

Parse output from json python

I have a json below, and I want to parse out value from this dict.
I can do something like this to get one specific value
print(abc['everything']['A']['1']['tree']['value'])
But, what is best way to parse out all "value?"
I want to output good, bad, good.
abc = {'everything': {'A': {'1': {'tree': {'value': 'good'}}},
'B': {'5': {'tree1': {'value': 'bad'}}},
'C': {'30': {'tree2': {'value': 'good'}}}}}
If you are willing to use pandas, you could just use pd.json_normalize, which is actually quite fast:
import pandas as pd
abc = {'everything': {'A': {'1': {'tree': {'value': 'good'}}},
'B': {'5': {'tree1': {'value': 'bad'}}},
'C': {'30': {'tree2': {'value': 'good'}}}}}
df = pd.json_normalize(abc)
print(df.values[0])
['good' 'bad' 'good']
Without any extra libraries, you will have to iterate through your nested dictionary:
values = [abc['everything'][e][k][k1]['value'] for e in abc['everything'] for k in abc['everything'][e] for k1 in abc['everything'][e][k]]
print(values)
['good', 'bad', 'good']
Provided your keys and dictionaries have a value somewhere, you can try this:
Create a function (or reuse the code) that gets the first element of the dictionary until the value key exists, then return that. Note that there are other ways of doing this.
Iterate through, getting the result under each value key and return.
# Define function
def get(d):
while not "value" in d:
d = list(d.values())[0]
return d["value"]
# Get the results from your example
results = [get(v) for v in list(abc["everything"].values())]
['good', 'bad', 'good']
A Recursive way:
def fun(my_dict, values=[]):
if not isinstance(my_dict, dict):
return values
for i, j in my_dict.items():
if i == 'value':
values.append(j)
else:
values = fun(j, values)
return values
abc = {'everything': {'A': {'1': {'tree': {'value': 'good'}}},
'B': {'5': {'tree1': {'value': 'bad'}}},
'C': {'30': {'tree2': {'value': 'good'}}}}}
data = fun(abc)
print(data)
Output:
['good', 'bad', 'good']
Firstly, the syntax you are using is incorrect.
If you are using pandas, you can code like
import pandas as pd
df4 = pd.DataFrame({"TreeType": ["Tree1", "Tree2", "Tree3"],
"Values": ["Good", "Bad","Good"]})
df4.index = ["A","B","C"]
next just run the code df4, you would get the correct output.
output:
TreeType Values
A Tree1 Good
B Tree2 Bad
C Tree3 Good

Convert pandas.DataFrame to list of dictionaries in Python

I have a dictionary which is converted from a dataframe as below :
a = d.to_json(orient='index')
Dictionary :
{"0":{"yr":2017,"PKID":"58306, 57011","Subject":"ABC","ID":"T001"},"1":{"yr":2018,"PKID":"1234,54321","Subject":"XYZ","ID":"T002"}}
What I need is it be in a list, so essentially a list of dictionary.
So i just add a [] because that is the format to be used in the rest of the code.
input_dict = [a]
input_dict :
['
{"0":{"yr":2017,"PKID":"58306, 57011","Subject":"ABC","ID":"T001"},"1":{"yr":2018,"PKID":"1234,54321","Subject":"XYZ","ID":"T002"}}
']
I need to get the single quotes removed just after the [ and just before the ]. Also, have the PKID values in form of list.
How can this be achieved ?
Expected Output :
[ {"yr":2017,"PKID":[58306, 57011],"Subject":"ABC","ID":"T001"},"1":{"yr":2018,"PKID":[1234,54321],"Subject":"XYZ","ID":"T002"} ]
NOTE : The PKID column has multiple integer values which have to come as a lift of integers. a string is not acceptable.
so we need like "PKID":[58306, 57011] and not "PKID":"[58306, 57011]"
pandas.DataFrame.to_json returns a string (JSON string), not a dictionary. Try to_dict instead:
>>> df
col1 col2
0 1 3
1 2 4
>>> [df.to_dict(orient='index')]
[{0: {'col1': 1, 'col2': 3}, 1: {'col1': 2, 'col2': 4}}]
>>> df.to_dict(orient='records')
[{'col1': 1, 'col2': 3}, {'col1': 2, 'col2': 4}]
Here is one way:
from collections import OrderedDict
d = {"0":{"yr":2017,"PKID":"58306, 57011","Subject":"ABC","ID":"T001"},"1":{"yr":2018,"PKID":"1234,54321","Subject":"XYZ","ID":"T002"}}
list(OrderedDict(sorted(d.items())).values())
# [{'ID': 'T001', 'PKID': '58306, 57011', 'Subject': 'ABC', 'yr': 2017},
# {'ID': 'T002', 'PKID': '1234,54321', 'Subject': 'XYZ', 'yr': 2018}]
Note the ordered dictionary is ordered by text string keys, as supplied. You may wish to convert these to integers first before any processing via d = {int(k): v for k, v in d.items()}.
You are converting your dictionary to json which is a string. Then you wrap your resulting string a list. So, naturally, the result is a string inside of a list.
Try instead: [d] where d is your raw dictionary (not converted json
You can use a list comprehension
Ex:
d = {"0":{"yr":2017,"PKID":"58306, 57011","Subject":"ABC","ID":"T001"},"1":{"yr":2018,"PKID":"1234,54321","Subject":"XYZ","ID":"T002"}}
print [{k: v} for k, v in d.items()]
Output:
[{'1': {'PKID': '1234,54321', 'yr': 2018, 'ID': 'T002', 'Subject': 'XYZ'}}, {'0': {'PKID': '58306, 57011', 'yr': 2017, 'ID': 'T001', 'Subject': 'ABC'}}]
What about something like this:
from operator import itemgetter
d = {"0":{"yr":2017,"PKID":"58306, 57011","Subject":"ABC","ID":"T001"},"1":
{"yr":2018,"PKID":"1234,54321","Subject":"XYZ","ID":"T002"}}
sorted_d = sorted(d.items(), key=lambda x: int(x[0]))
print(list(map(itemgetter(1), sorted_d)))
Which Outputs:
[{'yr': 2017, 'PKID': '58306, 57011', 'Subject': 'ABC', 'ID': 'T001'},
{'yr': 2018, 'PKID': '1234,54321', 'Subject': 'XYZ', 'ID': 'T002'}]

How to iterate through a list of dictionaries

My code is
index = 0
for key in dataList[index]:
print(dataList[index][key])
Seems to work fine for printing the values of dictionary keys for index = 0. However, I can't figure out how to iterate through an unknown number of dictionaries in dataList.
You could just iterate over the indices of the range of the len of your list:
dataList = [{'a': 1}, {'b': 3}, {'c': 5}]
for index in range(len(dataList)):
for key in dataList[index]:
print(dataList[index][key])
or you could use a while loop with an index counter:
dataList = [{'a': 1}, {'b': 3}, {'c': 5}]
index = 0
while index < len(dataList):
for key in dataList[index]:
print(dataList[index][key])
index += 1
you could even just iterate over the elements in the list directly:
dataList = [{'a': 1}, {'b': 3}, {'c': 5}]
for dic in dataList:
for key in dic:
print(dic[key])
It could be even without any lookups by just iterating over the values of the dictionaries:
dataList = [{'a': 1}, {'b': 3}, {'c': 5}]
for dic in dataList:
for val in dic.values():
print(val)
Or wrap the iterations inside a list-comprehension or a generator and unpack them later:
dataList = [{'a': 1}, {'b': 3}, {'c': 5}]
print(*[val for dic in dataList for val in dic.values()], sep='\n')
the possibilities are endless. It's a matter of choice what you prefer.
You can easily do this:
for dict_item in dataList:
for key in dict_item:
print(dict_item[key])
It will iterate over the list, and for each dictionary in the list, it will iterate over the keys and print its values.
use=[{'id': 29207858, 'isbn': '1632168146', 'isbn13': '9781632168146', 'ratings_count': 0}]
for dic in use:
for val,cal in dic.items():
print(f'{val} is {cal}')
def extract_fullnames_as_string(list_of_dictionaries):
return list(map(lambda e : "{} {}".format(e['first'],e['last']),list_of_dictionaries))
names = [{'first': 'Zhibekchach', 'last': 'Myrzaeva'}, {'first': 'Gulbara', 'last': 'Zholdoshova'}]
print(extract_fullnames_as_string(names))
#Well...the shortest way (1 line only) in Python to extract data from the list of dictionaries is using lambda form and map together.
"""The approach that offers the most flexibility and just seems more dynamically appropriate to me is as follows:"""
Loop thru list in a Function called.....
def extract_fullnames_as_string(list_of_dictionaries):
result = ([val for dic in list_of_dictionaries for val in
dic.values()])
return ('My Dictionary List is ='result)
dataList = [{'first': 3, 'last': 4}, {'first': 5, 'last': 7},{'first':
15, 'last': 9},{'first': 51, 'last': 71},{'first': 53, 'last': 79}]
print(extract_fullnames_as_string(dataList))
"""This way, the Datalist can be any format of a Dictionary you throw at it, otherwise you can end up dealing with format issues, I found. Try the following and it will still works......."""
dataList1 = [{'a': 1}, {'b': 3}, {'c': 5}]
dataList2 = [{'first': 'Zhibekchach', 'last': 'Myrzaeva'}, {'first':
'Gulbara', 'last': 'Zholdoshova'}]
print(extract_fullnames_as_string(dataList1))
print(extract_fullnames_as_string(dataList2))
Another pythonic solution is using collections module.
Here is an example where I want to generate a dict containing only 'Name' and 'Last Name' values:
from collections import defaultdict
test_dict = [{'Name': 'Maria', 'Last Name': 'Bezerra', 'Age': 31},
{'Name': 'Ana', 'Last Name': 'Mota', 'Age': 31},
{'Name': 'Gabi', 'Last Name': 'Santana', 'Age': 31}]
collect = defaultdict(dict)
# at this moment, 'key' becomes every dict of your list of dict
for key in test_dict:
collect[key['Name']] = key['Last Name']
print(dict(collect))
Output should be:
{'Name': 'Maria', 'Last Name': 'Bezerra'}, {'Name': 'Ana', 'Last Name': 'Mota'}, {'Name': 'Gabi', 'Last Name': 'Santana'}
There are multiple ways to iterate through a list of dictionaries. However, if you are into Pythonic code, consider the following ways, but first, let's use data_list instead of dataList because in Python snake_case is preferred over camelCase.
Way #1: Iterating over a dictionary's keys
# let's assume that data_list is the following dictionary
data_list = [{'Alice': 10}, {'Bob': 7}, {'Charlie': 5}]
for element in data_list:
for key in element:
print(key, element[key])
Output
Alice 10
Bob 7
Charlie 5
Explanation:
for element in data_list: -> element will be a dictionary in data_list at each iteration, i.e., {'Alice': 10} in the first iteration,
{'Bob': 7} in the second iteration, and {'Charlie': 5}, in the third iteration.
for key in element: -> key will be a key of element at each iteration, so when element is {'Alice': 10}, the values for key will be 'Alice'. Keep in mind that element could contain more keys, but in this particular example it has just one.
print(key, element[key]) -> it prints key and the value of element for key key, i.e., it access the value of key in `element.
Way #2: Iterating over a dictionary's keys and values
# let's assume that data_list is the following dictionary
data_list = [{'Alice': 10}, {'Bob': 7}, {'Charlie': 5}]
for element in data_list:
for key, value in element.items():
print(key, value)
The output for this code snippet is the same as the previous one.
Explanation:
for element in data_list: -> it has the same explanation as the one in the code before.
for key, value in element.items(): -> at each iteration, element.items() will return a tuple that contains two elements. The former element is the key, and the latter is the value associated with that key, so when element is {'Alice': 10}, the value for key will be 'Alice', and the value for value will be 10. Keep in mind that this dictionary has only one key-value pair.
print(key, value) -> it prints key and value.
As stated before, there are multiple ways to iterate through a list of dictionaries, but to keep your code more Pythonic, avoid using indices or while loops.
had a similar issue, fixed mine by using a single for loop to iterate over the list, see code snippet
de = {"file_name":"jon","creation_date":"12/05/2022","location":"phc","device":"s3","day":"1","time":"44692.5708703703","year":"1900","amount":"3000","entity":"male"}
se = {"file_name":"bone","creation_date":"13/05/2022","location":"gar","device":"iphone","day":"2","time":"44693.5708703703","year":"2022","amount":"3000","entity":"female"}
re = {"file_name":"cel","creation_date":"12/05/2022","location":"ben car","device":"galaxy","day":"1","time":"44695.5708703703","year":"2022","amount":"3000","entity":"male"}
te = {"file_name":"teiei","creation_date":"13/05/2022","location":"alcon","device":"BB","day":"2","time":"44697.5708703703","year":"2022","amount":"3000","entity":"female"}
ye = {"file_name":"js","creation_date":"12/05/2022","location":"woji","device":"Nokia","day":"1","time":"44699.5708703703","year":"2022","amount":"3000","entity":"male"}
ue = {"file_name":"jsdjd","creation_date":"13/05/2022","location":"town","device":"M4","day":"5","time":"44700.5708703703","year":"2022","amount":"3000","entity":"female"}
d_list = [de,se,re,te,ye,ue]
for dic in d_list:
print (dic['file_name'],dic['creation_date'])

Can I use set comprehension to create a list of dict from a bigger list of dict?

I'm working with denormalized tables which provides a bit of challenge when it comes to extracting unique information. If the tables were normalized:
unique_data = list({d['value'] for d in mydata})
would do the trick.
But the tables aren't normalized.
Can I create a set of dict that I can then turn into list? Something like (this gives me an error):
unique_data_with_id = list({{'id':d['id'], 'value':d['value']} for d in mydata})
Dictionaries are mutable, so you can't put them in a set. One way around this is to use a namedtuple instead of a dictionary:
IdValueTuple = collections.namedtuple("IdValueTuple", "id value")
unique_data_with_id = list({IdValueTuple(d["id"], d["value"]) for d in mydata})
{{'id':d['id'], 'value':d['value']} for d in mydata}
creates a set ofdicts. Because dicts are mutable, they aren't hashable and a set needs hashable elements.
Try tuple instead:
{(d['id'], d['value']) for d in mydata}
Note that I quite like Sven Marnach's usage of a namedtuple here.
More because it's occasionally useful in other contexts, you could use a frozenset as an intermediate object:
>>> pprint.pprint(mydata)
[{'id': 1, 'ignore': 92, 'value': 'a'},
{'id': 2, 'ignore': 92, 'value': 'b'},
{'id': 1, 'ignore': 92, 'value': 'a'}]
>>> keep_keys = "id", "value"
>>> [dict(s) for s in {frozenset((k, d[k]) for k in keep_keys) for d in mydata}]
[{'id': 1, 'value': 'a'}, {'id': 2, 'value': 'b'}]

Categories