Python extracting values from nested dict keys - python

I have a nested dict(as below). Goal is to to extract the values of "req_key1" and "rkey1", and append them to a list.
raw_json = {
"first_key": {
"f_sub_key1": "some_value",
"f_sub_key2": "some_value"
},
"second_key": {
"another_key": [{
"s_sub_key1": [{
"date": "2022-01-01",
"day": {
"key1": "value_1",
"keyn": "value_n"
}
}],
"s_sub_key2": [{
"req_key1": "req_value1",
"req_key2": {
"rkey1": "rvalue_1",
"rkeyn": "rvalue_n"
}
}]
}]
}
}
I am able to append the values to a list and below is my approach.
emp_ls = []
filtered_key = raw_json["second_key"]["another_key"]
for i in filtered_key:
for k in i.get("s_sub_key2"):
emp_ls.append({"first_val": k.get("req_key1"), "second_val": k["req_key2"].get("rkey1") })
print(emp_ls)
Is it a good approach i.e. it can be used in production or there can be another approach to do this task?

Related

Add same key but different values (coming from list) to a nested dictionary

I managed to scrape some content and organize it as a nested dictionary like this.
country_data = {
"US": {
"People": [
{
"Title": "Pres.",
"Name": "Joe"
},
{
"Title": "Vice",
"Name": "Harris"
}
]
}
}
Then I have this list
tw_usernames = ['#user1', '#user2']
that I'd like to use to add each item list to each entry of the People nested dictionary. I have done some research about list and dict comprehension but I cannot find something to make it work, so I tried with this basic code but of course it's not exactly what I want, as it returns all the items list.
for firstdict in country_data.values():
for dictpeople in firstdict.values():
for name in dictpeople:
name['Twitter'] = tw_usernames
print(name)
So how would you do it to get dictionary like this?
country_data = {
"US": {
"People": [
{
"Title": "Pres.",
"Name": "Joe",
"Twitter": "#user1"
},
{
"Title": "Vice",
"Name": "Harris",
"Twitter": "#user2"
}
]
}
}
Thanks in advance for any tip that could teach me.
Try this. I have just added index in your logic. It will pick the username according to the index
idx = 0
tw_usernames = ['#user1', '#user2']
for firstdict in country_data.values():
for dictpeople in firstdict.values():
for name in dictpeople:
name['Twitter'] = tw_usernames[idx]
idx+=1
print(country_data)
output
{
"US":{
"People":[
{
"Title":"Pres.",
"Name":"Joe",
"Twitter":"#user1"
},
{
"Title":"Vice",
"Name":"Harris",
"Twitter":"#user2"
}
]
}
}
You can zip the list of sub-dicts with the list of usernames and iterate over the resulting sequence of pairs to add a Twitter key to each sub-dict:
for person, username in zip(country_data['US']['People'], tw_usernames):
person['Twitter'] = username
With your sample input, country_data would become:
{'US': {'People': [{'Title': 'Pres.', 'Name': 'Joe', 'Twitter': '#user1'}, {'Title': 'Vice', 'Name': 'Harris', 'Twitter': '#user2'}]}}
Source code
def add_twitter_usernames_to_users(country_data: dict, tw_usernames: [str]):
for tw_username in tw_usernames:
country_data["US"]["People"][tw_usernames.index(tw_username)]["Twitter"] = tw_username
return country_data
Test
def test_add_twitter_usernames_to_users():
country_data = {
"US": {
"People": [
{
"Title": "Pres.",
"Name": "Joe"
},
{
"Title": "Vice",
"Name": "Harris"
}
]
}
}
tw_usernames = ['#user1', '#user2']
updated_country_data: dict = so.add_twitter_usernames_to_users(country_data, tw_usernames)
assert updated_country_data["US"]["People"][0]["Twitter"] == "#user1"
assert updated_country_data["US"]["People"][1]["Twitter"] == "#user2"

Loop through 1st + 3rd level of a dictionary

I'm wondering if there's a Pythonic way to squash this nested for loop:
dict = {
"keyA": { "subkey1": { "A1a": "frog", "A1b": "dog", "A1c": "airplane" } },
"keyA": { "subkey2": { "A2a": "cat" } },
"keyB": { "subkey1": { "B1a": "Zorba", "B1q": ["popcorn", -34] } },
"keyB": { "subkey2": { "B2z": "A Man A Plan A Canal", "B2e": "armadillo", "B2w": [1, 3, "jump"] } },
"keyC": { "subkey1": { "C1a": 3.14, "C1z": { "aaa": "dishwater", "bbb": "Dishwalla" }, "C1x": "bat" } },
"keyC": { "subkey2": { "C2a": None, "C2b": 123 } }
}
for key in dict.keys():
for subsubkey in dict[key]["subkey2"].keys():
print(key+":"+subsubkey)
Output:
keyA:A2a
keyB:B2z
keyB:B2e
keyB:B2w
keyC:C2a
keyC:C2b
One Pythonic way to solve this is to use list comprehension. This allows you to define a list within a single line, following the for loop structure you have already laid out. A working version may look something like:
final_keys = [(first_key, second_key) for first_key in dict.keys() for second_key in dict[first_key]['subkey2'].keys()]
Outputting (from your dataset):
[('keyA', 'A2a'), ('keyB', 'B2z'), ('keyB', 'B2e'), ('keyB', 'B2w'), ('keyC', 'C2a'), ('keyC', 'C2b')]

Is there a way to eliminate all duplicates from a collection?

I have a collection where the objects have a structure similar to
{'_id': ObjectId('5e691cb9e73282f624362221'),
'created_at': 'Tue Mar 10 09:23:54 +0000 2020',
'id': 1237308186757120001,
'id_str': '1237308186757120001',
'full_text': 'See you in July'}
I am struggling to only keep object which have a unique full text. Using distinct only gives me a list of the distinct full text field values where as I want to only conserve object in the collection with unique full texts.
There is, the code should look like this:
dict = {"a": 1, "b": 2, "c": 3, "a": 5, "d": 4, "e": 5, "c": 8}
#New clean dictionary
unique = {}
#Go through the original dictionary's items
for key, value in dict.items():
if(key in unique.keys()):
#If the key already exists in the new dictionary
continue
else:
#Otherwise
unique[key] = value
print(unique)
I hope this helps you!
There are 2 ways:
MongoDB way
We perform MongoDB aggregation where we group records by full_text, filter unique documents only and insert them into collection. (in the shell)
db.collection.aggregate([
{
$group: {
_id: "$full_text",
data: {
$push: "$$ROOT"
},
count: {
$sum: 1
}
}
},
{
$match: {
count: {
$eq: 1
}
}
},
{
$addFields: {
data: {
$arrayElemAt: [
"$data",
0
]
}
}
},
{
$replaceRoot: {
newRoot: "$data"
}
},
{
$out: "tmp"
}
])
When you run this query, it will create new collection with unique full_text values. You can drop old collection and rename this one.
You may also put your collection name into $out operator like this {$out:"collection"}, but there is no going back.
Python way
We perform MongoDB aggregation grouping by full_text field, filter duplicate documents and create single array with all _id to be removed. Once MongoDB returns results, we execute remove command for duplicate documents.
db.collection.aggregate([
{
$group: {
_id: "$full_text",
data: {
$push: "$_id"
},
count: {
$sum: 1
}
}
},
{
$match: {
count: {
$gt: 1
}
}
},
{
$group: {
_id: null,
data: {
$push: "$data"
}
}
},
{
$addFields: {
data: {
$reduce: {
input: "$data",
initialValue: [],
in: {
$concatArrays: [
"$$value",
"$$this"
]
}
}
}
}
}
])
MongoPlayground
Pseudocode
data = list(collection.aggregate(...))
if len(data) > 0:
colleciton.remove({'_id':{'$in':data[0]["data"]}})

How do I extract keys from a dictionary that has {"key":[{"A":"1"},{"B":"2"}]?

I have a python dictionary,
dict = {
"A": [{
"264": "0.1965"
}, {
"289": "0.1509"
}, {
"192": "0.1244"
}]
}
I have a collection in mongoDB that has,
{
"_id": ObjectId("5d5a7f474c55b68a873f9602"),
"A": [{
"264": "0.5700"
}, {
"175": "0.321"
}
}
{
"_id": ObjectId("5d5a7f474c55b68a873f9610"),
"B": [{
"152": "0.2826"
}, {
"012": "0.1234"
}
}
}
I want to see if the key "A" from dict is available in mongodb. If yes, I want to loop over the keys in the list i.e.
[{
"264": "0.19652049960139123"
}, {
"289": "0.1509138215380371"
}, {
"192": "0.12447470015715734"
}]
}
and check if 264 is available in mongodb and update the key value else append.
Expected output in mongodb:
{
"_id": ObjectId("5d5a7f474c55b68a873f9602"),
"A": [{
"264": "0.1965"
}, {
"175": "0.321"
}, {
"289": "0.1509"
}, {
"192": "0.1244"
}
}
{
"_id": ObjectId("5d5a7f474c55b68a873f9610"),
"B": [{
"152": "0.2826"
},{
"012": "0.1234"
}
}
The value for key 264 is updated. Kindly help.
Assuming you are looking for the python part and not the mongoDB, try:
for k,v in dict['A'].items(): #k is key, v is value
process_entry(k, v) #do what you want with the database
assuming your mongodb collection is called your_collection
data= your_collection.find_one({'A':{'$exists':1}})
if data:
#loop over the keys
for item in data['A']:
#check whether a certain key is available
if 'some_key' not in item:
do_something()# update

Modify value for specified key in list of dictionaries

I have a list of json objects (dictionaries) ds_list
ds_list = [ { "status": "NEW" }, { "status": "UP_TO_DATE" }]
I need to modify an attribute of each object.
So here is my solution:
if we_are_processing:
result = list(map(lambda ds: ds.update({'status': 'PROCESSING'}) or ds, ds_list))
result = [ { "status": "PROCESSING" }, { "status": "PROCESSING" }]
It works, but I don't like it very much, in particular update() and or ds.
What is more pythonic (readable) way of implementing it?
The Pythonic way is to use for loop:
ds_list = [ { "status": "NEW" }, { "status": "UP_TO_DATE" }]
for item in ds_list:
item['status'] = 'PENDING'
# [{'status': 'PENDING'}, {'status': 'PENDING'}]

Categories