Extracting matching values from nested dictionary

Extracting matching values from nested dictionary - python

I'm trying to extract values from a nested dictionary if a value matches the value in a list.
data = [
{
"id": 12345678,
"list_id": 12345,
"creator_id": 1234567,
"entity_id": 1234567,
"created_at": "2020-01-30T00:43:55.256-08:00",
"entity": {
"id": 123456,
"type": 0,
"first_name": "John",
"last_name": "Doe",
"primary_email": "john#fakemail.com",
"emails": [
"john#fakemail.com"
]
}
},
{
"id": 12345678,
"list_id": 12345,
"creator_id": 1234567,
"entity_id": 1234567,
"created_at": "2020-01-30T00:41:54.375-08:00",
"entity": {
"id": 123456,
"type": 0,
"first_name": "Jane",
"last_name": "Doe",
"primary_email": "jane#fakemail.com",
"emails": [
"jane#fakemail.com"
]
}
}
]
The code is as follows.
match_list = ['jane#fakemail.com',[]]
first_names = []
email = []
for i in match_list:
for record in data:
if 'primary_email' == i:
email.append(record.get('entity',{}).get('primary_email', None))
first_names.append(record.get('entity',{}).get('first_name', None))
print(first_names)
print(email)
Instead of returning the matching values this only returns empty lists. Any help here would be much appreciated.
The expected output is
first_names = ['Jane'] and email = ['jane#fakemail.com']

Store temporary values in variables, to make your code easier to handle:
emails = []
names = []
match_list = ['jane#fakemail.com',[]]
for item in data:
entry = item.get('entity', {})
fName = entry.get('first_name', '')
pMail = entry.get('primary_email', '')
if pMail in match_list:
print (fName)
print (pMail)
emails.append(pMail)
names.append(fName)
Output:
Jane
jane#fakemail.com

In the 6th line of your code
if 'primary_email' == i:
You're comparing elements from match_list (that is 'i') to literally the string called 'primary_email' instead of the actual email. since 'jane#fakemail.com' is not equal to 'primary_email' (literally the string).
Instead use
if record['entity']['primary_email'] == i:
and your code should work as expected.

In your code you would always get an empty list as you are comparing 'primary_email'==i which will always be False.
Change it to record['entity']['primary_email']==i.
And here there is no need to use get.Since if mail doesn't match with any of the primary_email then nothing happens. primary_email will only be added when it meets the condition d['entity']['primary_email']==mail.
Try this I refactored your code little bit.
In [25]: for mail in match_list:
...: for d in data:
...: if d['entity']['primary_email']==mail:
...: first_name.append(d['entity']['first_name'])
...: emails.append(d['entity']['primary_email'])
output
In [26]: emails
Out[26]: ['jane#fakemail.com']
In [27]: first_name
Out[27]: ['Jane']

Related

Create a nested data dictionary in Python

I have the data as below
{
"employeealias": "101613177",
"firstname": "Lion",
"lastname": "King",
"date": "2022-04-21",
"type": "Thoughtful Intake",
"subject": "Email: From You Success Coach"
}
{
"employeealias": "101613177",
"firstname": "Lion",
"lastname": "King",
"date": "2022-04-21",
"type": null,
"subject": "Call- CDL options & career assessment"
}
I need to create a dictionary like the below:

You have to create new dictionary with list and use for-loop to check if exists employeealias, firstname, lastname to add other information to sublist. If item doesn't exist then you have to create new item with employeealias, firstname, lastname and other information.
data = [
{"employeealias":"101613177","firstname":"Lion","lastname":"King","date":"2022-04-21","type":"Thoughtful Intake","subject":"Email: From You Success Coach"},
{"employeealias":"101613177","firstname":"Lion","lastname":"King","date":"2022-04-21","type":"null","subject":"Call- CDL options & career assessment"},
]
result = {'interactions': []}
for row in data:
found = False
for item in result['interactions']:
if (row["employeealias"] == item["employeealias"]
and row["firstname"] == item["firstname"]
and row["lastname"] == item["lastname"]):
item["activity"].append({
"date": row["date"],
"subject": row["subject"],
"type": row["type"],
})
found = True
break
if not found:
result['interactions'].append({
"employeealias": row["employeealias"],
"firstname": row["firstname"],
"lastname": row["lastname"],
"activity": [{
"date": row["date"],
"subject": row["subject"],
"type": row["type"],
}]
})
print(result)
EDIT:
You read lines as normal text but you have to convert text to dictonary using module json
import json
data = []
with open("/Users/Downloads/amazon_activity_feed_0005_part_00.json") as a_file:
for line in a_file:
line = line.strip()
dictionary = json.loads(line)
data.append(dictionary)
print(data)

You can create a nested dictionary inside Python like this:
student = {name : "Suman", Age = 20, gender: "male",{class : 11, roll no: 12}}

Convert multiple string stored in a variable into a single list in python

I hope everyone is doing well.
I need a little help where I need to get all the strings from a variable and need to store into a single list in python.
For example -
I have json file from where I am getting ids and all the ids are getting stored into a variable called id as below when I run print(id)
17298626-991c-e490-bae6-47079c6e2202
17298496-19bd-2f89-7b5f-881921abc632
17298698-3e17-7a9b-b337-aacfd9483b1b
172986ac-d91d-c4ea-2e50-d53700480dd0
172986d0-18aa-6f51-9c62-6cb087ad31e5
172986f4-80f0-5c21-3aee-12f22a5f4322
17298712-a4ac-7b36-08e9-8512fa8322dd
17298747-8cc6-d9d0-8d05-50adf228c029
1729875c-050f-9a99-4850-bb0e6ad35fb0
1729875f-0d50-dc94-5515-b4891c40d81c
17298761-c26b-3ce5-e77e-db412c38a5b4
172987c8-2b5d-0d94-c365-e8407b0a8860
1729881a-e583-2b54-3a52-d092020d9c1d
1729881c-64a2-67cf-d561-6e5e38ed14cb
172987ec-7a20-7eb6-3ebe-a9fb621bb566
17298813-7ac4-258b-d6f9-aaf43f9147b1
17298813-f1ef-d28a-0817-5f3b86c3cf23
17298828-b62b-9ee6-248b-521b0663226e
17298825-7449-2fcb-378e-13671cb4688a
I want these all values to be stored into a single list.
Can some please help me out with this.
Below is the code I am using:
import json
with open('requests.json') as f:
data = json.load(f)
print(type(data))
for i in data:
if 'traceId' in i:
id = i['traceId']
newid = id.split()
#print(type(newid))
print(newid)
And below is my json file looks like:
[
{
"id": "376287298-hjd8-jfjb-khkf-6479280283e9",
"submittedTime": 1591692502558,
"traceId": "17298626-991c-e490-bae6-47079c6e2202",
"userName": "ABC",
"onlyChanged": true,
"description": "Not Required",
"startTime": 1591694487929,
"result": "NONE",
"state": "EXECUTING",
"paused": false,
"application": {
"id": "16b22a09-a840-f4d9-f42a-64fd73fece57",
"name": "XYZ"
},
"applicationProcess": {
"id": "dihihdosfj9279278yrie8ue",
"name": "Deploy",
"version": 12
},
"environment": {
"id": "fkjdshkjdshglkjdshgldshldsh03r937837",
"name": "DEV"
},
"snapshot": {
"id": "djnglkfdglki98478yhgjh48yr844h",
"name": "DEV_snapshot"
},
},
{
"id": "17298495-f060-3e9d-7097-1f86d5160789",
"submittedTime": 1591692844597,
"traceId": "17298496-19bd-2f89-7b5f-881921abc632",
"userName": "UYT,
"onlyChanged": true,
"startTime": 1591692845543,
"result": "NONE",
"state": "EXECUTING",
"paused": false,
"application": {
"id": "osfodsho883793hgjbv98r3098w",
"name": "QA"
},
"applicationProcess": {
"id": "owjfoew028r2uoieroiehojehfoef",
"name": "EDC",
"version": 5
},
"environment": {
"id": "16cf69c5-4194-e557-707d-0663afdbceba",
"name": "DTESTU"
},
}
]
From where I am trying to get the traceId.

you could use simple split method like the follwing:
ids = '''17298626-991c-e490-bae6-47079c6e2202 17298496-19bd-2f89-7b5f-881921abc632 17298698-3e17-7a9b-b337-aacfd9483b1b 172986ac-d91d-c4ea-2e50-d53700480dd0 172986d0-18aa-6f51-9c62-6cb087ad31e5 172986f4-80f0-5c21-3aee-12f22a5f4322 17298712-a4ac-7b36-08e9-8512fa8322dd 17298747-8cc6-d9d0-8d05-50adf228c029 1729875c-050f-9a99-4850-bb0e6ad35fb0 1729875f-0d50-dc94-5515-b4891c40d81c 17298761-c26b-3ce5-e77e-db412c38a5b4 172987c8-2b5d-0d94-c365-e8407b0a8860 1729881a-e583-2b54-3a52-d092020d9c1d 1729881c-64a2-67cf-d561-6e5e38ed14cb 172987ec-7a20-7eb6-3ebe-a9fb621bb566 17298813-7ac4-258b-d6f9-aaf43f9147b1 17298813-f1ef-d28a-0817-5f3b86c3cf23 17298828-b62b-9ee6-248b-521b0663226e 17298825-7449-2fcb-378e-13671cb4688a'''
l = ids.split(" ")
print(l)
This will give the following result, I assumed that the separator needed is simple space you can adjust properly:
['17298626-991c-e490-bae6-47079c6e2202', '17298496-19bd-2f89-7b5f-881921abc632', '17298698-3e17-7a9b-b337-aacfd9483b1b', '172986ac-d91d-c4ea-2e50-d53700480dd0', '172986d0-18aa-6f51-9c62-6cb087ad31e5', '172986f4-80f0-5c21-3aee-12f22a5f4322', '17298712-a4ac-7b36-08e9-8512fa8322dd', '17298747-8cc6-d9d0-8d05-50adf228c029', '1729875c-050f-9a99-4850-bb0e6ad35fb0', '1729875f-0d50-dc94-5515-b4891c40d81c', '17298761-c26b-3ce5-e77e-db412c38a5b4', '172987c8-2b5d-0d94-c365-e8407b0a8860', '1729881a-e583-2b54-3a52-d092020d9c1d', '1729881c-64a2-67cf-d561-6e5e38ed14cb', '172987ec-7a20-7eb6-3ebe-a9fb621bb566', '17298813-7ac4-258b-d6f9-aaf43f9147b1', '17298813-f1ef-d28a-0817-5f3b86c3cf23', '17298828-b62b-9ee6-248b-521b0663226e', '17298825-7449-2fcb-378e-13671cb4688a']
Edit
You get list of lists because each iteration you read only 1 id, so what you need to do is to initiate an empty list and append each id to it in the following way:
l = []
for i in data
if 'traceId' in i:
id = i['traceId']
l.append(id)

you can append the ids variable to the list such as,
#list declaration
l1=[]
#this must be in your loop
l1.append(ids)

I'm assuming you get the id as a str type value. Using id.split() will return a list of all ids in one single Python list, as each id is separated by space here in your example.
id = """17298626-991c-e490-bae6-47079c6e2202 17298496-19bd-2f89-7b5f-881921abc632
17298698-3e17-7a9b-b337-aacfd9483b1b 172986ac-d91d-c4ea-2e50-d53700480dd0
172986d0-18aa-6f51-9c62-6cb087ad31e5 172986f4-80f0-5c21-3aee-12f22a5f4322
17298712-a4ac-7b36-08e9-8512fa8322dd 17298747-8cc6-d9d0-8d05-50adf228c029
1729875c-050f-9a99-4850-bb0e6ad35fb0 1729875f-0d50-dc94-5515-b4891c40d81c
17298761-c26b-3ce5-e77e-db412c38a5b4 172987c8-2b5d-0d94-c365-e8407b0a8860
1729881a-e583-2b54-3a52-d092020d9c1d 1729881c-64a2-67cf-d561-6e5e38ed14cb
172987ec-7a20-7eb6-3ebe-a9fb621bb566 17298813-7ac4-258b-d6f9-aaf43f9147b1
17298813-f1ef-d28a-0817-5f3b86c3cf23 17298828-b62b-9ee6-248b-521b0663226e
17298825-7449-2fcb-378e-13671cb4688a"""
id_list = id.split()
print(id_list)
Output:
['17298626-991c-e490-bae6-47079c6e2202', '17298496-19bd-2f89-7b5f-881921abc632',
'17298698-3e17-7a9b-b337-aacfd9483b1b', '172986ac-d91d-c4ea-2e50-d53700480dd0',
'172986d0-18aa-6f51-9c62-6cb087ad31e5', '172986f4-80f0-5c21-3aee-12f22a5f4322',
'17298712-a4ac-7b36-08e9-8512fa8322dd', '17298747-8cc6-d9d0-8d05-50adf228c029',
'1729875c-050f-9a99-4850-bb0e6ad35fb0', '1729875f-0d50-dc94-5515-b4891c40d81c',
'17298761-c26b-3ce5-e77e-db412c38a5b4', '172987c8-2b5d-0d94-c365-e8407b0a8860',
'1729881a-e583-2b54-3a52-d092020d9c1d', '1729881c-64a2-67cf-d561-6e5e38ed14cb',
'172987ec-7a20-7eb6-3ebe-a9fb621bb566', '17298813-7ac4-258b-d6f9-aaf43f9147b1',
'17298813-f1ef-d28a-0817-5f3b86c3cf23', '17298828-b62b-9ee6-248b-521b0663226e',
'17298825-7449-2fcb-378e-13671cb4688a']
split() splits by default with space as a separator. You can use the sep argument to use any other separator if needed.

Python: How can I add extra item in dictionary created via a loop?

I'm creating a table like this:
data = json.load(f)
num_of_certificates = (len(data.get('certificates')))
new_data = sorted([{n: f"{data.get('certificates', [{}])[i].get(n, 0)}"
for n in some_dic}
for i in range(num_of_items)], key=lambda x: (int(x['exp_date_year']), int(x['exp_date_month']), int(x['exp_date_day'])))
and I want to add an extra item. Imagine having one more "n" to loop through but I can't just add it in the "some_dic" for my reasons that dont affect this. I tried
data = json.load(f)
num_of_certificates = (len(data.get('certificates')))
new_data = sorted([{n: f"{data.get('certificates', [{}])[i].get(n, 0)}",
'test': 'test value'
for n in some_dic}
for i in range(num_of_items)], key=lambda x: (int(x['exp_date_year']), int(x['exp_date_month']), int(x['exp_date_day'])))
but it doesn't work. I made it work doing it like this:
data = json.load(f)
num_of_certificates = (len(data.get('certificates')))
new_data = sorted([{n: f"{data.get('certificates', [{}])[i].get(n, 0)}" if n is not "remaining_days" else "new_value"
for n in some_dic}
for i in range(num_of_certificates)], key=lambda x: (int(x['exp_date_year']), int(x['exp_date_month']), int(x['exp_date_day'])))
basically adding another empty thing inside "some_dic" but this creates other issues and I feel like there's a way easier way to do this.
Here's the "some_dic" dictionary
some_dic = {
"name": False,
"type": False,
"exp_date_day": False,
"exp_date_month": False,
"exp_date_year": False,
"color": False,
"remaining_days": True
}
Here's the json file im parsing:
{
"certificates": [
{
"exp_date_year": "2020",
"name": "1",
"type": "1",
"exp_date_day": "1",
"exp_date_month": "1",
"color": "1",
"exp_date_day_of_year": 1
},
{
"exp_date_year": "2020",
"name": "2",
"type": "2",
"exp_date_day": "2",
"exp_date_month": "2",
"color": "2",
"exp_date_day_of_year": 33
},
{
"exp_date_year": "2022",
"name": "3",
"type": "3",
"exp_date_day": "3",
"exp_date_month": "3",
"color": "3",
"exp_date_day_of_year": "62"
}
]
}

First, you should probably une regular loops to avoid an overcomplicated expression like this.
Second, your f-string make no sense:
f"{data.get('certificates', [{}])[i].get(n, 0)}"
is the same as:
str(data.get('certificates', [{}])[i].get(n, 0))
Third, you can replace:
[{n: str(data.get('certificates', [{}])[i].get(n, 0)})
for n in some_dic}
for i in range(num_of_items)]
By:
[{n: str(c.get(n, 0)})
for n in some_dic}
for c in data.get('certificates', [{}])]
Because c will iterate over the elements of data['certificates'].
Fourth, since you are using only the keys of some_dic, you can write:
some_keys = {"name", "type", "exp_date_day", "exp_date_month", "exp_date_year", "color"}
L = [{n: str(c.get(n, 0)})
for n in some_keys}
for c in data.get('certificates', [{}])]
new_data = sorted(L, ...)
Fifth, I would test if certificates is in data before the list comprehension.
Note that you return an empty list if data['certificates'] is an empty list
or if the key certificates is not in data. I would rather raise an error in the second case:
if 'certificates' not in data:
raise ValueError("Data is corrupted")
certificates = data['certificates']
L = [{k: str(c.get(n, 0)}) for k in some_keys} for c in certificates]
...
And sixth, the actual answer to your question, you want to add an element to the dict in one expression. See this question and the answers:
L = [{**{k: str(c.get(n, 0)}) for k in some_keys}, 'remaining_days': 'new_value'} for c in certificates]
Again, prefer regular loops unless this part of your code is a bottleneck.

Python find obj value based on condition within dictionary

I get below json data from a python request:
{
"results": [
{
"name": "virtual-machine-1",
"guest": "Microsoft Windows Server 2016 (64-bit)",
"status": "green",
"id": "567890-004",
},
{
"name": "virtual-machine-2",
"guest": "CoreOS Linux (64-bit)",
"status": "green",
"id": "567890-005",
}
]
}
How can I get "id" values of all dictionaries based on the "name" values.
I have seen solutions for finding values based on keys but not on a conditional basis within same dictionary and iterating it for multiple dictionaries. Appreciate your help.

That's re-keying off the id:
>>> {result['id']: result['name'] for result in data['results']}
{'567890-004': 'virtual-machine-1', '567890-005': 'virtual-machine-2'}
This technique is called a dictionary comprehension.

d = {result['name']: result['id'] for result in request.dict['results'] if 'name' in result}
# request.dict is the name of the object that contains the list "results"

Convert your json to a list of dictionaries. Then just use short-hand for statement.
results = [
{ "name": "virtual-machine-1",
"guest": "Microsoft Windows Server 2016 (64-bit)",
"status": "green",
"id": "567890-004"},
{"name": "virtual-machine-2",
"guest": "CoreOS Linux (64-bit)",
"status": "green",
"id": "567890-005"}
]
id = [x['id'] for x in results if x['name'] == "virtual-machine-1"]
print id # prints 567890-004
id = [x['id'] for x in results if x['name'] == "virtual-machine-2"]
print id # prints 567890-005

Index JSON searches python

I have the next JSON that I get from a URL:
[{
"id": 1,
"version": 23,
"external_id": "2312",
"url": "https://example.com/432",
"type": "typeA",
"date": "2",
"notes": "notes",
"title": "title",
"abstract": "dsadasdas",
"details": "something",
"accuracy": 0,
"reliability": 0,
"severity": 12,
"thing": "32132",
"other": [
"aaaaaaaaaaaaaaaaaa",
"bbbbbbbbbbbbbb",
"cccccccccccccccc",
"dddddddddddddd",
"eeeeeeeeee"
],
"nana": 8
},
{
"id": 2,
"version": 23,
"external_id": "2312",
"url": "https://example.com/432",
"type": "typeA",
"date": "2",
"notes": "notes",
"title": "title",
"abstract": "dsadasdas",
"details": "something",
"accuracy": 0,
"reliability": 0,
"severity": 12,
"thing": "32132",
"other": [
"aaaaaaaaaaaaaaaaaa",
"bbbbbbbbbbbbbb",
"cccccccccccccccc",
"dddddddddddddd",
"eeeeeeeeee"
],
"nana": 8
}]
My code:
import json
import urllib2
data = json.load(urllib2.urlopen('http://someurl/path/to/json'))
print data
I want to know how to access to the part "abstract" of the object that has "id" equal to 2 for example. The part "id" is unique so I can use id to index my searchs.
Thanks!

Here's one way to do it. You can create a generator via a generator expression, call next to iterate that generator once, and get back the desired object.
item = next((item for item in data if item['id'] == 2), None)
if item:
print item['abstract']
See also Python: get a dict from a list based on something inside the dict
EDIT : If you'd like access to all elements of the list that have a given key value (for example, id == 2) you can do one of two things. You can either create a list via comprehension (as shown in the other answer), or you can alter my solution:
my_gen = (item for item in data if item['id'] == 2)
for item in my_gen:
print item
In the loop, item will iterate over those items in your list which satisfy the given condition (here, id == 2).

You can use list comprehention to filter:
import json
j = """[{"id":1,"version":23,"external_id":"2312","url":"https://example.com/432","type":"typeA","date":"2","notes":"notes","title":"title","abstract":"dsadasdas","details":"something","accuracy":0,"reliability":0,"severity":12,"thing":"32132","other":["aaaaaaaaaaaaaaaaaa","bbbbbbbbbbbbbb","cccccccccccccccc","dddddddddddddd","eeeeeeeeee"],"nana":8},{"id":2,"version":23,"external_id":"2312","url":"https://example.com/432","type":"typeA","date":"2","notes":"notes","title":"title","abstract":"dsadasdas","details":"something","accuracy":0,"reliability":0,"severity":12,"thing":"32132","other":["aaaaaaaaaaaaaaaaaa","bbbbbbbbbbbbbb","cccccccccccccccc","dddddddddddddd","eeeeeeeeee"],"nana":8}]"""
dicto = json.loads(j)
results = [x for x in dicto if "id" in x and x["id"]==2]
And then you can print the 'abstract' values like so:
for result in results:
if "abstract" in result:
print result["abstract"]

import urllib2
import json
data = json.load(urllib2.urlopen('http://someurl/path/to/json'))
your_id = raw_input('enter the id')
for each in data:
if each['id'] == your_id:
print each['abstract']
In the above code data is list and each is a dict you can easily access the dict object.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Extracting matching values from nested dictionary - python

Related

Create a nested data dictionary in Python

Convert multiple string stored in a variable into a single list in python

Python: How can I add extra item in dictionary created via a loop?

Python find obj value based on condition within dictionary

Index JSON searches python

Categories

Resources