How to desalinize json coming from dynamodb stream

How to desalinize json coming from dynamodb stream - python

event = event = {'Records': [{'eventID': '2339bc590c21035b84f8cc602b12c1d2', 'eventName': 'INSERT', 'eventVersion': '1.1', 'eventSource': 'aws:dynamodb', 'awsRegion': 'us-east-1', 'dynamodb': {'ApproximateCreationDateTime': 1595908037.0, 'Keys': {'id': {'S': '9'}}, 'NewImage': {'last_name': {'S': 'Hus'}, 'id': {'S': '9'}, 'age': {'S': '95'}}, 'SequenceNumber': '3100000000035684810908', 'SizeBytes': 23, 'StreamViewType': 'NEW_IMAGE'}, 'eventSourceARN': 'arn:aws:dynamodb:us-east-1:656441365658:table/glossary/stream/2020-07-28T00:26:55.462'}, {'eventID': 'bbd4073256ef3182b3c00f13ead09501', 'eventName': 'MODIFY', 'eventVersion': '1.1', 'eventSource': 'aws:dynamodb', 'awsRegion': 'us-east-1', 'dynamodb': {'ApproximateCreationDateTime': 1595908037.0, 'Keys': {'id': {'S': '2'}}, 'NewImage': {'last_name': {'S': 'JJ'}, 'id': {'S': '2'}, 'age': {'S': '5'}}, 'SequenceNumber': '3200000000035684810954', 'SizeBytes': 21, 'StreamViewType': 'NEW_IMAGE'}, 'eventSourceARN': 'arn:aws:dynamodb:us-east-1:656441365658:table/glossary/stream/2020-07-28T00:26:55.462'}, {'eventID': 'a9c90c0c4a5a4b64d0314c4557e94e28', 'eventName': 'INSERT', 'eventVersion': '1.1', 'eventSource': 'aws:dynamodb', 'awsRegion': 'us-east-1', 'dynamodb': {'ApproximateCreationDateTime': 1595908037.0, 'Keys': {'id': {'S': '10'}}, 'NewImage': {'last_name': {'S': 'Hus'}, 'id': {'S': '10'}, 'age': {'S': '95'}}, 'SequenceNumber': '3300000000035684810956', 'SizeBytes': 25, 'StreamViewType': 'NEW_IMAGE'}, 'eventSourceARN': 'arn:aws:dynamodb:us-east-1:656441365658:table/glossary/stream/2020-07-28T00:26:55.462'}, {'eventID': '288f4a424992e5917af0350b53f754dc', 'eventName': 'MODIFY', 'eventVersion': '1.1', 'eventSource': 'aws:dynamodb', 'awsRegion': 'us-east-1', 'dynamodb': {'ApproximateCreationDateTime': 1595908037.0, 'Keys': {'id': {'S': '1'}}, 'NewImage': {'last_name': {'S': 'V'}, 'id': {'S': '1'}, 'age': {'S': '2'}}, 'SequenceNumber': '3400000000035684810957', 'SizeBytes': 20, 'StreamViewType': 'NEW_IMAGE'}, 'eventSourceARN': 'arn:aws:dynamodb:us-east-1:656441365658:table/glossary/stream/2020-07-28T00:26:55.462'}]}
The above one coming from dynamodb stream. I need to extract the some value from above
Code is below nothing is returning
def deserialize(event):
data = {}
data["M"] = event
return extract_some(data)
def extract_some(event):
for key, value in list(event.items()):
if (key == "NULL"):
return None
if (key == "S" or key == "BOOL"):
return value
for record in event['Records']:
doc = deserialise(record['dynamodb']['NewImage'])
print (doc)
Expected Out
{'last_name': 'Hus', 'id': '9', 'age': '95'}
{'last_name': 'JJ', 'id': '2', 'age': '5'}
{'last_name': 'Hus', 'id': '10', 'age': '95'}
{'last_name': 'V', 'id': '1', 'age': '2'}

try this,
from pprint import pprint
result = []
for r in event['Records']:
tmp = {}
for k, v in r['dynamodb']['NewImage'].items():
if "S" in v.keys() or "BOOL" in v.keys():
tmp[k] = v.get('S', v.get('BOOL', False))
elif 'NULL' in v:
tmp[k] = None
result.append(tmp)
pprint(result)
[{'age': '95', 'id': '9', 'last_name': 'Hus'},
{'age': '5', 'id': '2', 'last_name': 'JJ'},
{'age': '95', 'id': '10', 'last_name': 'Hus'},
{'age': '2', 'id': '1', 'last_name': 'V'}]

Related

Comparing 2 uneven lists consists of dictionaries with unique keys in python and search key value in a but not in b

Hi all I'm still at learning stage of python and looking for help in list of dictionary.
Having two list of dictionary a1 and p1, we are trying to take out "Code" which are present a1 but not present in p1 and also if Code present and not p1 then it's IsRemoved should have value 0 i.e., [IsRemoved] == 0
a1 = [{'Code': '1', 'Name': 'ven1', 'DomainName': 'xyz.com', 'IsRemoved': 0}, {'Code': '2', 'Name': 'ven2', 'DomainName': 'abc.co.in', 'IsRemoved': 1}, {'Code': '3', 'Name': 'ven3', 'DomainName': 'abc.com', 'IsRemoved': 0}, {'Code': 'v001', 'Name': 'ven1', 'DomainName': 'xyz.com|abc.com', 'IsRemoved': 0},{'Code': '4', 'Name': 'ven4', 'DomainName': 'xyz.com', 'IsRemoved': 0}, {'Code': '5', 'Name': 'ven5', 'DomainName': 'abc.com', 'IsRemoved': 0}, {'Code': '6', 'Name': 'ven6', 'DomainName': 'xyz.com', 'IsRemoved': 0}, {'Code': '7', 'Name': 'ven7', 'DomainName': 'xyz.com', 'IsRemoved': 1}, {'Code': '8', 'Name': 'ven8', 'DomainName': 'abc.co.in', 'IsRemoved': 0}, {'Code': '9', 'Name': 'ven9', 'DomainName': 'xyz.com', 'IsRemoved': 1}, {'Code': '10', 'Name': 'ven10', 'DomainName': 'xyz.com', 'IsRemoved': 0}, {'Code': '11', 'Name': 'ven6', 'DomainName': 'xyz.com', 'IsRemoved': 0}, {'Code': 'v001', 'Name': 'ven1', 'DomainName': 'xyz.com|abc.com', 'IsRemoved': 1}, {'Code': 'v002', 'Name': 'ven2', 'DomainName': 'xyz.com|abc.com', 'IsRemoved': 0}]
p1 = [{'Code': '1', 'Name': 'ven1', 'Domain': ['xyz.com']}, {'Code': '2', 'Name': 'ven2', 'Domain': ['abc.co.in']}, {'Code': '3', 'Name': 'ven3', 'Domain': ['abc.com']}, {'Code': '4', 'Name': 'ven4', 'Domain': ['xyz.com']}, {'Code': '5', 'Name': 'ven5', 'DomainName': 'abc.com' ]
So here I'm expecting output as:
{'Code': 'v001', 'Name': 'ven1', 'DomainName': 'xyz.com|abc.com', 'IsRemoved': 0}
{'Code': '6', 'Name': 'ven6', 'DomainName': 'xyz.com', 'IsRemoved': 0}
{'Code': '8', 'Name': 'ven8', 'DomainName': 'abc.co.in', 'IsRemoved': 0}
{'Code': '10', 'Name': 'ven10', 'DomainName': 'xyz.com', 'IsRemoved': 0}
{'Code': '11', 'Name': 'ven6', 'DomainName': 'xyz.com', 'IsRemoved': 0}
{'Code': 'v002', 'Name': 'ven2', 'DomainName': 'xyz.com|abc.com', 'IsRemoved': 0}
I tried it out using for loop with != in below way:
for x in a1:
for y in p1:
if x["Code"] != y["Code"] and x["IsRemoved"] == 0:
print(x)
When I'm trying to find == it works fine and giving me correct result but strange it's not working correctly for !=
Please have a look and guide if I can use some other method to get the desired result.

Here is way you can try out, create unique codes from p1 & use list comprehension to apply filter on a1.
unique = {p['Code'] for p in p1} # use set for faster lookup
[a for a in a1 if a['Code'] not in unique and a['IsRemoved'] == 0]

create dictionary of values based on matching keys in list from nested dictionary

i have nested dictionary with upto 300 items from TYPE1 TO TYPE300 called mainlookup
mainlookup = {'TYPE1': [{'Song': 'Rock', 'Type': 'Hard', 'Price': '10'}],
'TYPE2': [{'Song': 'Jazz', 'Type': 'Slow', 'Price': '5'}],
'TYPE37': [{'Song': 'Country', 'Type': 'Fast', 'Price': '7'}]}
input list to search in lookup based on string TYPE1, TYPE2 and so one
input_list = ['thissong-fav-user:type1-chan-44-John',
'thissong-fav-user:type1-chan-45-kelly-md',
'thissong-fav-user:type2-rock-45-usa',
'thissong-fav-user:type737-chan-45-patrick-md',
'thissong-fav-user:type37-chan-45-kelly-md']
i want to find the string TYPE IN input_list and then create a dictionary as shown below
Output_Desired = {'thissong-fav-user:type1-chan-44-John': [{'Song': 'Rock', 'Type': 'Hard',
'Price':'10'}],
'thissong-fav-user:type1-chan-45-kelly-md': [{'Song': 'Rock', 'Type': 'Hard', 'Price': '10'}],
'thissong-fav-user:type2-rock-45-usa': [{'Song': 'Jazz', 'Type': 'Slow', 'Price': '5'}],
'thissong-fav-user:type37-chan-45-kelly-md': [{'Song': 'Country', 'Type': 'Fast', 'Price': '7'}]}
Note-thissong-fav-user:type737-chan-45-patrick-md in the list has no match so i want to create a
seperate list if value is not found in main lookup
Notfound_list = ['thissong-fav-user:type737-chan-45-patrick-md', and so on..]
Appreciate your help.

You can try this:
mainlookup = {'TYPE1': [{'Song': 'Rock', 'Type': 'Hard', 'Price': '10'}],
'TYPE2': [{'Song': 'Jazz', 'Type': 'Slow', 'Price': '5'}], 'TYPE37': [{'Song': 'Country', 'Type': 'Fast', 'Price': '7'}]}
input_list = ['thissong-fav-user:type1-chan-44-John',
'thissong-fav-user:type1-chan-45-kelly-md', 'thissong-fav-user:type737-chan-45-kelly-md']
dct={i:mainlookup[i.split(':')[1].split('-')[0].upper()] for i in input_list if i.split(':')[1].split('-')[0].upper() in mainlookup.keys()}
Notfoundlist=[i for i in input_list if i not in dct.keys() ]
print(dct)
print(Notfoundlist)
Output:
{'thissong-fav-user:type1-chan-44-John': [{'Song': 'Rock', 'Type': 'Hard', 'Price': '10'}], 'thissong-fav-user:type1-chan-45-kelly-md': [{'Song': 'Rock', 'Type': 'Hard', 'Price': '10'}]}
['thissong-fav-user:type737-chan-45-kelly-md']

An answer using regular expressions:
import re
from pprint import pprint
input_list = ['thissong-fav-user:type1-chan-44-John', 'thissong-fav-user:type1-chan-45-kelly-md', 'thissong-fav-user:type2-rock-45-usa', 'thissong-fav-user:type737-chan-45-patrick-md', 'thissong-fav-user:type37-chan-45-kelly-md']
mainlookup = {'TYPE2': {'Song': 'Reggaeton', 'Type': 'Hard', 'Price': '30'}, 'TYPE1': {'Song': 'Rock', 'Type': 'Hard', 'Price': '10'}, 'TYPE737': {'Song': 'Jazz', 'Type': 'Hard', 'Price': '99'}, 'TYPE37': {'Song': 'Rock', 'Type': 'Soft', 'Price': '1'}}
pattern = re.compile('type[0-9]+')
matches = [re.search(pattern, x).group(0) for x in input_list]
result = {x: [mainlookup[matches[i].upper()]] for i, x in enumerate(input_list)}
pprint(result)
Output:
{'thissong-fav-user:type1-chan-44-John': [{'Price': '10',
'Song': 'Rock',
'Type': 'Hard'}],
'thissong-fav-user:type1-chan-45-kelly-md': [{'Price': '10',
'Song': 'Rock',
'Type': 'Hard'}],
'thissong-fav-user:type2-rock-45-usa': [{'Price': '30',
'Song': 'Reggaeton',
'Type': 'Hard'}],
'thissong-fav-user:type37-chan-45-kelly-md': [{'Price': '1',
'Song': 'Rock',
'Type': 'Soft'}],
'thissong-fav-user:type737-chan-45-patrick-md': [{'Price': '99',
'Song': 'Jazz',
'Type': 'Hard'}]}

Python: Remove Duplicates From List of Dicts Based on DateTime Key

I want to reduce this list of dictionaries to take the most current record of the duplicates, where duplicates are determined by same project_name and same feature_group_name. How do I go about doing that?
The way I'm doing it right now is as follows, but I'm sure there is a better way w/o the need for pandas:
d = pd.DataFrame(l)
sorted_df = d.sort_values('datetime', ascending=False).drop_duplicates(['project_name', 'feature_group_name'],
keep='first')
sorted_df.to_dict(orient='records')
Example:
l = [{'project_name': 'test', 'feature_group_name': 'dnb_features',
'datetime': '2020-02-10T21:24:29Z', 'id': '1'},
{'project_name': 'test', 'feature_group_name': 'dnb_features2',
'datetime': '2020-02-10T21:24:29Z', 'id': '2'},
{'project_name': 'test', 'feature_group_name': 'dnb_features',
'datetime': '2020-02-09T21:24:29Z', 'id': '3'},
{'project_name': 'test', 'feature_group_name': 'dnb_features',
'datetime': '2020-02-08T21:24:29Z', 'id': '4'},
{'project_name': 'test', 'feature_group_name': 'dnb_features2',
'datetime': '2020-02-08T21:24:29Z', 'id': '5'},
{'project_name': 'test', 'feature_group_name': 'dnb_features3',
'datetime': '2020-02-08T21:24:29Z', 'id': '6'}]
Desired Result:
[{'project_name': 'test', 'feature_group_name': 'dnb_features',
'datetime': '2020-02-10T21:24:29Z', 'id': '1'},
{'project_name': 'test', 'feature_group_name': 'dnb_features2',
'datetime': '2020-02-10T21:24:29Z', 'id': '2'},
{'project_name': 'test', 'feature_group_name': 'dnb_features3',
'datetime': '2020-02-08T21:24:29Z', 'id': '6'}]

l = [i for n, i in enumerate(l) if i not in l[n + 1:]]

Finding missing value in JSON using python

I am facing this problem, I want to separate the dataset that has completed and not complete.
So, I want to put flag like 'complete' in the JSON. Example as in output.
This is the data that i have
data=[{'id': 'abc001',
'demo':{'gender':'1',
'job':'6',
'area':'3',
'study':'3'},
'ex_data':{'fam':'small',
'scholar':'2'}},
{'id': 'abc002',
'demo':{'gender':'1',
'edu':'6',
'qual':'3',
'living':'3'},
'ex_data':{'fam':'',
'scholar':''}},
{'id': 'abc003',
'demo':{'gender':'1',
'edu':'6',
'area':'3',
'sal':'3'}
'ex_data':{'fam':'big',
'scholar':NaN}}]
Output
How can I put the flag and also detect NaN and NULL in JSON?
Output=[{'id': 'abc001',
'completed':'yes',
'demo':{'gender':'1',
'job':'6',
'area':'3',
'study':'3'},
'ex_data':{'fam':'small',
'scholar':'2'}},
{'id': 'abc002',
'completed':'no',
'demo':{'gender':'1',
'edu':'6',
'qual':'3',
'living':'3'},
'ex_data':{'fam':'',
'scholar':''}},
{'id': 'abc003',
'completed':'no',
'demo':{'gender':'1',
'edu':'6',
'area':'3',
'sal':'3'}
'ex_data':{'fam':'big',
'scholar':NaN}}]

Something like this should work for you:
data = [
{
'id': 'abc001',
'demo': {
'gender': '1',
'job': '6',
'area': '3',
'study': '3'},
'ex_data': {'fam': 'small',
'scholar': '2'}
},
{
'id': 'abc002',
'demo': {
'gender': '1',
'edu': '6',
'qual': '3',
'living': '3'},
'ex_data': {'fam': '',
'scholar': ''}},
{
'id': 'abc003',
'demo': {
'gender': '1',
'edu': '6',
'area': '3',
'sal': '3'},
'ex_data': {'fam': 'big',
'scholar': None}
}
]
def browse_dict(dico):
empty_values = 0
for key in dico:
if dico[key] is None or dico[key] == "":
empty_values += 1
if isinstance(dico[key], dict):
for k in dico[key]:
if dico[key][k] is None or dico[key][k] == "":
empty_values += 1
if empty_values == 0:
dico["completed"] = "yes"
else:
dico["completed"] = "no"
for d in data:
browse_dict(d)
print(d)
Output :
{'id': 'abc001', 'demo': {'gender': '1', 'job': '6', 'area': '3', 'study': '3'}, 'ex_data': {'fam': 'small', 'scholar': '2'}, 'completed': 'yes'}
{'id': 'abc002', 'demo': {'gender': '1', 'edu': '6', 'qual': '3', 'living': '3'}, 'ex_data': {'fam': '', 'scholar': ''}, 'completed': 'no'}
{'id': 'abc003', 'demo': {'gender': '1', 'edu': '6', 'area': '3', 'sal': '3'}, 'ex_data': {'fam': 'big', 'scholar': None}, 'completed': 'no'}
Note that I changed NaN to None, because here you are most likely showing a python dictionary, not a JSON file since you are using data =
In a dictionary, the NaN value would be changed for None.
If you have to convert your JSON to a dictionary, refer to the JSON module documentation.
Also please check your dictionary syntax. You missed several commas to separate data.

You should try
The Input is
data = [{'demo': {'gender': '1', 'job': '6', 'study': '3', 'area': '3'}, 'id': 'abc001', 'ex_data': {'scholar': '2', 'fam': 'small'}}, {'demo': {'living': '3', 'gender': '1', 'qual': '3', 'edu': '6'}, 'id': 'abc002', 'ex_data': {'scholar': '', 'fam': ''}}, {'demo': {'gender': '1', 'area': '3', 'sal': '3', 'edu': '6'}, 'id': 'abc003', 'ex_data': {'scholar': None, 'fam': 'big'}}]
Also, Nan will not work in Python. So, instead of Nan we have used None.
for item in data:
item["completed"] = 'yes'
for key in item.keys():
if isinstance(item[key],dict):
for inner_key in item[key].keys():
if (not item[key][inner_key]):
item["completed"] = "no"
break
else:
if (not item[key]):
item["completed"] = "no"
break
The Output will be
data = [{'demo': {'gender': '1', 'job': '6', 'study': '3', 'area': '3'}, 'completed': 'yes', 'id': 'abc001', 'ex_data': {'scholar': '2', 'fam': 'small'}}, {'demo': {'living': '3', 'edu': '6', 'qual': '3', 'gender': '1'}, 'completed': 'no', 'id': 'abc002', 'ex_data': {'scholar': '', 'fam': ''}}, {'demo': {'edu': '6', 'gender': '1', 'sal': '3', 'area': '3'}, 'completed': 'no', 'id': 'abc003', 'ex_data': {'scholar': None, 'fam': 'big'}}]

How to get/filter values in python3 json list dictionary response?

Below is result I got from API query.
[{'type':'book','title': 'example1', 'id': 12456, 'price': '8.20', 'qty': '12', 'status': 'available'},
{'type':'book','title': 'example2', 'id': 12457, 'price': '10.50', 'qty': '5', 'status': 'none'}]
How do I specify in code to get value pairs of title, price, & status only?
So result will be like:
[{'title': 'example1', 'price': '8.20', 'status': 'available'},
{'title': 'example2', 'price': '10.50', 'status': 'none'}]

You can use a dictionary comprehension within a list comprehension:
L = [{'type':'book','title': 'example1', 'id': 12456, 'price': '8.20', 'qty': '12', 'status': 'available'},
{'type':'book','title': 'example2', 'id': 12457, 'price': '10.50', 'qty': '5', 'status': 'none'}]
keys = ['title', 'price', 'status']
res = [{k: d[k] for k in keys} for d in L]
print(res)
[{'price': '8.20', 'status': 'available', 'title': 'example1'},
{'price': '10.50', 'status': 'none', 'title': 'example2'}]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to desalinize json coming from dynamodb stream - python

Related

Comparing 2 uneven lists consists of dictionaries with unique keys in python and search key value in a but not in b

create dictionary of values based on matching keys in list from nested dictionary

Python: Remove Duplicates From List of Dicts Based on DateTime Key

Finding missing value in JSON using python

How to get/filter values in python3 json list dictionary response?

Categories

Resources