searching theough a nested JSON data with python - python

i have a sample json file from a webhook response and i will want to extract just two data set from the JSON how can i do that using python. assuming i want to get the subscription code, and plan code values. thanks in anticipation
"event": "subscription.create",
"data": {
"domain": "test",
"status": "active",
"subscription_code": "SUB_vsyqdmlzble3uii",
"amount": 50000,
"cron_expression": "0 0 28 * *",
"next_payment_date": "2016-05-19T07:00:00.000Z",
"open_invoice": null,
"createdAt": "2016-03-20T00:23:24.000Z",
"plan": {
"name": "Monthly retainer",
"plan_code": "PLN_gx2wn530m0i3w3m",
"description": null,
"amount": 50000,
"interval": "monthly",
"send_invoices": true,
"send_sms": true,
"currency": "NGN"
},
"authorization": {
"authorization_code": "AUTH_96xphygz",
"bin": "539983",
"last4": "7357",
"exp_month": "10",
"exp_year": "2017",
"card_type": "MASTERCARD DEBIT",
"bank": "GTBANK",
"country_code": "NG",
"brand": "MASTERCARD"
},
"customer": {
"first_name": "BoJack",
"last_name": "Horseman",
"email": "bojack#horsinaround.com",
"customer_code": "CUS_xnxdt6s1zg1f4nx",
"phone": "",
"metadata": {},
"risk_action": "default"
},
"created_at": "2016-10-01T10:59:59.000Z"
}
}

You can use the built-in json library. For example:
import json
#if your json is in file
dict_from_file = json.load(open("foo.json"))
#if your json is in a string
dict_from_string = json.loads(string)

Related

What is the best way for me to iterate over this dataset to return all matching values from another key value pair if I match a separate key?

I want to be able to search through this list (see bottom of post) of dicts (I think that is what this particular arrangement is called) to search for an ['address'] that matches '0xd2'. If that match is found, I want to return/print all the corresponding ['id']s.
So in this case I would like to return:
632, 315, 432, 100
I'm able to extract individual values like this:
none = None
print(my_dict['result'][2]["id"])
432
I'm struggling with how to get a loop to do this properly.
{
"total": 4,
"page": 0,
"page_size": 100,
"result": [
{
"address": "0xd2",
"id": "632",
"amount": "1",
"name": "Avengers",
"group": "Marvel",
"uri": "https://google.com/",
"metadata": null,
"synced_at": "2022-05-26T22:52:34.113Z",
"last_sync": "2022-05-26T22:52:34.113Z"
},
{
"address": "0xd2",
"id": "315",
"amount": "1",
"name": "Avengers",
"group": "Marvel",
"uri": "https://google.com/",
"metadata": null,
"synced_at": "2022-05-26T22:52:34.113Z",
"last_sync": "2022-05-26T22:52:34.113Z"
},
{
"address": "0xd2",
"id": "432",
"amount": "1",
"name": "Avengers",
"group": "Marvel",
"uri": "https://google.com/",
"metadata": null,
"synced_at": "2022-05-26T22:52:34.113Z",
"last_sync": "2022-05-26T22:52:34.113Z"
},
{
"address": "0x44",
"id": "100",
"amount": "1",
"name": "Suicide Squad",
"group": "DC",
"uri": "https://google.com/",
"metadata": null,
"synced_at": "2022-05-26T22:52:34.113Z",
"last_sync": "2022-05-26T22:52:34.113Z"
}
],
"status": "SYNCED"
}
Welcome to StackOverflow.
You can try list comprehension:
[res["id"] for res in my_dict["result"] if res["address"] == "0xd2"]
If you'd like to use a for loop:
l = []
for res in my_dict["result"]:
if res["address"] == "0xd2":
l.append(res["id"])
You can use a list comprehension.
import json
json_string = """{
"total": 4,
"page": 0,
"page_size": 100,
"result": [
{
"address": "0xd2",
"id": "632",
"amount": "1",
"name": "Avengers",
"group": "Marvel",
"uri": "https://google.com/",
"metadata": null,
"synced_at": "2022-05-26T22:52:34.113Z",
"last_sync": "2022-05-26T22:52:34.113Z"
},
{
"address": "0xd2",
"id": "315",
"amount": "1",
"name": "Avengers",
"group": "Marvel",
"uri": "https://google.com/",
"metadata": null,
"synced_at": "2022-05-26T22:52:34.113Z",
"last_sync": "2022-05-26T22:52:34.113Z"
},
{
"address": "0xd2",
"id": "432",
"amount": "1",
"name": "Avengers",
"group": "Marvel",
"uri": "https://google.com/",
"metadata": null,
"synced_at": "2022-05-26T22:52:34.113Z",
"last_sync": "2022-05-26T22:52:34.113Z"
},
{
"address": "0x44",
"id": "100",
"amount": "1",
"name": "Suicide Squad",
"group": "DC",
"uri": "https://google.com/",
"metadata": null,
"synced_at": "2022-05-26T22:52:34.113Z",
"last_sync": "2022-05-26T22:52:34.113Z"
}
],
"status": "SYNCED"
}"""
json_dict = json.loads(json_string)
result = [elem['id'] for elem in json_dict['result'] if elem['address'] == '0xd2']
print(result)
Output:
['632', '315', '432']
This would store the associated ids in the list:
ids=[]
for r in dataset.get('result'):
if r.get('address')=='0xd2':
ids.append(r.get('id'))

Json To Angular

I am using MSGraph Api with Python as a backend and angular as a frontend to pull data but given output is showing in json format with all metadata i want to show specific data from it. How can i do that.
Python Code:
#app.route('/getChannel',methods=['GET','POST'])
def Channels():
token = _get_token_from_cache(app_config.SCOPE)
if not token:
return redirect(url_for("login"))
channel_data = requests.get(
app_config.ChannelURL,
headers={'Authorization': 'Bearer ' + token['access_token']},
).json()['value']
return render_template('team.html',result=channel_data)
json output
[
{
"createdDateTime": "2021-05-08T04:47:39.67Z",
"description": null,
"displayName": "General",
"email": "",
"id": "19:c9f316c845794a21a1ff2ba50be2bb1e#thread.tacv2",
"isFavoriteByDefault": null,
"membershipType": "standard",
"webUrl":
},
{
"createdDateTime": "2021-05-08T05:00:11.348Z",
"description": null,
"displayName": "Developement",
"email": "",
"id": "19:da1aff92f51f416c9390881e2fa70716#thread.tacv2",
"isFavoriteByDefault": true,
"membershipType": "standard",
"webUrl":
},
{
"createdDateTime": "2021-05-09T15:16:46.27Z",
"description": "testing",
"displayName": "channel1",
"email": "",
"id": "19:e81758e4858e4f598d31214282d6c100#thread.tacv2",
"isFavoriteByDefault": false,
"membershipType": "standard",
"webUrl":
}
]

Ignore specific JSON keys when extracting data in Python

I'm extracting certain keys in several JSON files and then converting it to a CSV in Python. I'm able to define a key list when I run my code and get the information I need.
However, there are certain sub-keys that I want to ignore from the JSON file. For example, if we look at the following snippet:
JSON Sample
[
{
"callId": "abc123",
"errorCode": 0,
"apiVersion": 2,
"statusCode": 200,
"statusReason": "OK",
"time": "2020-12-14T12:00:32.744Z",
"registeredTimestamp": 1417731582000,
"UID": "_guid_abc123==",
"created": "2014-12-04T22:19:42.894Z",
"createdTimestamp": 1417731582000,
"data": {},
"preferences": {},
"emails": {
"verified": [],
"unverified": []
},
"identities": [
{
"provider": "facebook",
"providerUID": "123",
"allowsLogin": true,
"isLoginIdentity": true,
"isExpiredSession": true,
"lastUpdated": "2014-12-04T22:26:37.002Z",
"lastUpdatedTimestamp": 1417731997002,
"oldestDataUpdated": "2014-12-04T22:26:37.002Z",
"oldestDataUpdatedTimestamp": 1417731997002,
"firstName": "John",
"lastName": "Doe",
"nickname": "John Doe",
"profileURL": "https://www.facebook.com/John.Doe",
"age": 50,
"birthDay": 31,
"birthMonth": 12,
"birthYear": 1969,
"city": "City, State",
"education": [
{
"school": "High School Name",
"schoolType": "High School",
"degree": null,
"startYear": 0,
"fieldOfStudy": null,
"endYear": 0
}
],
"educationLevel": "High School",
"favorites": {
"music": [
{
"name": "Music 1",
"id": "123",
"category": "Musician/band"
},
{
"name": "Music 2",
"id": "123",
"category": "Musician/band"
}
],
"movies": [
{
"name": "Movie 1",
"id": "123",
"category": "Movie"
},
{
"name": "Movie 2",
"id": "123",
"category": "Movie"
}
],
"television": [
{
"name": "TV 1",
"id": "123",
"category": "Tv show"
}
]
},
"followersCount": 0,
"gender": "m",
"hometown": "City, State",
"languages": "English",
"likes": [
{
"name": "Like 1",
"id": "123",
"time": "2014-10-31T23:52:53.0000000Z",
"category": "TV",
"timestamp": "1414799573"
},
{
"name": "Like 2",
"id": "123",
"time": "2014-09-16T08:11:35.0000000Z",
"category": "Music",
"timestamp": "1410855095"
}
],
"locale": "en_US",
"name": "John Doe",
"photoURL": "https://graph.facebook.com/123/picture?type=large",
"timezone": "-8",
"thumbnailURL": "https://graph.facebook.com/123/picture?type=square",
"username": "john.doe",
"verified": "true",
"work": [
{
"companyID": null,
"isCurrent": null,
"endDate": null,
"company": "Company Name",
"industry": null,
"title": "Company Title",
"companySize": null,
"startDate": "2010-12-31T00:00:00"
}
]
}
],
"isActive": true,
"isLockedOut": false,
"isRegistered": true,
"isVerified": false,
"lastLogin": "2014-12-04T22:26:33.002Z",
"lastLoginTimestamp": 1417731993000,
"lastUpdated": "2014-12-04T22:19:42.769Z",
"lastUpdatedTimestamp": 1417731582769,
"loginProvider": "facebook",
"loginIDs": {
"emails": [],
"unverifiedEmails": []
},
"rbaPolicy": {
"riskPolicyLocked": false
},
"oldestDataUpdated": "2014-12-04T22:19:42.894Z",
"oldestDataUpdatedTimestamp": 1417731582894,
"registered": "2014-12-04T22:19:42.956Z",
"regSource": "",
"socialProviders": "facebook"
}
]
I want to extract data from created and identities but ignore identities.favorites and identities.likes as well as their data underneath it.
This is what I have so far, below. I defined the JSON keys that I want to extract in the key_list variable:
Current Code
import json, pandas
from flatten_json import flatten
# Enter the path to the JSON and the filename without appending '.json'
file_path = r'C:\Path\To\file_name'
# Open and load the JSON file
json_list = json.load(open(file_path + '.json', 'r', encoding='utf-8', errors='ignore'))
# Extract data from the defined key names
key_list = ['created', 'identities']
json_list = [{k:d[k] for k in key_list} for d in json_list]
# Flatten and convert to a data frame
json_list_flattened = (flatten(d, '.') for d in json_list)
df = pandas.DataFrame(json_list_flattened)
# Export to CSV in the same directory with the original file name
export_csv = df.to_csv (file_path + r'.csv', sep=',', encoding='utf-8', index=None, header=True)
Similar to the key_list, I suspect that I would make an ignore list and factor that in the json_list for loop that I have? Something like:
key_ignore = ['identities.favorites', 'identities.likes']`
Then utilize the dict.pop() which looks like it will remove the unwanted sub-keys if it matches? Just not sure how to implement that correctly.
Expected Output
As a result, the code should extract data from the defined keys in key_list and ignore the sub keys defined in key_ignore, which is identities.favorites and identities.likes. Then the rest of the code will continue to convert it into a CSV:
created
identities.0.provider
identities.0.providerUID
identities...
2014-12-04T19:23:05.191Z
site
cb8168b0cf734b70ad541f0132763761
...
If the keys are always there, you can use
del d[0]['identities'][0]['likes']
del d[0]['identities'][0]['favorites']
or if you want to remove the columns from the dataframe after reading all the json data in you can use
df.drop(df.filter(regex='identities.0.favorites|identities.0.likes').columns, axis=1, inplace=True)

Python parsing json data to Netezza table

[
{
"id": 1,
"name": "Lea",
"username": "Bret",
"email": "hhaa#gma",
"address": {
"street": "Light",
"suite": "Apt. 5",
"city": "Gwen",
"zipcode": "3874",
"geo": {
"lat": "-37.3159",
"lng": "81.1496"
}
},
"phone": "1-770",
"website": "hilde.org",
"company": {
"name": "Roma",
"catchPhrase": "net",
"bs": "markets"
}
},
{
"id": 2,
"name": "Er",
"username": "Ant",
"email": "Sh",
"address": {
"street": "Vis",
"suite": "89",
"city": "Wibrugh",
"zipcode": "905",
"geo": {
"lat": "-43.9509",
"lng": "-34.4618"
}
},
"phone": "010-69",
"website": "ansia.net",
"company": {
"name": "Deist",
"catchPhrase": "contingency",
"bs": " supply-chains"
}
}
]
I am getting this data from webscraping and I would like to store this data into netezza database. Can you Please give me sample code? Do I need to correct Json before? If yes, How would I do it?
And when I am trying to use items iterate in list, I am only getting the last user id details.
I would suggest a different approach, due to better scalability:
1) load the raw txt data into a (temporary) table with the ‘external table’ syntax of Netezza
2) use these functions to parse the Json data into table columns: https://developer.ibm.com/articles/i-json-table-trs/

What's wrong with my paypal invoice?

I have a little problem with making paypal invoices using paypal sdk in my django project. when I trying to execute this code
invoice_id = ''
invoice = Invoice({
"merchant_info": {
"email": '', # You must change this to your sandbox email account
"first_name": str(merchant.first_name),
"last_name": str(merchant.last_name),
"business_name": str(merchant_full_name),
"phone": {
"country_code": "001",
"national_number": str(merchant.phone)
},
"address": {
"line1": str(merchant_address.address),
"city": str(merchant_address.city),
"state": str(merchant_address.state),
"postal_code": str(merchant_address.zip_code),
"country_code": "US"
}
},
"billing_info": [{"email": buyer.paypal_address}],
"items": [
{
"name": "Slab",
"quantity": 1,
"unit_price": {
"currency": "USD",
"value": float(slab.price)
}
}
],
"note": "Invoice for slab",
"payment_term": {
"term_type": "NET_45"
},
"shipping_info": {
"first_name": buyer.first_name,
"last_name": buyer.last_name,
"business_name": str(buyer_full_name),
"phone": {
"country_code": "001",
"national_number": str(buyer.phone)
},
"address": {
"line1": str(buyer_address.address),
"city": str(buyer_address.city),
"state": str(buyer_address.state),
"postal_code": str(buyer_address.zip_code),
"country_code": "US"
}
},
"shipping_cost": {
"amount": {
"currency": "USD",
"value": 0
}
}
})
if invoice.create():
print(json.dumps(invoice.to_dict(), sort_keys=False, indent=4))
invoice_id = invoice['id']
return invoice_id
else:
print(invoice.error)
My server return me those errors
INFO:paypalrestsdk.api:Request[POST]:
https://api.sandbox.paypal.com/v1/oauth2/token
INFO:paypalrestsdk.api:Response[200]: OK, Duration: 1.156739s.
INFO:paypalrestsdk.api:PayPal-Request-Id:
660f7807-961f-4460-bafd-18412b489a91
INFO:paypalrestsdk.api:Request[POST]:
https://api.sandbox.paypal.com/v1/invoicing/invoices
INFO:paypalrestsdk.api:Response[401]: Unauthorized, Duration:
1.213655s. INFO:paypalrestsdk.api:Request[POST]: https://api.sandbox.paypal.com/v1/oauth2/token
INFO:paypalrestsdk.api:Response[200]: OK, Duration: 1.53491s.
As I understand it's a problem with paypal token, right?
You need to authenticate your request:
import requests, base64
# you can only encode byte objects
credentials = b"userid:password"
b64Val = base64.b64encode(credentials)
r=requests.post(api_URL,
headers={"Authorization": "Basic %s" % b64Val},
data=payload)

Categories