I have a JSON file like this below and the keys in the custom_fields can vary for each id. I need to import this data into BigQuery but they don't allow field names to begin with a number. So, using Python 3.7, I am trying to find out how can I dynamically concatenate a value to the beginning of those keys within custom_fields without manually specifying each field name?
{
"response":[{
"id": "123",
"custom_fields":{
"5c30673efc89f7000400001d":"val1",
"5e34770a8e3d1b010a757981":"val2",
"5e3477d28e3d1b0140757993":"val3"
}},
{
"id": "456",
"custom_fields":{
"5c30673efc89f7000400001d":"val1",
"5e34770a8e3d1b010a757981":"val2",
"5e3477d28e3d1b0140757993":"val3"
}}]
}
The data is coming from an API and saved to cloud storage, with the output being retrieved and formatted to JSON with this:
response = urllib.request.Request('https://www.test.com')
result = urllib.request.urlopen(response)
resulttext = result.read()
jsonResponse = json.loads(resulttext.decode('utf-8'))
Desired output would be like:
{
"response":[{
"id": "123",
"custom_fields":{
"_5c30673efc89f7000400001d":"val1",
"_5e34770a8e3d1b010a757981":"val2",
"_5e3477d28e3d1b0140757993":"val3"
}},
{
"id": "456",
"custom_fields":{
"_5c30673efc89f7000400001d":"val1",
"_5e34770a8e3d1b010a757981":"val2",
"_5e3477d28e3d1b0140757993":"val3"
}}]
}
If the jsonResponse is like what you've shown in your post then this should do the job fine.
for d in jsonResponse["response"]:
d["custom_fields"] = {f"_{k}": v for k, v in d["custom_fields"].items()}
import pprint
a_dict = {
"id": "123",
"custom_fields":{
"5c30673efc89f7000400001d":"val1",
"5e34770a8e3d1b010a757981":"val2",
"5e3477d28e3d1b0140757993":"val3"
}
}
print('before')
pprint.pprint(a_dict)
for key in a_dict['custom_fields']:
k_new = '_' + key
a_dict['custom_fields'][k_new] = a_dict['custom_fields'].pop(key)
print('after')
pprint.pprint(a_dict)
outputs:
before
{'custom_fields': {'5c30673efc89f7000400001d': 'val1',
'5e34770a8e3d1b010a757981': 'val2',
'5e3477d28e3d1b0140757993': 'val3'},
'id': '123'}
after
{'custom_fields': {'_5c30673efc89f7000400001d': 'val1',
'_5e34770a8e3d1b010a757981': 'val2',
'_5e3477d28e3d1b0140757993': 'val3'},
'id': '123'}
Related
I have some json that I would like to transform from this:
[
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
...
{
"name":"fieldN",
"intValue":"N"
}
]
into this:
{ "field1" : "1",
"field2" : "2",
...
"fieldN" : "N",
}
For each pair, I need to change the value of the name field to a key, and the values of the intValue field to a value. This doesn't seem like flattening or denormalizing. Are there any tools that might do this out-of-the-box, or will this have to be brute-forced? What's the most pythonic way to accomplish this?
parameters = [ # assuming this is loaded already
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
{
"name":"fieldN",
"intValue":"N"
}
]
field_int_map = dict()
for p in parameters:
field_int_map[p['name']] = p['intValue']
yields {'field1': '1', 'field2': '2', 'fieldN': 'N'}
or as a dict comprehension
field_int_map = {p['name']:p['intValue'] for p in parameters}
This works to combine the name attribute with the intValue as key:value pairs, but the result is a dictionary instead of the original input type which was a list.
Use dictionary comprehension:
json_dct = {"parameters":
[
{
"name":"field1",
"intValue":"1"
},
{
"name":"field2",
"intValue":"2"
},
{
"name":"fieldN",
"intValue":"N"
}
]}
dct = {d["name"]: d["intValue"] for d in json_dct["parameters"]}
print(dct)
# {'field1': '1', 'field2': '2', 'fieldN': 'N'}
I have the below python dictionary stored as dictPython
{
"paging": {"count": 10, "start": 0, "links": []},
"elements": [
{
"organizationalTarget~": {
"vanityName": "vv",
"localizedName": "ViV",
"name": {
"localized": {"en_US": "ViV"},
"preferredLocale": {"country": "US", "language": "en"},
},
"primaryOrganizationType": "NONE",
"locations": [],
"id": 109,
},
"role": "ADMINISTRATOR",
},
],
}
I need to get the values of vanityName, localizedName and also the values from name->localized and name->preferredLocale.
I tried dictPython.keys() and it returned dict_keys(['paging', 'elements']).
Also I tried dictPython.values() and it returned me what is inside of the parenthesis({}).
I need to get [vv, ViV, ViV, US, en]
I am writing this in a form of answer, so I can get to explain it better without the comments characters limit
a dict in python is an efficient key/value structure or data type
for example dict_ = {'key1': 'val1', 'key2': 'val2'} to fetch key1 we can do it in 2 different ways
dict_.get(key1) this returns the value of the key in this case val1, this method has its advantage, that if the key1 is wrong or not found it returns None so no exceptions are raised. You can do dict_.get(key1, 'returning this string if the key is not found')
dict_['key1'] doing the same .get(...) but will raise a KeyError if the key is not found
So to answer your question after this introduction,
a dict can be thought of as nested dictionaries and/or objects inside of one another
to get your values you can do the following
# Fetch base dictionary to make code more readable
base_dict = dict_["elements"][0]["organizationalTarget~"]
# fetch name_dict following the same approach as above code
name_dict = base_dict["name"]
localized_dict = name_dict["localized"]
preferred_locale_dict = name_dict ["preferredLocale"]
so now we fetch all of the wanted data in their corresponding locations from your given dictionary, now to print the results, we can do the following
results_arr = []
for key1, key2 in zip(localized_dict, preferredLocale_dict):
results_arr.append(localized_dict.get(key1))
results_arr.append(preferred_locale_dict.get(key2))
print(results_arr)
What about:
dic = {
"paging": {"count": 10, "start": 0, "links": []},
"elements": [
{
"organizationalTarget~": {
"vanityName": "vv",
"localizedName": "ViV",
"name": {
"localized": {"en_US": "ViV"},
"preferredLocale": {"country": "US", "language": "en"},
},
"primaryOrganizationType": "NONE",
"locations": [],
"id": 109,
},
"role": "ADMINISTRATOR",
},
],
}
base = dic["elements"][0]["organizationalTarget~"]
c = base["name"]["localized"]
d = base["name"]["preferredLocale"]
output = [base["vanityName"], base["localizedName"]]
output.extend([c[key] for key in c])
output.extend([d[key] for key in d])
print(output)
outputs:
['vv', 'ViV', 'ViV', 'US', 'en']
So something like this?
[[x['organizationalTarget~']['vanityName'],
x['organizationalTarget~']['localizedName'],
x['organizationalTarget~']['name']['localized']['en_US'],
x['organizationalTarget~']['name']['preferredLocale']['country'],
x['organizationalTarget~']['name']['preferredLocale']['language'],
] for x in s['elements']]
I am using a python and getting the data from an API the data formatted as listed in the example I have a problem getting out Cust_id and name put of the API
Below is one of the things I tried and one of the things answered by SimonR. I am sure I am doing something really dumb right now but I get the error
typeError: the JSON object must be str, bytes or bytearray, not dict. Thank you everyone in advance for your answers
import json
a = {
"count": 5,
"Customers": {
"32759": {
"cust_id": "1234",
"name": "Mickey Mouse"
},
"11053": {
"cust_id": "1235",
"name": "Mini Mouse"
},
"21483": {
"cust_id": "1236",
"name": "Goofy"
},
"12441": {
"cust_id": "1237",
"name": "Pluto"
},
"16640": {
"cust_id": "1238",
"name": "Donald Duck"
}
}
}
d = json.loads(a)
customers = {v["cust_id"]: v["name"] for v in d["Customers"].values()}
Is this what you're trying to do ?
import json
d = json.loads(a)
customers = {v["cust_id"]: v["name"] for v in d["Customers"].values()}
outputs :
{'1234': 'Mickey Mouse',
'1235': 'Mini Mouse',
'1236': 'Goofy',
'1237': 'Pluto',
'1238': 'Donald Duck'}
Well if I understood correctly you can do this:
# d is the API response in your post
# This will give you the list of customers
customers = d['Customers']
Then you can iterate over the customers dictionary and save them to any data structure you want:
# This will print out the name and cust_id
for k, v in customers.items():
print(v['cust_id'], v['name'])
Hope it helps!
import json
# convert json to python dict
response = json.loads(json_string)
# loop through all customers
for key, customer in response['Customers'].items():
# get customer id
customer['cust_id']
# get customer name
custoemr['name']
I have a nested JSON data like this of about 5000 records.
{
"data": {
"attributes": [
{
"alert_type": "download",
"severity_level": "med",
"user": "10.1.1.16"
},
{
"alert_type": "download",
"severity_level": "low",
"user": "10.2.1.18"
}
]
}
}
Now , I need to parse this JSON and get only certain fields in a CSV format. Let's we would need alert_type & user in a CSV format.
I tried to parse this JSON dictionary:
>>> import json
>>> resp = '{"data":{"attributes":[{"alert_type":"download","severity_level":"med","user":"10.1.1.16"},{"alert_type":"download","severity_level":"low","user":"10.2.1.18"}]}}'
>>> user_dict = json.loads(resp)
>>> event_cnt = user_dict['data']['attributes']
>>> print event_cnt[0]['alert_type']
download
>>> print event_cnt[0]['user']
10.1.1.16
>>> print event_cnt[0]['alert_type'] + "," + event_cnt[0]['user']
download,10.1.1.16
>>>
How to get all the elements/values of a particular keys in a CSV format and in a single iteration ?
Output:
download,10.1.1.16
download,10.2.1.18
Simple list comprehension:
>>> jdict=json.loads(resp)
>>> ["{},{}".format(d["alert_type"],d["user"]) for d in jdict["data"]["attributes"]]
['download,10.1.1.16', 'download,10.2.1.18']
Which you can join for your desired output:
>>> li=["{},{}".format(d["alert_type"],d["user"]) for d in jdict["data"]["attributes"]]
>>> print '\n'.join(li)
download,10.1.1.16
download,10.2.1.18
Since {"data":{"attributes": is a list, you can loop over it and print the values for desired keys (d is the user dict):
for item in d['data']['attributes']:
print(item['alert_type'],',',item['user'], sep='')
You could make it somewhat data-driven like this:
import json
DESIRED_KEYS = 'alert_type', 'user'
resp = '''{ "data": {
"attributes": [
{
"alert_type": "download",
"severity_level": "med",
"user": "10.1.1.16"
},
{
"alert_type": "download",
"severity_level": "low",
"user": "10.2.1.18"
}
]
}
}
'''
user_dict = json.loads(resp)
for attribute in user_dict['data']['attributes']:
print(','.join(attribute[key] for key in DESIRED_KEYS))
To handle attributes that don't have all the keys, you could instead use this as the last line which will assign missing values a default value (such as a blank string as shown) instead of it causing an exception.
print(','.join(attribute.get(key, '') for key in DESIRED_KEYS))
Using jq, a one-line solution is straightforward:
$ jq -r '.data.attributes[] | [.alert_type, .user] | #csv' input.json
"download","10.1.1.16"
"download","10.2.1.18"
If you don't want the strings to be quoted, use join(",") instead of #csv
i want to append below json with the data
meta = [{
"output_metadata": {
"api_URL": apiURL,
"query_execution_time": queryExecTime,
"api_execution_time": apiExecTime,
}
}]
jsondata = json.dumps([dict(ix) for ix in Data], default=str)
json data:
{"data": [{"id": "1234", "name": "jhon", "dept": "APA"}]}
meta.append(jsondata)
expected result:
{"output_metadata": {"api_url": "xxxxx", "query_execution_time":"xxxxx", "api_execution_time":"xxxxx"}},{"data": "[{"id": "1234", "name": "jhon", "dept": "APA"}]}
output:
{"output_metadata": {"api_url": "XXXXXX", "query_execution_time": "XXXXXX", "api_execution_time":"XXXXXX" }},{"data": "[{"\id": "1234\", "\name": "\jhon", "\dept": "\APA"}]}
How to remove \ from the final output?
If this thing you wrote above is python the meta variable you create is invalid because before every " you should use an escape character and every time you go in a new line. For example you should write:
meta = ["{\
\"output_metadata\": {\
\"api_URL\": apiURL,\
\"query_execution_time\": queryExecTime,\
\"api_execution_time\": apiExecTime, \
}\
}"]
data = ["{\"data\": {\"id\": \"1234\", \"name\": \"jhon\", \"dept\": \"APA\"}]}"]
meta.append(data)
Where you handle the json's as strings and then append them in one list. Is this what you want?
EDIT: if you run something like
data = [{"id": 1234, "name": "jhon", "dept": "APA" }]
jdata= json.dumps([dict(ix) for ix in data], default=str)
apiURL = 'url'
queryExecTime = 1
apiExecTime = 1
meta = [{ "output_metadata": { "api_url": apiURL,
"query_execution_time": queryExecTime,
"api_execution_time": apiExecTime, } }]
jdata = { "data": jdata }
meta.append(jdata)
res = json.dumps(meta)
print(res)
the result will be:
'[{"output_metadata": {"api_url": "url", "query_execution_time": 1, "api_execution_time": 1}}, {"data": "[{\\"id\\": 1234, \\"name\\": \\"jhon\\", \\"dept\\": \\"APA\\"}]"}]'
The \ are used as escape characters for the ". You see the result as a literal string.