Code used for extraction from JSON
import json
string = json.loads(data)
string['Body']
import base64
base64.b64decode(string['Body'])
bytes_data = base64.b64decode(string['Body'])
str(bytes_data, encoding='utf-8')
I have a following format that is extracted from JSON
"[{"id":"XXXX_U2_170216:XXXX_U2_170216:FBE_23015.Air","values":[{"v":"46","q":192,"t":"2021-10-28T13:47:59.7880096Z"}]},
{"id":"XXXX_U2_170216:XXXX_U2_170216:FBE_23015.Atomise","values":[{"v":"3.1","q":192,"t":"2021-10-28T13:47:59.7880096Z"}]}]"
Any idea about converting it to actual list
[{"id":"XXXX_U2_170216:XXXX_U2_170216:FBE_23015.Air","values":[{"v":"46","q":192,"t":"2021-10-28T13:47:59.7880096Z"}]},
{"id":"XXXX_U2_170216:XXXX_U2_170216:FBE_23015.Atomise","values":[{"v":"3.1","q":192,"t":"2021-10-28T13:47:59.7880096Z"}]}]
things I have tried :
list(bytearray(bytes_data))
for loop - for this output string, but this is a convoluted way to do it.
some more conversion stuff. looking for something that is compact.
Reverse engineering your question....
Given a JSON file with base64 data
$ cat /tmp/data.json
{
"Body": "W3siaWQiOiJYWFhYX1UyXzE3MDIxNjpYWFhYX1UyXzE3MDIxNjpGQkVfMjMwMTUuQWlyIiwidmFsdWVzIjpbeyJ2IjoiNDYiLCJxIjoxOTIsInQiOiIyMDIxLTEwLTI4VDEzOjQ3OjU5Ljc4ODAwOTZaIn1dfSwKeyJpZCI6IlhYWFhfVTJfMTcwMjE2OlhYWFhfVTJfMTcwMjE2OkZCRV8yMzAxNS5BdG9taXNlIiwidmFsdWVzIjpbeyJ2IjoiMy4xIiwicSI6MTkyLCJ0IjoiMjAyMS0xMC0yOFQxMzo0Nzo1OS43ODgwMDk2WiJ9XX1dCg=="
}
When read and extracted
import json
import base64
with open('/tmp/data.json') as f:
string = json.load(f)
body = string['Body']
Then decoded... a list is returned
import pprint
l = json.loads(base64.b64decode(body)
pprint.pprint(l)
[{'id': 'XXXX_U2_170216:XXXX_U2_170216:FBE_23015.Air',
'values': [{'v': '46', 'q': 192, 't': '2021-10-28T13:47:59.7880096Z'}]},
{'id': 'XXXX_U2_170216:XXXX_U2_170216:FBE_23015.Atomise',
'values': [{'v': '3.1', 'q': 192, 't': '2021-10-28T13:47:59.7880096Z'}]}]
Use the built-in json module:
import json
data = json.loads(bytes_data)
It seems like you have a json inside a json, so load it twice:
import json
import base64
string = json.loads(data)
bytes_data = base64.b64decode(string['Body'])
output = json.loads(bytes_data)
use json load method like this, suppose you have JSON array and want to convert in LIST then do following
import json
array = '{"Items": ["IPhone", "Earphone", "Powerbackup"]}'
data = json.loads(array)
print (data['Items'])
Related
Given a single-lined string of multiple, arbitrary nested json-files without separators, like for example:
contents = r'{"payload":{"device":{"serial":213}}}{"payload":{"device":{"serial":123}}}'
How can contents be parsed into an array of dicts/jsons ? I tried
df = pd.read_json(contents, lines=True)
But only got a ValueError response:
ValueError: Unexpected character found when decoding array value (2)
You can split the string, then parse each JSON string into a dictionary:
import json
contents = r'{"payload":{"device":{"serial":213}}}{"payload":{"device":{"serial":123}}}'
json_strings = contents.replace('}{', '}|{').split('|')
json_dicts = [json.loads(string) for string in json_strings]
Output:
[{'payload': {'device': {'serial': 213}}}, {'payload': {'device': {'serial': 123}}}]
I'm new to Python ,help me how to pass json value as parameter instead of load from filename.Please check below code for reference..
import json
filename = input("Enter your train data filename : ")
print(filename)
with open(filename) as train_data:
train = json.load(train_data)
TRAIN_DATA = []
for data in train:
ents = [tuple(entity) for entity in data['entities']]
TRAIN_DATA.append((data['content'],{'entities':ents}))
with open('{}'.format(filename.replace('json','txt')),'w') as write:
write.write(str(TRAIN_DATA))
In above code json value loaded from file ,instead of file i want to pass json value and load ....
Ex:
train_data=[{"content":"what is the price of polo?","entities":[[21,25,"PrdName"]]}
with open(filename) as train_data:
train = json.load(train_data)
Thanks,
"json value" doesn't mean anything. Json is a text format, not a data type, and what json.loads() do is to transform the json text to python objects - dicts, lists etc - according to the json syntax and what exact type makes sense in Python (json object -> dict, json array -> list etc). You can check this by yourself in your Python shell:
>>> import json
>>> jsonstr = '{"foo":"bar", "baaz":[1, 2, 3]}'
>>> json_data = json.loads(jsonstr)
>>> json_data
{'foo': 'bar', 'baaz': [1, 2, 3]}
>>> type(json_data)
<class 'dict'>
IOW, if you already have the correct Python dict, you have nothing else to do.
I want to extract data from file into a dictionary via json.loads. Example:
{725: 'pitcher, ewer',
726: "plane, carpenter's plane, woodworking plane"}
json.loads can't handle the keys as numbers
Some values are "" and others are '.
Any suggestions?
Code
import requests
url = url
r = requests.get(url)
response = r.text.replace('\n','')
response = re.sub(r':(\d+):*', r'"\1"', response)
The file you supplied seems like a valid Python dict, so I suggest an alternative approach, with literal_eval.
from ast import literal_eval
data = literal_eval(r.text)
print(data[726])
Output: plane, carpenter's plane, woodworking plane
If you still like json, then you can try replacing the numbers with strings using regex.
import re
s = re.sub(r"(?m)^(\W*)(\d+)\b", r'\1"\2"', r.text)
data = json.loads(s)
How do I go about extracting more than one JSON key at a time given this script - the script cycles through a list of message ids and extracts the JSON response. I only want to extract certain keys from the response.
import urllib3
import json
import csv
from progressbar import ProgressBar
import time
pbar = ProgressBar()
base_url = 'https://api.pipedrive.com/v1/mailbox/mailMessages/'
fields = {"include_body": "1", "api_token": "token"}
json_arr = []
http = urllib3.PoolManager()
with open('ten.csv', newline='') as csvfile:
for x in pbar(csv.reader(csvfile, delimiter=' ', quotechar='|')):
r = http.request('GET', base_url + "".join(x), fields=fields)
mails = json.loads(r.data.decode('utf-8'))
json_arr.append(mails['data']['from'][0]['id'])
print(json_arr)
This works as intended. But I want to do the following.
json_arr.append(mails(['data']['from'][0]['id'],['data']['to'][0]['id'])
Which results in TypeError: list indices must be integers or slices, not str
Did you mean:
json_arr.append(mails['data']['from'][0]['id'])
json_arr.append(mails['data']['to'][0]['id'])
The answer already posted looks good but I'll share the one-liner equivalent, using extend() instead of append():
json_arr.extend([mails['data']['from'][0]['id'], mails['data']['to'][0]['id']])
I am trying to read file from a compressed file and convert data into json/ dictionary. But there is unicode issue that I have been struggling for a while. Can anyone help ?
exfile_obj = tar.extractfile(member)
data = exfile_obj.read()
print(type(data)) ## shows str
print(data) ## it is something like: "{u'building': False, u'displayName': u'Tam\\xe1s Kosztol\\xe1nczi', u'changeSet': {u'items': u'comment'}}"
json_obj = json.loads(data) # it is a unicode object.
That data is a string representation of a Python dictionary. You can convert it to a dictionary using ast.literal_eval, and you can convert that dict to a JSON string using json.dumps.
import ast
import json
src = "{u'building': False, u'displayName': u'Tam\\xe1s Kosztol\\xe1nczi', u'changeSet': {u'items': u'comment'}}"
data = ast.literal_eval(src)
print(data)
j = json.dumps(data)
print(j)
output
{'building': False, 'displayName': 'Tamás Kosztolánczi', 'changeSet': {'items': 'comment'}}
{"building": false, "displayName": "Tam\u00e1s Kosztol\u00e1nczi", "changeSet": {"items": "comment"}}