Unable to read MongoDB JSON file from Python

Unable to read MongoDB JSON file from Python - python

I have a mongo collection called test.json,I am unable to read the test.json from python.If i run the below code its giving the error
ValueError: No JSON object could be decoded
from bson import ObjectId
import json
from pprint import pprint
with open('E:/Work/Paths/Production/test.json') as data_file:
data = json.load(data_file)
pprint(data)
test.json
{
"_id" : ObjectId("582c2011fe5dc80c8f2f8077"),
"menuNumber" : NumberInt(14603),
"imageurl" : "menu/test.png",
"imageurl_thumb" : "master/14603_thumb.png"
}
{
"_id" : ObjectId("582c2018fe5dc80c8f2f8078"),
"menuNumber" : NumberInt(14614),
"imageurl" : "menu/test1.png",
"imageurl_thumb" : "master/14614_thumb.png"
}

Actually the test.json file that you have put up is not a valid json it can be treated something like each line is a json object starting with '{' and ending with '}' but not the entire file as a whole. You should read it as a normal file and then apply some techniques to load it as json.

Related

How to to access different JSON keys?

I have a JSON file as the following, and I'm trying to access those different keys with Python.
My JSON file format:
{
"spider":[
{
"t":"Spider-Man: No Way Home (2021)",
"u":"movie\/spider-man-no-way-home-2021",
"i":"c2NJbHBJYWNtbW1ibW12Tmptb1JjdndhY05FbXZhS1A"
},
{
"t":"Spider-Man: Far from Home (2019)",
"u":"movie\/spider-man-far-from-home-2019",
"i":"c2NJbHBJYWNtTGNtdm1qbXZtYm1FRWNtcEV4bWJ4bWJteGo"
},
{
"t":"Spider-Man: Homecoming (2017)",
"u":"movie\/spider-man-homecoming-2017",
"i":"c2NJbHBJYWN2TllqbVRibXVjbWJ2d3h2dGNtam1idmM"
},
{
"t":"Spider-Man: Into the Spider-Verse (2018)",
"u":"movie\/spider-man-into-the-spider-verse-2018",
"i":"c2NJbHBJYWNtVEVtdnZjbXZtdm1qRWNtYnhtR1VURXZjY3c"
},
{
"t":"Spider-Man (2002)",
"u":"movie\/spider-man-2002",
"i":"c2NJbHBJYWNtam1ZanZjbWptakVjbXZtdm1oenh2Y3htSQ"
},
{
"t":"The Spiderwick Chronicles (2008)",
"u":"movie\/the-spiderwick-chronicles-2008",
"i":"c2NJbHBJYWNtVG9Oam1qbWJFY21ibWJ2d1BtYm1tbUhj"
}
]
}
How I can access the t, u, and i keys?
I tried:
print(json_file['t'])
Nothing helped with the error:
Traceback (most recent call last):
File "/home/werz/Desktop/trying/programming/nutflix/flask-nutflix/test.py", line 38, in <module>
print (json_file['t'])
KeyError: 't'

Try indexing for printing like
print(json_file["spider"][1]["t"])
You can try for loop to print all

You can use python's builtin JSON module, and iterate through the spider key of your json object.
import json#import the builtin json library
with open('file_path') as file:#open the file
text=f.read()#read the contents of the file
json_data=json.loads(text)#turn the file into a json object
t=[]#List of the t
u=[]#List of the u
i=[]#List of the i
for film in json_data['spider']:#iterate through films
t.append(film['t'])#store the data for these films
u.append(film['u'])
i.append(film['i'])

You can use Json module to load and read json files. Please find the example where i am getting 't' values. Write the same for 'u' and 'i'.
import json
# Opening JSON file
f = open('myJson.json', )
# returns JSON object as a dictionary
data = json.load(f)
# Iterating through the json list
for i in data['spider'][:]:
print(i['t'])
# Closing file
f.close()
Hope this will help. :)

TypeError: a bytes-like object is required, not '_io.BufferedReader' : While passing file in the request parameter

I want to pass xlsx file as one of the request parameters (file) as below.
fields = {
"file": ('1.xlsx',open("file.xlsx", "rb"),'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet'),
"payload" : ""
}
But when I am passing a file like above I am getting this exception or error in python:
TypeError: a bytes-like object is required, not '_io.BufferedReader'
Can anyone help on this.

open() just opens the file for reading, you need to actually read the file bytes. Cannot tell fully from limited context, but if you don't need base64 then just drop that part out. The MIME type for binary data is "application/octet-stream"
Try this:
import base64
with open("file.xlsx", "rb") as xl_file:
fields = {
"file": ('1.xlsx',base64.encodestring(xl_file.read()),'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet'),
"payload" : ""
}
# do something with fields

pyArango bulkImport_json is complaining about improper indicies

I'm testing the ability to store PyTest results, generated by the json plugin for that test harness, into ArangoDB. I am attempting to import as follows
import pyArango.connection as adbConn
dbConn = adbConn.Connection(...)
db = dbConn['mydb']
collection = db.collections['PyTestResults']
collection.bulkImport_json('/path/to/results.json')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.8/site-packages/pyArango/collection.py", line 777, in bulkImport_json
errorMessage = "At least: %d errors. The first one is: '%s'\n\n more in <this_exception>.data" %
(len(data), data[0]["errorMessage"])
TypeError: string indices must be integers
What isn't making sense is that the JSON file is properly formed. In fact, using the regular Python JSON module, it works just fine:
import json
with open('/path/to/results.json') as fd:
data = json.load(fd)
print(data)
This works. The beginning of the file is
{"report":
{"environment":
{
"Python": "3.6.9", "Platform": "Linux-4.4.0-17763-Microsoft-x86_64-with-Ubuntu-18.04-bionic"
},
It seems that the library, pyArango, is wanting the keys to be integers. I tried this, that is I tried changing "report" to 0. However, this resulted in invalidating the JSON structure.
How is one to use the pyArango library to import JSON? The overall structure of this JSON file doesn't look much different than any of the examples in this page. Any pointers are greatly appreciated.

TypeError when trying to get data from JSON

I would like to print specific data in a JSON but I get the following error:
Traceback (most recent call last):
File "script.py", line 47, in <module>
print(link['data.file.url.short'])
TypeError: 'int' object has no attribute '__getitem__'
Here is the JSON:
{
"status":true,
"data":{
"file":{
"url":{
"full":"https://anonfile.com/y000H35fn3/yuh_txt",
"short":"https://anonfile.com/y000H35fn3"
},
"metadata":{
"id":"y000H35fn3",
"name":"yuh.txt",
"size":{
"bytes":0,
"readable":"0 Bytes"
}
}
}
}
}
I'm trying to get data.file.url.short which is the short value of the url
Here is the script in question:
post = os.system('curl -F "file=#' + save_file + '" https://anonfile.com/api/upload')
link = json.loads(str(post))
print(link['data.file.url.short'])
Thanks

Other than os.system() return value mentioned by #John Gordon I think correct syntax to access data.file.url.short is link['data']['file']['url']['short'], since json.loads returns dict.

os.system() does not return the output of the command; it returns the exit status of the command, which is an integer.
If you want to capture the command's output, see this question.

You are capturing the return code of the process created by os.system which is an integer.
Why dont you use the request class in the urllib module to perform that action within python?
import urllib.request
import json
urllib.request.urlretrieve('https://anonfile.com/api/upload', save_file)
json_dict = json.load(save_file)
print(json_dict['data']['file']['url']['short']) # https://anonfile.com/y000H35fn3
Or if you don't need to save the file you can use the requests library:
import requests
json_dict = requests.get('https://anonfile.com/api/upload').json()
print(json_dict['data']['file']['url']['short']) # https://anonfile.com/y000H35fn3

Extra data: line 2 column 1 - line 341211 -- Error in json.load

I am trying to load a json file using python in pycharm, but seems that the json.load() doesn't quite get my json format.
My json is like this:
{"User_id":"304062","First_name":"client1_first_name ","Last_name":"client1_last_name","Email":"client1emailemailemail#gmail.com","City":"vitoria","Country":"country_code","Reservas":"0","Unsubscribe":"0"}
{"User_id":"1372","First_name"client2firstname".","Last_name":"client2lastname","Email":"tralala#blabla.com","City":"nop","Country":"bra","Reservas":"0","Unsubscribe":"0"}
The code I am using is as it follows:
import json
from pprint import pprint
with open('path_to_my_json/my_json.json',) as data_file:
data = json.load(data_file)
print(data)
pprint(data[0])
The error I am receiving is:
raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 2 column 1 - line 341211 column 1 (char 163 - 58075195)

This is not valid JSON:
{"User_id":"1372","First_name"client2firstname".","Last_name":"client2lastname","Email":"tralala#blabla.com","City":"nop","Country":"bra","Reservas":"0","Unsubscribe":"0"}
Validate your JSON before loading it:
https://jsonlint.com/

It's been a while since being asked, but I just ran across the same problem and solved it like this:
tweets = []
for line in open('tweets.json', 'r'):
tweets.append(json.loads(line))
I found this solution here: https://izziswift.com/python-json-loads-shows-valueerror-extra-data/ (it's solution 2).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Unable to read MongoDB JSON file from Python - python

Actually the test.json file that you have put up is not a valid json it can be treated something like each line is a json object starting with '{' and ending with '}' but not the entire file as a whole. You should read it as a normal file and then apply some techniques to load it as json.

Related

How to to access different JSON keys?

TypeError: a bytes-like object is required, not '_io.BufferedReader' : While passing file in the request parameter

pyArango bulkImport_json is complaining about improper indicies

TypeError when trying to get data from JSON

Extra data: line 2 column 1 - line 341211 -- Error in json.load

Categories

Resources