I want to parse a json file in python. I don't know the content of the file. I downloaded this file from a website in json format.
As per my knowledge to parse a json file we need this code
import json
sourcefile=open("News_Category_Dataset_v2.json","r")
json_data=json.load(sourcefile)
print (json_data)
But I got this error as describe below. jsonparse.py is my file name which is save in my computer d:/algorithm
D:\python\envs\algorithms\python.exe D:/algorithms/jsonparse.py
Traceback (most recent call last):
File "D:/algorithms/jsonparse.py", line 4, in <module>
json_data=json.load(sourcefile)
File "D:\python\envs\algorithms\lib\json\__init__.py", line 299, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "D:\python\envs\algorithms\lib\json\__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "D:\python\envs\algorithms\lib\json\decoder.py", line 342, in decode
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 366)
Process finished with exit code 1
How could I fix the problem?
Your file is not json. but it has lines where each one of them is json.
This snippet should help you
import json
json_list = []
for i in open('test.json'):
json_line = json.loads(i)
json_list.append(json_line)
print(json_list)
Related
I'd like to download a YAML file from Gitlab using Python requests. I'm almost there, but I havent quite got to the cigar stage.
I am doing the following :-
GITLAB_FILE ="https://my_url/api/v4/projects/my_id/repository/files/path_and_filename.yaml/raw?ref=master&private_token=mytoken"
g=requests.request("GET", GITLAB_FILE, verify=False)
print(g.json())
Now, it all works in as much as I can get to the file ok, but when it comes to accessing the data the print(g.json()) throws an error, but then continues to print out the file contents as I'd hoped. The error is :-
Traceback (most recent call last):
File "/home/myproj/edm/lib/python3.7/site-packages/requests/models.py", line 910, in json
return complexjson.loads(self.text, **kwargs)
File "/usr/lib/python3.7/json/__init__.py", line 348, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.7/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./http_test.py", line 73, in <module>
print (g.json())
File "/home/myproj/edm/lib/python3.7/site-packages/requests/models.py", line 917, in json
raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: [Errno Expecting value]
Once this is printed, it then proceeds to print out the contents of my yaml file correctly.
I suspect its something to do with the
print(g.json())
expecting a json format, and encountering a yaml file?
Any pointers as to how I can get the file contents error free would be helpful.
I've answered my own question.
Quite simply instead of
print(g.json())
I used
print(g.text)
Which did exactly what it said on the tin.
I have tried to remove the first key and value from a json file using python. While running the program, I came across error, they are mentioned as follows:
import json
with open('testing') as json_data:
data = json.load(json_data)
for element in data:
del element['url']
Error:
Traceback (most recent call last):
File "p.py", line 3, in <module>
data = json.load(json_data)
File "/usr/lib/python3.5/json/__init__.py", line 268, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "/usr/lib/python3.5/json/__init__.py", line 319, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.5/json/decoder.py", line 342, in decode
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 180)
The file input is something like this:
{"url":"example.com","original_url":"http://example.com","text":"blah...blah"...}
{"url":"example1.com","original_url":"http://example1.com","text":"blah...blah"...}
.
.
.
.
{"url":"exampleN.com","original_url":"http://exampleN.com","text":"blah...blah"...}
I don't know why is this problem occurring?
you have to read the file line by line, since it's rather lines of json data than valid json structure
Here's my line-by-line proposal
import json
data = []
with open('testing') as f:
for json_data in f:
element = json.loads(json_data) # load from current line as string
del element['url']
data.append(element)
Valid json would be in that case:
[{"url":"example.com","original_url":"http://example.com","text":"blah...blah"...},
{"url":"example1.com","original_url":"http://example1.com","text":"blah...blah"...}]
As per my comment, the input file is not valid JSON.
This answer multiple json dictionaries python tells you how to successfully read such a file, which consists of a concatenation of valid JSON entities rather tyan a JSON list of such entities.
The alternative if and only if you can rely on the line-structure of the file, is to read line by line and decode each line separately.
json_data is an instance of your file, not the content. so first apply read() on the instance for getting data. and second, write the full file name if you are reading a JSON file. your file should be testing.json. and third specify the mode of file opening mode. you can use this code
import json
with open('testing.json', 'r') as json_data:
data = json.load(json_data.read())
for element in data:
del element['url']
I am trying to manipulate the Chrome Bookmarks file in Python, but have fallen at the first hurdle. I have this code:
import json
import os
input_filename = os.getenv("APPDATA") + "\..\Local\Google\Chrome\User Data\Default\history"
with open(input_filename) as data_file:
bookmark_data = json.load(data_file)
When I run this code I get the following error:
Traceback (most recent call last):
File "C:/Users/David/PycharmProjects/MyBookmarks/myBookmarks.py", line 17, in <module>
bookmark_data = json.load(data_file)
File "C:\Python27\lib\json\__init__.py", line 290, in load
**kw)
File "C:\Python27\lib\json\__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "C:\Python27\lib\json\decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Python27\lib\json\decoder.py", line 384, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
Process finished with exit code 1
I am not that familiar with JSON, but given this is the chrome bookmarks file, I doubt it is a problem with the structure of the file, and I am stumped as to what to try next! Any ideas?
Thanks in advance.
Bookmarks is the name of the JSON file which you want to open
History is a database file which contains information on URL visits
and files downloaded
Specifying the encoding worked for me
with open(input_filename, "r", encoding='utf-8') as data_file:
bookmark_data = json.load(data_file)
Inspiration: https://pypi.org/project/chrome-bookmarks/
I'm getting the following error when attempting to open a json file.
Traceback (most recent call last):
File "C:\Python34\test.py", line 5, in <module>
data = json.load(data_file)
File "C:\Python34\lib\json\__init__.py", line 268, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "C:\Python34\lib\json\__init__.py", line 318, in loads
return _default_decoder.decode(s)
File "C:\Python34\lib\json\decoder.py", line 346, in decode
raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 2 column 1 - line 8300 column 1 (char 157 - 30292811)
This is what I"m doing to open the file in idle:
import json
with open('three_minutes_tweets.json','r', encoding="utf-8") as data_file:
data = json.load(data_file)
print(data_file)
The file is a tweet sample file and looks likes simple dictionaries of dictionaries. Thank you
The error message is telling you exactly what the problem is. There is extra data starting at character 157. In other words, you have invalid JSON data. There is nothing wrong with your code.
I am trying to parse some text files containing JSON objects in Python using the json.load() method. It's working for one set of them, but for this one it will not:
{
"mapinfolist":{
"mapinfo":[
{"sku":"00028-0059","price":"38.35","percent":"50","basepercent":"50","exact":0,"match":0,"roundup":0}
,{"sku":"77826-7230","price":"4.18","percent":"60","basepercent":"60","exact":1,"match":0,"roundup":0}
,{"sku":"77827-1310","price":"2.36","percent":"60","basepercent":"60","exact":1,"match":0,"roundup":0}
,{"sku":"77827-2020","price":"2.36","percent":"60","basepercent":"60","exact":1,"match":0,"roundup":0}
,{"sku":"77827-3360","price":"2.36","percent":"60","basepercent":"60","exact":1,"match":0,"roundup":0}
,{"sku":"77827-4060","price":"2.36","percent":"60","basepercent":"60","exact":1,"match":0,"roundup":0}
,{"sku":"77827-4510","price":"2.36","percent":"60","basepercent":"60","exact":1,"match":0,"roundup":0}
,{"sku":"77827-7230","price":"2.36","percent":"60","basepercent":"60","exact":1,"match":0,"roundup":0}
],
"count":2
}
}
It is in a file called 'map.txt' - I open it using open('map.txt') and then call json.load(). When I run my test program (test.py), the following error trace is generated:
Traceback (most recent call last):
File "test.py", line 28, in <module>
main()
File "test.py", line 23, in main
map_list = json.load(f1)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/json/__init__.py", line 268, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/json/__init__.py", line 318, in loads
return _default_decoder.decode(s)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/json/decoder.py", line 343, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/json/decoder.py", line 361, in raw_decode
raise ValueError(errmsg("Expecting value", s, err.value)) from None
ValueError: Expecting value: line 1 column 1 (char 0)
The JSON object is valid - when I put it into https://www.jsoneditoronline.org/ it is parsed and displayed correctly, so I am having trouble identifying what could be stopping it from working when I try to do it in Python. Any advice would be much appreciated. Thanks!
EDIT: Here's my code.
import json
def main():
with open('map.txt') as f1:
map_list = json.load(f1)
Trying map_list = json.loads(f1.read()) also does not work and gives me an almost identical error trace.
EDIT - RESOLVED:
I just copied and pasted FROM map.txt into a new TextEdit file map2.txt and used the new file instead, and it works now. I copied directly from the old file and made no changes - the only difference is that it is a different file. I can't make heads or tails of why that would be - any ideas? I would like to understand what may have happened so I can avoid the problem in the future.
Does the following solution work for you?
import json
f = open("map.txt")
map = json.loads(f.read())
Python Docs
maybe try to read all the file to string and then use json.loads
def yourfunc():
file = open('map.txt')
json_string = file.read()
map = json.loads(json_string)