What is the issue with Python while processing my JSON file? - python

I have tried to remove the first key and value from a json file using python. While running the program, I came across error, they are mentioned as follows:
import json
with open('testing') as json_data:
data = json.load(json_data)
for element in data:
del element['url']
Error:
Traceback (most recent call last):
File "p.py", line 3, in <module>
data = json.load(json_data)
File "/usr/lib/python3.5/json/__init__.py", line 268, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "/usr/lib/python3.5/json/__init__.py", line 319, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.5/json/decoder.py", line 342, in decode
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 180)
The file input is something like this:
{"url":"example.com","original_url":"http://example.com","text":"blah...blah"...}
{"url":"example1.com","original_url":"http://example1.com","text":"blah...blah"...}
.
.
.
.
{"url":"exampleN.com","original_url":"http://exampleN.com","text":"blah...blah"...}
I don't know why is this problem occurring?

you have to read the file line by line, since it's rather lines of json data than valid json structure
Here's my line-by-line proposal
import json
data = []
with open('testing') as f:
for json_data in f:
element = json.loads(json_data) # load from current line as string
del element['url']
data.append(element)
Valid json would be in that case:
[{"url":"example.com","original_url":"http://example.com","text":"blah...blah"...},
{"url":"example1.com","original_url":"http://example1.com","text":"blah...blah"...}]

As per my comment, the input file is not valid JSON.
This answer multiple json dictionaries python tells you how to successfully read such a file, which consists of a concatenation of valid JSON entities rather tyan a JSON list of such entities.
The alternative if and only if you can rely on the line-structure of the file, is to read line by line and decode each line separately.

json_data is an instance of your file, not the content. so first apply read() on the instance for getting data. and second, write the full file name if you are reading a JSON file. your file should be testing.json. and third specify the mode of file opening mode. you can use this code
import json
with open('testing.json', 'r') as json_data:
data = json.load(json_data.read())
for element in data:
del element['url']

Related

Can only load json file from python terminal and not from script, neither works with ujson package

I originally only used ujson as follows. This code has been working for sometime and I'm not sure how I broke it.
import ujson as json
with open('performance_data.json', 'r') as f:
data = json.load(f)
It just today started throwing a ValueError
ValueError: Expected object or value
I tried loading the .json file using python in terminal with ujson and I got the same error. Then I tried loading it using json package instead of ujson, and it worked fine, in python terminal. So I added in a try except to use json instead of ujson so now my code looks like this
import json
import ujson
with open('performance_data.json', 'r' as f:
try:
data = ujson.load(f)
except ValueError:
data = json.load(f)
However this is still giving me problems.
json Traceback:
File "live_paper.py", line 141, in main
data = json.load(f)
File "/usr/lib/python3.8/json/__init__.py", line 293, in load
return loads(fp.read(),
File "/usr/lib/python3.8/json/__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.8/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
I would normally assume that this means the file is empty. However I can run the following code from script and see the file content.
with open('performance_data.json', 'r') as f:
print(f.readline())
I've checked os.getcwd() is correct from script.
So summarizing, json.load(f) works from terminal but not when the script is ran. In terminal I can sift through my data and everything looks as it should.
ujson.load() works neither in terminal or from script and json.load() doesnt work from script.
The problem is there is no valid json in your file for the module to load. You can verify this by trying to print the contents of the file using f.read() inside of your with statement. I know you have said that you tried this but there is a difference between the file not being empty and having valid json. The function being called will fail if there is not a valid json object found.

Parsing a json file in python create difficulties

I want to parse a json file in python. I don't know the content of the file. I downloaded this file from a website in json format.
As per my knowledge to parse a json file we need this code
import json
sourcefile=open("News_Category_Dataset_v2.json","r")
json_data=json.load(sourcefile)
print (json_data)
But I got this error as describe below. jsonparse.py is my file name which is save in my computer d:/algorithm
D:\python\envs\algorithms\python.exe D:/algorithms/jsonparse.py
Traceback (most recent call last):
File "D:/algorithms/jsonparse.py", line 4, in <module>
json_data=json.load(sourcefile)
File "D:\python\envs\algorithms\lib\json\__init__.py", line 299, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "D:\python\envs\algorithms\lib\json\__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "D:\python\envs\algorithms\lib\json\decoder.py", line 342, in decode
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 366)
Process finished with exit code 1
How could I fix the problem?
Your file is not json. but it has lines where each one of them is json.
This snippet should help you
import json
json_list = []
for i in open('test.json'):
json_line = json.loads(i)
json_list.append(json_line)
print(json_list)

Trouble parsing JSON object in Python

I am trying to parse some text files containing JSON objects in Python using the json.load() method. It's working for one set of them, but for this one it will not:
{
"mapinfolist":{
"mapinfo":[
{"sku":"00028-0059","price":"38.35","percent":"50","basepercent":"50","exact":0,"match":0,"roundup":0}
,{"sku":"77826-7230","price":"4.18","percent":"60","basepercent":"60","exact":1,"match":0,"roundup":0}
,{"sku":"77827-1310","price":"2.36","percent":"60","basepercent":"60","exact":1,"match":0,"roundup":0}
,{"sku":"77827-2020","price":"2.36","percent":"60","basepercent":"60","exact":1,"match":0,"roundup":0}
,{"sku":"77827-3360","price":"2.36","percent":"60","basepercent":"60","exact":1,"match":0,"roundup":0}
,{"sku":"77827-4060","price":"2.36","percent":"60","basepercent":"60","exact":1,"match":0,"roundup":0}
,{"sku":"77827-4510","price":"2.36","percent":"60","basepercent":"60","exact":1,"match":0,"roundup":0}
,{"sku":"77827-7230","price":"2.36","percent":"60","basepercent":"60","exact":1,"match":0,"roundup":0}
],
"count":2
}
}
It is in a file called 'map.txt' - I open it using open('map.txt') and then call json.load(). When I run my test program (test.py), the following error trace is generated:
Traceback (most recent call last):
File "test.py", line 28, in <module>
main()
File "test.py", line 23, in main
map_list = json.load(f1)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/json/__init__.py", line 268, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/json/__init__.py", line 318, in loads
return _default_decoder.decode(s)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/json/decoder.py", line 343, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/json/decoder.py", line 361, in raw_decode
raise ValueError(errmsg("Expecting value", s, err.value)) from None
ValueError: Expecting value: line 1 column 1 (char 0)
The JSON object is valid - when I put it into https://www.jsoneditoronline.org/ it is parsed and displayed correctly, so I am having trouble identifying what could be stopping it from working when I try to do it in Python. Any advice would be much appreciated. Thanks!
EDIT: Here's my code.
import json
def main():
with open('map.txt') as f1:
map_list = json.load(f1)
Trying map_list = json.loads(f1.read()) also does not work and gives me an almost identical error trace.
EDIT - RESOLVED:
I just copied and pasted FROM map.txt into a new TextEdit file map2.txt and used the new file instead, and it works now. I copied directly from the old file and made no changes - the only difference is that it is a different file. I can't make heads or tails of why that would be - any ideas? I would like to understand what may have happened so I can avoid the problem in the future.
Does the following solution work for you?
import json
f = open("map.txt")
map = json.loads(f.read())
Python Docs
maybe try to read all the file to string and then use json.loads
def yourfunc():
file = open('map.txt')
json_string = file.read()
map = json.loads(json_string)

Reading JSON from a file [duplicate]

This question already has answers here:
How can I parse (read) and use JSON?
(5 answers)
Closed last month.
A simple looking, easy statement is throwing some errors in my face.
I have a JSON file called strings.json like this:
"strings": [{"-name": "city", "#text": "City"}, {"-name": "phone", "#text": "Phone"}, ...,
{"-name": "address", "#text": "Address"}]
I want to read the JSON file, just that for now. I have these statements which I found out, but it's not working:
import json
from pprint import pprint
with open('strings.json') as json_data:
d = json.loads(json_data)
json_data.close()
pprint(d)
The error displayed on the console was this:
Traceback (most recent call last):
File "/home/.../android/values/manipulate_json.py", line 5, in <module>
d = json.loads(json_data)
File "/usr/lib/python2.7/json/__init__.py", line 326, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
TypeError: expected string or buffer
[Finished in 0.1s with exit code 1]
If I use json.load instead of json.loads, I get this error:
Traceback (most recent call last):
File "/home/.../android/values/manipulate_json.py", line 5, in <module>
d = json.load(json_data)
File "/usr/lib/python2.7/json/__init__.py", line 278, in load
**kw)
File "/usr/lib/python2.7/json/__init__.py", line 326, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 369, in decode
raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 829 column 1 - line 829 column 2 (char 18476 - 18477)
[Finished in 0.1s with exit code 1]
The json.load() method (without "s" in "load") can read a file directly:
import json
with open('strings.json') as f:
d = json.load(f)
print(d)
You were using the json.loads() method, which is used for string arguments only.
The error you get with json.loads is a totally different problem. In that case, there is some invalid JSON content in that file. For that, I would recommend running the file through a JSON validator.
There are also solutions for fixing JSON like for example How do I automatically fix an invalid JSON string?.
Here is a copy of code which works fine for me,
import json
with open("test.json") as json_file:
json_data = json.load(json_file)
print(json_data)
with the data
{
"a": [1,3,"asdf",true],
"b": {
"Hello": "world"
}
}
You may want to wrap your json.load line with a try catch, because invalid JSON will cause a stacktrace error message.
The problem is using the with statement:
with open('strings.json') as json_data:
d = json.load(json_data)
pprint(d)
The file is going to be implicitly closed already. There is no need to call json_data.close() again.
In Python 3, we can use the below method.
Read from a file and convert to JSON
import json
from pprint import pprint
# Considering "json_list.json" is a JSON file
with open('json_list.json') as fd:
json_data = json.load(fd)
pprint(json_data)
The with statement automatically closes the opened file descriptor.
String to JSON
import json
from pprint import pprint
json_data = json.loads('{"name" : "myName", "age":24}')
pprint(json_data)
You can use the Pandas library to read the JSON file.
import pandas as pd
df = pd.read_json('strings.json', lines=True)
print(df)
To add on this, today you are able to use pandas to
import JSON: pandas.read_json
You may want to do a careful use of the orient parameter.
def read_JSON():
with open("FILE PATH", "r") as i:
JSON_data = i.read()
print(JSON_data)

Parsing through a JSON file with Python 2.x

I'm currently trying to parse through a text file containing a number of Facebook chat fragments. The fragments are stored as below:-
{"t":"msg","c":"p_100002239013747","s":14,"ms":[{"msg":{"text":"2what is the best restauran
t in hong kong? ","time":1303115825598,"clientTime":1303115824391,"msgID":"1862585188"},"from":10000
2239013747,"to":635527479,"from_name":"David Robinson","from_first_name":"David","from_gender":1,"to_name":"Jason Yeung","to_first_name":"Jason","to_gender":2,"type":"msg"}]}
I've tried a number of ways to parse / open the JSON file but to no avail. Here is what I've tried thusfar:-
import json
data = []
with open("C:\\Users\\Me\\Desktop\\facebookchat.txt", 'r') as json_string:
for line in json_string:
data.append(json.loads(line))
error:
Traceback (most recent call last):
File "C:/Users/Amy/Desktop/facebookparser.py", line 6, in <module>
data.append(json.loads(line))
File "C:\Program Files\Python27\lib\json\__init__.py", line 326, in loads
return _default_decoder.decode(s)
File "C:\Program Files\Python27\lib\json\decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Program Files\Python27\lib\json\decoder.py", line 382, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Invalid control character at: line 1 column 91 (char 91)
and also:
import json
with open("C:\\Users\\Me\\Desktop\\facebookchat.txt", 'r') as json_file:
data = json.load(json_file)
... but I get exactly the same error as above.
Any suggestions? I've searched previous posts on here and tried the alternative solutions but to no avail. I'm aware I need to treat it as a dictionary file with for example, 'time' being a key and '1303115825598' being the respective time value but if I can't even process the json file into memory, there's no way I can parse it.
Where am I going wrong? Thanks
Your data contains newlines where JSON would not allow these. You'll have to stitch the lines back together again:
data = []
with open("C:\\Users\\Me\\Desktop\\facebookchat.txt", 'r') as json_string:
partial = ''
for line in json_string:
partial += line.rstrip('\n')
try:
data.append(json.loads(partial))
partial = ''
except ValueError:
continue # Not yet a complete JSON value
The code collects lines into partial, but minus the newline, and tries to decode the JSON. If that succeeds, partial is set to the empty string again to process the next entry. If it fails, we loop to the next line to append, until there is a complete JSON value to decode.

Categories