Python dumps "\n" instead of a newline in a json file - python

I have been taking some data through the Graph API of facebook and saving it in the json format in a new file. However, whenever I try to save it in the file, the new lines don't actually show as newlines but they show as "\n". Moreover, a backslash also is appended before any data.
For example,
I want the data to be saved in this format:
{
"feed": {
"data": [
{
"message": "XYZ",
"created_time": "0000-00-0000:00:00+0000",
"id": ABC"
}
But it is being saved in this format (in a single line)
"{\n\"feed\": {\n\"data\": [\n{\n\"message\": \"XYZ\",\n\"created_time\": \"0000-00-0000:00:00+0000\",\n\"id\": \"ABC\"\n}
How do I save it in the first format and not the second?
I have been using this code:
url2 = '{0}?fields={1}&access_token={2}'.format(url,fields,token) #the format in which the API receives the request to get the data which is needed
# token is the access token, url is to connect to fb and fields is the data I want
size = os.path.getsize('new.json') #gets the size of the file
content = requests.get(url2).json() #obtaining the content
obj = json.dumps(content,indent = 4)
with open('new.json','r+') as f: #if file size is 0, delete all content and rewrite with new and old content
if size>0:
f.truncate(0)
json.dump(obj,f)
Even though I have used indent, it does not pretty-print in the way I want it to. Help appreciated!

You're using json.dumps to create a JSON representation of your data. Then you're using json.dump to create a JSON representation of that representation. You're double-JSONifying it. Just use one or the other.

Related

Append json files while iterating through for loop

I want to iterate through some range of pages and save all of them into one json file, that is append page 2 to page 1 and page 3 to already appended page2 to page1.
for i in range(4):
response = requests.post("https://API&page="+str(i))
data = response.json()
my_data = json.load(open( "data.json" ))
my_data.update(my_data)
json.dump(data, open( "data.json", 'w' ))
Basing on some answers from similar question I wrote something like that, but it overwrites instead of appending one page to another.
The json data structure is as follows:
ending with page number that increments every page.
Any idea what I did wrong?
What is it that you are trying to achieve?
You are overwriting the file data.json each time with the result of the response saved in the variable data.
Your code has 2 issues: you are updating a dictionary and you are overwriting a file. Any of the two could solve your problem, depending on what you want to achieve.
It looks like you instead want to save the contents of my_data like that:
json.dump(my_data, open( "data.json", 'w' ))
Anyway, my_data will be a dictionary that gets its contents overwritten each time. Depending on the structure of data, this could not be what you want.
I'll explain better: if your structure is, for any page, something like
{
"username": "retne",
"page": <page-number>
}
my_data will just be equal to the last data page.
Moreover, about the second issue, if you open the file in 'w' mode, you will always overwrite it.
If you will open it in 'a' mode, you will append data to it, obtaining something like this:
{
"username": "retne",
"page": 1
}
{
"username": "pentracchiano",
"page": 2
}
{
"username": "foo",
"page": 3
}
but this is not a valid .json file, because it contains multiple objects with no delimiters.
Try being clearer about your intents and I can provide additional support.
Your code is overwriting the contents of the data.json file on each iteration of the loop. This is because you are using the 'w' mode when calling json.dump, which will overwrite the contents of the file.
To append the data to the file, you can use the 'a' mode instead of 'w' when calling json.dump. This will append the data to the end of the file, rather than overwriting the contents.
Like this
for i in range(4):
response = requests.post("https://API&page="+str(i))
data = response.json()
my_data = json.load(open( "data.json" ))
my_data.update(my_data)
json.dump(data, open( "data.json", 'a' ))

Iterate json, look in text file and append data if not exists

I have this ever changing json file with users/viewers of a live stream who join and leave. I want to append all users who is currently viewing and the new users who join the live stream to a text file. So the script has to run and append new viewers until the live stream ends and the viewer list becomes empty.
My script doesn't behave as expected, so I am doing something horrible wrong why I need your help. :)
First my while loop doesn't work as intended. It doesn't stop when the list is empty.
Second it just keeps running in an infinite loop keeping appending everything over and over again. So the test if it is already in the text file does not work.
I hope you can help me solve this. I am still pretty much a newbie learning, so treat me like a child who needs it well explained :D In advance, thank you for your assistance.
Expected behavior:
Look if stream is active (while any viewers are present) else if empty, break
Iterate the json file.
Open a text file.
Test if the userid's from json are already present in the text file.
If not already present append userid and nickname.
If already present skip them since they are already in the text file.
while data['result']['list'] != '': # keep it running while list is not empty.
with open('test.txt', 'a+') as viewers: # open text file.
for users in data['result']['list']: # iterate json.
for line in viewers: # iterate text file
if users['userId'] in line: # look if userId is already in textfile.
break # all users has already been added to text file. No new users to add.
else: # append users to file.
viewers.write(users['nickName'] + '\n')
viewers.write(str(users['userId']) + '\n')
The json output looks like this and is changing whenever a viewer join or leave the live stream:
{
"code": 1,
"result": {
"liveType": 0,
"watchNum": 140,
"rank": 0,
"duringF": 0,
"list": [
{
"userId": 294782,
"nickName": "user1"
},
{
"userId": 200829,
"nickName": "user2"
}
],
"earning": 4183,
"likeNum": 233
},
"msg": "OK"
}
There are lots of problems with this code.
You're testing for an empty list incorrectly. A list is not a string.
You're not rereading the JSON file, so data never changes.
When you open a file in append mode, you're positioned at the end of the file, so trying to read the file won't read anything. You need to seek to the beginning of the file first.
You're writing the new user to the file for every line that doesn't match. You should wait until the end of the loop, and only write if the user was never found.
users['userId'] in line will match when the userID is a substring of the line. So if the userID is 10, it will match if 101 or 110 are in the line. You need to do an exact match of the line.
users = set(open("test.txt").read().splitlines()[1::2]) # set of userIds from file
with open("test.txt", "a") as viewers:
while True:
with open("json_file.json") as j:
data = json.load(j)
for user in data['result']['list']:
if user['userId'] not in users:
users.add(user['userId'])
viewers.write(user['nickName'] + '\n')
viewers.write(str(user['userId']) + '\n')
An improvement to this would be to check whether the JSON file's modification time has changed since the previous iteration, and skip the rest of the loop if it hasn't.

parsing a deeply nested JSON data present in a .dms file

I am trying to parse a deeply nested json data which is saved as .dms file. I saved some transactions of the file as a .json file. When I try json.load() function to read the .json file. I am getting the error as
JSONDecodeError: Extra data: line 2 column 1 (char 4392)
Opening the .dms file in text editor, I copied 3 transactions from it and saved it as .json file. The transactions in the file are not separated by commas. It is separated by new lines. When I used 1 transaction of it as a .json file and used json.load() function, it successfully read. But when I try the json file with 3 transactions, its showing error.
import json
d = json.load(open('t3.json')) or
with open('t3.json') as f:
data = json.load(f)
print(data)
the example transaction is :
{
"header":{
"msgType":"SOURCE_EVENT",
},
"content":{
"txntype":"ums",
"ISSUE":{
"REQUEST":{
"messageTime":"2019-06-06 21:54:11.492",
"Code":"655400",
},
"RESPONSE":{
"Time":"2019-06-06 21:54:11.579",
}
},
"DATA":{
"UserId":"021",
},
{header:{.....}}}
{header:{......}}}
This is how my json data from an API looks like. I wrote it in a readable way. But its all continuously written and whenever a header starts it starts from a new line. and the .dms file has 3500 transactions. the two transactions are not even seperated by commas. Its separated by new lines. But within a transaction there are extra spaces in a value. for eg; "company": "Target Chips 123 CA"
The output I need:
I need to make a csv by extracting values of keys messageType, messageTime, userid from the data for each transaction.
Please help out to clear the error and suggest ways to extract the data I need from these transactions for every transaction and put in .csv file for me to do further analysis and machine learning modeling.
If each object is contained within a single line, then read one line at a time and decode each line separately:
with open(fileName, 'r') as file_to_read:
for line in filetoread:
json_line = json.loads(line)
If objects are spread over multiple lines, then ideally try and fix the source of the data, otherwise use my library jsonfinder. Here is an example answer that may help.

Python: Json.load large json file MemoryError

I'm trying to load a large JSON File (300MB) to use to parse to excel. I just started running into a MemoryError when I do a json.load(file). Questions similar to this have been posted but have not been able to answer my specific question. I want to be able to return all the data from the json file in one block like I did in the code. What is the best way to do that? The Code and json structure are below:
The code looks like this.
def parse_from_file(filename):
""" proceed to load the json file that given and verified,
it and returns the data that was in the json file so it can actually be read
Args:
filename (string): full branch location, used to grab the json file plus '_metrics.json'
Returns:
data: whatever data is being loaded from the json file
"""
print("STARTING PARSE FROM FILE")
with open(filename) as json_file:
d = json.load(json_file)
json_file.close()
return d
The structure looks like this.
[
{
"analysis_type": "test_one",
"date": 1505900472.25,
"_id": "my_id_1.1.1",
"content": {
.
.
.
}
},
{
"analysis_type": "test_two",
"date": 1605939478.91,
"_id": "my_id_1.1.2",
"content": {
.
.
.
}
},
.
.
.
]
Inside "content" the information is not consistent but has 3 distinct but different possible template that can be predicted based of analysis_type.
i did like this way, hope it will helps you. and maybe you need skip the 1th line "[". and remove "," at a line end if exists "},".
with open(file) as f:
for line in f:
while True:
try:
jfile = ujson.loads(line)
break
except ValueError:
# Not yet a complete JSON value
line += next(f)
# do something with jfile
If all the tested libraries are giving you memory problems my approach would be splitting the file into one per each object inside the array.
If the file has the newlines and padding as you said in the OP I owuld read by line, discarding if it is [ or ] writting the lines to new files every time you find a }, where you also need to remove the commas. Then try to load everyfile and print a message when you end reading each one to see where it fails, if it does.
If the file has no newlines or is not properly padded you would need to start reading char by char keeping too counters, increasing each of them when you find [ or { and decreasing them when you find ] or } respectively. Also take into account that you may need to discard any curly or square bracket that is inside a string, though that may not be needed.

Python 3 how to add text to chosen line in a txt document using append

In my document I would like to add scores to a chosen line using:
file = open('scores.txt', 'a')
because I don't want to add all the lines to a list and change that one line.
I'm basically asking if you can add something to .write() or .writelines() to choose the line that it writes onto. If not, any other ways? If not, maybe a simple way of adding to a list then changing a line.
basically what the txt document looks like:
User 1 scores:
15,
User 2 Scores:
20,
The best way to store data to store some kind of user informations is Serialization. You have several modules including Pickle (which serialize the object into a binary file, and later can deserialize this object by reading the file, giving you the exact same object as it was before serialization), or Json.
If you're looking for a good solution, but still want to be able to clearly read data (Since Pickle is storing the data as a binary entity object, you can't read it clearly), use JSON.
It's very simple for this basic use :
If you have for example this same structure as you showed before:
User 1 scores:
15,
User 2 scores:
20,
You can make it as a dictionary, for example like that:
scores = {
'User 1': 15,
'User 2': 20,
}
Which makes it even easier to edit and use, since to access the User 1 score, all you need to type is scores['User 1'].
Now, for the serialization part of this dictionary, to save it into a file, you can obtain a serialized string representing the dict using the json.dumps(<your array>) function. Used like that:
import json
dictionary = { 'User 1': 15, 'User 2': 20 }
print(json.dumps(dictionary))
The print will show you how it's represented. As a JSON compliant file.
You just need a simple file.write() to save it, and a file.read() to retrieve it.
(After test, using json.dumps on your example give me that: {"User 2": 20, "User 1": 15})
To retrieve that data, you'll need to use the json.loads(<json string>) function.
import json
# As you can see, string representation of json.
dictionary = json.loads("{"User 2": 20, "User 1": 15}")
Your dict is now loaded & saved !
More infos on File IO here: http://www.tutorialspoint.com/python3/python_files_io.htm
More infos on JSON Python native API here: https://docs.python.org/3/library/json.html
Hope that helps !
EDIT:
For your information, Python native file IO module only give 1 function to change the position of the read buffer in the file, and that function is file.seek() function, that moves you into a specific byte in the file. (you need to pass the byte address as a function parameter, for example file.seek(0, 0) will position you at the beginning of the file).
You can get your position in the file using file.tell() function.

Categories