Iterate json, look in text file and append data if not exists - python

I have this ever changing json file with users/viewers of a live stream who join and leave. I want to append all users who is currently viewing and the new users who join the live stream to a text file. So the script has to run and append new viewers until the live stream ends and the viewer list becomes empty.
My script doesn't behave as expected, so I am doing something horrible wrong why I need your help. :)
First my while loop doesn't work as intended. It doesn't stop when the list is empty.
Second it just keeps running in an infinite loop keeping appending everything over and over again. So the test if it is already in the text file does not work.
I hope you can help me solve this. I am still pretty much a newbie learning, so treat me like a child who needs it well explained :D In advance, thank you for your assistance.
Expected behavior:
Look if stream is active (while any viewers are present) else if empty, break
Iterate the json file.
Open a text file.
Test if the userid's from json are already present in the text file.
If not already present append userid and nickname.
If already present skip them since they are already in the text file.
while data['result']['list'] != '': # keep it running while list is not empty.
with open('test.txt', 'a+') as viewers: # open text file.
for users in data['result']['list']: # iterate json.
for line in viewers: # iterate text file
if users['userId'] in line: # look if userId is already in textfile.
break # all users has already been added to text file. No new users to add.
else: # append users to file.
viewers.write(users['nickName'] + '\n')
viewers.write(str(users['userId']) + '\n')
The json output looks like this and is changing whenever a viewer join or leave the live stream:
{
"code": 1,
"result": {
"liveType": 0,
"watchNum": 140,
"rank": 0,
"duringF": 0,
"list": [
{
"userId": 294782,
"nickName": "user1"
},
{
"userId": 200829,
"nickName": "user2"
}
],
"earning": 4183,
"likeNum": 233
},
"msg": "OK"
}

There are lots of problems with this code.
You're testing for an empty list incorrectly. A list is not a string.
You're not rereading the JSON file, so data never changes.
When you open a file in append mode, you're positioned at the end of the file, so trying to read the file won't read anything. You need to seek to the beginning of the file first.
You're writing the new user to the file for every line that doesn't match. You should wait until the end of the loop, and only write if the user was never found.
users['userId'] in line will match when the userID is a substring of the line. So if the userID is 10, it will match if 101 or 110 are in the line. You need to do an exact match of the line.
users = set(open("test.txt").read().splitlines()[1::2]) # set of userIds from file
with open("test.txt", "a") as viewers:
while True:
with open("json_file.json") as j:
data = json.load(j)
for user in data['result']['list']:
if user['userId'] not in users:
users.add(user['userId'])
viewers.write(user['nickName'] + '\n')
viewers.write(str(user['userId']) + '\n')
An improvement to this would be to check whether the JSON file's modification time has changed since the previous iteration, and skip the rest of the loop if it hasn't.

Related

Append json files while iterating through for loop

I want to iterate through some range of pages and save all of them into one json file, that is append page 2 to page 1 and page 3 to already appended page2 to page1.
for i in range(4):
response = requests.post("https://API&page="+str(i))
data = response.json()
my_data = json.load(open( "data.json" ))
my_data.update(my_data)
json.dump(data, open( "data.json", 'w' ))
Basing on some answers from similar question I wrote something like that, but it overwrites instead of appending one page to another.
The json data structure is as follows:
ending with page number that increments every page.
Any idea what I did wrong?
What is it that you are trying to achieve?
You are overwriting the file data.json each time with the result of the response saved in the variable data.
Your code has 2 issues: you are updating a dictionary and you are overwriting a file. Any of the two could solve your problem, depending on what you want to achieve.
It looks like you instead want to save the contents of my_data like that:
json.dump(my_data, open( "data.json", 'w' ))
Anyway, my_data will be a dictionary that gets its contents overwritten each time. Depending on the structure of data, this could not be what you want.
I'll explain better: if your structure is, for any page, something like
{
"username": "retne",
"page": <page-number>
}
my_data will just be equal to the last data page.
Moreover, about the second issue, if you open the file in 'w' mode, you will always overwrite it.
If you will open it in 'a' mode, you will append data to it, obtaining something like this:
{
"username": "retne",
"page": 1
}
{
"username": "pentracchiano",
"page": 2
}
{
"username": "foo",
"page": 3
}
but this is not a valid .json file, because it contains multiple objects with no delimiters.
Try being clearer about your intents and I can provide additional support.
Your code is overwriting the contents of the data.json file on each iteration of the loop. This is because you are using the 'w' mode when calling json.dump, which will overwrite the contents of the file.
To append the data to the file, you can use the 'a' mode instead of 'w' when calling json.dump. This will append the data to the end of the file, rather than overwriting the contents.
Like this
for i in range(4):
response = requests.post("https://API&page="+str(i))
data = response.json()
my_data = json.load(open( "data.json" ))
my_data.update(my_data)
json.dump(data, open( "data.json", 'a' ))

How to load text file in python?

I have a text file like this:
[0.52, '1_1man::army'], stack
[0.45, '3_3man::army'], flow
[0.52, '1_1man::army'], testing
[0.52, '2_2man:army'], expert
How can I load into the file and print all the values for
'1_1man::army', '3_3man::army', '1_1man::army' and '2_2man:army'
My code:
text = open("text.txt", "r").readlines()
print(text[1])
Then to implement the solutions some good people have shared. I cant use their codes since the file I have now is different from the one I posted(I wish to try out this new example).
How can I arrange the list according to similar item in certain location
If that format is rigid throughout the file. You could simply use split() to extract those values in between quotes
with open("text.txt", "r") as file:
for line in file:
print (line.split("'")[1])
line.split("'") slices the string up whenever it sees a '. In your case, every line would be sliced into a list of 3 elements:
[0.52,
1_1man::army
], stack
You want the middle one, which has index [1]. So line.split("'")[1] gives you exactly that.
An easier approach to this would to make a json file instead. Python was a good built in json reading library. This is what the json would look like:
{
"1_1man::army": "stack",
"3_3man::army": "flow",
"1_1man::army": "testing",
"2_2man::army": "expert",
}
You would enter this and change the file extension from .txt to .json. You can read it like this:
import json
with open("YourText/JsonFileHere.json") as f:
data = json.load(f)
// Get first 1_1man::army value
data[0]["1_1man::army"]
// Get 3_3man::army value
data["3_3man::army"]
// Get second 1_1man::army value
data["1_1man::army"]
// Get 1_1man::army value
data[1]["1_1man::army"]
// in order to add things to the json do this:
data["What you want the new key to be called"] = "What the value is"
Let me know if this helps!

Python: Json.load large json file MemoryError

I'm trying to load a large JSON File (300MB) to use to parse to excel. I just started running into a MemoryError when I do a json.load(file). Questions similar to this have been posted but have not been able to answer my specific question. I want to be able to return all the data from the json file in one block like I did in the code. What is the best way to do that? The Code and json structure are below:
The code looks like this.
def parse_from_file(filename):
""" proceed to load the json file that given and verified,
it and returns the data that was in the json file so it can actually be read
Args:
filename (string): full branch location, used to grab the json file plus '_metrics.json'
Returns:
data: whatever data is being loaded from the json file
"""
print("STARTING PARSE FROM FILE")
with open(filename) as json_file:
d = json.load(json_file)
json_file.close()
return d
The structure looks like this.
[
{
"analysis_type": "test_one",
"date": 1505900472.25,
"_id": "my_id_1.1.1",
"content": {
.
.
.
}
},
{
"analysis_type": "test_two",
"date": 1605939478.91,
"_id": "my_id_1.1.2",
"content": {
.
.
.
}
},
.
.
.
]
Inside "content" the information is not consistent but has 3 distinct but different possible template that can be predicted based of analysis_type.
i did like this way, hope it will helps you. and maybe you need skip the 1th line "[". and remove "," at a line end if exists "},".
with open(file) as f:
for line in f:
while True:
try:
jfile = ujson.loads(line)
break
except ValueError:
# Not yet a complete JSON value
line += next(f)
# do something with jfile
If all the tested libraries are giving you memory problems my approach would be splitting the file into one per each object inside the array.
If the file has the newlines and padding as you said in the OP I owuld read by line, discarding if it is [ or ] writting the lines to new files every time you find a }, where you also need to remove the commas. Then try to load everyfile and print a message when you end reading each one to see where it fails, if it does.
If the file has no newlines or is not properly padded you would need to start reading char by char keeping too counters, increasing each of them when you find [ or { and decreasing them when you find ] or } respectively. Also take into account that you may need to discard any curly or square bracket that is inside a string, though that may not be needed.

My program sometimes writes an extra ] or } at the end of data in json file?

I have written a note taking tool for myself as my first program. Its actually working really well for the most part however sometimes the program will write an extra ]or } at the end of the list or dict stored inside of said json file.
It doesn't happen often and I think it is only happening when I am writing new lines of code or changing existing lines that read/write to said files. I am not 100% sure but that is what it looks like.
For example I have a single list stored in a file and I use the indent="" flag to make sure as it writes the files its a little more readable for me if I ever have to edit said files. Sometimes when running my program after changing up some code or adding code I get an error stating a file has "extra data" in it.
The error looks something like this:
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 6 column 2 (char 5791)
and the cause of the error would be something like this:
[
"Help",
"DataTypes",
"test",
"Variables",
]] # the error would be cause by this extra ] at the end of the list
What I don't understand is why does the program sometimes add and extra ] or } at the end of the data in my json files?
Is there something I am doing wrong when I open the file or dump to the file?
Here are some sections of code I have that are used to open files and dump to files:
path = "./NotesKeys/"
notebook = dict()
currentWorkingLib = ""
currentWorkingKeys = ""
#~~~~~~~~~~~~~~~~~~~< USE TO open all files in Directory >~~~~~~~~~~~~~~~~~~~
with open("%s%s"%(path,"list_of_all_filenames"), "r") as listall:
list_of_all_filenames = json.load(listall)
def openAllFiles(event=None):
global path
for filename in os.listdir(path):
with open(path+filename, "r+") as f:
notebook[filename] = json.load(f)
openAllFiles()
And here is how I am updating the data in the file. Just ignore the e1Current, e1allcase, e2Current they are used to keep the format of the users input for filenames (dict key) lower case in the dictionaries where the notes are stored and maintain the case the user imputed for a display list. This should not be related to the file read write issue.:
Edit: removed unrelated code per commenters request.
#~~~~~~~~~~~~~~~~~~~< UPDATE selected_notes! >~~~~~~~~~~~~~~~~~~~
dict_to_be_updated = notebook[currentWorkingLib]
dict_to_be_updated[e1Current] = e2Current
with open("%s%s"%(path,currentWorkingLib),"r+") as working_temp_var:
json.dump(dict_to_be_updated, working_temp_var, indent = "")
I am aware of how to open a file and use the data and how to dump data to said file and update the content loaded in the variables of the program based off the newly dumped data.
Am I missing something important during this process? Should I be doing something to ensure data integrity in the json files?
You are opening files in read-write mode, r+:
with open("%s%s"%(path,currentWorkingLib),"r+") as working_temp_var:
This means you'll be writing to a file that already has data in it, and sometimes the existing data is longer than what you are now writing to the file. That means you'll end up with some trailing data at the end.
You can see this by writing a shorter demo string to a file, then using r+ to write less data to the same file, then reading again:
>>> with open('/tmp/demo', 'w') as init:
... init.write('The quick brown fox jumps over the lazy dog\n')
...
44
>>> with open('/tmp/demo', 'r+') as readwrite:
... readwrite.write("Monty Python's flying circus\n")
...
29
>>> with open('/tmp/demo', 'r') as result:
... print(result.read())
...
Monty Python's flying circus
r the lazy dog
Don't do this. Use w write mode so the file is truncated first:
with open("%s%s"%(path,currentWorkingLib), "w") as working_temp_var:
This ensures your file is cut back to size 0 before you write a new JSON document.

Python quiz not writing to txt file

I am trying to write a simple arithmetic quiz. Once the user has completed the quiz, I want to write their name and score to a text file. However, if they have already completed the quiz, then their new score should be appended on the same line as their previous score is on.
Currently the text file contains: Raju,Joyal : 10
However, when completing the test under the same surname, the new score is not appended to this line, and when completing the test under a different surname no new line is written to the text file at all.
This is my code:
rewrite = False
flag = True
while flag == True:
try:
# opening src in a+ mode will allow me to read and append to file
with open("Class {0} data.txt".format(classNo),"a+") as src:
# list containing all data from file, one line is one item in list
data = src.readlines()
for ind,line in enumerate(data):
if surname.lower() in line.lower():
# overwrite the relevant item in data with the updated score
data[ind] = "{0} {1}\n".format(line.rstrip(), ", ",score)
rewrite = True
else:
src.write("{0},{1} : {2}{3} ".format(surname, firstName, score,"\n"))
if rewrite == True:
# reopen src in write mode and overwrite all the records with the items in data
with open("Class {} data.txt".format(classNo),"w") as src2:
src2.writelines(data)
flag = False
except IOError:
errorHandle("Data file not found. Please ensure data files are the in same folder as the program")
You're opening the file but, because you're in "append" mode (a+) your read/write pointer is positioned at the end of the file. So when you say readlines() you get nothing: even if the file is not empty, there are no more lines past where you currently are. As a result, your for loop is iterating over a list of length 0, so the code never runs.
You should read up on working with files (look for the keywords seek and tell).
Note that even if you're positioned in the right place in the middle of the file, overwriting what's already there in an existing file will not be a good way to go: if the data you want to write are a different number of bytes from what you want to overwrite, you'll get problems. Instead you'll probably want to open one copy of the file for reading and create a new one to write to. When they're both finished and closed, move the newer file to replace the older one.
Finally note that if surname.lower() in line.lower() is not watertight logic. What happens if your file has the entry Raju,Joyal: 10 and someone else has the surname "Joy" ?
This is from my own project but I don't know if it helps:
file=open("Mathematics Test Results (v2.5).txt","a")
file.write("Name: "+name+", Score: "+str(score)+", Class: "+cls+"."+"\n")
file.close()

Categories