Python write operation not writing quickly enough - python

I don't know why this started happening recently. I have a function that opens a new text file, writes a url to it, then closes it, but it is not made immediately after the f.close() is executed. The problem is that a function after it open_url() needs to read from that text file a url, but since nothing is there, my program errors out.
Ironically, after my program errors out and I stop it, the url.txt file is made haha. Anyone know why this is happening with the python .write() action? Is there another way to create a text file and write a line of text to that text file faster?
#staticmethod
def write_url():
if not path.exists('url.txt'):
url = UrlObj().url
print(url)
with open('url.txt', 'w') as f:
f.write(url)
f.close
else:
pass
#staticmethod
def open_url():
x = open('url.txt', 'r')
y = x.read()
return y
def main():
scraper = Job()
scraper.write_url()
url = scraper.open_url()
results = scraper.load_craigslist_url(url)
scraper.kill()
dictionary_of_listings = scraper.organizeResults(results)
scraper.to_csv(dictionary_of_listings)
if __name__ == '__main__':
main()
scheduler = BlockingScheduler()
scheduler.add_job(main, 'interval', hours=1)
scheduler.start()
There is another class called url that prompts the user to add attributes to a bare url for seleenium to use. UrlObj().url gives you the url to write which is used to write to the new text file. If the url.txt file already exists, then pass and go to open_url()and get the url from the url.txt file to pass to the url variable which is used to start the scraping.

Just found a work around. If the file does not exist then return the url to be fed directly to load_craigslist_url. If the text file exists then just read from the text file.

Related

Python doesn't release file after it is closed

What I need to do is to write some messages on a .txt file, close it and send it to a server. This happens in a infinite loop, so the code should look more or less like this:
from requests_toolbelt.multipart.encoder import MultipartEncoder
num = 0
while True:
num += 1
filename = f"example{num}.txt"
with open(filename, "w") as f:
f.write("Hello")
f.close()
mp_encoder = MultipartEncoder(
fields={
'file': ("file", open(filename, 'rb'), 'text/plain')
}
)
r = requests.post("my_url/save_file", data=mp_encoder, headers=my_headers)
time.sleep(10)
The post works if the file is created manually inside my working directory, but if I try to create it and write on it through code, I receive this response message:
500 - Internal Server Error
System.IO.IOException: Unexpected end of Stream, the content may have already been read by another component.
I don't see the file appearing in the project window of PyCharm...I even used time.sleep(10) because at first, I thought it could be a time-related problem, but I didn't solve the problem. In fact, the file appears in my working directory only when I stop the code, so it seems the file is held by the program even after I explicitly called f.close(): I know the with function should take care of closing files, but it didn't look like that so I tried to add a close() to understand if that was the problem (spoiler: it was not)
I solved the problem by using another file
with open(filename, "r") as firstfile, open("new.txt", "a+") as secondfile:
secondfile.write(firstfile.read())
with open(filename, 'w'):
pass
r = requests.post("my_url/save_file", data=mp_encoder, headers=my_headers)
if r.status_code == requests.codes.ok:
os.remove("new.txt")
else:
print("File not saved")
I make a copy of the file, empty the original file to save space and send the copy to the server (and then delete the copy). Looks like the problem was that the original file was held open by the Python logging module
Firstly, can you change open(f, 'rb') to open("example.txt", 'rb'). In open, you should be passing file name not a closed file pointer.
Also, you can use os.path.abspath to show the location to know where file is written.
import os
os.path.abspath('.')
Third point, when you are using with context manager to open a file, you don't close the file. The context manger supposed to do it.
with open("example.txt", "w") as f:
f.write("Hello")

How to write txt files inside a infinite loop?

I am trying to make a loop that writes in a text file the date each time en event happens. but i can't get it to work since i need an infinite loop to run the program. If i put myfile.close() inside the loop even inside the "if x[14]=="track":" i get:
myfile.write(wri)
ValueError: I/O operation on closed file.
However if i place it outside the loop the file Doesn't close and nothing is written in the output file.
Here is the code
while 1 :
print("yes")
response = requests.get('https://api.spotify.com/v1/me/player/currently-playing', headers=headers)
soup2 = BeautifulSoup(response.text, "html.parser")
x=re.findall('"([^"]*)"', str(soup2))
if isinstance(x, list)==True:
if len(x)>=15:
print(x[14])
if x[14]=="track":
os.system("TASKKILL /IM spotify.exe")
sleep(2)
subprocess.Popen("C:/Users/nebbu/AppData/Roaming/Spotify/Spotify.exe")
sleep(2)
import pyautogui
pyautogui.press("playpause")
pyautogui.press("l")
print(x)
wri=str(date)+"- -"+str(x[13]+": "+str(x[14]))
myfile.write(wri)
myfile.close()
The loop never ends, i don't know if it has to end to close the file or if there is another way of doing it.
Simply make a custom function and call it for every time you want to add a new line to your text file. For example:
def f(dump):
file = open('myfile.txt', 'a')
file.write(dump)
file.write('\n')
file.close()
and then pass it the values you want to write on the fly.

regarding file downloading in python

I wrote this code to download an srt subtitle file, but this doesn't work. Please review this problem and help me with the code. I need to find what is the mistake that i'm doing. Thanks.
from urllib import request
srt_url = "https://subscene.com/subtitle/download?mac=LkM2jew_9BdbDSxdwrqLkJl7hDpIL_HnD-s4XbfdB9eqPHsbv3iDkjFTSuKH0Ee14R-e2TL8NQukWl82yNuykti8b_36IoaAuUgkWzk0WuQ3OyFyx04g_vHI_rjnb2290"
def download_srt_file(srt_url):
response = request.urlopen(srt_url)
srt = response.read()
srt_str = str(srt)
lines = srt_str.split('\\n')
dest_url = r'srtfile.srt'
fx = open('dest_url' , 'w')
for line in lines:
fx.write(line)
fx.close()
download_srt_file(srt_url)
A number of things are wrong or can be improved.
You are missing the return statement on your function.
You are calling the function from within the function so you are not actually calling it at all. You never enter it to begin with.
dest_url is not a string, it is a variable so fx = open('dest_url', 'w') will return an error (no such file)
To avoid handling the closing and flushing the file you are writing just use the with statement.
Your split('//n') is also wrong. You are escaping the slash like that. You want to split the lines so it has to be split('\n')
Finally, you don't have to convert the srt to string. It already is.
Below is a modified and hopefully functioning version of your code with the above implemented.
from urllib import request
def download_srt_file(srt_url):
response = request.urlopen(srt_url)
srt = response.read()
lines = srt.split('\n')
dest_url = 'srtfile.srt'
with open(dest_url, 'w') as fx:
for line in lines:
fx.write(line)
return
srt_url = "https://subscene.com/subtitle/download?mac=LkM2jew_9BdbDSxdwrqLkJl7hDpIL_HnD-s4XbfdB9eqPHsbv3iDkjFTSuKH0Ee14R-e2TL8NQukWl82yNuykti8b_36IoaAuUgkWzk0WuQ3OyFyx04g_vHI_rjnb2290"
download_srt_file(srt_url)
Tell me if it works for you.
A final remark is that you are not setting the target directory for the file you are writing. Are you sure you want to do that?

How do I get my program to write all the data it generates to a Text file I'm trying to a sentimental analysis of Twitter feeds?

Ok, so I'm trying to do a sentimental analysis of twitter tweets and all my code works perfect to get a response of live tweets. However the shell deletes all the tweets after a certain amount was reached. I have been messing around with my code to try and write all the tweets to a text file but for the last 5 hours of my struggles I can not figure it out. Where the comment symbol # is code I added to try and write the information to my text file. I'm fairly new to python so if someone can help me out I would very much appreciate it.
I would use Git because I know how to write all the data to a text file in that program but I can't figure out how to get it to run my python files.
def twitterreq(url, method, parameters):
req = oauth.Request.from_consumer_and_token(oauth_consumer,
token=oauth_token,
http_method=http_method,
http_url=url,
parameters=parameters)
req.sign_request(signature_method_hmac_sha1, oauth_consumer, oauth_token)
headers = req.to_header()
if http_method == "POST":
encoded_post_data = req.to_postdata()
else:
encoded_post_data = None
url = req.to_url()
opener = urllib.OpenerDirector()
opener.add_handler(http_handler)
opener.add_handler(https_handler)
response = opener.open(url, encoded_post_data)
return response
def fetchsamples():
url = "https://stream.twitter.com/1/statuses/sample.json"
parameters = []
response = twitterreq(url, "GET", parameters)
f=open("C:\\Users\\name\\Desktop\\datasci_course_materials\\assignment1", "w") # my attempt
for line in response:
f.write(str(line) + "\n") # 100% sure im not using this command properly
print line.strip()
if __name__ == '__main__':
fetchsamples()
I have left out the top of my code because we shouldn't need my access and consumer keys to answer this question. This code is in Python 2.7
Could try something along the lines of.
try:
with open("filename.txt", "a") as f:
for n in response:
f.write(n + "\n")
f.close()
except IOError as e:
print e
except TypeError as t:
print t
This will attempt to open filename.txt and append each item in "response" to a new line. It will capture IO errors and Type errors.
The line f=open("<filename>", "w") # my attempt means that if it stops, your file will just be lost completely and erased. Every time that your program runs this line it erases the file and then opens it.
Try changing the mode "a", which means that each subsequent call will just add data to the end.
f = open("<filename>", "a") # Appending instead of overwriting.
Extra information: https://docs.python.org/2/library/functions.html#open

How to structure Python function so that it continues after error?

I am new to Python, and with some really great assistance from StackOverflow, I've written a program that:
1) Looks in a given directory, and for each file in that directory:
2) Runs a HTML-cleaning program, which:
Opens each file with BeautifulSoup
Removes blacklisted tags & content
Prettifies the remaining content
Runs Bleach to remove all non-whitelisted tags & attributes
Saves out as a new file
It works very well, except when it hits a certain kind of file content that throws up a bunch of BeautifulSoup errors and aborts the whole thing. I want it to be robust against that, as I won't have control over what sort of content winds up in this directory.
So, my question is: How can I re-structure the program so that when it errors on one file within the directory, it reports that it was unable to process that file, and then continues to run through the remaining files?
Here is my code so far (with extraneous detail removed):
def clean_dir(directory):
os.chdir(directory)
for filename in os.listdir(directory):
clean_file(filename)
def clean_file(filename):
tag_black_list = ['iframe', 'script']
tag_white_list = ['p', 'div']
attr_white_list = {'*': ['title']}
with open(filename, 'r') as fhandle:
text = BeautifulSoup(fhandle)
text.encode("utf-8")
print "Opened "+ filename
# Step one, with BeautifulSoup: Remove tags in tag_black_list, destroy contents.
[s.decompose() for s in text(tag_black_list)]
pretty = (text.prettify())
print "Prettified"
# Step two, with Bleach: Remove tags and attributes not in whitelists, leave tag contents.
cleaned = bleach.clean(pretty, strip="TRUE", attributes=attr_white_list, tags=tag_white_list)
fout = open("../posts-cleaned/"+filename, "w")
fout.write(cleaned.encode("utf-8"))
fout.close()
print "Saved " + filename +" in /posts-cleaned"
print "Done"
clean_dir("../posts/")
I looking for any guidance on how to write this so that it will keep running after hitting a parsing/encoding/content/attribute/etc error within the clean_file function.
You can handle the Errors using :try-except-finally
You can do the error handling inside clean_file or in the for loop.
for filename in os.listdir(directory):
try:
clean_file(filename)
except:
print "Error processing file %s" % filename
If you know what exception gets raised you can use a more specific catch.

Categories