file.readable() overwriting external text files - bug or bad code? - python

Am taking a tutorial on youtube for python3 - in this exercise the code runs just fine if I have the open "r" option selected for reading files. If I switch the "r" to a "w" it runs through fine the first time, tells me that it cannot read the file - which is the expected outcome. But when I look at employees.txt again or if I try to rerun the code with the "r" option enabled, I see that the text file is empty.
I have tried the syntax in every way I can think of, but so far, I have had no luck in figuring out why it empties the text file
How would you go about troubleshooting something like this? Or is there something obvious in my code that you see? (code is exactly like the course example).
#!/usr/bin/env python3
# Script Name - reading-files.fcc.py
employee_file = open("employees.txt", "r")
if employee_file.readable() == True:
print(employee_file.read())
employee_file.close()
exit
else:
print("Cannot read file.")
employee_file.close()
exit
Here is employee.txt:
Jim - Sales
Dwight - Sales
Pam - Sales
Michael - Manager
Oscar - Accounting

This might be a good post for your consideration.
Opening the file in w mode will overwrite the file, meaning that unless you are writing anything to the file thereafter, it will effectively empty the file's contents- in your case with no additional text, rendering the file's contents blank.

When you open the file with 'w' option, employee_file.readable() returns False, so it print a line, close and save an empty file.

Related

Strange behavior when trying to create and write to a text file on macOS [duplicate]

This question already has answers here:
Convert UTF-8 with BOM to UTF-8 with no BOM in Python
(7 answers)
Closed last year.
I'm opening a plain text file, parsing it, and adding different lines to existing, empty string variables. I add these variables into a new variable that is a multi-line fstring. Trying to write the data to a new text file is not behaving as expected.
Reading the original file works fine. Text is properly parsed, variables populated.
The multi-line fstring variable seems fine. Prints normally. Even tried formatting it different ways which I show below.
When writing to a new file, that's where the strangeness starts. I've tried 2 ways:
Straight coding the open function with w or w+
Adding the above to a function and using that inside main()
The file is saved to disk with the correct name. Trying to double-click open in Finder produces nothing. Right-click to open produces nothing. Trying to move to trash with command+delete gives an error:
It sounds like the file goes to trash, but as the file disappears from the folder a new one is created with the same name in its place.
If I try to open in TextMate via File > Open, it opens as a blank file with no errors.
Since I can't get rid of the file, I have to delete the directory and create the directory again with the same name, or force delete in Terminal using rm. Restarting the system does not help. Relaunching Finder does nothing. Saving text files from other apps works fine. Directory is chmod 755.
If I copy an existing text file into the output directory, rename it to what the file is expected to be named, and let python overwrite the contents, it doesn't work either. The file modification date changes (and I see the file "blink" in Finder) but the contents remain the same. However, the file is not corrupted and opens normally.
If I do the same but delete the text inside of the copied file first, then run the script, python writes no data to the file, I can't open it by double-clicking on it, and I get error -43 again with the odd non-trashing behavior.
The strangest thing is this: if I add another with open() at the end of the script, and open the file that was just created and supposedly written to, and print its contents, the contents print. It's like when the script ends the file contents are being removed or its being corrupted somehow. Tried to close the file inside the script even though it's not needed, but same behavior persists.
Code:
Here's the code for writing:
FORMAT='utf-8'
OUTPUT_DIR = '/Path/To/SaveFolder'
# as a function
def write_to_file(content, fpath, name):
the_file = os.path.join(fpath, name)
with open(the_file, 'w+', encoding=FORMAT) as t:
t.write(content)
def main():
print(f" Writing File...\n")
filename = f"{pcode}_{author}_{title}_text.txt"
write_to_file(multiline_var, OUTPUT_DIR, filename)
# or hard coded in main()
def main():
print(f" Writing File...\n")
filename = f"{pcode}_{author}_{title}_text.txt"
the_file = os.path.join(OUTPUT_DIR, filename)
with open(the_file, 'w+', encoding=FORMAT) as t:
t.write(multiline_var)
I have tried using w w+ wt and wt+ and with and without encoding='utf-8'
Here is an example of multi-line fstring variable:
# using triple quotes
multiline_var = f"""
[PROJ-{pcode}] {full_title} by {author}
{description}
{URL}
{DIVIDER_1}
{TEXT_BLURB}
Some text here and then {SOME_MORE_TEXT}"
{DIVIDER_1}
{SOME_LINK}
"""
# or inside parens
multiline_var = (
f"[PROJ-{pcode}] {full_title} by {author}\n"
f"{description}\n\n"
f"{URL}\n"
f"{DIVIDER_1}\n"
f"{TEXT_BLURB}\n\n"
f"Some text here and then {SOME_MORE_TEXT}\n"
f"{DIVIDER_1}\n\n"
f"{SOME_LINK}"
)
Using exiftool on the text file shows the following, so it looks the data is there but must be corrupted:
File Size : 1797 bytes
File Modification Date/Time : 2021:12:31 15:55:39-05:00
File Access Date/Time : 2021:12:31 15:58:13-05:00
File Inode Change Date/Time : 2021:12:31 15:55:39-05:00
File Permissions : -rw-r--r--
File Type : TXT
File Type Extension : txt
MIME Type : text/plain
MIME Encoding : utf-8
Byte Order Mark : No
Newlines : Unix LF
Line Count : 55
Word Count : 181
Not sure what I'm doing wrong. VScode shows no syntax errors in the script. There are no errors in Terminal when running the script. Have I made some simple mistake in the above code? Maybe the fstring variable is causing a problem?
Thanks to #bnaecker for leading me to the solution to this problem.
It appeared that when creating/writing to a text file with a long name, Python can corrupt it. Not sure why, as I save long names for images with Python image libraries all the time. Using a short name like "MyFile.txt" it worked just fine, but that was a red herring.
I have updated this post with my journey to the final solution for using the long names that are needed for my project, though I'm not sure why the problem exists.
First Attempts:
So far creating using a short name and then renaming to a long one.... attempts have failed. I did notice that python is locking the file it creates and never unlocks it. Not sure if this is the problem. Setting chflags with os.system('chflags nouchg') command does not work, not even with sudo, and not even in the Terminal doing it manually.
Using os.rename() in Python corrupts the file
Using os.system('mv oldFile.txt newFile.txt') corrupts the file
Manually using mv command in Terminal corrupts the file
Manually changing the filename in the Finder does not (wtf?)
I kept looking for workarounds but nothing did the job.
Round 2:
Progress!
After much tinkering, I discovered a hidden character inside the file. I ran cat /path/longfilename.txt in Terminal, selected and copied the output and pasted into VScode. Here is what I saw:
Somehow a hidden character is getting into the project code number.
Pasting it into a Unicode search engine it came up as a ZERO WIDTH NO-BREAK SPACE also known in Unicode as EF BB BF. However, when pasting this symbol into TextMate it shows up as <U+FEFF> which is?...
The Byte Order Mark!
Opening a normal utf-8 text file in a hex editor also shows the files starting with EFBBBF for the BOM.
Now, the text file being read and parsed at first has no blank lines to start the file, so I added a line break, and also tried adding some spaces. This time when writing the file I could open it, however, after sending it to the trash, the same behavior occurred and the file was broken again. It seems that because other corrupted versions were in the trash, it added the symbol back to the file name for some reason.
So what appears to be happening, for whatever reason, when Python opens the text file I'm parsing that has no line break at the top, it seems to be grabbing the BOM from the file and adding that to the first variable which is grabbing the first line of the text file. Since that text is a number code that starts the file name, the BOM symbol is being added to the file name as well as the code inside the text file.
Just... wow
The Current Solution:
I have to leave a blank line at the start of the text file that I'm opening and parsing and a simple line break won't do it. I have no idea why this is. I added some spaces for good measure because randomly the BOM would be added to the variable and filename again. So far (knock on wood) as long as the first line of that initial file has some spaces and then a line break, and previous corrupted files have been deleted from the trash, a long file name can be used for all the files I'm creating and writing to without any problems.
This corruption even persists if I remove the encoding flag from both of the open functions I'm using (one to read and parse, the other to create and write).
If anyone knows why this is happening, please share. I've never seen it mentioned before. I'm not sure if it's a python 3.8 bug, a mac OS bug, the way TextMate wrote the original file, or a combination of these.
Correct Solution:
Thanks to #tripleee for the proper way to handle this, as I don't remember seeing this before, though I haven't been using python for very long.
In order to ignore the BOM, reading in the text file to be parsed with an encoding='utf-8-sig' does the job. Seems to be why it exists. :)
Problem solved.

Local Blast empty xml file python

I am trying to implement a little script in order to automatize a local blast alignment.
I had ran commands in the terminal en it works perfectly. However when I try to automatize this, I have a message like : Empty XML file.
Do we have to implement a "system" waiting time to let the file be written, or I did something wrong?
The code :
#sequence identifier as key, sequence as value.
for element in dictionnaryOfSequence:
#I make a little temporary fasta file because the blast command need a fasta file as input.
out_fasta = open("tmp.fasta", 'w')
query = ">" + element + "\n" + str(dictionnary[element])
out_fasta.write(query) # And I have this file with my sequence correctly filled
OUT_FASTA.CLOSE() # EDIT : It was out of my loop....
#Now the blast command, which works well in the terminal, I have my tmp.xml file well filled.
os.system("blastn -db reads.fasta -query tmp.fasta -out tmp.xml -outfmt 5 -max_target_seqs 5000")
#Parsing of the xml file.
handle = open("tmp.xml", 'r')
blast_records = NCBIXML.read(handle)
print blast_records
I have an Error : Your XML file was empty, and the blast_records object doesn't exist.
Did I make something wrong with handles?
I take all advice. Thank you a lot for your ideas and help.
EDIT : Problem solved, sorry for the useless question. I did wrong with handle and I did not open the file in the right location. Same thing with the closing.
Sorry.
try to open the file "tmp.xml" in Internet explorer. All tags are closed?

python will not let another program write test report to .txt file while its running

I have an automated RAM tester that writes a test report for each Module it tests. the RAM tester keeps adding to the test report indefinitely. What I want to do is have Python read the report and look for the word "PASS" and the speed of the RAM.
Once the two words are found, I need Python to write to the serial port and clear the report so there is nothing in the .txt file. That way it is ready to loop around and read the next report from the next module tested.
The code is all written besides when Python is running the RAM tester will not write its report to the.txtfile. I have created a small program that takes a test report I captured from the RAM tester and writes it to the .txt file every 3 seconds and that works perfectly.
The program I am working on opens the.txtfile, finds the text my other program wrote to it, finds the 2 key words, deletes them, loops around and does it until I close the program like I want it to. I have done some trouble shooting with it by commenting out chunks of code and everything works until it runs the
file = open("yup.txt", "r+")
txt = file.read()
part, then the RAM tester fails to write the report. I think that loop is screwing it up by constantly accessing/reading the.txtfile...not too sure though. Also Python does not crash at all it just sits there in the loop so I have no problems as far as that goes.
Here is the code I'm having troubles with:
cache_size = os.lstat("yup.txt").st_size
print '\nsearching for number of characters in cache\n'
time.sleep(2)
if cache_size == 0:
print ('0 characters found in cache!\n')
time.sleep(1.5)
print ('there is no data to process!\n')
time.sleep(1.5)
print ('waiting for RAMBot\n')
if cache_size > 0:
print '%d characters found in cache!' % (cache_size)
time.sleep(1.5)
print ('\ndata analysis will now begin\n')
print('________________________________________________________________________________')
x = 1
while x == 1:
file = open("yup.txt" , "r+")
txt = file.read()
if "PASS" and '#2x400MHZ' in txt:
ser.write('4')
print('DDR2 PC-6400 (800MHz) module detected')
open('yup.txt' , 'w')
file.close()
if "PASS" and '#2x333MHZ' in txt:
ser.write('3')
print('DDR2 PC-5300 (667MHz) module detected')
open('yup.txt' , 'w')
file.close()
if "PASS" and '#2x266MHZ' in txt:
ser.write('2')
print('DDR2 PC-4200 (533MHz) module detected')
open('yup.txt' , 'w')
file.close()
if "PASS" and '#2x200MHZ' in txt:
ser.write('1')
print('DDR2 PC-3200 (400MHz) module detected')
open('yup.txt' , 'w')
file.close()
Here is a one of the test reports from the RAM tester:
Test No.: 1
Module : DDR2 256Mx72 2GB 2R(8)#2x333MHZ 1.8V
(Tested at 2x400MHz)
Addr.(rowxcol.) : 14 x 10
Data (rankxbit) : 2 x 72
Internal Banks : 8
Burst : Mode=Sequential, Length=8
AC parameters : CL=5, AL=0, Trcd=5, Trp=5
S/N from SPD : a128f4f3
Test Loop # : 1
Test Pattern : wA, wD, mt, mX, mC, mY, S.O.E
## PASS: Loop 1 ##
Elapsed Time : 00:00:53.448
Date : 09/26/2014, 16:07:40
Am not sure if this helps or not but here is the small program that I wrote to simulate the RAM tester writing its test reports to the.txtfile. I am still confused on why this works and the RAM tester writing the test report has problems...
import os
import time
Q = '''Test No.: 1
Module : DDR2 256Mx72 2GB 2R(8)#2x333MHZ 1.8V
(Tested at 2x400MHz)
Addr.(rowxcol.) : 14 x 10
Data (rankxbit) : 2 x 72
Internal Banks : 8
Burst : Mode=Sequential, Length=8
AC parameters : CL=5, AL=0, Trcd=5, Trp=5
S/N from SPD : a128f4f3
Test Loop # : 1
Test Pattern : wA, wD, mt, mX, mC, mY, S.O.E
## PASS: Loop 1 ##
Elapsed Time : 00:00:53.448
Date : 09/26/2014, 16:07:40'''
x = 1
while x == 1:
host = open('yup.txt' , 'w')
host.write(Q)
host.close()
time.sleep(3)
Thank you very much in advance, I really need to get this to work so it is much appreciated.
The problem is that on Windows, two programs generally can't have the same file open at the same time. When you try to open the file in w or r+ mode, you're asking it to open the file for exclusive access, meaning it will fail if someone else already has the file open, and it will block anyone else from opening the file.
If you want the specifics on sharing and locks in Windows, see the dwShareMode explanation in the CreateFile function on MSDN. (Of course you're not calling CreateFile, you're just using Python's open, which calls CreateFile for you—or, in older versions, calls fopen, which itself calls CreateFile.)
So, how do you work around this?
The simplest thing to do is just not keep the file open. Open the file, write it, and close it again. (Also, since you never write to file, why open it in r+ mode in the first place?)
You will also have to add some code that handles an OSError caused by the race condition of both programs trying to open and write the file at the exact same time, but that's just a simple try:/except: with a loop around it.
Could you just open the file with more permissive sharing?
Sure. You could, for example, use pywin32 to call CreateFile and WriteFile instead of using Python's open and write wrappers, and then you can pass any parameters you want for dwShareMode.
But think about what this means. What happens if both programs try to write the file at the same time? Who wins? If you're lucky, you lose one test output. If you're unlucky, script A blanks the file halfway through script B writing its test output, and you get a garbage file that you can't parse and throw an indecipherable and hard-to-reproduce exception. So, is that really what you want?
Meanwhile, you've got some other weird stuff in your code.
Why are you opening another handle to the same path just to truncate it? Why not just, say, file.truncate(0)? Doing another open while you still have file open in r+ mode means you end up conflicting with yourself, even if no other program was trying to use the same file.
You're also relying on some pretty odd behavior of the file pointer. You've read everything in file. You haven't seeked back to the start, or reopened the file. You've truncated the file and overwritten it with about the same amount of data. So when you read() again, you should get nothing, or maybe a few lines if the test reports aren't always the exact same length. The fact that you're actually getting the whole file is an unexpected consequence of some weird things Windows does in its C stdio library.

python not writing to file for initial run

I have a super simple code and when run the first time it does not write to the file. But when run a second/multiple times later it writes to the file. The same thing happens when using "w" instead of "a" as well.
It also seems that the file is not closed after fh.close is run because I am unable to delete it - and a message appears saying that python is using the file. Any suggestions? Thanks!
fh = open("hello.txt","a")
fh.write("hello world again")
fh.close
fh.close doesn't call close, it just refers to the function. You need to do fh.close() to call the function.
you need to put the brackets after fh.close else you aren't actually calling the function, and if you are running interactively (i.e. with IDLE) then the interpreter keeps the file open.
so change your last line to:
fh.close()
James
The other posters are correct.
Also, I would suggest using the "with" statement when working with files, because then they will be automatically closed when your code goes out of scope.
with open("hello.txt","a") as fh:
fh.write("hello world again")
# Code that doesnt use the file continues here
If you use this, you never have to worry about closing your file. Even if runtime errors occur, the file will still always be closed.

Problem with exiting a Word doc using Python

This is my first time using this so be kind :) basically my question is I am making a program that opens many Microsoft Word 2007 docs and reads from a certain table in that document and writes that info to an excel file there is well in excess of 1000 word docs. I have all of this working but the only problem when I run my code it does not close MSword after opening each doc I have to manually do this at the end of the program run by opening word and selecting exit word option from the Home menu. Another problem is also if a run this program consecutively on the second run everything goes to hell it prints the same thing repeatedly no matter which doc is selected I think this may have to do with how MSword is deciding which doc is active e.g. is it still opening the last active document that was not closed from the last run. Anyways here is my code for the opening and closing part I wont bore you guys with the rest::
MSWord = win32com.client.Dispatch("Word.Application")
MSWord.Visible = 0
# Open a specific file
#myWordDoc = tkFileDialog.askopenfilename()
MSWord.Documents.Open("C:\\Documents and Settings\\fdosier" + chosen_doc)
#Get the textual content
docText = MSWord.Documents[0].Content
charText = MSWord.Documents[0].Characters
# Get a list of tables
ListTables = MSWord.Documents[0].Tables
------Main Code---------
MSWord.Documents.Close
MSWord.Documents.Quit
del MSWord
Basically, Python is not VBA, so this:
MSWord.Documents.Close
is equivalent to:
getattr(MSWord.Documents, "Close")
i.e. you just get some method object and do nothing with it. You need to call the method with the call operator (the parentheses :) :
MSWord.Documents.Close()
Accordingly for .Quit.
Before your MSWord.Quit did you try using:
MSWord.ActiveWindow.Close
Or even more simpley just doing
MSWord.Quit
I dont really understand if you are trying to close a document or the application.
I think you need a MSWord.Quit at the end (before and/or instead of the the del)

Categories