File content not reading without seek in python - python

In my case I am going to write some content to a file in bytearray format and tries to read the content that I have written . But here the problem is if I am not giving the seek function then the file content read is empty. What I understood is by default the reference point is at the beginning of the file which is similar to seek(0). Please help me to understand this problem. I will give you both scenarios as example here
Without seek command
filename = "my_file"
Arr = [0x1, 0x2]
file_handle = open(filename, "wb+")
binary_format = bytearray(Arr)
file_handle.write(binary_format)
#file_handle.seek(0) #Here commenting the seek(0) part
print("file_handle-",file_handle.read())
file_handle.close()
Output in the console
file_handle- b''
With seek command
filename = "my_file"
Arr = [0x1, 0x2]
file_handle = open(filename, "wb+")
binary_format = bytearray(Arr)
file_handle.write(binary_format)
file_handle.seek(0)
print("file_handle-",file_handle.read())
file_handle.close()
Output in the console is
file_handle- b'\x01\x02'
Is the seek(0) is mandatory here even if by default it points to the beginning of file ?

Related

How can I manipulate a txt file to be all in lowercase in python?

Let's say that I have a txt file that I have to get all in lowercase. I tried this
def lowercase_txt(file):
file = file.casefold()
with open(file, encoding = "utf8") as f:
f.read()
Here I get "'str' object has no attribute 'read'"
then I tried
def lowercase_txt(file):
with open(poem_filename, encoding="utf8") as f:
f = f.casefold()
f.read()
and here '_io.TextIOWrapper' object has no attribute 'casefold'
What can I do?
EDIT: I re-runned this exact code and now there are no errors (dunno why) but the file doesn't change at all, all the letters stay the way they are.
This will rewrite the file. Warning: if there is some type of error in the middle of processing (power failure, you spill coffee on your computer, etc.) you could lose your file. So, you might want to first make a backup of your file:
def lowercase_txt(file_name):
"""
file_name is the full path to the file to be opened
"""
with open(file_name, 'r', encoding = "utf8") as f:
contents = f.read() # read contents of file
contents = contents.lower() # convert to lower case
with open(file_name, 'w', encoding = "utf8") as f: # open for output
f.write(contents)
For example:
lowercase_txt('/mydirectory/test_file.txt')
Update
The following version opens the file for reading and writing. After the file is read, the file position is reset to the start of the file before the contents is rewritten. This might be a safer option.
def lowercase_txt(file_name):
"""
file_name is the full path to the file to be opened
"""
with open(file_name, 'r+', encoding = "utf8") as f:
contents = f.read() # read contents of file
contents = contents.lower() # convert to lower case
f.seek(0, 0) # position back to start of file
f.write(contents)
f.truncate() # in case new encoded content is shorter than older

How to make a program that replaces newlines in python file with a string [duplicate]

This question already has answers here:
Why doesn't calling a string method (such as .replace or .strip) modify (mutate) the string?
(3 answers)
Closed 3 years ago.
I am trying to display my python file in html and therefore I would like to replace every time the file jumps to a newline with < br> but the program I've written is not working.
I've looked on here and tried changing the code around a bit I have gotten different results but not the ones I need.
with open(path, "r+") as file:
contents = file.read()
contents.replace("\n", "<br>")
print(contents)
file.close()
I want to have the file display < br> every time I have a new line but instead the code dosen't change anything to the file.
Here is an example program that works:
path = "example"
contents = ""
with open(path, "r") as file:
contents = file.read()
new_contents = contents.replace("\n", "<br>")
with open(path, "w") as file:
file.write(new_contents)
Your program doesn't work because the replace method does not modify the original string; it returns a new string.
Also, you need to write the new string to the file; python won't do it automatically.
Hope this helps :)
P.S. a with statement automatically closes the file stream.
Your code reads from the file, saves the contents to a variable and replaces the newlines. But the result is not saved anywhere. And to write the result into a file you must open the file for writing.
with open(path, "r+") as file:
contents = file.read()
contents = contents.replace("\n", "<br>")
with open(path, "w+") as file:
contents = file.write(contents)
there are some issues in this code snippet.
contents.replace("\n", "<br>") will return a new object which replaced \n with <br>, so you can use html_contents = contents.replace("\n", "<br>") and print(html_contents)
when you use with the file descriptor will close after leave the indented block.
Try this:
import re
with open(path, "r") as f:
contents = f.read()
contents = re.sub("\n", "<br>", contents)
print(contents)
Borrowed from this post:
import tempfile
def modify_file(filename):
#Create temporary file read/write
t = tempfile.NamedTemporaryFile(mode="r+")
#Open input file read-only
i = open(filename, 'r')
#Copy input file to temporary file, modifying as we go
for line in i:
t.write(line.rstrip()+"\n")
i.close() #Close input file
t.seek(0) #Rewind temporary file to beginning
o = open(filename, "w") #Reopen input file writable
#Overwriting original file with temporary file contents
for line in t:
o.write(line)
t.close() #Close temporary file, will cause it to be deleted

Unable to find string in text file

I am trying to simple find if a string exists in a text file, but I am having issues. I am assuming its something on the incorrect line, but I am boggled.
def extract(mPath, frequency):
if not os.path.exists('history.db'):
f = open("history.db", "w+")
f.close()
for cFile in fileList:
with open('history.db', "a+") as f:
if cFile in f.read():
print("File found - skip")
else:
#with ZipFile(cFile, 'r') as zip_ref:
#zip_ref.extractall(mPath)
print("File Not Found")
f.writelines(cFile + "\n")
print(cFile)
Output:
File Not Found
C:\Users\jefhill\Desktop\Python Stuff\Projects\autoExtract\Test1.zip
File Not Found
C:\Users\jefhill\Desktop\Python Stuff\Projects\autoExtract\test2.zip
Text within the history.db file:
C:\Users\jefhill\Desktop\Python Stuff\Projects\autoExtract\Test1.zip
C:\Users\jefhill\Desktop\Python Stuff\Projects\autoExtract\test2.zip
What am I missing? Thanks in advance
Note: cFile is the file path shown in the output and fileList is the list of both the paths from the output.
You're using the wrong flags for what you want to do. open(file, 'a') opens a file for append-writing, meaning that it seeks to the end of the file. Adding the + modifier means that you can also read from the file, but you're doing so from the end of the file; so read() returns nothing, because there's nothing beyond the end of the file.
You can use r+ to read from the start of the file while having the option of writing to it. But keep in mind that anytime you write you'll be writing to the reader's current position in the file.
I haven't tested the code but this should put you on the right track!
def extract(mPath, frequency):
if not os.path.exists('history.db'):
f = open("history.db", "w+")
f.close()
with open('history.db', "rb") as f:
data = f.readlines()
for line in data:
if line.rstrip() in fileList: #assuming fileList is a list of strings
#do everything else here

How can I create files , read and write files in Python?

All the tutorials I can find follow the same format which isn't working.I don't get an error message but I don't get normal output. What I get appears to be the file description at some memory location.
# file_test
ftpr= open("file","w")
ftpr.write("This is a sample line/n")
a=open("file","r")
print a
#This is the result
<open file 'file', mode 'r' at 0x00000000029DDDB0>
>>>
Do you want to read the contents of the file? Try print a.readlines().
Ie:
with open('file', 'w') as f:
f.write("Hello, world!\nGoodbye, world!\n")
with open('file', 'r') as f:
print f.readlines() # ["Hello, world!\n", "Goodbye, world!\n"]
FYI, the with blocks, if you're unfamiliar with them, ensure that the open()-d files are close()-d.
This is not the correct way to read the file. You are printing return value from open call which is object of file type. Do like this for reading and writing.
for writing
f=open("myfile","w")
f.write("hello\n")
f.write("This is a sample line/n")
f.close()
For reading
f=open("file","r")
string = f.read()
print("string")
f.close()

Slow python file I:O; Ruby runs better than this; Got the wrong language?

Please advise - I'm going to use this asa learning point. I'm a beginner.
I'm splitting a 25mb file into several smaller file.
A Kindly guru here gave me a Ruby sript. It works beautifully fast. So, in order to learn I mimicked it with a python script. This runs like a three-legged cat (slow). I wonder if anyone can tell me why?
My python script
##split a file into smaller files
###########################################
def splitlines (file) :
fileNo=0001
outFile=open("C:\\Users\\dunner7\\Desktop\\Textomics\\Media\\LexisNexus\\ele\\newdocs\%s.txt" % fileNo, 'a') ## open file to append
fh = open(file, "r") ## open the file for reading
mylines = fh.readlines() ### read in lines
for line in mylines: ## for each line
if re.search("Copyright ", line): # if the line is equal to the regex
outFile.close() ## close the file
fileNo +=1 #and add one to the filename, starting to read lines in again
else: # otherwise
outFile=open("C:\\Users\\dunner7\\Desktop\\Textomics\\Media\\LexisNexus\\ele\\newdocs\%s.txt" % fileNo, 'a') ## open file to append
outFile.write(line) ## then append it to the open outFile
fh.close()
The guru's Ruby 1.9 script
g=0001
f=File.open(g.to_s + ".txt","w")
open("corpus1.txt").each do |line|
if line[/\d+ of \d+ DOCUMENTS/]
f.close
f=File.open(g.to_s + ".txt","w")
g+=1
end
f.print line
end
There are many reasons why your script is slow -- the main reason being that you reopen the outputfile for almost every line you write. Since the old file gets implicitly closed on opening a new one (due to Python garbage collection), the write buffer is flushed for every single line you write, which is quite expensive.
A cleaned up and corrected version of your script would be
def file_generator():
file_no = 1
while True:
f = open(r"C:\Users\dunner7\Desktop\Textomics\Media"
r"\LexisNexus\ele\newdocs\%s.txt" % file_no, 'a')
yield f
f.close()
file_no += 1
def splitlines(filename):
files = file_generator()
out_file = next(files)
with open(filename) as in_file:
for line in in_file:
if "Copyright " in line:
out_file = next(files)
out_file.write(line)
out_file.close()
I guess the reason your script is so slow is that you open a new file descriptor for each line. If you look at your guru's ruby script, it closes and opens the output file only if your separator matches.
In contrast to that, your python script opens a new file descriptor for every line you read (and btw, does not close them). Opening a file requires talking to the kernel, so this is relatively slow.
Another change I would suggest is to change
fh = open(file, "r") ## open the file for reading
mylines = fh.readlines() ### read in lines
for line in mylines: ## for each line
to
fh = open(file, "r")
for line in fh:
With this change, you do not read the whole file into memory, but only block after block. Although it should not matter with a 25MiB file, it will hurt you with big files and is good practice (and less code ;)).
The Python code might be slow due to regex and not IO. Try
def splitlines (file) :
fileNo=0001
outFile=open("newdocs/%s.txt" % fileNo, 'a') ## open file to append
reg = re.compile("Copyright ")
for line in open(file, "r"):
if reg.search("Copyright ", line): # if the line is equal to the regex
outFile.close() ## close the file
outFile=open("newdocs%s.txt" % fileNo, 'a') ## open file to append
fileNo +=1 #and add one to the filename, starting to read lines in again
outFile.write(line) ## then append it to the open outFile
Several notes
Always use / instead of \ for path name
If regex is used repeatedly, compile it
Do you need re.search? or re.match?
UPDATE:
#Ed. S: point taken
#Winston Ewert: code updated to be closer to the original Ruby code
rosser,
Don't use names of built-in objects as identifiers in a code (file, splitlines)
The following code respects the effect of your own code: an out_file is closed without the line containing 'Copyright ' that constitutes the signal of closing
The use of the function writelines() is intended to obtain a faster execution than with a repetition of out_file.write(line)
The if li: block is there to trigger the closing of out_file in case the last line of the read file doesn't contains 'Copyright '
def splitfile(filename, wordstop, destrep, file_no = 1, li = []):
with open(filename) as in_file:
for line in in_file:
if wordstop in line:
with open(destrep+str(file_no)+'.txt','w') as f:
f.writelines(li)
file_no += 1
li = []
else:
li.append(line)
if li:
with open(destrep+str(file_no)+'.txt','w') as f:
f.writelines(li)

Categories