python - open, seek, write to a file, from another file - python

I guess I am doing something wrong.
I am not sure what it is though, but I keep getting TypeError: expected a character buffer object
I just want to open a file, seek to certain offsets and overwrite data from patch1 and patch2.
Here is the code I am using, please help me and show me what I am doing wrong:
patch1 = open("patch1", "r");
patch2 = open("patch2", "r");
main = open("patchthis.bin", "w");
main.seek(0xC0010);
main.write(patch1);
main.seek(0x7C0010);
main.write(patch1);
main.seek(0x40000);
main.write(patch2);
main.close();
I am noob when it comes to file handling with python, even though I have read up about it.
I really want to start learning more, but I need some good examples and any help sure would be appreciated :)

You are trying to write file object into file, not a string.
try:
patch1_text = patch1.read()
main.write(patch1_text)
and so on.
Also use with statement when operating on files:
with open('patch1', 'r') as patch1:
patch1_text = patch1.read()
patch1.close()
And don't use semi-colons at the end of line !!!

Related

python script does not rewrite the file on itself

so first and foremost i wanted to connect multiple lines into one and add ","
so in example
line1
line2
line2
to
line1,line2,line3
i managed to make it work with this script right here
filelink = input("Enter link here ")
fix = file = open(filelink, "r")
data=open(filelink).readlines()
for n,line in enumerate(data):
if line.startswith("line"):
data[n] = "\n"+line.rstrip()
else:
data[n]=line.rstrip()
print(','.join(data))
HOWEVER in the terminal itself it shows it executed perfectly but in the text file itself it's still remains the same no connected lines and no commas
side note. i would love some explanations how does the loop work and what "enumerate" stands for and why specifically this code i tried googling each one separately and understand the code but i didn't manage to find what i was looking for if anyone keen to explain the code line by line shortly i would be very appreciative
Thanks in advance <3
This is somewhat superfluous:
fix = file = open(filelink, "r")
That's assigning two names to the same file object, and you don't even use fix, so at least drop that part.
For handling files, you would be better using a context manager. That means that you can open a resource and they will automatically get closed for you once you're done (usually).
In any case, you opened in read mode with open(filelink, "r") so you'll never change the file contents. print(','.join(data)) will probably show you what you expect, but print() writes to stdout and the change will only be in your terminal. You will not modify the base file with this. But, I think you're sufficiently close that I'll try close the missing connection.
In this case, you need to:
open the file first in read mode to pull the data out.
Do the transform in python to the data
Open the file again in write mode (which wipes the existing contents)
Write the transformed data
So, like this:
filelink = input("Enter link here ")
with open(filelink) as infile: # context manager, by default in "r" mode
data = [item.strip() for item in infile.readlines()]
data = ','.join(data)
# Now write it back out
with open(filelink, "w") as outfile:
outfile.write(data)

Opening, edit/rewrite string, save back to a new or same file

I want to open a file, decode the format of data (from base64 to ASCII), rewrite or save the decoded string, either back to the same file, or new one.
I have it opening, reading, decoding (and printing as a test) the decoded base64 string into readable format (ASCII I believe)
My goal is to now save this output to: either a "newfile.txt" document or back to the original "test.mcz" file ready for the next steps of my mission...
I know there are great online base64 decoders and they do work well for what I am doing - I use them often, but my goal is to write my own program as a learning exercise more than anything (also when my internet plays up I need an offline program)
Here's where I am so far (the original file is .mcz format it is a game save)
# PYTHON 3
import base64
f = open('test.mcz', 'r')
f_read = f.read()
# print(f_read) # was just as a test
new_f_read = base64.b64decode(f_read)
print (new_f_read)
This prints a butt-load of readable code that is what I need, but I don't want to have to just copy and paste this output from the Python shell into another editor, I want to save it to a file...for convenience.
Either back into the same test.mcz (I will be re-encoding to base64 again later on anyway) or to a new file - thus leaving my original as it was.
problem arises when I want to save/write this decoded output that is stored within the new_f_read variable...it's just been a headache, before I started I could visualise how it needed to be written, I got tripped up when I had to switch it all over to Python3 for some reason (Don't ask...) and I have tried so many variations from online examples - I wouldn't know where to start explaining what I've tried so far. I can't open the original file as both "r" AND "w" together so once Ive opened and decoded I cant reopen the original file as "w" because it just wipes the contents (which are still encoded anyway) -
I think I need to write functions to handle:
1. Open, read, save string to a variable
2. Manipulate string - decode
3. Write the new string to new or existing file
Sounds easy I know, but I am stuck...so here I am. If anyone shows examples, please take the time to explain what is going on, it seems pointless to me having code I don't understand. Apologies if this seems like a simple thing, help would be appreciated..Thanks
First, you can absolutely open a file for both reading and writing without truncating the contents. That's what the r+ mode is for (see https://docs.python.org/3/library/functions.html#open). If you do this, the model is (a) open the file, (b) read it, (c) seek back to the beginning with e.g. f.seek(0), (d) write it.
Secondly, you can simply open the file, read it, then close the file, and then reopen it, write it, and close it again, like this:
# open the file for reading, read the data, then close the file
with open('test.mcz', 'rb') as f:
f_read = f.read()
new_f_read = base64.b64decode(f_read)
# open the file for writing, write the data, then close the file
with open('test.mcz', 'wb') as f:
f.write(new_f_read)
This is probably the easiest solution.
The easiest thing is to open first a read file handle, close it then open a write handle. Read/Write handles are complicated because they have to have a pointer to where in the file you are and it add overhead that you don't need to use. You could do it if you wanted, but its a waste of time here.
Using the with operator to open files is recommended since the file will automatically close when you leave the with block.
import base64
with open('test.mcz', 'r') as f:
encode = base64.b64decode(f.read())
with open('test.mcz', 'wb') as f:
f.write(encode)
This is the same as
import base64
f = open('test.mcz', 'r'):
encode = base64.b64decode(f.read())
f.close()
f = open('test.mcz', 'wb'):
f.write(encode)
f.close()

Why won't a single line print from a file?

As part of a bigger project, I would simply like to make sure that a file can be opened and Python can read and use it. So after I opened up the txt file, I said:
data = txtfile.read()
first_line = data.split('\n',1)[2]
print(first_line)
I also tried
print(f1.readline())
where f1 is the txt file. This, again, did nothing.
I am using the spyder IDE, and it just says running file, and doesn't print anything. Is it because my file is too large? It is 4.6 gigs.
Does anyone have any idea what's going on?
and it just says running file, and doesn't print anything. Is it
because my file is too large? It is 4.6 gigs.
Yes.
data = txtfile.read()
This function is going to read the entire file. Since you stated that the file is 4.6GB, it is going to take time to load the entire file and then split the by newline character.
See this: Read large text files in Python
I don't know your context of use, so, if you can process line by line, it would be simpler. Or even chunks would make it simpler than reading the entire file.
first_line = open('myfile.txt', 'r').readline()

Python open() modes and file writing

I'm learning PyGTK and I'm making a Text Editor (That seems to be the hello world of pygtk :])
Anyways, I have a "Save" function that writes the TextBuffer to a file. Looks something like
try:
f = open(self.working_file_path, "rw+")
buff = self._get_buffer()
f.write(self._get_text())
#update modified flag
buff.set_modified(False)
f.close()
except IOError as e:
print "File Doesnt Exist so bring up Save As..."
......
Basically, if the file exist, write the buffer to it, if not bring up the Save As Dialog.
My question is: What is the best way to "update" a file. I seem to only be able to append to the end of a file. I've tried various file modes, but I'm sure I'm missing something.
Thanks in advance!
You can open a file in "r+" mode, which allows you to both read and write to the file, and to seek to particular positions and write there. This probably doesn't help you do what I think you want though; it sounds like you're wanting to only write out the changed data?
Remember that on the disk the file isn't stored as a series of extensible lines, it's just a sequence of bytes; some of those bytes indicate line-endings, but the next line follows on immediately. So if you edit the first line in the file and you write the new first line out, unless the new one happens to be exactly the same length as the old one the second line now won't be in the right place, so you'll need to move it (and have taken a copy of it first if the new line you wrote out was longer than the original). And this now means that the next line isn't in the right position either... and so on until you've had to read in and write out the entire rest of the file.
In practice you almost never write only part of an existing file unless you can simply append more data; if you need to "alter" a file you read it in, alter it in memory, and write it back out or you read in the file in pieces (often line by line) and then write out to a new file as you go (and then possibly move the new file over the top of the original). The first approach is easiest, the second is better for not having to hold the whole thing in memory at once.
At the point where you write to the file, your location is at the end of the file, so you need to seek back to the beginning. Then, you will overwrite the file, but this may leave old content at the end, so you also need to truncate the file.
Additionally, the mode you're specifying ('rw+') is invalid, and I get IOErrors when I try to do some operations on files opened with it. I believe that you want mode 'r+' ("Open for reading and writing. The stream is positioned at the beginning of the file."). 'w+' is similar, but would create the file if it didn't exist.
So, what you're looking for might be code like this:
try:
f = open(self.working_file_path, "r+")
buff = self._get_buffer()
f.seek(0)
f.truncate()
f.write(self._get_text())
#update modified flag
buff.set_modified(False)
f.close()
except IOError as e:
print "File Doesnt Exist so bring up Save As..."
......
However, you may want to modify this code to correctly catch and handle errors while truncating and writing the file, rather than assuming that all IOErrors in this section are non-existant-file errors from the call to open.
Read the file in as a list, add an element to the start of it, write it all out. Something like this.
f = open(self.working_file_path, "r+")
flist = f.readlines()
flist.insert(0, self._get_text())
f.seek(0)
f.writelines(flist)

How do I remove lines from a big file in Python, within limited environment

Say I have a 10GB HDD Ubuntu VPS in the USA (and I live in some where else), and I have a 9GB text file on the hard drive. I have 512MB of RAM, and about the same amount of swap.
Given the fact that I cannot add more HDD space and cannot move the file to somewhere else to process, is there an efficient method to remove some lines from the file using Python (preferably, but any other language will be acceptable)?
How about this? It edits the file in place. I've tested it on some small text files (in Python 2.6.1), but I'm not sure how well it will perform on massive files because of all the jumping around, but still...
I've used a indefinite while loop with a manual EOF check, because for line in f: didn't work correctly (presumably all the jumping around messes up the normal iteration). There may be a better way to check this, but I'm relatively new to Python, so someone please let me know if there is.
Also, you'll need to define the function isRequired(line).
writeLoc = 0
readLoc = 0
with open( "filename" , "r+" ) as f:
while True:
line = f.readline()
#manual EOF check; not sure of the correct
#Python way to do this manually...
if line == "":
break
#save how far we've read
readLoc = f.tell()
#if we need this line write it and
#update the write location
if isRequired(line):
f.seek( writeLoc )
f.write( line )
writeLoc = f.tell()
f.seek( readLoc )
#finally, chop off the rest of file that's no longer needed
f.truncate( writeLoc )
Try this:
currentReadPos = 0
removedLinesLength = 0
for line in file:
currentReadPos = file.tell()
if remove(line):
removedLinesLength += len(line)
else:
file.seek(file.tell() - removedLinesLength)
file.write(line + "\n")
file.flush()
file.seek(currentReadPos)
I have not run this, but the idea is to modify the file in place by overwriting the lines you want to remove with lines you want to keep. I am not sure how the seeking and modifying interacts with the iterating over the file.
Update:
I have tried fileinput with inplace by creating a 1GB file. What I expected was different from what happened. I read the documentation properly this time.
Optional in-place filtering: if the
keyword argument inplace=1 is passed
to fileinput.input() or to the
FileInput constructor, the file is
moved to a backup file and standard
output is directed to the input file
(if a file of the same name as the
backup file already exists, it will be
replaced silently).
from docs/fileinput
So, this doesn't seem to be an option now for you. Please check other answers.
Before Edit:
If you are looking for editing the file inplace, then check out Python's fileinput module - Docs.
I am really not sure about its efficiency when used with a 10gb file. But, to me, this seemed to be the only option you have using Python.
Just sequentially read and write to the files.
f.readlines() returns a list
containing all the lines of data in
the file. If given an optional
parameter sizehint, it reads that many
bytes from the file and enough more to
complete a line, and returns the lines
from that. This is often used to allow
efficient reading of a large file by
lines, but without having to load the
entire file in memory. Only complete
lines will be returned.
Source
Process the file getting 10/20 or more MB of chunks.
This would be the fastest way.
Other way of doing this is to stream this file and filter it using AWK for example.
example pseudo code:
file = open(rw)
linesCnt=50
newReadOffset=0
tmpWrtOffset=0
rule=1
processFile()
{
while(rule)
{
(lines,newoffset)=getLines(file, newReadOffset)
if lines:
[x for line in lines if line==cool: line]
tmpWrtOffset = writeBackToFile(file, x, tmpWrtOffset) #should return new offset to write for the next time
else:
rule=0
}
}
To resize file at the end use truncate(size=None)

Categories