I have some trouble reading the .text section of a binary file.
The binary is compiled by gcc.
readelf -S binary_file
This command shows that
.text PROGBITS 0000831C 00031C 000340
The address if the .text section is 0000831c, offset = 00031c and size = 000340
I have tried
file = open('binary_file')
content = file.readlines()
And the Capstone could not recognize.
If the .text content looks like
f102 030e 0000 a0e3
how to read it as
content = b'\xf1\x02\x03\x0e\x00\x00\xa0\xe3'
By default, open() opens a file in text mode. To open a file in binary mode, you need to supply the appropriate mode: 'rb' - which means open for reading in binary mode.
readlines() is designed to read a line of text from a file, so it does not make sense to use it for reading from a binary file.
You want something like:
file = open('binary_file', 'rb')
content = file.read()
Related
I have another doubt related to reading the dat file.
The file format is DAT file (.dat)
The content inside the file is in bytes.
When I tried the run open file code, the program built and ran successfully. However, the python shell has no output (I can't see the contents from the file).
Since the content inside the file is in bytes, should I modify the code ? What is the code to use for bytes?
Thank you.
There is no "DAT" file format and, as you say, the file contains bytes - as do all files.
It's possible that the file contains binary data for which it's best to open the file in binary mode. You do that by specifying b as part of the mode parameter to open(), like this:
f = open('file.dat', 'rb')
data = f.read() # read the entire file into data
print(data)
f.close()
Note that the full mode parameter is set to rb which means open the file in binary mode for reading.
A better way is to use with:
with open('file.dat', 'rb') as f:
data = f.read()
print(data)
No need to explicitly close the file.
If you know that the file contains text, possibly encoded in some specific encoding, e.g. UTF8, then you can specify the encoding when you open the file (Python 3):
with open('file.dat', encoding='UTF8') as f:
for line in f:
print(line)
In Python 2 you can use io.open().
Before you say "There´s already a thread covering that" - read further, there´s not.
I simply need to "address" the very first sector of a NTFS-Filesystem and read byte after byte (raw data). I do NOT need a program which does this, I need the code.
What I got so far:
drive = r"\\.\PhysicalDrive1"
pyLog = "C:\\ohMyPy\mft.txt"
hd = open(drive,encoding='cp850')
mft = hd.readlines(1024*10000)
with open(pyLog,'w',encoding='cp850') as f:
f.writelines(mft)
f.close
You need to open the files in binary mode ('rb'/'wb') otherwise Python will modify newline characters on Windows. Encoding is not needed when the file is opened in binary mode. Also, you can open both files in the same context manager (with) as shown below.
drive_filename = r'\\.\PhysicalDrive1'
log_filename = r'C:\ohMyPy\mft.txt'
with open(drive_filename, 'rb') as drive, open(log_filename, 'wb') as logfile:
logfile.write(drive.read(1024*10000))
I can read my MBR as follows;
drive = r"\\.\PhysicalDrive0"
hd = open(drive,'rb')
mbr = hd.read(512)
magic is in 'rb' = open file for reading in binary mode, i.e. do not change line-end characters.
Here is the code
def main():
f = open("image.jpg", "rb")
filedata = f.read()
f.close()
print "Creating Test Image"
f = open("ftp_test.jpg", "w+")
f.write(filedata)
f.close()
print "Done!"
if __name__ == '__main__':
main()
Im not sure, why but here is the original image
and here is the resulting picture from the code
I'm not sure what to do so I decided to come to the experts since I'm only 14. I am also adding more to it like TCP communication. So I can send files over the internet.
You're reading the file in binary with rb, so write back in binary too, by using wb.
f = open("ftp_test.jpg", "wb+")
From the official docs:
On Windows, 'b' appended to the mode opens the file in binary mode, so
there are also modes like 'rb', 'wb', and 'r+b'. Python on Windows
makes a distinction between text and binary files; the end-of-line
characters in text files are automatically altered slightly when data
is read or written. This behind-the-scenes modification to file data
is fine for ASCII text files, but it’ll corrupt binary data like that
in JPEG or EXE files. Be very careful to use binary mode when reading
and writing such files. On Unix, it doesn’t hurt to append a 'b' to
the mode, so you can use it platform-independently for all binary
files.
So I have a .jpg/.png and I opened it up in Text Edit which I provided below:
Is there anyway I can save these exotic symbols to a string in Python to later write that to a file to produce an image?
I tried to import a string that had the beta symbol in it and I got an error that send Non-ASCII so I am assuming the same would happen for this.
Is there anyway to get around this problem?
Thanks
Portion of Image.png in Text Edit:
What you are looking at in your text edit is a binary file, trying to represent it all in human readable characters.
Just open the file as binary in python:
with open('picture.png', 'rb') as f:
data = f.read()
with open('picture_out.png', 'wb') as f:
f.write(data)
You can read to file in binary format by providing the rb flag to open and then just save what ever comes out of the file into a text file. I don't know what the point of this would be but there you go
# read in image data
fh = open('test.png','rb')
data = fh.read()
fh.close()
# write gobbledigoock to text file
fh = open('test.txt','w')
fh.write(data)
fh.close
fh.close()
According to Pydocs,
fp = file('blah.xml', 'w+b')
or
fp = file('blah.xml', 'wb')
means open the file in write and binary mode. This is an xml file, however, so why do these two chaps
http://www.pixelmender.com/2010/10/12/scraping-data-using-scrapy-framework/
and
http://doc.scrapy.org/topics/exporters.html#scrapy.contrib.exporter.XmlItemExporter
recommend doing so in their tutorial/docs pages about exporting Scrapy items? In other words, why would anyone open a new xml file in 'b' mode?
It just doesn't make sense with plain XML files.
On Unix there is no difference between binary and non-binary. On Windows written '\n' get translated to '\r\n' if you write non-binary.
But it will make a difference if you embed binary BLOBs, but I don't see those on the sites you mentioned.