Python reading a PE file and changing resource section - python

I am trying to open a Windows PE file and alter some strings in the resource section.
f = open('c:\test\file.exe', 'rb')
file = f.read()
if b'A'*10 in file:
s = file.replace(b'A'*10, newstring)
In the resource section I have a string that is just:
AAAAAAAAAA
And I want to replace that with something else. When I read the file I get:
\x00A\x00A\x00A\x00A\x00A\x00A\x00A\x00A\x00A\x00A
I have tried opening with UTF-16 and decoding as UTF-16 but then I run into a error:
UnicodeDecodeError: 'utf-16-le' codec can't decode bytes in position 1604-1605: illegal encoding
Everyone I seen who had the same issue fixed by decoding to UTF-16. I am not sure why this doesn't work for me.

If resource inside binary file is encoded to utf-16, you shouldn't change encoding.
try this
f = open('c:\\test\\file.exe', 'rb')
file = f.read()
unicode_str = u'AAAAAAAAAA'
encoded_str = unicode_str.encode('UTF-16')
if encoded_str in file:
s = file.replace(encoded_str, new_utf_string.encode('UTF-16'))
inside binary file everything is encoded, keep in mind

Related

How to convert cp866 in utf-8 coding in python

I have file in cp866 encoding, i open it:
input_file = open(file_name, 'r', encoding = 'cp866')
How i can print() lines from this file like utf-8
I need decode this file to utf-8 and print
Well, the characters read from the file will be decoded and stored in memory as Python strings. You can print them on screen and they should be correct. You can then save the data as utf-8.
You can try:
result = text.encode('cp866').decode('cp866').encode('utf8') - simple convert
data = myfile.read()
b = bytes(data,"KOI8-R")
data_encoding = str(b,"cp1251")
to cp1251, usualy from cp866 you want to decode to cp1251"
If you need decode just once - try this web-site

Can't reopen Django File as rb

Why does reopening a django.core.files File as binary not work?
from django.core.files import File
f = open('/home/user/test.zip')
test_file = File(f)
test_file.open(mode="rb")
test_file.read()
This gives me the error 'utf-8' codec can't decode byte 0x82 in position 14: invalid start byte so opening in 'rb' obviously didn't work. The reason I need this is because I want to open a FileField as binary
You need to open(…) [Python-doc] the underlying file handler in binary mode, so:
with open('/home/user/test.zip', mode='rb') as f:
test_file = File(f)
test_file.open(mode='rb')
test_file.read()
Without opening it in binary mode, the underlying reader will try to read this as text, and thus error on bytes that are not a utf-8 code points.

unable to decode this string using python

I have this text.ucs file which I am trying to decode using python.
file = open('text.ucs', 'r')
content = file.read()
print content
My result is
\xf\xe\x002\22
I tried doing decoding with utf-16, utf-8
content.decode('utf-16')
and getting error
Traceback (most recent call last): File "", line 1, in
File "C:\Python27\lib\encodings\utf_16.py", line 16, in
decode
return codecs.utf_16_decode(input, errors, True) UnicodeDecodeError: 'utf16' codec can't decode bytes in position
32-33: illegal encoding
Please let me know if I am missing anything or my approach is wrong
Edit: Screenshot has been asked
The string is encoded as UTF16-BE (Big Endian), this works:
content.decode("utf-16-be")
oooh, as i understand you using python 2.x.x but encoding parameter was added only in python 3.x.x as I know, i am doesn't master of python 2.x.x but you can search in google about io.open for example try:
file = io.open('text.usc', 'r',encoding='utf-8')
content = file.read()
print content
but chek do you need import io module or not
You can specify which encoding to use with the encoding argument:
with open('text.ucs', 'r', encoding='utf-16') as f:
text = f.read()
your string need to Be Uncoded With The Coding utf-8 you can do What I Did Now for decode your string
f = open('text.usc', 'r',encoding='utf-8')
print f

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc1 in position 0: invalid start byte

This is my code.
stock_code = open('/home/ubuntu/trading/456.csv', 'r')
csvReader = csv.reader(stock_code)
for st in csvReader:
eventcode = st[1]
print(eventcode)
I want to know content in excel.
But there are unicodeDecodeError.
How can i fix it?
The CSV docs say,
Since open() is used to open a CSV file for reading, the file will by default be decoded into unicode using the system default encoding...
The error message shows that your system is expecting the file to be using UTF-8 encoding.
Solutions:
Make sure the file is using the correct encoding.
For example, open the file using NotePad++, select Encoding from the menu
and select UTF-8. Then resave the file.
Alternatively, specify the encoding of the file when calling open(), like this
my_encoding = 'UTF-8' # or whatever is the encoding of the file.
with open('/home/ubuntu/trading/456.csv', 'r', encoding=my_encoding) as stock_code:
stock_code = open('/home/ubuntu/trading/456.csv', 'r')
csvReader = csv.reader(stock_code)
for st in csvReader:
eventcode = st[1]
print(eventcode)

Python 3: JSON File Load with Non-ASCII Characters

just trying to load this JSON file(with non-ascii characters) as a python dictionary with Unicode encoding but still getting this error:
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 90: ordinal not in range(128)
JSON file content = "tooltip":{
"dxPivotGrid-sortRowBySummary": "Sort\"{0}\"byThisRow",}
import sys
import json
data = []
with open('/Users/myvb/Desktop/Automation/pt-PT.json') as f:
for line in f:
data.append(json.loads(line.encode('utf-8','replace')))
You have several problems as near as I can tell. First, is the file encoding. When you open a file without specifying an encoding, the file is opened with whatever sys.getfilesystemencoding() is. Since that may vary (especially on Windows machines) its a good idea to explicitly use encoding="utf-8" for most json files. Because of your error message, I suspect that the file was opened with an ascii encoding.
Next, the file is decoded from utf-8 into python strings as it is read by the file system object. The utf-8 line has already been decoded to a string and is already ready for json to read. When you do line.encode('utf-8','replace'), you encode the line back into a bytes object which the json loads (that is, "load string") can't handle.
Finally, "tooltip":{ "navbar":"Operações de grupo"} isn't valid json, but it does look like one line of a pretty-printed json file containing a single json object. My guess is that you should read the entire file as 1 json object.
Putting it all together you get:
import json
with open('/Users/myvb/Desktop/Automation/pt-PT.json', encoding="utf-8") as f:
data = json.load(f)
From its name, its possible that this file is encoded as a Windows Portugese code page. If so, the "cp860" encoding may work better.
I had the same problem, what worked for me was creating a regular expression, and parsing every line from the json file:
REGEXP = '[^A-Za-z0-9\'\:\.\;\-\?\!]+'
new_file_line = re.sub(REGEXP, ' ', old_file_line).strip()
Having a file with content similar to yours I can read the file in one simple shot:
>>> import json
>>> fname = "data.json"
>>> with open(fname) as f:
... data = json.load(f)
...
>>> data
{'tooltip': {'navbar': 'Operações de grupo'}}
You don't need to read each line. You have two options:
import sys
import json
data = []
with open('/Users/myvb/Desktop/Automation/pt-PT.json') as f:
data.append(json.load(f))
Or, you can load all lines and pass them to the json module:
import sys
import json
data = []
with open('/Users/myvb/Desktop/Automation/pt-PT.json') as f:
data.append(json.loads(''.join(f.readlines())))
Obviously, the first suggestion is the best.

Categories