Scipy: UnicodeDecodeError while loading image file

Scipy: UnicodeDecodeError while loading image file - python

def image_to_laplacian(filename):
with open(filename, 'r', encoding="latin-1") as f:
s = f.read()
img = sc.misc.imread(f)
image_to_laplacian('images/bw_3x3.png')
Produces:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte
"images/bw_3x3.png" is a 3x3 image I produced in Pinta. I tried opening a cat.jpg I got from Google Images, but I got the same error.
I also tried to use "encoding="latin-1" as an argument to open, based on something I read on SO; I was able to open the file, but I'm
read failed with the exception
OSError: cannot identify image file <_io.TextIOWrapper name='images/bw_3x3.png' mode='r' encoding='latin-1'>

The line that is causing the error is
s = f.read()
In 'r' mode this tries to read the data as a string, but it's a image file, so it will fail. You can use 'rb' instead. Definitely remove the encoding=latin because that's only relevant for text files.
Also, note that according to the documentation:
name : str or file object
The file name or file object to be read.
So you can dispense with opening a file and just give it a filepath as a string. The following should work:
img = sc.misc.imread(filename)

Related

Can't reopen Django File as rb

Why does reopening a django.core.files File as binary not work?
from django.core.files import File
f = open('/home/user/test.zip')
test_file = File(f)
test_file.open(mode="rb")
test_file.read()
This gives me the error 'utf-8' codec can't decode byte 0x82 in position 14: invalid start byte so opening in 'rb' obviously didn't work. The reason I need this is because I want to open a FileField as binary

You need to open(…) [Python-doc] the underlying file handler in binary mode, so:
with open('/home/user/test.zip', mode='rb') as f:
test_file = File(f)
test_file.open(mode='rb')
test_file.read()
Without opening it in binary mode, the underlying reader will try to read this as text, and thus error on bytes that are not a utf-8 code points.

Python reading a PE file and changing resource section

I am trying to open a Windows PE file and alter some strings in the resource section.
f = open('c:\test\file.exe', 'rb')
file = f.read()
if b'A'*10 in file:
s = file.replace(b'A'*10, newstring)
In the resource section I have a string that is just:
AAAAAAAAAA
And I want to replace that with something else. When I read the file I get:
\x00A\x00A\x00A\x00A\x00A\x00A\x00A\x00A\x00A\x00A
I have tried opening with UTF-16 and decoding as UTF-16 but then I run into a error:
UnicodeDecodeError: 'utf-16-le' codec can't decode bytes in position 1604-1605: illegal encoding
Everyone I seen who had the same issue fixed by decoding to UTF-16. I am not sure why this doesn't work for me.

If resource inside binary file is encoded to utf-16, you shouldn't change encoding.
try this
f = open('c:\\test\\file.exe', 'rb')
file = f.read()
unicode_str = u'AAAAAAAAAA'
encoded_str = unicode_str.encode('UTF-16')
if encoded_str in file:
s = file.replace(encoded_str, new_utf_string.encode('UTF-16'))
inside binary file everything is encoded, keep in mind

UnicodeDecodeError using PIL module

I am getting this error message:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
From this code:
from PIL import Image
import os
image_name = "mypic"
count = 0
for filename in os.listdir('pictures'):
if filename.endswith('.jpg'):
image_file = open('pictures/' +filename)
image = Image.open(image_file)
# next 3 lines strip exif
data = list(image.getdata())
image_without_exif = Image.new(image.mode, image.size)
image_without_exif.putdata(data)
image_without_exif.save('new-pictures/' + image_name + str(count) + '.jpg')
count = count + 1;
Not sure why, as this was working yesterday...

I think you need to open the file in binary mode:
image_file = open('pictures/' +filename, 'rb')

This happens because open is trying to read the file as text. You can resolve this by opening the path directly with Image.open()
img = Image.open('pictures/' + filename)
This works because PIL does the related handling for you internally; take a look at its documentation here for more!
https://pillow.readthedocs.io/en/latest/reference/Image.html#PIL.Image.open
Further, it probably makes even more sense to use Image.open as a context manager to handle opening and closing your image when done (there's a good explanation here)
with Image.open('pictures/' + filename) as img:
# process img
# image file closed now after leaving context scope

When using the open(filename) function without any further arguments, you open the file in "text" mode.
Python will assume that the file contains text when reading it. When it finds a byte with the value of 255 (0xFF), it is confused because no text character matches that byte.
To fix this, open the file in bytes mode:
open(filename, "b")
This tells python to not assume it contains text and the file handle will just give out the raw bytes instead.
Because this is a common use-case, PIL already has opening images by filename built in:
Image.open(filename)

unable to decode this string using python

I have this text.ucs file which I am trying to decode using python.
file = open('text.ucs', 'r')
content = file.read()
print content
My result is
\xf\xe\x002\22
I tried doing decoding with utf-16, utf-8
content.decode('utf-16')
and getting error
Traceback (most recent call last): File "", line 1, in
File "C:\Python27\lib\encodings\utf_16.py", line 16, in
decode
return codecs.utf_16_decode(input, errors, True) UnicodeDecodeError: 'utf16' codec can't decode bytes in position
32-33: illegal encoding
Please let me know if I am missing anything or my approach is wrong
Edit: Screenshot has been asked

The string is encoded as UTF16-BE (Big Endian), this works:
content.decode("utf-16-be")

oooh, as i understand you using python 2.x.x but encoding parameter was added only in python 3.x.x as I know, i am doesn't master of python 2.x.x but you can search in google about io.open for example try:
file = io.open('text.usc', 'r',encoding='utf-8')
content = file.read()
print content
but chek do you need import io module or not

You can specify which encoding to use with the encoding argument:
with open('text.ucs', 'r', encoding='utf-16') as f:
text = f.read()

your string need to Be Uncoded With The Coding utf-8 you can do What I Did Now for decode your string
f = open('text.usc', 'r',encoding='utf-8')
print f

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc1 in position 0: invalid start byte

This is my code.
stock_code = open('/home/ubuntu/trading/456.csv', 'r')
csvReader = csv.reader(stock_code)
for st in csvReader:
eventcode = st[1]
print(eventcode)
I want to know content in excel.
But there are unicodeDecodeError.
How can i fix it?

The CSV docs say,
Since open() is used to open a CSV file for reading, the file will by default be decoded into unicode using the system default encoding...
The error message shows that your system is expecting the file to be using UTF-8 encoding.
Solutions:
Make sure the file is using the correct encoding.
For example, open the file using NotePad++, select Encoding from the menu
and select UTF-8. Then resave the file.
Alternatively, specify the encoding of the file when calling open(), like this
my_encoding = 'UTF-8' # or whatever is the encoding of the file.
with open('/home/ubuntu/trading/456.csv', 'r', encoding=my_encoding) as stock_code:
stock_code = open('/home/ubuntu/trading/456.csv', 'r')
csvReader = csv.reader(stock_code)
for st in csvReader:
eventcode = st[1]
print(eventcode)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Scipy: UnicodeDecodeError while loading image file - python

Related

Can't reopen Django File as rb

Python reading a PE file and changing resource section

UnicodeDecodeError using PIL module

unable to decode this string using python

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc1 in position 0: invalid start byte

Categories

Resources