UnicodeDecodeError: while writing into a file

UnicodeDecodeError: while writing into a file - python

I get this error while writing into a file. How can I handle this.
Traceback (most recent call last):
File "C:\Python27\AureusBAXProjectFB.py", line 278, in <module>
rows = [[unicode(x) for x in row] for row in outlist]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 0: ordinal not in range(128)
>>>
Code for writing into a file
class UnicodeWriter:
"""
A CSV writer which will write rows to CSV file "f",
which is encoded in the given encoding.
"""
def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):
# Redirect output to a queue
self.queue = cStringIO.StringIO()
self.writer = csv.writer(self.queue, dialect=dialect, **kwds)
self.stream = f
self.encoder = codecs.getincrementalencoder(encoding)()
def writerow(self, row):
self.writer.writerow([s.encode("utf-8") for s in row])
# Fetch UTF-8 output from the queue ...
data = self.queue.getvalue()
data = data.decode("utf-8")
# ... and reencode it into the target encoding
data = self.encoder.encode(data)
# write to the target stream
self.stream.write(data)
# empty queue
self.queue.truncate(0)
def writerows(self, rows):
for row in rows:
self.writerow(row)
with open('C:/Users/Desktop/fboutput.csv', 'wb') as f:
writer = UnicodeWriter(f)
rows = [[unicode(x) for x in row] for row in outlist]
writer.writerows(rows)
I am using BeautifulSoup to parse the html data and thats working fine. I get an error only while writing into a file.

unicode() constructor defined as unicode(string[, encoding, errors]) and encoding has default is ascii. If multi-byte string is in outlist, you should appoint unicode encode like utf-8.

Related

how to write a unicode csv in Python 2.7

I want to write data to files where a row from a CSV should look like this list (directly from the Python console):
row = ['\xef\xbb\xbft_11651497', 'http://kozbeszerzes.ceu.hu/entity/t/11651497.xml', "Szabolcs Mag '98 Kft.", 'ny\xc3\xadregyh\xc3\xa1za', 'ny\xc3\xadregyh\xc3\xa1za', '4400', 't\xc3\xbcnde utca 20.', 47.935175, 21.744975, u'Ny\xedregyh\xe1za', u'Borb\xe1nya', u'Szabolcs-Szatm\xe1r-Bereg', u'Ny\xedregyh\xe1zai', u'20', u'T\xfcnde utca', u'Magyarorsz\xe1g', u'4405']
Py2k does not do Unicode, but I had a UnicodeWriter wrapper:
import cStringIO, codecs
class UnicodeWriter:
"""
A CSV writer which will write rows to CSV file "f",
which is encoded in the given encoding.
"""
def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):
# Redirect output to a queue
self.queue = cStringIO.StringIO()
self.writer = csv.writer(self.queue, dialect=dialect, **kwds)
self.stream = f
self.encoder = codecs.getincrementalencoder(encoding)()
def writerow(self, row):
self.writer.writerow([unicode(s).encode("utf-8") for s in row])
# Fetch UTF-8 output from the queue ...
data = self.queue.getvalue()
data = data.decode("utf-8")
# ... and reencode it into the target encoding
data = self.encoder.encode(data)
# write to the target stream
self.stream.write(data)
# empty queue
self.queue.truncate(0)
def writerows(self, rows):
for row in rows:
self.writerow(row)
However, these lines still produce the dreaded encoding error message below:
f.write(codecs.BOM_UTF8)
writer = UnicodeWriter(f)
writer.writerow(row)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 9: ordinal not in range(128)
What is there to do? Thanks!

You are passing bytestrings containing non-ASCII data in, and these are being decoded to Unicode using the default codec at this line:
self.writer.writerow([unicode(s).encode("utf-8") for s in row])
unicode(bytestring) with data that cannot be decoded as ASCII fails:
>>> unicode('\xef\xbb\xbft_11651497')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 0: ordinal not in range(128)
Decode the data to Unicode before passing it to the writer:
row = [v.decode('utf8') if isinstance(v, str) else v for v in row]
This assumes that your bytestring values contain UTF-8 data instead. If you have a mix of encodings, try to decode to Unicode at the point of origin; where your program first sourced the data. You really want to do so anyway, regardless of where the data came from or if it already was encoded to UTF-8 as well.

Read and Write CSV files including unicode with Python 2.7

I am new to Python, and I have a question about how to use Python to read and write CSV files. My file contains like Germany, French, etc. According to my code, the files can be read correctly in Python, but when I write it into a new CSV file, the unicode becomes some strange characters.
The data is like:
And my code is:
import csv
f=open('xxx.csv','rb')
reader=csv.reader(f)
wt=open('lll.csv','wb')
writer=csv.writer(wt,quoting=csv.QUOTE_ALL)
wt.close()
f.close()
And the result is like:
What should I do to solve the problem?

Another alternative:
Use the code from the unicodecsv package ...
https://pypi.python.org/pypi/unicodecsv/
>>> import unicodecsv as csv
>>> from io import BytesIO
>>> f = BytesIO()
>>> w = csv.writer(f, encoding='utf-8')
>>> _ = w.writerow((u'é', u'ñ'))
>>> _ = f.seek(0)
>>> r = csv.reader(f, encoding='utf-8')
>>> next(r) == [u'é', u'ñ']
True
This module is API compatible with the STDLIB csv module.

Make sure you encode and decode as appropriate.
This example will roundtrip some example text in utf-8 to a csv file and back out to demonstrate:
# -*- coding: utf-8 -*-
import csv
tests={'German': [u'Straße',u'auslösen',u'zerstören'],
'French': [u'français',u'américaine',u'épais'],
'Chinese': [u'中國的',u'英語',u'美國人']}
with open('/tmp/utf.csv','w') as fout:
writer=csv.writer(fout)
writer.writerows([tests.keys()])
for row in zip(*tests.values()):
row=[s.encode('utf-8') for s in row]
writer.writerows([row])
with open('/tmp/utf.csv','r') as fin:
reader=csv.reader(fin)
for row in reader:
temp=list(row)
fmt=u'{:<15}'*len(temp)
print fmt.format(*[s.decode('utf-8') for s in temp])
Prints:
German Chinese French
Straße 中國的 français
auslösen 英語 américaine
zerstören 美國人 épais

There is an example at the end of the csv module documentation that demonstrates how to deal with Unicode. Below is copied directly from that example. Note that the strings read or written will be Unicode strings. Don't pass a byte string to UnicodeWriter.writerows, for example.
import csv,codecs,cStringIO
class UTF8Recoder:
def __init__(self, f, encoding):
self.reader = codecs.getreader(encoding)(f)
def __iter__(self):
return self
def next(self):
return self.reader.next().encode("utf-8")
class UnicodeReader:
def __init__(self, f, dialect=csv.excel, encoding="utf-8-sig", **kwds):
f = UTF8Recoder(f, encoding)
self.reader = csv.reader(f, dialect=dialect, **kwds)
def next(self):
'''next() -> unicode
This function reads and returns the next line as a Unicode string.
'''
row = self.reader.next()
return [unicode(s, "utf-8") for s in row]
def __iter__(self):
return self
class UnicodeWriter:
def __init__(self, f, dialect=csv.excel, encoding="utf-8-sig", **kwds):
self.queue = cStringIO.StringIO()
self.writer = csv.writer(self.queue, dialect=dialect, **kwds)
self.stream = f
self.encoder = codecs.getincrementalencoder(encoding)()
def writerow(self, row):
'''writerow(unicode) -> None
This function takes a Unicode string and encodes it to the output.
'''
self.writer.writerow([s.encode("utf-8") for s in row])
data = self.queue.getvalue()
data = data.decode("utf-8")
data = self.encoder.encode(data)
self.stream.write(data)
self.queue.truncate(0)
def writerows(self, rows):
for row in rows:
self.writerow(row)
with open('xxx.csv','rb') as fin, open('lll.csv','wb') as fout:
reader = UnicodeReader(fin)
writer = UnicodeWriter(fout,quoting=csv.QUOTE_ALL)
for line in reader:
writer.writerow(line)
Input (UTF-8 encoded):
American,美国人
French,法国人
German,德国人
Output:
"American","美国人"
"French","法国人"
"German","德国人"

Because str in python2 is bytes actually. So if want to write unicode to csv, you must encode unicode to str using utf-8 encoding.
def py2_unicode_to_str(u):
# unicode is only exist in python2
assert isinstance(u, unicode)
return u.encode('utf-8')
Use class csv.DictWriter(csvfile, fieldnames, restval='', extrasaction='raise', dialect='excel', *args, **kwds):
py2
The csvfile: open(fp, 'w')
pass key and value in bytes which are encoded with utf-8
writer.writerow({py2_unicode_to_str(k): py2_unicode_to_str(v) for k,v in row.items()})
py3
The csvfile: open(fp, 'w')
pass normal dict contains str as row to writer.writerow(row)
Finally code
import sys
is_py2 = sys.version_info[0] == 2
def py2_unicode_to_str(u):
# unicode is only exist in python2
assert isinstance(u, unicode)
return u.encode('utf-8')
with open('file.csv', 'w') as f:
if is_py2:
data = {u'Python中国': u'Python中国', u'Python中国2': u'Python中国2'}
# just one more line to handle this
data = {py2_unicode_to_str(k): py2_unicode_to_str(v) for k, v in data.items()}
fields = list(data[0])
writer = csv.DictWriter(f, fieldnames=fields)
for row in data:
writer.writerow(row)
else:
data = {'Python中国': 'Python中国', 'Python中国2': 'Python中国2'}
fields = list(data[0])
writer = csv.DictWriter(f, fieldnames=fields)
for row in data:
writer.writerow(row)
Conclusion
In python3, just use the unicode str.
In python2, use unicode handle text, use str when I/O occurs.

I had the very same issue. The answer is that you are doing it right already. It is the problem of MS Excel. Try opening the file with another editor and you will notice that your encoding was successful already. To make MS Excel happy, move from UTF-8 to UTF-16. This should work:
class UnicodeWriter:
def __init__(self, f, dialect=csv.excel_tab, encoding="utf-16", **kwds):
# Redirect output to a queue
self.queue = StringIO.StringIO()
self.writer = csv.writer(self.queue, dialect=dialect, **kwds)
self.stream = f
# Force BOM
if encoding=="utf-16":
import codecs
f.write(codecs.BOM_UTF16)
self.encoding = encoding
def writerow(self, row):
# Modified from original: now using unicode(s) to deal with e.g. ints
self.writer.writerow([unicode(s).encode("utf-8") for s in row])
# Fetch UTF-8 output from the queue ...
data = self.queue.getvalue()
data = data.decode("utf-8")
# ... and reencode it into the target encoding
data = data.encode(self.encoding)
# strip BOM
if self.encoding == "utf-16":
data = data[2:]
# write to the target stream
self.stream.write(data)
# empty queue
self.queue.truncate(0)
def writerows(self, rows):
for row in rows:
self.writerow(row)

I couldn't respond to Mark above, but I just made one modification which fixed the error that was caused if data in the cells was not unicode, e.g. float or int data. I replaced this line into the UnicodeWriter function: "self.writer.writerow([s.encode("utf-8") if type(s)==types.UnicodeType else s for s in row])" so that it became:
class UnicodeWriter:
def __init__(self, f, dialect=csv.excel, encoding="utf-8-sig", **kwds):
self.queue = cStringIO.StringIO()
self.writer = csv.writer(self.queue, dialect=dialect, **kwds)
self.stream = f
self.encoder = codecs.getincrementalencoder(encoding)()
def writerow(self, row):
'''writerow(unicode) -> None
This function takes a Unicode string and encodes it to the output.
'''
self.writer.writerow([s.encode("utf-8") if type(s)==types.UnicodeType else s for s in row])
data = self.queue.getvalue()
data = data.decode("utf-8")
data = self.encoder.encode(data)
self.stream.write(data)
self.queue.truncate(0)
def writerows(self, rows):
for row in rows:
self.writerow(row)
You will also need to "import types".

I don't think this is the best answer, but it's probably the most self-contained answer and also the funniest.
UTF7 is a 7-bit ASCII encoding of unicode. It just so happens that UTF7 makes no special use of commas, quotes, or whitespace. It just passes them through from input to output. So really it makes no difference if you UTF7-encode first and then parse as CSV, or if you parse as CSV first and then UTF7-encode. Python 2's CSV parser can't handle unicode, but python 2 does have a UTF-7 encoder. So you can encode, parse, and then decode, and it's as if you had a unicode-capable parser.
import csv
import io
def read_csv(path):
with io.open(path, 'rt', encoding='utf8') as f:
lines = f.read().split("\r\n")
lines = [l.encode('utf7').decode('ascii') for l in lines]
reader = csv.reader(lines, dialect=csv.excel)
for row in reader:
yield [x.encode('ascii').decode('utf7') for x in row]
for row in read_csv("lol.csv"):
print(repr(row))
lol.csv
foo,bar,foo∆bar,"foo,bar"
output:
[u'foo', u'bar', u'foo\u2206bar', u'foo,bar']

python csv unicode 'ascii' codec can't encode character u'\xf6' in position 1: ordinal not in range(128)

I have copied this script from [python web site][1] This is another question but now problem with encoding:
import sqlite3
import csv
import codecs
import cStringIO
import sys
class UTF8Recoder:
"""
Iterator that reads an encoded stream and reencodes the input to UTF-8
"""
def __init__(self, f, encoding):
self.reader = codecs.getreader(encoding)(f)
def __iter__(self):
return self
def next(self):
return self.reader.next().encode("utf-8")
class UnicodeReader:
"""
A CSV reader which will iterate over lines in the CSV file "f",
which is encoded in the given encoding.
"""
def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):
f = UTF8Recoder(f, encoding)
self.reader = csv.reader(f, dialect=dialect, **kwds)
def next(self):
row = self.reader.next()
return [unicode(s, "utf-8") for s in row]
def __iter__(self):
return self
class UnicodeWriter:
"""
A CSV writer which will write rows to CSV file "f",
which is encoded in the given encoding.
"""
def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):
# Redirect output to a queue
self.queue = cStringIO.StringIO()
self.writer = csv.writer(self.queue, dialect=dialect, **kwds)
self.stream = f
self.encoder = codecs.getincrementalencoder(encoding)()
def writerow(self, row):
self.writer.writerow([s.encode("utf-8") for s in row])
# Fetch UTF-8 output from the queue ...
data = self.queue.getvalue()
data = data.decode("utf-8")
# ... and reencode it into the target encoding
data = self.encoder.encode(data)
# write to the target stream
self.stream.write(data)
# empty queue
self.queue.truncate(0)
def writerows(self, rows):
for row in rows:
self.writerow(row)
This time problem with encoding, when I ran this it gave me this error:
Traceback (most recent call last):
File "makeCSV.py", line 87, in <module>
uW.writerow(d)
File "makeCSV.py", line 54, in writerow
self.writer.writerow([s.encode("utf-8") for s in row])
AttributeError: 'int' object has no attribute 'encode'
Then I converted all integers to string, but this time I got this error:
Traceback (most recent call last):
File "makeCSV.py", line 87, in <module>
uW.writerow(d)
File "makeCSV.py", line 54, in writerow
self.writer.writerow([str(s).encode("utf-8") for s in row])
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 1: ordinal not in range(128)
I have implemented above to deal with unicode characters, but it gives me such error. What is the problem and how to fix it?

Then I converted all integers to string,
You converted both integers and strings to byte strings. For strings this will use the default character encoding which happens to be ASCII, and this fails when you have non-ASCII characters. You want unicode instead of str.
self.writer.writerow([unicode(s).encode("utf-8") for s in row])
It might be better to convert everything to unicode before calling that method. The class is designed specifically for parsing Unicode strings. It was not designed to support other data types.

From the documentation:
http://docs.python.org/library/stringio.html?highlight=cstringio#cStringIO.StringIO
Unlike the StringIO module, this module is not able to accept Unicode strings that cannot be encoded as plain ASCII strings.
I.e. only 7-bit clean strings can be stored.

If you are using Python 2:
make encoding as : str(s.encode("utf-8"))
i.e.
def writerow(self, row):
self.writer.writerow([str(s.encode("utf-8")) for s in row])
# Fetch UTF-8 output from the queue ...
data = self.queue.getvalue()
data = data.decode("utf-8")
# ... and reencode it into the target encoding
data = self.encoder.encode(data)
# write to the target stream
self.stream.write(data)
# empty queue
self.queue.truncate(0)

Python - SQLite to CSV Writer Error - ASCII values not parsed

Afternoon,
I am having some trouble with a SQLite to CSV python script. I have searched high and I have searched low for an answer but none have worked for me, or I am having a problem with my syntax.
I want to replace characters within the SQLite database which fall outside of the ASCII table (larger than 128).
Here is the script I have been using:
#!/opt/local/bin/python
import sqlite3
import csv, codecs, cStringIO
class UnicodeWriter:
"""
A CSV writer which will write rows to CSV file "f",
which is encoded in the given encoding.
"""
def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):
# Redirect output to a queue
self.queue = cStringIO.StringIO()
self.writer = csv.writer(self.queue, dialect=dialect, **kwds)
self.stream = f
self.encoder = codecs.getincrementalencoder(encoding)()
def writerow(self, row):
self.writer.writerow([unicode(s).encode("utf-8") for s in row])
# Fetch UTF-8 output from the queue ...
data = self.queue.getvalue()
data = data.decode("utf-8")
# ... and reencode it into the target encoding
data = self.encoder.encode(data)
# write to the target stream
self.stream.write(data)
# empty queue
self.queue.truncate(0)
def writerows(self, rows):
for row in rows:
self.writerow(row)
conn = sqlite3.connect('test.db')
c = conn.cursor()
# Select whichever rows you want in whatever order you like
c.execute('select ROWID, Name, Type, PID from PID')
writer = UnicodeWriter(open("ProductListing.csv", "wb"))
# Make sure the list of column headers you pass in are in the same order as your SELECT
writer.writerow(["ROWID", "Product Name", "Product Type", "PID", ])
writer.writerows(c)
I have tried to add the 'replace' as indicated here but have got the same error. Python: Convert Unicode to ASCII without errors for CSV file
The error is the UnicodeDecodeError.
Traceback (most recent call last):
File "SQLite2CSV1.py", line 53, in <module>
writer.writerows(c)
File "SQLite2CSV1.py", line 32, in writerows
self.writerow(row)
File "SQLite2CSV1.py", line 19, in writerow
self.writer.writerow([unicode(s).encode("utf-8") for s in row])
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in position 65: ordinal not in range(128)
Obviously I want the code to be robust enough that if it encounters characters outside of these bounds that it replaces it with a character such as '?' (\x3f).
Is there a way to do this within the UnicodeWriter class? And a way I can make the code robust that it won't produce these errors.
Your help is greatly appreciated.

If you just want to write an ASCII CSV, simply use the stock csv.writer(). To ensure that all values passed are indeed ASCII, use encode('ascii', errors='replace').
Example:
import csv
rows = [
[u'some', u'other', u'more'],
[u'umlaut:\u00fd', u'euro sign:\u20ac', '']
]
with open('/tmp/test.csv', 'wb') as csvFile:
writer = csv.writer(csvFile)
for row in rows:
asciifiedRow = [item.encode('ascii', errors='replace') for item in row]
print '%r --> %r' % (row, asciifiedRow)
writer.writerow(asciifiedRow)
The console output for this is:
[u'some', u'other', u'more'] --> ['some', 'other', 'more']
[u'umlaut:\xfd', u'euro sign:\u20ac', ''] --> ['umlaut:?', 'euro sign:?', '']
The resulting CSV file contains:
some,other,more
umlaut:?,euro sign:?,

With access to a unix environment, here's what worked for me
sqlite3.exe a.db .dump > a.sql;
tr -d "[\\200-\\377]" < a.sql > clean.sql;
sqlite3.exe clean.db < clean.sql;
(It's not a python solution, but maybe it will help someone else due to its brevity. This solution STRIPS OUT all non ascii characters, doesn't try to replace them.)

Given a unicode error I don't understand

Here is my code, I'm sure it looks terrible but it all works as it should, only problem I'm having is with the last line...
import pyPdf
import os
import csv
class UnicodeWriter:
"""
A CSV writer which will write rows to CSV file "f",
which is encoded in the given encoding.
"""
def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):
# Redirect output to a queue
self.queue = cStringIO.StringIO()
self.writer = csv.writer(self.queue, dialect=dialect, **kwds)
self.stream = f
self.encoder = codecs.getincrementalencoder(encoding)()
def writerow(self, row):
self.writer.writerow([s.encode("utf-8") for s in row])
# Fetch UTF-8 output from the queue ...
data = self.queue.getvalue()
data = data.decode("utf-8")
# ... and reencode it into the target encoding
data = self.encoder.encode(data)
# write to the target stream
self.stream.write(data)
# empty queue
self.queue.truncate(0)
def writerows(self, rows):
for row in rows:
self.writerow(row)
PDFWriter = csv.writer(open('/home/nick/TAM_work/text/text.doc', 'a'), delimiter=' ', quotechar='|', quoting=csv.QUOTE_ALL)
def getPDFContent(path):
content = ""
# Load PDF into pyPDF
pdf = pyPdf.PdfFileReader(file(path, "rb"))
# Iterate pages
for i in range(0, pdf.getNumPages()):
# Extract text from page and add to content
content += pdf.getPage(i).extractText() + "\n"
# Collapse whitespace
content = " ".join(content.replace(u"\xa0", " ").strip().split())
return content
for word in os.listdir("/home/nick/TAM_work/TAM_pdfs"):
print getPDFContent("/home/nick/TAM_work/TAM_pdfs/" + word)
PDFWriter.writerow ([getPDFContent("/home/nick/TAM_work/TAM_pdfs/" + word)])
When I run it everything works until it hits this...
Traceback (most recent call last):
File "Saving_fuction_added.py", line 52, in <module>
PDFWriter.writerow ([getPDFContent("/home/nick/TAM_work/TAM_pdfs/" + word)])
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2122' in position 81: ordinal not in range(128)
I'd love any help. Thanks guys.
Matt

Here's the code that answered that question. But now it only writes the last file.
import pyPdf
import os
import csv
class UnicodeWriter:
"""
A CSV writer which will write rows to CSV file "f",
which is encoded in the given encoding.
"""
def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):
# Redirect output to a queue
self.queue = cStringIO.StringIO()
self.writer = csv.writer(self.queue, dialect=dialect, **kwds)
self.stream = f
self.encoder = codecs.getincrementalencoder(encoding)()
def writerow(self, row):
self.writer.writerow([s.encode("utf-8") for s in row])
# Fetch UTF-8 output from the queue ...
data = self.queue.getvalue()
data = data.decode("utf-8")
# ... and reencode it into the target encoding
data = self.encoder.encode(data)
# write to the target stream
self.stream.write(data)
# empty queue
self.queue.truncate(0)
def writerows(self, rows):
for row in rows:
self.writerow(row)
PDFWriter = csv.writer(open('/home/nick/TAM_work/text/text.doc', 'a'), delimiter=' ', quotechar='|', quoting=csv.QUOTE_ALL)
def getPDFContent(path):
content = ""
# Load PDF into pyPDF
pdf = pyPdf.PdfFileReader(file(path, "rb"))
# Iterate pages
for i in range(0, pdf.getNumPages()):
# Extract text from page and add to content
content += pdf.getPage(i).extractText() + "\n"
# Collapse whitespace
content = " ".join(content.replace(u"\xa0", " ").strip().split())
return content
for word in os.listdir("/home/nick/TAM_work/TAM_pdfs"):
print getPDFContent("/home/nick/TAM_work/TAM_pdfs/" + word)
PDFWriter.writerow ([getPDFContent("/home/nick/TAM_work/TAM_pdfs/" + word).encode("ascii", "ignore")])

as I Underestand you put a large number in a small varible and its throw an exception.
I introduce you a C# tool that work very fine with unicode , you can find it at http://unicode.codeplex.com
in your case I recommand to change the
for i in range(0, pdf.getNumPages()):
pdf.getNumPages() is above than 128 just controll it.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

UnicodeDecodeError: while writing into a file - python

unicode() constructor defined as unicode(string[, encoding, errors]) and encoding has default is ascii. If multi-byte string is in outlist, you should appoint unicode encode like utf-8.

Related

how to write a unicode csv in Python 2.7

Read and Write CSV files including unicode with Python 2.7

python csv unicode 'ascii' codec can't encode character u'\xf6' in position 1: ordinal not in range(128)

Python - SQLite to CSV Writer Error - ASCII values not parsed

Given a unicode error I don't understand

Categories

Resources