Python write to json file gives UnicodeDecodeError [duplicate] - python

I know that this is a very common error, but it's the first time I've encountered it when trying to write a file.
I'm using networkx to work with graphs for network analysis, and when I try to write into any format:
nx.write_gml(G, "Graph.gml")
nx.write_pajek(G, "Graph.net")
nx.write_gexf(G, "graph.gexf")
I get:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 2, in write_pajek
File "/Library/Python/2.7/site-packages/networkx/utils/decorators.py", line 263, in _open_file
result = func(*new_args, **kwargs)
File "/Library/Python/2.7/site-packages/networkx/readwrite/pajek.py", line 100, in write_pajek
path.write(line.encode(encoding))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 19: ordinal not in range(128)
I haven't found documentation on this, so quite confused.

Wondering if you can make use of codec module to solve it or not. Just create a file object by codec as following before feeding to networkx.
ex,
import codecs
f = codecs.open("graph.gml", "w", "utf-8")

Related

error Traceback (most recent call last): after reading .txt file in Python

this is my first time trying to program in Python
with open('/Users/solidaneziri/Downloads/Data_Exercise_1.txt') as infile:
for line in infile:
print(line.split()[0])
this is the code that I wrote when reading the file and it complied and ran the first time, after the first time I keep getting this error and I don't know to fix it
/usr/bin/python3 /Users/solidaneziri/IdeaProjects/Abgabe1/src/BonusAufgabe/aufgabe1.py
Traceback (most recent call last):
File "/Users/solidaneziri/IdeaProjects/Abgabe1/src/BonusAufgabe/aufgabe1.py", line 2, in <module>
for line in infile:
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x9f in position 190: invalid start byte
I replaced the letters with accents to normal ones and it worked. This is not the optimal Solution but it worked

Python read text

I am simply trying to read a text file that has 4000+ lines of nouns all single column and I’m getting an error:
Traceback (most recent call last):
File "/private/var/mobile/Library/Mobile Documents/iCloud~com~omz-software~Pythonista3/Documents/nouns.py", line 4, in <module>
for i in nouns_file:
File "/var/containers/Bundle/Application/107074CD-03B1-4FB3-809A-CBD44D6CF245/Pythonista3.app/Frameworks/Py3Kit.framework/pylib/encodings/ascii.py", line 27, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2241: ordinal not in range(128)
With code:
with open("nounlist.txt", "r") as nouns_file:
for i in nouns_file:
print(i)
I’m not sure what’s causing this. I would think that it would just output all of the nouns from my nounlist.txt file.

Python num2words bug in french translation

I am using the num2words library for Python in order to translate numbers into french words.
However I have an encoding issue with the following case :
num2words((5.05),lang='fr')
output :
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/num2words/__init__.py", line 50, in num2words
return converter.to_cardinal(number)
File "/usr/local/lib/python2.7/dist-packages/num2words/base.py", line 93, in to_cardinal
return self.to_cardinal_float(value)
File "/usr/local/lib/python2.7/dist-packages/num2words/base.py", line 127, in to_cardinal_float
out.append(str(self.to_cardinal(curr)))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 1: ordinal not in range(128)
It seems to be coming from the "zéro" string. But I do not have the problem when there is no dot:
num2words((0),lang='fr')
output:
u'z\xe9ro'
Can you please help ?
Thanks !

representing µs in Python 2.7

I'm parsing a csv, and writing part of its contents to a xls file using xlwt
Every time µs pops up in the original file, I get a UnicodeDecodeError from xlwt:
File "C:\SW_DevSandbox\E2\FlightTestInstrumentation\ICDforFTI\ICDforFTI.py", line 243, in generateICD
icd.write(icdLine,icdTitle.index('Unit'),entry['Unit'])
File "C:\espressoE2\tools\OpenVIB\1.2\python\lib\site-packages\xlwt\Worksheet.py", line 1030, in write
self.row(r).write(c, label, style)
File "C:\espressoE2\tools\OpenVIB\1.2\python\lib\site-packages\xlwt\Row.py", line 240, in write
StrCell(self.__idx, col, style_index, self.__parent_wb.add_str(label))
File "C:\espressoE2\tools\OpenVIB\1.2\python\lib\site-packages\xlwt\Workbook.py", line 326, in add_str
return self.__sst.add_str(s)
File "C:\espressoE2\tools\OpenVIB\1.2\python\lib\site-packages\xlwt\BIFFRecords.py", line 24, in add_str
s = unicode(s, self.encoding)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xb5 in position 0: invalid start byte
I think the root problem is the following:
In python 3, I can easily represent µs:
>>> '\xb5s'
'µs'
>>>
In python 2, apparently not:
>>> '\xb5s'
'\xb5s'
>>> u'\xb5s'
u'\xb5s'
>>> unicode('\xb5s')
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xb5 in position 0: ordinal not in range(128)
>>> unicode('\xb5s','utf8')
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
File "C:\espressoE2\tools\OpenVIB\1.2\python\lib\encodings\utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xb5 in position 0: invalid start byte
>>>
Edit: print u'\xb5s' works in Python 2, thanks #cdarke. But print does not solve the problem, it's not an internal representation I can then feed to xlwt.
end of Edit.
So how can I represent µs in Python 2?
Notepad++ displays the csv file fine, with µs. The "Encoding" menu shows it's encoding as "ANSI", and if I change to "UTF-8" I start seeing the "B5" all over the text.
Python 2 Unicode has no encoding called "ANSI".
Is there a Python 2 Unicode encoding equivalent to what Notepad++ calls "ANSI"?
ANSI in Notepad is the native locale for Windows. If you are using US Windows that locale is cp1252. Your file is probably encoded in cp1252 and not utf8. If you are using another version of Windows, locale.getpreferredencoding() will tell you what Windows considers ANSI.
>>> '\xb5s'.decode('cp1252')
u'\xb5s'

UnicodeDecodeError reading string in CSV

I'm having a problem reading some chars in python.
I have a csv file in UTF-8 format, and I'm reading, but when script read:
Preußen Münster-Kaiserslautern II
I get this error:
Traceback (most recent call last):
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/webapp/__init__.py", line 515, in __call__
handler.get(*groups)
File "/Users/fermin/project/gae/cuotastats/controllers/controllers.py", line 50, in get
f.name = unicode( row[1])
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4: ordinal not in range(128)
I tried to use Unicode functions and convert string to Unicode, but I haven't found the solution. I tried to use sys.setdefaultencoding('utf8') but that doesn't work either.
Try the unicode_csv_reader() generator described in the csv module docs.

Categories