python - How to write unicode characters to files correctly - python

If I run the following code:
text = 'سلام عزیزم! عزیزم سلام!'
with open('temp.txt', 'w') as out_file:
print(text)
out_file.write(text)
with open('temp.txt', 'r') as in_file:
print(in_file.read())
I get this output:
Traceback (most recent call last):
سلام عزیزم! عزیزم سلام!
File "Z:/my files/projects/programing/python/courseAssist/gui.py", line 190, in <module>
out_file.write(text)
File "C:\Users\aran\Anaconda3\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-3: character maps to <undefined>
How can I fix it?

Specify the encoding encoding='utf-8':
text = 'سلام عزیزم! عزیزم سلام!'
with open('temp.txt', 'w', encoding='utf-8') as out_file:
print(text)
out_file.write(text)
with open('temp.txt', 'r', encoding='utf-8') as in_file:
print(in_file.read())

Related

UnicodeEncodeError: 'charmap' codec can't encode character '\u4e00' in position 28: character maps to <undefined>

I am trying to decode a .csv in python.
But the .csv has Chinese eg.(三) and Japanese eg.(さん) characters in it
Here is how i am decoding the file:
fname = "data.csv"
rows = []
with open(fname, 'r', encoding="utf-8") as file:
csvreader = csv.reader(file)
for row in csvreader:
rows.append(row)
print(rows)
It gives me this error even thought its already decoded with utf-8:
Traceback (most recent call last):
File "c:\Users\m8\Desktop\programing_stuff\python-stuff\handwritten_kanji_recognition - 30-08-2022\app.py", line 25, in <module>
print(rows)
File "C:\Users\m8\AppData\Local\Programs\Python\Python310\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u4e00' in position 28: character maps to <undefined>

Program crashes during reading text file

def process_file(self):
error_flag = 0
line_count = 0
log_file = self.file_name
pure_name = log_file.strip()
# print('Before opening file ',pure_name)
logfile_in = open(pure_name, 'r') # Read file
lines = logfile_in.readlines()
# print('After reading file enteries ', pure_name)
Error Message
Traceback (most recent call last):
File "C:\Users\admin\PycharmProjects\BackupLogCheck\main.py", line 49, in <module>
backupLogs.process_file()
File "C:\Users\admin\PycharmProjects\BackupLogCheck\main.py", line 20, in process_file
lines = logfile_in.readlines()
File "C:\Users\admin\AppData\Local\Programs\Python\Python39\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 350: character maps to <undefined>
Process finished with exit code 1
Line 49 is where I call above method. But I have traced that it crashes at reading the file. I have checked the file; it has just text in it. I don't know if there are some characters which it doesn't like on reading entries. I am running on Windows 10.
I am new to Python, any suggestion how to find/correct the issue?
Try the file name in string format
logfile_in = open('pure_name', 'r') # Read file
lines = logfile_in.readlines()
print(lines)
output
['test line one\n', 'test line two']
or
logfile_in = open('pure_name', 'r') # Read file
lines = logfile_in.readlines()
for line in lines:
print(line)
output
test line one
test line two

having problems in module

i am making a discord bot heres the code and the error
f = open("rules.txt","r")
rules = f.readlines()
error:
Traceback (most recent call last):
File "C:\Users\Windows10\OneDrive\Desktop\YourBot\bot.py", line 8, in <module>
rules = f.readlines()
File "C:\Users\Windows10\AppData\Local\Programs\Python\Python39\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 7: character maps to <undefined>
please help me..
Kindly Try
First:
f = open('rules.txt', 'r', encoding='utf8')
rules = f.readlines()
Second:
f = open('rules.txt', 'r', errors = 'ignore')
rules = f.readlines()

Converting JSON file to CSV file

I am trying to convert a JSON file into a CSV file. My code is down below. However, I keep getting this error:
Traceback (most recent call last):
File "C:\Users\...\PythonParse.py", line 42, in <module>
writer.writerow(data)
File "C:\Documents and Settings\...\Python37\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 38409-38412: character maps to <undefined>
import json
import gzip
import csv
outfile = open("VideoGamesMeta.csv","w")
writer = csv.writer(outfile)
data = []
items = []
names = []
checkItems = False;
checkUsers = False;
numItems = []
numUsers = []
for line in open("meta_Video_Games.json","r",encoding="utf-8"):
results = (json.loads(line))
if 'title' in results:
if 'asin' in results:
name = results['title']
item = results['asin']
data = [item,name]
writer.writerow(data)
items.append(item)
names.append(name)

how to write my terminal in a text file using python

Partial of my code is below. I want to export output of terminal in a text file but I get below error:
UnicodeEncodeError Traceback (most recent call last)
<ipython-input-2-c7d647fa741c> in <module>()
34 text_file = open("Output.txt", "w")
35
---> 36 text_file.write(data)
37 #print (data)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 150-151: ordinal not in range(128)
# data is multi line text
data = ''.join(soup1.findAll('p', text=True))
text_file = open("Output.txt", "w")
text_file.write(data)
# print (data)
Encode your text before you write to the file:
text_file.write(data.encode("utf-8"))

Categories