Encode/decode documents to base64 dynamically

Encode/decode documents to base64 dynamically - python

How do I encode pdf and word files in a folder to base64 and decode them and save into the same folder?
The pdf and word files are generated dynamically through a web service.
I would like to use python to do so.
I used this. But it gives the error
Traceback (most recent call last):
File "sample.py", line 7, in
base64.encode(open("hello.pdf"), open("hello1.b64", "w"))
File "C:\Python34\lib\base64.py", line 496, in encode
s = input.read(MAXBINSIZE)
File "C:\Python34\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 1340: character maps
base64.encode(open("hello.pdf"), open("hello1.b64", "w"))

The base64 module, which is included in the standard lib. The documentation is here.

Related

unable to debug error produced while reading csv in python

I am trying to csv file. Code I've written below gives error(available after code block). Not sure what I am missing or doing wrong.
import csv
file = open('AlfaRomeo.csv')
csvreader = csv.reader(file)
for j in csvreader:
print(j)
Traceback (most recent call last):
File "C:\Users\Pratik\PycharmProjects\AkraScraper\Transform_Directory\Developer_Sandbox.py", line 39, in
for j in csvreader:
File "C:\Users\Pratik\AppData\Local\Programs\Python\Python310\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 402: character maps to

The error is that you have a character in your input file which fails the Unicode decode test. It's value is 0x8d (141 decimal), and it's the 402nd byte in the file. I suggest loading the file in a text editor and search forward until you find it. So you know what you're looking for, it's in the Extended ASCII code section of https://www.asciitable.com/.

Python read text

I am simply trying to read a text file that has 4000+ lines of nouns all single column and I’m getting an error:
Traceback (most recent call last):
File "/private/var/mobile/Library/Mobile Documents/iCloud~com~omz-software~Pythonista3/Documents/nouns.py", line 4, in <module>
for i in nouns_file:
File "/var/containers/Bundle/Application/107074CD-03B1-4FB3-809A-CBD44D6CF245/Pythonista3.app/Frameworks/Py3Kit.framework/pylib/encodings/ascii.py", line 27, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2241: ordinal not in range(128)
With code:
with open("nounlist.txt", "r") as nouns_file:
for i in nouns_file:
print(i)
I’m not sure what’s causing this. I would think that it would just output all of the nouns from my nounlist.txt file.

json.load() function give strange 'UnicodeDecodeError: 'ascii' codec can't decode' error

I'm trying to read a JSON file I have saved in a text file using the python .loads() function. I will later parse the JSON to obtain a specific value.
I keep getting this error message. When I google it, there are no results.
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position >85298: ordinal not in range(128)
Here is the full error message:
Traceback (most recent call last): File ".../FirstDegreeKanyeScript.py", >line 10, in data=json.load(data_file) File >"/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/in>it.py", line 265, in load return loads(fp.read(), File >"/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/encodings>/ascii.py", line 26, in decode return codecs.ascii_decode(input, >self.errors)[0] UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 >in position 85298: ordinal not in range(128)
Here is my code:
import json
from pprint import pprint
with
open("/Users/.../KanyeAllSongs.txt") as data_file:
data=json.load(data_file)
pprint(data)
I've tried adding data.decode('utf-8') under the json.load, but I still get the same error.
Any ideas what could be the issue?

Specify the encoding in the open call.
# encoding is a keyword argument
open("/Users/.../KanyeAllSongs.txt", encoding='utf-8') as data_file:
data=json.load(data_file)

Python ignores encoding argument in favor of cp1252

I have a lengthy json file that contains utf-8 characters (and is encoded in utf-8). I want to read it in python using the built-in json module.
My code looks like this:
dat = json.load(open("data.json"), "utf-8")
Though I understand the "utf-8" argument should be unnecessary as it is assumed as the default. However, I get this error:
Traceback (most recent call last):
File "winratio.py", line 9, in <module>
dat = json.load(open("data.json"), "utf-8")
File "C:\Python33\lib\json\__init__.py", line 271, in load
return loads(fp.read(),
File "C:\Python33\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 28519: ch
aracter maps to <undefined>
My question is: Why does python seem to ignore my encoding specification and try to load the file in cp1252?

Try this:
import codecs
dat = json.load(codecs.open("data.json", "r", "utf-8"))
Also here are described some tips about a writing mode in context of the codecs library: Write to UTF-8 file in Python

UnicodeDecodeError reading string in CSV

I'm having a problem reading some chars in python.
I have a csv file in UTF-8 format, and I'm reading, but when script read:
PreuÃŸen MÃ¼nster-Kaiserslautern II
I get this error:
Traceback (most recent call last):
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/webapp/__init__.py", line 515, in __call__
handler.get(*groups)
File "/Users/fermin/project/gae/cuotastats/controllers/controllers.py", line 50, in get
f.name = unicode( row[1])
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4: ordinal not in range(128)
I tried to use Unicode functions and convert string to Unicode, but I haven't found the solution. I tried to use sys.setdefaultencoding('utf8') but that doesn't work either.

Try the unicode_csv_reader() generator described in the csv module docs.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Encode/decode documents to base64 dynamically - python

The base64 module, which is included in the standard lib. The documentation is here.

Related

unable to debug error produced while reading csv in python

Python read text

json.load() function give strange 'UnicodeDecodeError: 'ascii' codec can't decode' error

Python ignores encoding argument in favor of cp1252

UnicodeDecodeError reading string in CSV

Categories

Resources