I have installed Postgres.app and started it.
I have pip installed pypyodbc
I have copied the hello world lines from the Pypyodbc docs, and received the
error below. any ideas what the issue might be?
Here is my code
from __future__ import print_function
import pypyodbc
import datetime
conn = pypyodbc.connect("DRIVER={psqlOBDC};SERVER=localhost")
And I receive this error:
File "/ob/pkg/python/dan27/lib/python2.7/site-packages/pypyodbc.py", line 975, in ctrl_err
err_list.append((from_buffer_u(state), from_buffer_u(Message), NativeError.value))
File "/ob/pkg/python/dan27/lib/python2.7/site-packages/pypyodbc.py", line 482, in UCS_dec
uchar = buffer.raw[i:i + ucs_length].decode(odbc_decoding)
File "/ob/pkg/python/dan27/lib/python2.7/encodings/utf_32.py", line 11, in decode
return codecs.utf_32_decode(input, errors, True)
UnicodeDecodeError: 'utf32' codec can't decode bytes in position 0-1: truncated data
what am I doing wrong?
Do I need to somehow initialize the DB / tables first? it is a weird error if that is the issue.
I copied your code on my Fedora machine and it started when I changed connect string to something like:
conn = pypyodbc.connect("Driver={PostgreSQL};Server=IP address;Port=5432;Database=myDataBase;Uid=myUsername;Pwd=myPassword;")
You can find more connect strings for PostgreSQL and ODBC at: https://connectionstrings.com/postgresql-odbc-driver-psqlodbc/
Related
Python 3.7 on Windows 10. Camelot 0.8.2
I'm using the following code to convert a pdf file to HTML:
import camelot
import os
def CustomScript(args):
path_to_pdf = "C:\PDFfolder\abc.pdf"
folder_to_pdf = os.path.dirname(path_to_pdf)
tables = camelot.read_pdf(os.path.normpath(path_to_pdf), flavor='stream', pages='1-end')
tables.export(os.path.normpath(os.path.join(folder_to_pdf,"temp","foo.html")), f='html')
return CustomScriptReturn.Empty();
I receive the following error at the tables.export line:
"UnicodeEncodeError -'charmap' codec can't encode character '\u2010'
in position y: character maps to undefined.
This code runs without issue on Mac. This error seems to pertain to Windows, which is the environment I will need to run this on.
I have now spent two entire days researching this error ad nauseum - I have tried many of the solutions offered here on Stack Overflow from the several posts related to this. The error persists. The problem with adding the lines of code suggested in all the solutions is that they're all arguments to be added to vanilla Python methods. These arguments are not available to the Camelot's export method.
EDIT 1: Updated post to specify which line is throwing the error.
EDIT 2: PDF file used: http://tsbde.texas.gov/78i8ljhbj/Fiscal-Year-2014-Disciplinary-Actions.pdf
EDIT 3: Here is the full Traceback from Windows console:
> Traceback (most recent call last): File "main.py", line 18, in
> <module>
> tables.export(os.path.normpath(os.path.join(folder_to_pdf, "foo.html")), f='html') File
> "C:\Users\stpete\AppData\Local\Programs\Python\Python37\lib\site-packages\camelot\core.py",
> line 737, in export
> self._write_file(f=f, **kwargs) File "C:\Users\stpete\AppData\Local\Programs\Python\Python37\lib\site-packages\camelot\core.py",
> line 699, in _write_file
> to_format(filepath) File "C:\Users\stpete\AppData\Local\Programs\Python\Python37\lib\site-packages\camelot\core.py",
> line 636, in to_html
> f.write(html_string) File "C:\Users\stpete\AppData\Local\Programs\Python\Python37\lib\encodings\cp1252.py",
> line 19, in encode
> return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\u2010' in
> position 5737: character maps to <undefined>
The problem you are facing is related to the method camelot.core.Table.to_html:
def to_html(self, path, **kwargs):
"""Writes Table to an HTML file.
For kwargs, check :meth:`pandas.DataFrame.to_html`.
Parameters
----------
path : str
Output filepath.
"""
html_string = self.df.to_html(**kwargs)
with open(path, "w") as f:
f.write(html_string)
Here, the file to be written should be opened with UTF-8 encoding and it is not.
This is my solution, which uses a monkey patch to replace original camelot method:
import camelot
import os
# here I define the corrected method
def to_html(self, path, **kwargs):
"""Writes Table to an HTML file.
For kwargs, check :meth:`pandas.DataFrame.to_html`.
Parameters
----------
path : str
Output filepath.
"""
html_string = self.df.to_html(**kwargs)
with open(path, "w", encoding="utf-8") as f:
f.write(html_string)
# monkey patch: I replace the original method with the corrected one
camelot.core.Table.to_html=to_html
def CustomScript(args):
path_to_pdf = "C:\PDFfolder\abc.pdf"
folder_to_pdf = os.path.dirname(path_to_pdf)
tables = camelot.read_pdf(os.path.normpath(path_to_pdf), flavor='stream', pages='1-end')
tables.export(os.path.normpath(os.path.join(folder_to_pdf,"temp","foo.html")), f='html')
return CustomScriptReturn.Empty();
I tested this solution and it works for Python 3.7, Windows 10, Camelot 0.8.2.
You're getting UnicodeEncodeError, which in this case means that the output to be written to file contains a character than cannot be encoded in the default encoding for your platform, cp1252.
camelot does not seem to handle setting an encoding when writing to an html file.
A workaround might be to set the PYTHONIOENCODING environment variable to "UTF-8" when running your program:
C:\> set PYTHONIOENCODING=UTF-8 && python myprog.py
to force outputting the file(s) with UTF-8 encoding.
I'm triying to display results from a firebird 3.x database, but get:
File
"/...../Envs/pos/lib/python3.6/site-packages/fdb/fbcore.py",
line 479, in b2u
return st.decode(charset) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd1 in position 9: invalid continuation byte
Despite I set utf-8 everywhere:
# -- coding: UTF-8 --
import os
os.environ["PYTHONIOENCODING"] = "utf8"
from sqlalchemy import *
SERVIDOR = "localhost"
BASEDATOS_1 = "db.fdb"
PARAMS = dict(
user="SYSDBA",
pwd="masterkey",
host="localhost",
port=3050,
path=BASEDATOS_1,
charset='utf-8'
)
firebird = create_engine("firebird+fdb://%(user)s:%(pwd)s#%(host)s:%(port)d/%(path)s?charset=%(charset)s" % PARAMS, encoding=PARAMS['charset'])
def select(eng, sql):
with eng.connect() as con:
return eng.execute(sql)
for row in select(firebird, "SELECT * from clientes"):
print(row)
I had the same problem.
In my situation the database was not in UTF-8.
After setting the correct charset in the connection string it worked: ?charset=ISO8859_1
I would try to use the module unidecode.
Your script is crashing when it tries to convert, so this module can help you. As they says in the module documentation:
The module exports a single function that takes an Unicode object
(Python 2.x) or string (Python 3.x) and returns a string (that can be
encoded to ASCII bytes in Python 3.x)
First you download it using pip and then try this:
import unidecode
...
if type(line) is unicode:
line = unidecode.unidecode(line)
I hope it solves your problem.
I can read from a MSSQL database by sending queries in python through pypyodbc.
Mostly unicode characters are handled correctly, but I've hit a certain character that causes an error.
The field in question is of type nvarchar(50) and begins with this character "" which renders for me a bit like this...
-----
|100|
|111|
-----
If that number is hex 0x100111 then it's the character supplementary private use area-b u+100111. Though interestingly, if it's binary 0b100111 then it's an apostrophe, could it be that the wrong encoding was used when the data was uploaded? This field is storing part of a Chinese postal address.
The error message includes
UnicodeDecodeError: 'utf16' codec can't decode bytes in position 0-1: unexpected end of data
Here it is in full...
Traceback (most recent call last): File "question.py", line 19, in <module>
results.fetchone() File "/VIRTUAL_ENVIRONMENT_DIR/local/lib/python2.7/site-packages/pypyodbc.py", line 1869, in fetchone
value_list.append(buf_cvt_func(from_buffer_u(alloc_buffer))) File "/VIRTUAL_ENVIRONMENT_DIR/local/lib/python2.7/site-packages/pypyodbc.py", line 482, in UCS_dec
uchar = buffer.raw[i:i + ucs_length].decode(odbc_decoding) File "/VIRTUAL_ENVIRONMENT_DIR/lib/python2.7/encodings/utf_16.py", line 16, in decode
return codecs.utf_16_decode(input, errors, True) UnicodeDecodeError: 'utf16' codec can't decode bytes in position 0-1: unexpected end of data
Here's some minimal reproducing code...
import pypyodbc
connection_string = (
"DSN=sqlserverdatasource;"
"UID=REDACTED;"
"PWD=REDACTED;"
"DATABASE=obi_load")
connection = pypyodbc.connect(connection_string)
cursor = connection.cursor()
query_sql = (
"SELECT address_line_1 "
"FROM address "
"WHERE address_id == 'REDACTED' ")
with cursor.execute(query_sql) as results:
row = results.fetchone() # This is the line that raises the error.
print row
Here is a chunk of my /etc/freetds/freetds.conf
[global]
; tds version = 4.2
; dump file = /tmp/freetds.log
; debug flags = 0xffff
; timeout = 10
; connect timeout = 10
text size = 64512
[sqlserver]
host = REDACTED
port = 1433
tds version = 7.0
client charset = UTF-8
I've also tried with client charset = UTF-16 and omitting that line all together.
Here's the relevant chunk from my /etc/odbc.ini
[sqlserverdatasource]
Driver = FreeTDS
Description = ODBC connection via FreeTDS
Trace = No
Servername = sqlserver
Database = REDACTED
Here's the relevant chunk from my /etc/odbcinst.ini
[FreeTDS]
Description = TDS Driver (Sybase/MS SQL)
Driver = /usr/lib/x86_64-linux-gnu/odbc/libtdsodbc.so
Setup = /usr/lib/x86_64-linux-gnu/odbc/libtdsS.so
CPTimeout =
CPReuse =
UsageCount = 1
I can work around this issue by fetching results in a try/except block, throwing away any rows that raise a UnicodeDecodeError, but is there a solution? Can I throw away just the undecodable character, or is there a way to fetch this line without raising an error?
It's not inconceivable that some bad data has ended up on the database.
I've Googled around and checked this site's related questions, but have had no luck.
I fixed the issue myself by using this:
conn.setencoding('utf-8')
immediately before creating a cursor.
Where conn is the connection object.
I was fetching tens of millions of rows with fetchall(), and in the middle of a transaction that would be extremely expensive to undo manually, so I couldn't afford to simply skip invalid ones.
Source where I found the solution: https://github.com/mkleehammer/pyodbc/issues/112#issuecomment-264734456
This problem was eventually worked around, I suspect that the problem was that text had a character of one encoding hammered into a field with another declared encoding through some hacky method when the table was being set up.
I am using the Azure Python SDK to upload an image file as an Azure Block Blob. I'd like to use the "put_block_blob_from_bytes" method, and not the "put_block_blob_from_file" method.
I am getting the following error on the last line of code:
"UnicodeDecodeError was unhandled by user code
Message: 'ascii' codec can't decode byte 0x89 in position 0: ordinal not in range(128)"
It seems I need to change the content encoding to "utf-8" somewhere, but I can't figure out the correct place to put this in the method signature for "put_block_blob_from_bytes".
I tried this, but still receive the same error:
blob_service.put_block_blob_from_bytes("testcontainer", "myimage.png", data, 0, None, "utf-8")
Here is the full code sample. Note: I removed the storage account name and key for the sake of publishing.
from azure.storage.blob import BlobService
azureStorageAccountName = "" # REMOVED for this question
azureStorageAccountKey = "" # REMOVED for this question
with open("c:\\temp\\image.png", "rb") as f:
data = f.read()
blob_service = BlobService(account_name=azureStorageAccountName, account_key=azureStorageAccountKey)
blob_service.put_block_blob_from_bytes("testcontainer", "myimage.png", data)
Thank you!
I ran:
pip install azure --upgrade
Which upgraded a few components. I then ran it again, and everything worked. Thanks to Gaurav Mantri for the tip to ensure I had the latest version of the SDK.
I am trying to make an app similar to StumbleUpon using Python as a back end for a personal project . From the database I retrieve a website name and then I open that website with webbrowser.open("http://www.website.com"). Sounds pretty straight forward right but there is a problem. When I try to open the website with webbrowser.open("website.com") it returns the following error:
File "fetchall.py", line 18, in <module>
webbrowser.open(x)
File "/usr/lib/python2.6/webbrowser.py", line 61, in open
if browser.open(url, new, autoraise):
File "/usr/lib/python2.6/webbrowser.py", line 190, in open
for arg in self.args]
TypeError: expected a character buffer object
Here is my code:
import sqlite3
import webbrowser
conn = sqlite3.connect("websites.sqlite")
cur = conn.cursor()
cur.execute("SELECT WEBSITE FROM COLUMN")
x = cur.fetchmany(1)
webbrowser.open(x)
EDIT
Okay thanks for the reply, but now I'm receiving this: "Error showing URL: Error stating file '/home/user/(u'http:bbc.co.uk,)': No such file or directory".
What's going on ?
webbrowser.open is expecting a character buffer, but fetchmany returns a list. So webbrowser.open(x[0]) should do the trick.