I am trying to use ceODBC to try and improve some query times, and I have a problem starting with ceODBC library.
I import the lib, connect, execute the select statement, but when running cursor.fetchall(), or similar, I get the error.
this error seems to happen only with labels that could have spaces or special characters
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc1 in position 26: invalid start byte
sample
import ceODBC
conn = ceODBC.connect('cnx string', autocommit=False)
cursor = conn.cursor()
cursor.execute("select [name], [label] from table1")
# error because of label having a "héllo world" value for eg
# none works
[print(row) for row in cursor]
print(cursor.fetchall())
I tried looking for decode/encode methods but found none on ceODBC
Related
I am trying to execute this statement without success because I get:
"UnicodeEncodeError: 'ascii' codec can't encode character '\xc9' in position 276: ordinal not in range(128)"
My code is:
import sqlalchemy as sa
from sqlalchemy.sql import text
engine = sa.create_engine("my connection (cannot show it)")
conn = engine.connect()
q = text("SELECT * FROM STORES WHERE CADENA = 'ÉLIAS'")
result = conn.execute(q).fetchall()
print(result)
As you see, the conditional of the SQL query has a "É" which cannot be encoded.
What can I do to solve this? If I write .encode('utf-8') at the end of the text, it says
AttributeError: 'TextClause' object has no attribute 'encode'
Thanks in advance!
I request your kind assistance in tackling an error. I am trying to save MS Access database tables as CSV files using Python. I seem to be running into an error I do not know how to fix. I have looked through different posts on Stack overflow and tried them but nothing fulfilling. Please provide your kind assistance.
import pyodbc
import csv
conn_string = ("DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};DBQ=C:\\Access\\permissions.accdb")
conn = pyodbc.connect(conn_string)
cursor = conn.cursor()
cursor.execute("select * from [Perm_Site Info];")
with open('C:\\Desktop\\Python Files\\Perms_Site_Info.csv','wb') as csvfile:
writer = csv.writer(csvfile)
rest_array = [text.encode("utf8") for text in cursor]
writer.writerow(rest_array)
writer.writerow([i[0] for i in cursor.description])
writer.writerows(cursor)
cursor.close()
conn.close()
print 'All done for now'
The error:
writer.writerows(cursor)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 4: ordinal not in range(128)
You should probably install and use the unicodecsv module.
https://pypi.python.org/pypi/unicodecsv/0.14.1
I can read from a MSSQL database by sending queries in python through pypyodbc.
Mostly unicode characters are handled correctly, but I've hit a certain character that causes an error.
The field in question is of type nvarchar(50) and begins with this character "" which renders for me a bit like this...
-----
|100|
|111|
-----
If that number is hex 0x100111 then it's the character supplementary private use area-b u+100111. Though interestingly, if it's binary 0b100111 then it's an apostrophe, could it be that the wrong encoding was used when the data was uploaded? This field is storing part of a Chinese postal address.
The error message includes
UnicodeDecodeError: 'utf16' codec can't decode bytes in position 0-1: unexpected end of data
Here it is in full...
Traceback (most recent call last): File "question.py", line 19, in <module>
results.fetchone() File "/VIRTUAL_ENVIRONMENT_DIR/local/lib/python2.7/site-packages/pypyodbc.py", line 1869, in fetchone
value_list.append(buf_cvt_func(from_buffer_u(alloc_buffer))) File "/VIRTUAL_ENVIRONMENT_DIR/local/lib/python2.7/site-packages/pypyodbc.py", line 482, in UCS_dec
uchar = buffer.raw[i:i + ucs_length].decode(odbc_decoding) File "/VIRTUAL_ENVIRONMENT_DIR/lib/python2.7/encodings/utf_16.py", line 16, in decode
return codecs.utf_16_decode(input, errors, True) UnicodeDecodeError: 'utf16' codec can't decode bytes in position 0-1: unexpected end of data
Here's some minimal reproducing code...
import pypyodbc
connection_string = (
"DSN=sqlserverdatasource;"
"UID=REDACTED;"
"PWD=REDACTED;"
"DATABASE=obi_load")
connection = pypyodbc.connect(connection_string)
cursor = connection.cursor()
query_sql = (
"SELECT address_line_1 "
"FROM address "
"WHERE address_id == 'REDACTED' ")
with cursor.execute(query_sql) as results:
row = results.fetchone() # This is the line that raises the error.
print row
Here is a chunk of my /etc/freetds/freetds.conf
[global]
; tds version = 4.2
; dump file = /tmp/freetds.log
; debug flags = 0xffff
; timeout = 10
; connect timeout = 10
text size = 64512
[sqlserver]
host = REDACTED
port = 1433
tds version = 7.0
client charset = UTF-8
I've also tried with client charset = UTF-16 and omitting that line all together.
Here's the relevant chunk from my /etc/odbc.ini
[sqlserverdatasource]
Driver = FreeTDS
Description = ODBC connection via FreeTDS
Trace = No
Servername = sqlserver
Database = REDACTED
Here's the relevant chunk from my /etc/odbcinst.ini
[FreeTDS]
Description = TDS Driver (Sybase/MS SQL)
Driver = /usr/lib/x86_64-linux-gnu/odbc/libtdsodbc.so
Setup = /usr/lib/x86_64-linux-gnu/odbc/libtdsS.so
CPTimeout =
CPReuse =
UsageCount = 1
I can work around this issue by fetching results in a try/except block, throwing away any rows that raise a UnicodeDecodeError, but is there a solution? Can I throw away just the undecodable character, or is there a way to fetch this line without raising an error?
It's not inconceivable that some bad data has ended up on the database.
I've Googled around and checked this site's related questions, but have had no luck.
I fixed the issue myself by using this:
conn.setencoding('utf-8')
immediately before creating a cursor.
Where conn is the connection object.
I was fetching tens of millions of rows with fetchall(), and in the middle of a transaction that would be extremely expensive to undo manually, so I couldn't afford to simply skip invalid ones.
Source where I found the solution: https://github.com/mkleehammer/pyodbc/issues/112#issuecomment-264734456
This problem was eventually worked around, I suspect that the problem was that text had a character of one encoding hammered into a field with another declared encoding through some hacky method when the table was being set up.
I have troubles with encoding in python while using xlrd and mysqldb.
I am reading an excel file which contains Turkish characters in it.
When I print the value like that print sheet.cell(rownum,19).value it writes İstanbul to console, which is correct.(Win7 Lucida ConsoleLine,encoding is `cp1254)
However, if I want to insert that value to database like
sql = "INSERT INTO city (name) VALUES('"+sheet.cell(rownum,19).value+"')"
cursor.execute (sql)
db.commit()
gives error as
Traceback (most recent call last):
File "excel_employer.py", line 112, in <module> cursor.execute (sql_deneme)
File "C:\Python27\lib\site-packages\MySQLdb\cursors.py", line 157, in execute
query = query.encode(charset)
UnicodeEncodeError: 'latin-1' codec can't encode character u'\u0130' in position
41: ordinal not in range(256)
If I change the sql as
sql = "INSERT INTO city (name) VALUES('"+sheet.cell(rownum,19).value.encode('utf8')+"')"
the value is inserted without any error but it becomes Ä°stanbul
Could you give me any idea how can I put the value İstanbul to database as it is.
Just as #Kazark said, maybe the encoding of your connector of mysql is not set.
conn = MySQLdb.connect(
host="localhost",
user="root",
passwd="root",
port=3306,
db="test1",
init_command="set names utf8"
)
Try this, when you init your python connector of mysql. But be sure the content been inserted is utf-8.
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 2: ordinal not in range(128)
I changed my database default to be utf-8, and not "latin"....but this error still occurs. why?
This is in my.cnf. Am I doing this wrong? I just want EVERYTHING TO BE UTF-8.
init_connect='SET collation_connection = utf8_general_ci'
init_connect='SET NAMES utf8'
default-character-set=utf8
character-set-server = utf8
collation-server = utf8_general_ci
default-character-set=utf8
MySQLdb.connect(read_default_*) options won't set the character set from default-character-set. You will need to set this explicitly:
MySQLdb.connect(..., charset='utf8')
Or the equivalent setting in your django databases settings.
If you get an exception from Python then it's nothing to do with MySQL -- the error happens before the expression is sent to MySQL. I would presume that the MySQLdb driver doesn't handle unicode.
If you are dealing with the raw MySQLdb interface this will be somewhat annoying (database wrappers like SQLAlchemy will handle this stuff for you), but you might want to create a function like this:
def exec_sql(conn_or_cursor, sql, *args, **kw):
if hasattr(conn_or_cursor):
cursor = conn_or_cursor.cursor()
else:
cursor = conn_or_cursor
cursor.execute(_convert_utf8(sql), *(_convert_utf8(a) for a in args),
**dict((n, _convert_utf8(v)) for n, v in kw.iteritems()))
return cursor
def _convert_utf8(value):
if isinstance(value, unicode):
return value.encode('utf8')
else:
return value