I'm working with python and mysql and I want to verify that a certain entry is compressed in the db. Ie:
cur = db.getCursor()
cur.execute('''select compressed_column from table where id=12345''')
res = cur.fetchall()
at this point I would like to verify that the entry is compressed (ie in order to work with the data you would have to use select uncompress(compressed_column)..). Ideas?
COMPRESS() on MySQL uses zlib, therefore you can try the following to see if the string is compressed:
try:
out = s.decode('zlib')
except zlib.error:
out = s
Related
I'm querying a relational Database and I need the result as a CSV string. I can't save it on the disk as is running in a serverless environment (I don't have access to disk).
Any idea?
My solution was using PyGreSQL library and defining this function:
import pg
def get_csv_from_db(query, cols):
"""
Given the SQL #query and the expected #cols,
a string formatted CSV (containing headers) is returned
:param str query:
:param list of str cols:
:return str:
"""
connection = pg.DB(
dbname=my_db_name,
host=my_host,
port=my_port,
user=my_username,
passwd=my_password)
header = ','.join(cols) + '\n'
records_list = []
for row in connection.query(query).dictresult():
record = []
for c in cols:
record.append(str(row[c]))
records_list.append(",".join(record))
connection.close()
return header + "\n".join(records_list)
Unfortunately this solution expects the column names in input (which is not too bad IMHO) and iterate over the dictionary result with Python code.
Other solutions (especially out of the box) using other packages are more than welcome.
This is another solution based on PsycoPG and Pandas:
import psycopg2
import pandas as pd
def get_csv_from_db(query):
"""
Given the SQL #query a string formatted CSV (containing headers) is returned
:param str query:
:return str:
"""
conn = psycopg2.connect(
dbname=my_db_name,
host=my_host,
port=my_port,
user=my_username,
passwd=my_password)
cur = conn.cursor()
cursor.execute("query")
df = pd.DataFrame(cur.fetchall(), columns=[desc[0] for desc in cur.description])
cur.close()
conn.commit()
return df.to_csv()
I hadn't chance to test it yet though.
here is a different approach from other answers, Using Pandas.
i suppose you have a database connection already,
for example I'm using Oracle database, same can be done by using respective library for your relational db.
only these 2 lines do the trick,
df = pd.read_sql(query, con)
df.to_csv("file_name.csv")
Here is a full example using Oracle database:
dsn = cx_Oracle.makedsn(ip, port,service_name)
con = cx_Oracle.connect("user","password",dsn)
query = """"select * from YOUR_TABLE"""
df = pd.read_sql(query, con)
df.to_csv("file_name.csv")
PyGreSQL's Cursor has method copy_to. It accept as stream file-like object (which must have a write() method). io.StringIO does meet this condition and do not need access to disk, so it should be possible to do:
import io
csv_io = io.StringIO()
# here connect to your DB and get cursor
cursor.copy_to(csv_io, "SELECT * FROM table", format="csv", decode=True)
csv_io.seek(0)
csv_str = csv_io.read()
Explanation: many python modules accept file-like object, meaning you can use io.StringIO() or io.BytesIO() in place of true file-handles. These mimick file opened in text and bytes modes respectively. As with files there is position of reader, so I do seek to begin after usage. Last line does create csv_str which is just plain str. Remember to adjust SQL query to your needs.
Note: I do not tested above code, please try it yourself and write if it works as intended.
this is the oracle command i am using :-
query = '''SELECT DBMS_METADATA.GET_DDL('TABLE', 'MY_TABLE', 'MY_SCHEMA') FROM DUAL;'''
cur.execute(query)
now how to get the ddl of the table using cx_Oracle and python3 .
please help . i am unable to extract the ddl.
The following code can be used to fetch the contents of the DDL from dbms_metadata:
import cx_Oracle
conn = cx_Oracle.connect("username/password#hostname/myservice")
cursor = conn.cursor()
def OutputTypeHandler(cursor, name, defaultType, size, precision, scale):
if defaultType == cx_Oracle.CLOB:
return cursor.var(cx_Oracle.LONG_STRING, arraysize = cursor.arraysize)
cursor.outputtypehandler = OutputTypeHandler
cursor.execute("select dbms_metadata.get_ddl('TABLE', :tableName) from dual",
tableName="THE_TABLE_NAME")
text, = cursor.fetchone()
print("DDL fetched of length:", len(text))
print(text)
The use of the output type handler is to eliminate the need to process the CLOB. Without it you would need to do str(lob) or lob.read() in order to get at its contents. Either way, however, you are not limited to 4,000 characters.
cx_Oracle has native support for calling PL/SQL.
Assuming you have connection and cursor object, the snippet will look like this:
binds = dict(object_type='TABLE', name='DUAL', schema='SYS')
ddl = cursor.callfunc('DBMS_METADATA.GET_DDL', keywordParameters=binds, returnType=cx_Oracle.CLOB)
print(ddl)
You don't need to split the DDL in 4k/32k chunks, as Python's str doesn't have this limitation. In order to get the DDL in one chunk, just set returnType to cx_Oracle.CLOB. You can later convert it to str by doing str(ddl).
I have a list of items which I like to store in my firebird database.
Thus far I made the following code
Sens=278.3
DSens=1.2
Fc10=3.8
Bw10=60.0
Fc20=4.2
Bw20=90.0
ResultArray = (Sens,DSens,Fc10,Bw10,Fc20,Bw20,t6,t20,Nel,Nsub)
con = fdb.connect(dsn="192.168.0.2:/database/us-database/usdb.gdb", user="sysdba", password="#########")
cur = con.cursor()
InsertStatement="insert into Tosh_Probe (TestResults ) Values (?)"
cur.execute(InsertStatement, (ResultArray,))
con.commit()
In here the TestResult field is blob field in my database.
This gives a TypeError (???)
What is the correct syntax to store these values into a blob
An other option I tried is to write the list of items into a StringIO, and store that in the database. Now a new entry is made in the database but no data is added to the blob field
Here is the code for adding the fields to the StringIO
ResultArray = StringIO.StringIO()
ResultArray.write = Sens
ResultArray.write = DSens
#ResultArray.close #tried with and without this line but with the same result
I've tested this with Python 3.5.1 and FDB 1.6. The following variants of writing all work (into a blob sub_type text):
import fdb
import io
con = fdb.connect(dsn='localhost:testdatabase', user='sysdba', password='masterkey')
cur = con.cursor()
statement = "insert into blob_test2 (text_blob) values (?)"
cur.execute(statement, ("test blob as string",))
cur.execute(statement, (io.StringIO("test blob as StringIO"),))
streamwrites = io.StringIO()
streamwrites.write("streamed write1,")
streamwrites.write("streamed write2,")
streamwrites.seek(0)
cur.execute(statement, (streamwrites,))
con.commit()
con.close()
The major differences with your code in the case of the the writes to StringIO are:
Use of write(...) instead of write = ...
Use of seek(0) to position the stream at the start, otherwise you read nothing, as the stream is positioned after the last write.
I haven't tried binary IO, but I expect that to work in a similar fashion.
My simple test code is listed below. I created the table already and can query it using the SQLite Manager add-in on Firefox so I know the table and data exist. When I run the query in python (and using the python shell) I get the no such table error
def TroyTest(self, acctno):
conn = sqlite3.connect('TroyData.db')
curs = conn.cursor()
v1 = curs.execute('''
SELECT acctvalue
FROM balancedata
WHERE acctno = ? ''', acctno)
print v1
conn.close()
When you pass SQLite a non-existing path, it'll happily open a new database for you, instead of telling you that the file did not exist before. When you do that, it'll be empty and you'll instead get a "No such table" error.
You are using a relative path to the database, meaning it'll try to open the database in the current directory, and that is probably not where you think it is..
The remedy is to use an absolute path instead:
conn = sqlite3.connect('/full/path/to/TroyData.db')
You need to loop over the cursor to see results:
curs.execute('''
SELECT acctvalue
FROM balancedata
WHERE acctno = ? ''', acctno)
for row in curs:
print row[0]
or call fetchone():
print curs.fetchone() # prints whole row tuple
The problem is the SQL statment. you must specify the db name and after the table name...
'''SELECT * FROM db_name.table_name WHERE acctno = ? '''
I am using Ubuntu 9.04
I have installed the following package versions:
unixodbc and unixodbc-dev: 2.2.11-16build3
tdsodbc: 0.82-4
libsybdb5: 0.82-4
freetds-common and freetds-dev: 0.82-4
python2.6-dev
I have configured /etc/unixodbc.ini like this:
[FreeTDS]
Description = TDS driver (Sybase/MS SQL)
Driver = /usr/lib/odbc/libtdsodbc.so
Setup = /usr/lib/odbc/libtdsS.so
CPTimeout =
CPReuse =
UsageCount = 2
I have configured /etc/freetds/freetds.conf like this:
[global]
tds version = 8.0
client charset = UTF-8
text size = 4294967295
I have grabbed pyodbc revision 31e2fae4adbf1b2af1726e5668a3414cf46b454f from http://github.com/mkleehammer/pyodbc and installed it using "python setup.py install"
I have a windows machine with Microsoft SQL Server 2000 installed on my local network, up and listening on the local ip address 10.32.42.69. I have an empty database created with name "Common". I have the user "sa" with password "secret" with full privileges.
I am using the following python code to setup the connection:
import pyodbc
odbcstring = "SERVER=10.32.42.69;UID=sa;PWD=secret;DATABASE=Common;DRIVER=FreeTDS"
con = pyodbc.connect(odbcstring)
cur = con.cursor()
cur.execute("""
IF EXISTS(SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = 'testing')
DROP TABLE testing
""")
cur.execute('''
CREATE TABLE testing (
id INTEGER NOT NULL IDENTITY(1,1),
myimage IMAGE NULL,
PRIMARY KEY (id)
)
''')
con.commit()
Everything WORKS up to this point. I have used SQLServer's Enterprise Manager on the server and the new table is there.
Now I want to insert some data on the table.
cur = con.cursor()
# using web data for exact reproduction of the error by all.
# I'm actually reading a local file in my real code.
url = 'http://www.forestwander.com/wp-content/original/2009_02/west-virginia-mountains.jpg'
data = urllib2.urlopen(url).read()
sql = "INSERT INTO testing (myimage) VALUES (?)"
Now here on my original question, I was having trouble using cur.execute(sql, (data,)) but now I've edited the question, because following Vinay Sajip's answer below (THANKS), I have changed it to:
cur.execute(sql, (pyodbc.Binary(data),))
con.commit()
And insertion is working perfectly. I can confirm the size of the inserted data using the following test code:
cur.execute('SELECT DATALENGTH(myimage) FROM testing WHERE id = 1')
data_inside = cur.fetchone()[0]
assert data_inside == len(data)
Which passes perfectly!!!
Now the problem is on retrieval of the data back.
I am trying the common approach:
cur.execute('SELECT myimage FROM testing WHERE id = 1')
result = cur.fetchone()
returned_data = str(result[0]) # transforming buffer object
print 'Original: %d; Returned: %d' % (len(data), len(returned_data))
assert data == returned_data
However that fails!!
Original: 4744611; Returned: 4096
Traceback (most recent call last):
File "/home/nosklo/devel/teste_mssql_pyodbc_unicode.py", line 53, in <module>
assert data == returned_data
AssertionError
I've put all the code above in a single file here, for easy testing of anyone that wants to help.
Now for the question:
I want python code to insert an image file into mssql. I want to query the image back and show it to the user.
I don't care about the column type in mssql. I am using the "IMAGE" column type on the example, but any binary/blob type would do, as long as I get the binary data for the file I inserted back unspoiled. Vinay Sajip said below that this is the preferred data type for this in SQL SERVER 2000.
The data is now being inserted without errors, however when I retrieve the data, only 4k are returned. (Data is truncated on 4096).
How can I make that work?
EDITS: Vinay Sajip's answer below gave me a hint to use pyodbc.Binary on the field. I have updated the question accordingly. Thanks Vinay Sajip!
Alex Martelli's comment gave me the idea of using the DATALENGTH MS SQL function to test if the data is fully loaded on the column. Thanks Alex Martelli !
Huh, just after offering the bounty, I've found out the solution.
You have to use SET TEXTSIZE 2147483647 on the query, in addition of text size configuration option in /etc/freetds/freetds.conf.
I have used
cur.execute('SET TEXTSIZE 2147483647 SELECT myimage FROM testing WHERE id = 1')
And everything worked fine.
Strange is what FreeTDS documentation says about the text size configuration option:
default value of TEXTSIZE, in bytes. For text and image datatypes, sets the maximum width of any returned column. Cf. set TEXTSIZE in the T-SQL documentation for your server.
The configuration also says that the maximum value (and the default) is 4,294,967,295. However when trying to use that value in the query I get an error, the max number I could use in the query is 2,147,483,647 (half).
From that explanation I thought that only setting this configuration option would be enough. It turns out that I was wrong, setting TEXTSIZE in the query fixed the issue.
Below is the complete working code:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import pyodbc
import urllib2
odbcstring = "SERVER=10.32.42.69;UID=sa;PWD=secret;DATABASE=Common;DRIVER=FreeTDS"
con = pyodbc.connect(odbcstring)
cur = con.cursor()
cur.execute("""
IF EXISTS(SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = 'testing')
DROP TABLE testing
""")
cur.execute('''
CREATE TABLE testing (
id INTEGER NOT NULL IDENTITY(1,1),
myimage IMAGE NULL,
PRIMARY KEY (id)
)
''')
con.commit()
cur = con.cursor()
url = 'http://www.forestwander.com/wp-content/original/2009_02/west-virginia-mountains.jpg'
data = urllib2.urlopen(url).read()
sql = "INSERT INTO testing (myimage) VALUES (?)"
cur.execute(sql, (pyodbc.Binary(data),))
con.commit()
cur.execute('SELECT DATALENGTH(myimage) FROM testing WHERE id = 1')
data_inside = cur.fetchone()[0]
assert data_inside == len(data)
cur.execute('SET TEXTSIZE 2147483647 SELECT myimage FROM testing WHERE id = 1')
result = cur.fetchone()
returned_data = str(result[0])
print 'Original: %d; Returned; %d' % (len(data), len(returned_data))
assert data == returned_data
I think you should be using a pyodbc.Binary instance to wrap the data:
cur.execute('INSERT INTO testing (myimage) VALUES (?)', (pyodbc.Binary(data),))
Retrieving should be
cur.execute('SELECT myimage FROM testing')
print "image bytes: %r" % str(cur.fetchall()[0][0])
UPDATE: The problem is in insertion. Change your insertion SQL to the following:
"""DECLARE #txtptr varbinary(16)
INSERT INTO testing (myimage) VALUES ('')
SELECT #txtptr = TEXTPTR(myimage) FROM testing
WRITETEXT testing.myimage #txtptr ?
"""
I've also updated the mistake I made in using the value attribute in the retrieval code.
With this change, I'm able to insert and retrieve a 320K JPEG image into the database (retrieved data is identical to inserted data).
N.B. The image data type is deprecated, and is replaced by varbinary(max) in later versions of SQL Server. The same logic for insertion/retrieval should apply, however, for the newer column type.
I had a similar 4096 truncation issue on TEXT fields, which SET TEXTSIZE 2147483647 fixed for me, but this also fixed it for me:
import os
os.environ['TDSVER'] = '8.0'