"with" statement works in Windows but not Ubuntu - python

I have a script (see below) that runs perfectly in Windows that I'm trying to move to an Ubuntu environment. I have setup the PostgreSQL database exactly the same, with the exact same tables and usernames, etc. However, when I try to run the script in Ubunu it fails when it parses the "with" statement.
Here is the "with" statement:
with con:
cur = con.cursor()
cur.executemany(final_str, symbols)
I get the following error:
INSERT INTO symbol (ticker, instrument, name, sector, currency, created_date, last_updated_date) VALUES (%s, %s, %s, %s, %s, %s, %s) 502
Traceback (most recent call last):
File "loadSPX.py", line 60, in <module>
insert_snp500_symbols(symbols)
File "loadSPX.py", line 54, in insert_snp500_symbols
with con:
AttributeError: __exit__
However, if I remove the "with" and change it to the following it works perfectly:
cur = con.cursor()
cur.executemany(final_str, symbols)
con.commit()
Any ideas what is causing this? Here is the full script below:
#!/usr/bin/python
# -*- coding: utf-8 -*-
import datetime
import lxml.html
import psycopg2 as mdb
import psycopg2.extras
from math import ceil
def obtain_parse_wiki_snp500():
"""Download and parse the Wikipedia list of S&P500
constituents using requests and libxml.
Returns a list of tuples for to add to database."""
# Stores the current time, for the created_at record
now = datetime.datetime.utcnow()
# Use libxml to download the list of S&P500 companies and obtain the symbol table
page = lxml.html.parse('http://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
symbolslist = page.xpath('//table[1]/tr')[1:503]
# Obtain the symbol information for each row in the S&P500 constituent table
symbols = []
for symbol in symbolslist:
tds = symbol.getchildren()
sd = {'ticker': tds[0].getchildren()[0].text,
'name': tds[1].getchildren()[0].text,
'sector': tds[3].text}
# Create a tuple (for the DB format) and append to the grand list
symbols.append( (sd['ticker'], 'stock', sd['name'],
sd['sector'], 'USD', now, now) )
return symbols
def insert_snp500_symbols(symbols):
"""Insert the S&P500 symbols into the database."""
# Connect to the PostgreSQL instance
db_host = 'localhost'
db_user = 'sec_user'
db_pass = 'XXXXXXX'
db_name = 'securities_master'
con = mdb.connect(host=db_host, dbname=db_name, user=db_user, password=db_pass)
# Create the insert strings
column_str = "ticker, instrument, name, sector, currency, created_date, last_updated_date"
insert_str = ("%s, " * 7)[:-2]
final_str = "INSERT INTO symbol (%s) VALUES (%s)" % (column_str, insert_str)
print final_str, len(symbols)
# Using the MySQL connection, carry out an INSERT INTO for every symbol
with con:
cur = con.cursor()
cur.executemany(final_str, symbols)
if __name__ == "__main__":
symbols = obtain_parse_wiki_snp500()
insert_snp500_symbols(symbols)

Your psycopg2 library on Ubuntu is too old; you need to upgrade to version 2.5 or newer. In older versions connections do not yet support being used as context managers.
See the Psycopg 2.5 release announcement:
Connections and cursors as context managers
A recent DBAPI extension has standardized the use of connections and cursors as context managers: it is now possible to use an idiom such as:
with psycopg2.connect(DSN) as conn:
with conn.cursor() as curs:
curs.execute(SQL)
with the intuitive behaviour: when the cursor block exits the cursor is closed; when the connection block exits normally the current transaction is committed, if it exits with an exception instead the transaction is rolled back, in either case the connection is ready to be used again
If you installed the python-psycopg2 system package you are most likely using 2.4.5; only Utopic Unicorn (14.10) has a more recent version (2.5.3). To install the newer version from soure, you'll need to install the Python development headers (python-dev) plus the PostgreSQL client library headers (libpq-dev).

Related

Can't store a pdf file in a MySql table

I need to store a pdf file in MySql. Whether I use escape_string or not, I always get the same error
b_blob = open(dir + fname_only, "rb")
myblob = b_blob.read() ####<- b'%PDF-1.4\n%\xaa\xab\xac\xad\n4 0 obj\n<<\n/Producer (Apache FOP Version 0.94)\
try:
conn = mysql.connector.connect( usual stuff )
cursor =conn.cursor(buffered=True, dictionary=True)
newblob = conn._cmysql.escape_string(myblob)
query = """INSERT INTO `mytable` (`storing`) VALUES('%s')""" %(newblob)
cursor.execute(query)
except Exception as exc:
Functions.error_handler(exc);
return
b_blob.close()
...MySQL server version for the right syntax to use near '\n%\xaa\xab\xac\xad\n4 0 obj\n<<\n/Producer (Apache FOP Version 0.94)\n/Creation' at line 1
So it looks like your problem is arriving from the quotes at the start of your string. I would consider putting double quotes around the newblob variable. Should look like this.
query = """INSERT INTO `mytable` (`storing`) VALUES("%s")""" %(newblob)

Importing .db file into Postgresql database

I am currently working on a script to import a .db file into Postgresql database, including the data. Is there any way to do so without using third party tools and by using python?
You can do it with Django for sure.
python manage.py dumpdata > db.json
Change the database settings to new database such as of PostgreSQL.
python manage.py migrate
python manage.py shell
Enter the following in the shell
from django.contrib.contenttypes.models import ContentType
ContentType.objects.all().delete()
python manage.py loaddata db.json
Otherwise if you want to tinker your way.
You need to install psycopg2
$ pip install psycopg2
Then you connect to Postgres.
import psycopg2
conn = psycopg2.connect("host=localhost dbname=postgres user=postgres")
This is how you insert values.
cur = conn.cursor()
insert_query = "INSERT INTO users VALUES {}".format("(10, 'hello#dataquest.io', 'Some Name', '123 Fake St.')")
cur.execute(insert_query)
conn.commit()
Now with SQLAlchemy you can easily open an SQLite file.
import sqlite3
conn = sqlite3.connect('database.db')
Fetch the data.
r = conn.execute("""SELECT * FROM books""")
r.fetchall()
Here is how to fetch all tables from SQLite
all the table names within your database:
SELECT name FROM sqlite_master WHERE type = 'table'
sqlite_master can be thought of as a table that contains information about your databases (metadata).
A quick but most likely inefficient way (because it will be running 700 queries with 700 separate resultsets) to get the list of table names, loop through those tables and return data where columnA = "-":
for row in connection.execute('SELECT name FROM sqlite_master WHERE type = "table" ORDER BY name').fetchall()
for result in connection.execute('SELECT * FROM ' + row[1] + ' WHERE "columnA" = "-"').fetchall()
# do something with results
Here is an other approach
import sqlite3
try:
conn = sqlite3.connect('/home/rolf/my.db')
except sqlite3.Error as e:
print('Db Not found', str(e))
db_list = []
mycursor = conn.cursor()
for db_name in mycursor.execute("SELECT name FROM sqlite_master WHERE type = 'table'"):
db_list.append(db_name)
for x in db_list:
print "Searching",x[0]
try:
mycursor.execute('SELECT * FROM '+x[0]+' WHERE columnA" = "-"')
stats = mycursor.fetchall()
for stat in stats:
print stat, "found in ", x
except sqlite3.Error as e:
continue
conn.close()

Using Sqlite with WAL

I've been following Python documentation on the SQLite tutorial and I managed to create an Employee table and write to it.
import sqlite3
conn = sqlite3.connect('employee.db')
c = conn.cursor()
firstname = "Ann Marie"
lastname = "Smith"
email = "ams#cia.com"
employee = (email, firstname, lastname)
c.execute('INSERT INTO Employee Values (?,?,?)', employee)
conn.commit()
# Print the table contents
for row in c.execute("select * from Employee"):
print(row)
conn.close()
I've been reading about the Write-Ahead Logging, but I can't find a tutorial that explains how to implement it. Can someone provide an example?
I notice Firefox, which uses SQLite, locks the file in such a way that if you attempt to delete the sqlite file while using Firefox, it will fail saying "file is open or being used"(or something similar), how do I achieve this? I'm running Python under Windows 10.
conn = sqlite3.connect('app.db', isolation_level=None)
Set journal mode to WAL:
conn.execute('pragma journal_mode=wal')
Or another way (just show how to off wal mode)
cur = conn.cursor()
cur.execute('pragma journal_mode=DELETE')
The PRAGMA journal_mode documentation says:
If the journal mode could not be changed, the original journal mode is returned. […]
Note also that the journal_mode cannot be changed while a transaction is active.
So you have to ensure that the database library does not try to be clever and automatically starts a transaction.

Python mysql: how do loop through table and regex-replacing field?

I am trying to iterate a table, fetch the rows in which a field has a pattern, then update the same row with a match group.
The following code runs without error, the two print lines before update clause output correct values. I have followed similar answers to come up the update clause, the logic seems right to me. However the code does not work, i.e., no rows updated. Where did I do wrong? Thanks,
# -*- coding: utf-8 -*-
import re
import MySQLdb
pattern = re.compile('#(.*)#.*$')
conn = MySQLdb.connect(
host='localhost', user='root',
passwd='password', db='j314', charset='utf8')
cursor = conn.cursor()
cursor.execute(
"""select `id`, `created_by_alias` from w0z9v_content where `catid` = 13 AND `created_by_alias` regexp "^#.*#.*$" limit 400""")
aliases = cursor.fetchall()
for alias in aliases:
newalias = pattern.match(alias[1])
if newalias.group(1) is not None:
# print alias[0]
# print newalias.group(1)
cursor.execute("""
update w0z9v_content set created_by_alias = %s where id = %s""", (newalias.group(1), alias[0]))
conn.close
autocommit is probably globally disabled on the server.
Execute either COMMIT after your updates, or SET autocommit=1 at the beginning of the session.
http://dev.mysql.com/doc/refman/5.0/en/commit.html
Also, you're not actually closing the connection, you forgot to call close:
conn.close()

using pyodbc on ubuntu to insert a image field on SQL Server

I am using Ubuntu 9.04
I have installed the following package versions:
unixodbc and unixodbc-dev: 2.2.11-16build3
tdsodbc: 0.82-4
libsybdb5: 0.82-4
freetds-common and freetds-dev: 0.82-4
python2.6-dev
I have configured /etc/unixodbc.ini like this:
[FreeTDS]
Description = TDS driver (Sybase/MS SQL)
Driver = /usr/lib/odbc/libtdsodbc.so
Setup = /usr/lib/odbc/libtdsS.so
CPTimeout =
CPReuse =
UsageCount = 2
I have configured /etc/freetds/freetds.conf like this:
[global]
tds version = 8.0
client charset = UTF-8
text size = 4294967295
I have grabbed pyodbc revision 31e2fae4adbf1b2af1726e5668a3414cf46b454f from http://github.com/mkleehammer/pyodbc and installed it using "python setup.py install"
I have a windows machine with Microsoft SQL Server 2000 installed on my local network, up and listening on the local ip address 10.32.42.69. I have an empty database created with name "Common". I have the user "sa" with password "secret" with full privileges.
I am using the following python code to setup the connection:
import pyodbc
odbcstring = "SERVER=10.32.42.69;UID=sa;PWD=secret;DATABASE=Common;DRIVER=FreeTDS"
con = pyodbc.connect(odbcstring)
cur = con.cursor()
cur.execute("""
IF EXISTS(SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = 'testing')
DROP TABLE testing
""")
cur.execute('''
CREATE TABLE testing (
id INTEGER NOT NULL IDENTITY(1,1),
myimage IMAGE NULL,
PRIMARY KEY (id)
)
''')
con.commit()
Everything WORKS up to this point. I have used SQLServer's Enterprise Manager on the server and the new table is there.
Now I want to insert some data on the table.
cur = con.cursor()
# using web data for exact reproduction of the error by all.
# I'm actually reading a local file in my real code.
url = 'http://www.forestwander.com/wp-content/original/2009_02/west-virginia-mountains.jpg'
data = urllib2.urlopen(url).read()
sql = "INSERT INTO testing (myimage) VALUES (?)"
Now here on my original question, I was having trouble using cur.execute(sql, (data,)) but now I've edited the question, because following Vinay Sajip's answer below (THANKS), I have changed it to:
cur.execute(sql, (pyodbc.Binary(data),))
con.commit()
And insertion is working perfectly. I can confirm the size of the inserted data using the following test code:
cur.execute('SELECT DATALENGTH(myimage) FROM testing WHERE id = 1')
data_inside = cur.fetchone()[0]
assert data_inside == len(data)
Which passes perfectly!!!
Now the problem is on retrieval of the data back.
I am trying the common approach:
cur.execute('SELECT myimage FROM testing WHERE id = 1')
result = cur.fetchone()
returned_data = str(result[0]) # transforming buffer object
print 'Original: %d; Returned: %d' % (len(data), len(returned_data))
assert data == returned_data
However that fails!!
Original: 4744611; Returned: 4096
Traceback (most recent call last):
File "/home/nosklo/devel/teste_mssql_pyodbc_unicode.py", line 53, in <module>
assert data == returned_data
AssertionError
I've put all the code above in a single file here, for easy testing of anyone that wants to help.
Now for the question:
I want python code to insert an image file into mssql. I want to query the image back and show it to the user.
I don't care about the column type in mssql. I am using the "IMAGE" column type on the example, but any binary/blob type would do, as long as I get the binary data for the file I inserted back unspoiled. Vinay Sajip said below that this is the preferred data type for this in SQL SERVER 2000.
The data is now being inserted without errors, however when I retrieve the data, only 4k are returned. (Data is truncated on 4096).
How can I make that work?
EDITS: Vinay Sajip's answer below gave me a hint to use pyodbc.Binary on the field. I have updated the question accordingly. Thanks Vinay Sajip!
Alex Martelli's comment gave me the idea of using the DATALENGTH MS SQL function to test if the data is fully loaded on the column. Thanks Alex Martelli !
Huh, just after offering the bounty, I've found out the solution.
You have to use SET TEXTSIZE 2147483647 on the query, in addition of text size configuration option in /etc/freetds/freetds.conf.
I have used
cur.execute('SET TEXTSIZE 2147483647 SELECT myimage FROM testing WHERE id = 1')
And everything worked fine.
Strange is what FreeTDS documentation says about the text size configuration option:
default value of TEXTSIZE, in bytes. For text and image datatypes, sets the maximum width of any returned column. Cf. set TEXTSIZE in the T-SQL documentation for your server.
The configuration also says that the maximum value (and the default) is 4,294,967,295. However when trying to use that value in the query I get an error, the max number I could use in the query is 2,147,483,647 (half).
From that explanation I thought that only setting this configuration option would be enough. It turns out that I was wrong, setting TEXTSIZE in the query fixed the issue.
Below is the complete working code:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import pyodbc
import urllib2
odbcstring = "SERVER=10.32.42.69;UID=sa;PWD=secret;DATABASE=Common;DRIVER=FreeTDS"
con = pyodbc.connect(odbcstring)
cur = con.cursor()
cur.execute("""
IF EXISTS(SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = 'testing')
DROP TABLE testing
""")
cur.execute('''
CREATE TABLE testing (
id INTEGER NOT NULL IDENTITY(1,1),
myimage IMAGE NULL,
PRIMARY KEY (id)
)
''')
con.commit()
cur = con.cursor()
url = 'http://www.forestwander.com/wp-content/original/2009_02/west-virginia-mountains.jpg'
data = urllib2.urlopen(url).read()
sql = "INSERT INTO testing (myimage) VALUES (?)"
cur.execute(sql, (pyodbc.Binary(data),))
con.commit()
cur.execute('SELECT DATALENGTH(myimage) FROM testing WHERE id = 1')
data_inside = cur.fetchone()[0]
assert data_inside == len(data)
cur.execute('SET TEXTSIZE 2147483647 SELECT myimage FROM testing WHERE id = 1')
result = cur.fetchone()
returned_data = str(result[0])
print 'Original: %d; Returned; %d' % (len(data), len(returned_data))
assert data == returned_data
I think you should be using a pyodbc.Binary instance to wrap the data:
cur.execute('INSERT INTO testing (myimage) VALUES (?)', (pyodbc.Binary(data),))
Retrieving should be
cur.execute('SELECT myimage FROM testing')
print "image bytes: %r" % str(cur.fetchall()[0][0])
UPDATE: The problem is in insertion. Change your insertion SQL to the following:
"""DECLARE #txtptr varbinary(16)
INSERT INTO testing (myimage) VALUES ('')
SELECT #txtptr = TEXTPTR(myimage) FROM testing
WRITETEXT testing.myimage #txtptr ?
"""
I've also updated the mistake I made in using the value attribute in the retrieval code.
With this change, I'm able to insert and retrieve a 320K JPEG image into the database (retrieved data is identical to inserted data).
N.B. The image data type is deprecated, and is replaced by varbinary(max) in later versions of SQL Server. The same logic for insertion/retrieval should apply, however, for the newer column type.
I had a similar 4096 truncation issue on TEXT fields, which SET TEXTSIZE 2147483647 fixed for me, but this also fixed it for me:
import os
os.environ['TDSVER'] = '8.0'

Categories