Currently I am trying to make a connection with a foxpro database using the python win32com module.
The Python code currently looks like this:
import win32com.client
conn = win32com.client.Dispatch('ADODB.Connection')
dsn = 'Provider=vfpoledb;Data Source=C:\MyDbFolder\MyDbContainer.dbc;'
conn.Open(dsn)
print('ok')
However it says that it could not find the provider; even-tough I have successfully installed the latest version of the Microsoft OLE DB Provider for Visual FoxPro 9.0 from the microsoft website.
'Provider cannot be found. It may not be properly installed.'
I have tried this both with python 32 bit as well as the 64 bit version on different pc's. If you use 32 bit python it works. However if one needs to use 64bit python, it seems that this should as well.
Did anyone get this working without issues?
One possible workaround for the lack of a 64-bit VFPOLEDB driver might be setting up the VFP database as a linked server in a 32-bit instance of MS SQL Server (Express is free and should work). SQL Server 2014 seems to be the last version for which a 32-bit edition is available. There are plenty 64-bit OLEDB drivers for SQL Server, and they don't care about the bitness of the instance.
There are step-by-step instructions in How to successfully connect to Foxpro database files using MSSQL Linked Server feature and ODBC? over on ServerFault
Note: using Fox data via a linked server is severely limiting and nowhere near as powerful as using Fox directly or via VFPOLEDB. However, sometimes limited access is better than no access at all.
The queries have to use SQL server syntax and they are limited by it. For example, boolean fields get mapped to the bit data type (0 or 1) because SQL Server has no concept of booleans. But inside an OpenQuery call you can use full Fox syntax. Assuming the linked server is called FOX and table StoffPZN has a boolean field op:
select * from FOX...StoffPZN where op = 1; -- T-SQL rules
select * from openquery(FOX, 'sele * from StoffPZN wher op'); -- Fox rules
Related
Is it possible to have a database driver written in pure python that doesn't need an underlying system library/ shared object to connect to a database?
Apologies for the necro-bump, but this still comes up in a google search for pure python drivers. So:
Implementing a database driver in pure python is conceptually quite straight forward, but only if you have the wire protocol it uses documented. Then you (just) write a handler for each type of message to and from the database server in byte format and away you go. The devil is in the detail of course and that's why you have to have the protocol documented unless you are patient enough to reverse engineer it (and handle undocumented changes!)
There is a pure python driver for mssql (called python-tds) and has been for a long time (v1.0 Jan 2013). There are also pure python drivers for postgresql (pg8000) and mysql (can't remember the name). I haven't done an exhaustive search for other databases as I don't generally use them.
Pure python drivers are excellent for cross platform development, using alternative python implementations, or simplifying packaging. I especially like them for putting a python program onto Android. You don't need to worry about how to cross compile db client libraries.
Yes. It is possible to implement python database API as it stated in PEP 249
Even more: such database API implementations exists.
E.g. nuodb-python
I'm planning on using the Teradata Python module, which can use either the Teradata REST API or ODBC to connect to Teradata. I'm wondering what the performance would be like for REST vs. ODBC connection methods for fairly large data pulls (> 1 million rows, > 1 GB of results).
Information on Teradata's site suggests that the use case for the REST API is more for direct access of Teradata by browsers or web applications, which implies to me that it may not be optimized for queries that return more data than a browser would be expected to handle. I also wonder if JSON overhead will make it less efficient than the ODBC data format for sending query results over the network.
Does anyone have experience with Teradata REST services performance or can point to any comparisons between REST and ODBC for Teradata?
I had exactly the same question. As the rest web server is active for us, I just run a few tests. I tested PyTD with rest and odbc back ends, and jdbc using jaydebeapi + Jpype1. I used Python 3.5, CentOS 7 machine, I got similar results with python 3.6 on centos and on windows.
Rest was the fastest and jdbc was the slowest. It is interesting, because in R JDBC was really fast. That probably means JPype is the bottleneck. Rest was also very fast for writing, but my guess is that could be improved in JDBC using prepared statements appropriately.
We are now going to switch to rest for production. Let's see how it goes, it is also not problem free for sure. Another advantage is that our analysts want also to work on their own pcs/macs and rest is the easiest to install particularly on windows (you do pip install teradata and you are done, while for odbc and jaydebeapi+Jpype you need a compiler, and with odbc spend some time to get it configured right).
If speed is critical I guess another way would be to write a java command line app that fetches the rows, writes them to a csv and then read the csv from python. I did not test, but based on my previous experience on these kind of issues I bet that is going to be faster than anything else.
Selecting 1M rows
Python 3- JDBC: 24 min
Python 3- ODBC: 6.5 min
Python 3- Rest: 4 min
R - JDBC: 35 s
Selecting 100 K rows
Python 3- JDBC 141 s
Python 3- ODBC 41 s
Python 3- Rest 16 s
R - JDBC 5 s
Inserting 100 K Rows
Python 3- JDBC got errors, too lazy to correct them
Python 3- ODBC 7 min
Python 3- Rest 8 s (batch) 9 min (no batch)
R - JDBC 8 min
I want to call PostgreSQL queries and return results for python APIs?
Basically , do a python and PostgreSQL integration/Connectivity.
So, for specific Python API /calls want to execute the queries n return result.
Also, want to achieve abstraction of PostgreSQL DB.
Thanks.
To add to klin's comment:
psycopg2 -
This is the most popular psql adapter for python. It was build to address heavy concurrency issues with psql database usage. Several extensions are available for added functionality with the DB API.
asyncpg -
More recent psql adapter which seeks to address shortfalls in functionality and performance that exist with psycopg2. Doubles the speed of psycopg's text based data exchange protocol by using binary I/O (which adds generic support for container types). A Major plus is that it has zero dependencies. No personal experience with this adapter but will test soon.
I have been at this for a while, trying all kinds of different packages from openSource, IBM, and many others. I have not yet found one that works without some sort of confusing install method that I can not get to work, or some sort of integration with other third-party pieces that I can not seem to get working.
I am simply trying to perform SQL statements on a Informix Server using Python. No different than mySQL and other tools. Using cursors or full result dumps, really do not care. I want to be able to formalize a query string statically or dynamically and then tell whatever tools/module to execute said query and return results (if any).
I have tried:
ibm_db 2.0.5.1 (https://pypi.python.org/pypi/ibm_db)
IBM Informix Client SDK
pymssql
unixODBC
Looked at but do not want to use Jython (JPython).
What I have managed:
I have been able to install and get the IBM Informix Client SDK installed and working. I can connect to my Informix DB server and perform queries.
I have mySQL working and connecting and querying.
I have written a Java program to perform queries using a Java driver, compiled it, combined it with a bash script to perform queries and email results.
I am just stumped. Looking for assistance on what to download (URLs), how to go about installing it (tips and tricks, environment variables, where to install it, etc..) I want to have something that does not depend on Java or writing Java, etc. I am looking for a solution that may will give me the ability to write Python to query, insert, update, and delete from an Informix database and tables. I want to combine my previously written Java and Bash script into a Python script.
Frustrated and looking for any assistance.
Thank you for listening and please ask questions if you do not understand my plea.
Informix on Linux is a bag of pain. My personal setup to get Informix-connect to work with CPython3 is stacking the Informix Client SDK with unixODBC and pyodbc. There are some hoops to jump through, none of which are documented. Almost all the setup is completely useless yet required to prevent some parts of the Informix-driver to bail out. Note that some options are case- and space-sensitive (Description=Informix != description = Informix).
Install the Informix Client SDK. You don't need all the garbage that comes in the package, just Informix Connect. I assume you use the default path /opt/IBM/informix
Add /opt/IBM/informix/lib/cli and /opt/IBM/informix/lib/esql to your dynamic linker lookup paths. On Fedora you can do this by putting them in a new file /etc/ld.so.conf.d/informix.conf
Create a new /etc/odbc.ini and add the following:
[ODBC Data Sources]
Infdrv1=IBM INFORMIX ODBC DRIVER
[Infdrv1]
Driver=/opt/IBM/informix/lib/cli/iclit09b.so
Description=Informix
Database=WHATEVER_YOUR_DB_NAME_IS
Servername=WHATEVER_YOUR_SERVER_NAME_IS
CLIENT_LOCALE=en_us.8859-1 # MAY BE DIFFERENT
DB_LOCALE=en_us.819 # MAY BE DIFFERENT
[ODBC]
UNICODE=UCS-2
Create a new /etc/odbcinst.ini and add the following
[IBM INFORMIX ODBC DRIVER]
Description=Informix Driver
Driver=libifcli.so
You need to set the environment variables INFORMIXDIR and ODBCINI. On Fedora you may add a new file /etc/profile.d/informix.sh and add
export INFORMIXDIR=/opt/IBM/informix
export ODBCINI=/etc/odbc.ini
Edit /opt/IBM/informix/etc/sqlhosts and put your basic connection information there. In the most simple case it has only one line that reads
YOUR_SERVER_NAME\tonsoctcp\tYOUR_DB_NAME\tpdap-np
Note that pdap-np is actually port 1526 which is also the Informix "Turbo"-Driver tcp port. See your /etc/services
Create an empty .odbc.ini in your $HOME e.g. by touch $HOME/.odbc.ini. It needs to be there. It needs to be 0 bytes. I love this part.
Install unixODBC and pyodbc from your favorite repository.
Remember to get your env-changes going, e.g. via reboot. You can now connect like this:
import pyodbc
DRIVER = 'IBM INFORMIX ODBC DRIVER'
SERVER = 'YOUR_SERVER_NAME'
DATABASE = 'YOUR_DB_NAME'
constr = 'DRIVER={%s};SERVER=%s;DATABASE=%s;UID=%s;PWD=%s' % (DRIVER, SERVER, DATABASE, USER, PASS)
con = pyodbc.connect(constr, autocommit=False)
From there on you can get your cursor, execute queries, fetch results and such. Note that there are numerous bugs in quirks in IBM's ODBC-driver, out of my head:
Rows that contain NULLs may cause a segfault as the IBM driver puts a 32bit int where a 64bit int is expected to signal the value being null. In case you are affected by this, you need to patch unixODBC for all possible column types to deal with this.
Columns without names cause the driver to segfault (e.g. SELECT COUNT(*) FROM foobar needs to be SELECT COUNT(*) AS c FROM foobar).
Make sure your encoding actually works as expected. UTF8 is something not enterprise-enough for IBM and UCS-2 is the only thing I got to work.
I am working on a program to automate parsing data from XML files and storing it into several databases. (Specifically the USGS realtime water quality service, if anyone's interested, at http://waterservices.usgs.gov/rest/WaterML-Interim-REST-Service.html) It's written in Python 2.5.1 using LXML and PYODBC. The databases are in Microsoft Access 2000.
The connection function is as follows:
def get_AccessConnection(db):
connString = 'DRIVER={Microsoft Access Driver (*.mdb)};DBQ=' + db
cnxn = pyodbc.connect(connString, autocommit=False)
cursor = cnxn.cursor()
return cnxn, cursor
where db is the filepath to the database.
The program:
a) opens the connection to the database
b) parses 2 to 8 XML files for that database and builds the values from them into a series of records to insert into the database (using a nested dictionary structure, not a user-defined type)
c) loops through the series of records, cursor.execute()-ing an SQL query for each one
d) commits and closes the database connection
If the cursor.execute() call throws an error, it writes the traceback and the query to the log file and moves on.
When my coworker runs it on his machine, for one particular database, specific records will simply not be there, with no errors recorded. When I run the exact same code on the exact same copy of the database over the exact same network path from my machine, all the data that should be there is there.
My coworker and I are both on Windows XP computers with Microsoft Access 2000 and the same versions of Python, lxml, and pyodbc installed. I have no idea how to check whether we have the same version of the Microsoft ODBC drivers. I haven't been able to find any difference between the records that are there and the records that aren't. I'm in the process of testing whether the same problem happens with the other databases, and whether it happens on a third coworker's computer as well.
What I'd really like to know is ANYTHING anyone can think of that would cause this, because it doesn't make sense to me. To summarize: Python code executing SQL queries will silently fail half of them on one computer and work perfectly on another.
Edit:
No more problem. I just had my coworker run it again, and the database was updated completely with no missing records. Still no idea why it failed in the first place, nor whether or not it will happen again, but "problem solved."
I have no idea how to check whether
we have the same version of the
Microsoft ODBC drivers.
I think you're looking for Control Panel | Administrative Tools | Data Sources (ODBC). Click the "Drivers" tab.
I think either Access 2000 or Office 2000 shipped with a desktop edition of SQL Server called "MSDE". Might be worth installing that for testing. (Or production, for that matter.)