How do you connect to an Oracle ADW cloud database via Python?

How do you connect to an Oracle ADW cloud database via Python? - python

I found this website and it suggest that I might be able to connect to my Oracle ADW cloud database using python. I tried running the below code but keep running into the same error. Anyone have any insight on how to resolve this? Note: Password is changed for obvious reasons.
Code in Jupyter Notebooks:
import cx_Oracle as cx
import pandas as pd
import warnings
warnings.filterwarnings('ignore')
pswd = 'ABC'
#Connect to Autonomous Data Warehouse
con = cx.connect(user = 'ADMIN', password = pswd)
query = 'SELECT * from TEST123'
data_train = pd.read_sql(query, con=con)
Error:
DatabaseError: Error while trying to retrieve text for error ORA-01804
I get the same error when I run the below code:
...
#Connect to Autonomous Data Warehouse
con = cx.connect('ADMIN',pswd,"mltest_high")
query = 'SELECT * from TEST123'
data_train = pd.read_sql(query, con=con)

So this took a lot of education in order to figure out especially when it came to how Oracle wallets work inline with SLQNET.ora and TNS_NAMES.ora file in conjunction with system environmental variables but this website did get my python (in .ipynb) in visual studio code to work in being able to connect with Oracle's cloud ADW system. It is almost exactly what I did to get it to work on my machine but I didn't do the virtual environment. I had to figure out a work around with the items stated above but was able to instead use the system link to my wallet for the directory.
It is important to know that you need to do these things to get it to work. When you download the wallet from ADW, you need to copy the high/medium/low lines from the TNS_NAMES and paste that in your Oracle/network/admin/tns_names.ora file. You will also need to take the wallet information and ssl server from the sqlnet.ora file and put it in the sqlnet.ora file in the Oracle/network/admin/ directory as well. If you chose not to use the virtual environment as demonstrated in the post, to get the directory link, for the wallet information line, to work, you'll need to set said directory to the directory of the wallet folder. I unzipped mine; unsure if needed or not.
Lastly, you will need to set your system environmental variables for TNS_NAMES to wherever your tns_names.ora and sqlnet.ora system files are (not the ones that come in the wallet download folder), likely in Oracle\network\admin.
Below is the code that worked for me. I hope this helps someone else and that they don't have to go through the same hoops that I had to in order to figure it out.
import cx_Oracle
import os
import pandas as pd
os.environ.get('TNS_ADMIN')
connection = cx_Oracle.connect('<Oracle ADW Username', '<Oracle ADW Password>', '<TNS_NAME entry (high/med/low)>')
cursor = connection.cursor()
rs = cursor.execute("SELECT * FROM TEST123")
df = pd.DataFrame(rs.fetchall())
df

At a guess from the error number and fact the message text wasn't found, cx_Oracle is using Oracle Instant Client libraries, but you have the ORACLE_HOME environment variable set to some other software. If so, unset ORACLE_HOME. Or perhaps you are only using libraries included in a local Oracle DB install and haven't fully set the Oracle environment variables e.g. haven't set ORACLE_HOME. Or perhaps you might need a more recent version of the Oracle client libraries - get 19c libraries e.g Oracle Instant Client. Also check other StackOverflow questions about ORA-1804. If you update your question with information about what Oracle software you have installed on the computer running Python, a more detailed answer might be possible.
It sounds like you have got the cloud wallet sorted out for connection, but here are references for people coming to this question after reading your heading:
A blog post How to connect to Oracle Autonomous Cloud Databases
cx_Oracle documentation Connecting to Autononmous Databases
Oracle ADW documentation: Connect with Python, Node.js, and other Scripting Languages

Related

How will setting autocommit = True affect queries from python to Hive server when calling pyodbc.connect()

I am trying to connect a jupyter notebook I'm running in a conda environment to a Hadoop cluster through Apache Hive on cloudera. I understand from this post that I should install/set up the cloudera odbc driver and use pydobc and with a connection as follows:
import pyodbc
import pandas as pd
with pyodbc.connect("DSN=<replace DSN name>", autocommit=True) as conn:
df = pd.read_sql("<Hive Query>", conn)
My question is about the autocommit parameter. I see in the pyodbc connection documentation that setting autocommit to True will make it so that I don't have to explicitly commit transactions, but it doesn't specify what that actually means.
What exactly is a transaction ?
I want to select data from the hive server using pd.read_sql_query() but I don't want to make any changes to the actual data on the server.
Apologies if this question is formatted incorrectly or if there are (seemingly simple) details I'm overlooking in my question - this is my first time posting on stackoverflow and I'm new to working with cloudera / Hive.
I haven't tried connecting yet or running any queries yet because I don't want to mess up anything on the server.

Hive do not have concept of commit and starting transactions like RDBMS systems.
You should not worry about autocommit.

Using jaydebeapi3 to connect to Apache Phoenix

I have a program, in which I have been using the phoenixdb package developed by Lukas Lalinsky but during the past few days it seems to have become very unstable. I think this is due to the size of the database (as it is constantly growing). By unstable I mean that around half my queries are failing with a runtime exception.
So I have moved on and tried to find a more stable way to connect with my Phoenix "server". Therefore I want to try out a JDBC connection. As far as I have understood Phoenix should have great integration with JDBC.
I do however have problems with understanding how to set up the initial connection.
I read the following Usage section of the JayDeBeApi package, but I don't know what the Driver Class is or where it is located? If I have to download it myself? How to set it up? And so forth.
I was hoping someone in here would know and hopefully explain it in detail.
Thanks!
EDIT:
I've managed to figure out that my connect statement should be something along this:
import jaybedeapi as jdbc
conn = jdbc.connect('org.apache.phoenix.jdbc.PhoenixDriver', ['jdbc:phoenix:<ip>:<port>:', '', ''], '<location-of-phoenix-client.jar>')
However I still don't know where to get my hands on that phoenix-client.jar file and how to reference to it.

I managed to find the solution after having set up a Java project and testing out JDBC in that development environment and getting a successful connection.
To get the JDBC connection working in Java I used the JDBC driver found in the Phoenix distribution from Apache here. I used the driver that matched my Phoenix and HBase versions - phoenix-4.9.0-HBase-1.2-client.jar
Once that setup was completed and I could connect to Phoenix using Java I started trying to set it up using Python. I started a connection to Phoenix with the following:
import jaydebeapi as jdbc
import os
cwd = os.getcwd()
jar = cwd + '/phoenix-4.9.0-HBase-1.2-client.jar'
drivername = 'org.apache.phoenix.jdbc.PhoenixDriver'
url = 'jdbc:phoenix:<ip>:<port>/'
conn = jdbc.connect(drivername, url, jar)
Now I had a successful connection through JDBC to Phoenix using Python. Hope someone else out there can use this question in the future.
I created a cursor using the following and could issue commands like in the following:
cursor = conn.cursor()
sql = """SELECT ...."""
cursor.execute(sql)
resp = cursor.fetchone() # could use .fetchall() or .fetchmany() if needed
I hope this helps someone out there!

How do I get Python and Informix talking on Linux?

I have been at this for a while, trying all kinds of different packages from openSource, IBM, and many others. I have not yet found one that works without some sort of confusing install method that I can not get to work, or some sort of integration with other third-party pieces that I can not seem to get working.
I am simply trying to perform SQL statements on a Informix Server using Python. No different than mySQL and other tools. Using cursors or full result dumps, really do not care. I want to be able to formalize a query string statically or dynamically and then tell whatever tools/module to execute said query and return results (if any).
I have tried:
ibm_db 2.0.5.1 (https://pypi.python.org/pypi/ibm_db)
IBM Informix Client SDK
pymssql
unixODBC
Looked at but do not want to use Jython (JPython).
What I have managed:
I have been able to install and get the IBM Informix Client SDK installed and working. I can connect to my Informix DB server and perform queries.
I have mySQL working and connecting and querying.
I have written a Java program to perform queries using a Java driver, compiled it, combined it with a bash script to perform queries and email results.
I am just stumped. Looking for assistance on what to download (URLs), how to go about installing it (tips and tricks, environment variables, where to install it, etc..) I want to have something that does not depend on Java or writing Java, etc. I am looking for a solution that may will give me the ability to write Python to query, insert, update, and delete from an Informix database and tables. I want to combine my previously written Java and Bash script into a Python script.
Frustrated and looking for any assistance.
Thank you for listening and please ask questions if you do not understand my plea.

Informix on Linux is a bag of pain. My personal setup to get Informix-connect to work with CPython3 is stacking the Informix Client SDK with unixODBC and pyodbc. There are some hoops to jump through, none of which are documented. Almost all the setup is completely useless yet required to prevent some parts of the Informix-driver to bail out. Note that some options are case- and space-sensitive (Description=Informix != description = Informix).
Install the Informix Client SDK. You don't need all the garbage that comes in the package, just Informix Connect. I assume you use the default path /opt/IBM/informix
Add /opt/IBM/informix/lib/cli and /opt/IBM/informix/lib/esql to your dynamic linker lookup paths. On Fedora you can do this by putting them in a new file /etc/ld.so.conf.d/informix.conf
Create a new /etc/odbc.ini and add the following:
[ODBC Data Sources]
Infdrv1=IBM INFORMIX ODBC DRIVER
[Infdrv1]
Driver=/opt/IBM/informix/lib/cli/iclit09b.so
Description=Informix
Database=WHATEVER_YOUR_DB_NAME_IS
Servername=WHATEVER_YOUR_SERVER_NAME_IS
CLIENT_LOCALE=en_us.8859-1 # MAY BE DIFFERENT
DB_LOCALE=en_us.819 # MAY BE DIFFERENT
[ODBC]
UNICODE=UCS-2
Create a new /etc/odbcinst.ini and add the following
[IBM INFORMIX ODBC DRIVER]
Description=Informix Driver
Driver=libifcli.so
You need to set the environment variables INFORMIXDIR and ODBCINI. On Fedora you may add a new file /etc/profile.d/informix.sh and add
export INFORMIXDIR=/opt/IBM/informix
export ODBCINI=/etc/odbc.ini
Edit /opt/IBM/informix/etc/sqlhosts and put your basic connection information there. In the most simple case it has only one line that reads
YOUR_SERVER_NAME\tonsoctcp\tYOUR_DB_NAME\tpdap-np
Note that pdap-np is actually port 1526 which is also the Informix "Turbo"-Driver tcp port. See your /etc/services
Create an empty .odbc.ini in your $HOME e.g. by touch $HOME/.odbc.ini. It needs to be there. It needs to be 0 bytes. I love this part.
Install unixODBC and pyodbc from your favorite repository.
Remember to get your env-changes going, e.g. via reboot. You can now connect like this:
import pyodbc
DRIVER = 'IBM INFORMIX ODBC DRIVER'
SERVER = 'YOUR_SERVER_NAME'
DATABASE = 'YOUR_DB_NAME'
constr = 'DRIVER={%s};SERVER=%s;DATABASE=%s;UID=%s;PWD=%s' % (DRIVER, SERVER, DATABASE, USER, PASS)
con = pyodbc.connect(constr, autocommit=False)
From there on you can get your cursor, execute queries, fetch results and such. Note that there are numerous bugs in quirks in IBM's ODBC-driver, out of my head:
Rows that contain NULLs may cause a segfault as the IBM driver puts a 32bit int where a 64bit int is expected to signal the value being null. In case you are affected by this, you need to patch unixODBC for all possible column types to deal with this.
Columns without names cause the driver to segfault (e.g. SELECT COUNT(*) FROM foobar needs to be SELECT COUNT(*) AS c FROM foobar).
Make sure your encoding actually works as expected. UTF8 is something not enterprise-enough for IBM and UCS-2 is the only thing I got to work.

Python - Connect AS400 Collection using ibm_db

I am able to connect to our database given the following connection string (OLEDB).
"Provider=IBMDA400;Data Source=10.33.xx.x;User Id=user;Password=pass;Default Collection=mm370lib;";
Then tried (Python ibm_db)
import ibm_db, ibm_db_dbi
ibm_db_conn = ibm_db.connect("DRIVER={IBM DB2 CLI DRIVER};DATABASE=mm370lib;HOSTNAME=10.33.xx.x;PORT=446;PROTOCOL=TCPIP;UID=user;PWD=pass;", '', '')
But this error occured.
Exception: [IBM][CLI Driver] SQL30061N The database alias or database name "MM370LIB " was not found at the remote node. SQLSTATE=08004 SQLCODE=-30061
What did I missed? Are the database Name and Default Collection different?

Yes, the DB name is usually the system name; though it doesn't have to be.
Originally, the AS/400 support only a single DB.
With the introduction of independent storage pools (iASP), today's IBM i machines can have multiple DBs.
From a 5250 session, try:
WRKRDBDIRE
Look for the *LOCAL entry, may be the only one.
You can also see the DB names using IBM i Navigator for Windows or the web based IBM Navigator. The DB names are shown under the "Databases" ,
there are three DBs on the system: Rchasma1, Iasp320, Ima1db1.

Python Connect to Oracle DB

I currently use PYODBC to connect to MS SQL Server and MYSQL, but now need to access an Oracle database as well.
I have Oracle SQL Developer installed on my work comp (but there doesn't seem to be a separate Net Manager client per other SO posts), which I can use to access the DB.
Ideally, I would run what I need to in python, but am having difficulties. As it stands, I have created a linked server object to the Oracle DB in a MS SQL Server DB as a work around, but this isn't ideal.
What do I need to do to get PYODBC (or substitute) to connect to Oracle? Thanks very kindly.

I ran into the same issue where I could connect to a database via Oracle SQL Developer but not via pyodbc. Someone else did most of the database setup, so I wasn't sure of the proper connection parameters. I'll run you through how I was able to connect on a Windows computer.
In the Start Menu I typed "odbc" and selected "Microsoft ODBC Administrator". Under the "System DSN" tab I found my DSN name (we'll call it myDSN) and corresponding driver (mine was "Oracle in OraClient11g_home2"). I also have to specify a username and password for my database so my connection line now looks like this:
cnxn = pyodbc.connect(driver='{Oracle in OraClient11g_home2}', dsn='myDSN', uid='HODOR', pwd='hodor')
Maybe at this point it will work for you, but I still wasn't able to connect. This computer is a mess of 32 and 64 bit drivers so I figured I was pointing to the wrong one. So once again into the Start Menu, where under All Programs I found a folder called "Oracle in OraClient11g_home2" and right under it, one called "Oracle in OraClient11g_home32Bit". I changed my connection line in Python to the following:
cnxn = pyodbc.connect(driver='{Oracle in OraClient11g_home32Bit}', dsn='myDSN', uid='HODOR', pwd='hodor')
And it connected.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.