I am using SQL Alchemy to connect to my Oracle 11g database residing in a Linux machine. I am writing a script in Python to connect to the database and retrieve values from a particular table.
The code i wrote is:
from sqlalchemy import *
def db_connection():
config = ConfigParser.RawConfigParser()
config.read('CMDC_Analyser.cfg')
USER = config.get('DB_Connector','db.user_name' )
PASSWORD = config.get('DB_Connector','db.password' )
SID = config.get('DB_Connector','db.sid' )
IP = config.get('DB_Connector','db.ip')
PORT = config.get('DB_Connector','db.port')
engine = create_engine('oracle://{user}:{pwd}#{ip}:{port}/{sid}'.format(user=USER, pwd=PASSWORD, ip=IP, port=PORT, sid=SID), echo=False)
global connection
connection = engine.connect()
p = connection.execute("select * from ssr.bouquet")
for columns in p:
print columns
connection.close()
The complete values from the table is printed out here. I wanted to select value from a particular column only. Hence i used the following code:
for columns in p:
print columns['BOUQUET_ID']
But here i am getting the following error.
sqlalchemy.exc.NoSuchColumnError: "Could not locate column in row for column 'BOUQUET_ID'"
How to fix this?
Related
I have a column called REQUIREDCOLUMNS in a SQL database which contains the columns which I need to select in my Python script below.
Excerpt of Current Code:
db = mongo_client.get_database(asqldb_row.SCHEMA_NAME)
coll = db.get_collection(asqldb_row.TABLE_NAME)
table = list(coll.find())
root = json_normalize(table)
The REQUIREDCOLUMNSin SQL contains values reportId, siteId, price, location
So instead of explicitly typing:
print(root[["reportId","siteId","price","location"]])
Is there a way to do print(root[REQUIREDCOLUMNS])?
Note: (I'm already connected to the SQL database in my python script)
You will have to use cursors if you are using mysql or pymysql , both the syntax are almost similar below i will mention for mysql
import mysql
import mysql.connector
db = mysql.connector.connect(
host = "localhost",
user = "root",
passwd = " ",
database = " "
)
cursor = db.cursor()
sql="select REQUIREDCOLUMNS from table_name"
cursor.execute(sql)
required_cols = cursor.fetchall()#this wll give ["reportId","siteId","price","location"]
cols_as_string=','.join(required_cols)
new_sql='select '+cols_as_string+' from table_name'
cursor.execute(new_sql)
result=cursor.fetchall()
This should probably work, i intentionally split many lines into several lines for understanding.
syntax could be slightly different for pymysql
I am trying to create pyspark jdbc dataframe using LDAP connections from Oracle.
Below code is working for normal connection string based JDBC dataframe creation which worked.
creds = {"user":"USER_NAME",
"password":"PASSWORD",
"driver": "oracle.jdbc.OracleDriver"}
connection_string = "jdbc:oracle:thin:#//hostname.com:1521/myint.domain.com"
df = spark_session.read.jdbc(url=connection_string, table=query, properties=creds)
Now, our database configurations changed and we are supposed to use LDAP based authentication only.
Hence I tried changing connection_string as below.
connection_string = "myint"
But, it is throwing below issue.
py4j.protocol.Py4JJavaError: An error occurred while calling o51.jdbc.
: java.lang.NullPointerException
Without Spark, I tried connecting using cx_Oracle module(python module for connecting to Oracle) for testing and it worked.
Before:
host = 'myint.domain.com'
ip = 'jdbc:oracle:thin:#//hostname.com'
port = 1521
conn = cx_Oracle.makedsn(ip, port, service_name=host)
db = cx_Oracle.connect('USER_NAME', 'PASSWORD', conn) # giving the conn object
cursor = db.cursor()
cursor.execute("""select * from myschema.mytable fetch first 5 rows only""")
for row in cursor:
print(row)
After:
db = cx_Oracle.connect('USER_NAME', 'PASSWORD', "myint") # Now, just giving only the database
cursor = db.cursor()
cursor.execute("""select * from myschema.mytable fetch first 5 rows only""")
for row in cursor:
print(row)
I need to achieve the same in Pyspark JDBC dataframe by just passing only the database name. Please suggest what needs to be done.
I hope this question can be applicable for scala spark as well.
I have a MySQL database named my_database , and it that database there are a lot of tables. I want to connect MySQL with Python and to work with specific table named my_table from that database.
This is the code that I have for now:
import json
import pymysql
connection = pymysql.connect(user = "root", password = "", host = "127.0.0.1", port = "", database = "my_database")
cursor = connection.cursor()
print(cursor.execute("SELECT * FROM my_database.my_table"))
This code returns number of rows, but I want to get all columns and rows (all values from that table).
I have also tried SELECT * FROM my_table but result is the same.
Did you read the documentation? You need to fetch the results after executing: fetchone(), fetchall() or something like this:
import json
import pymysql
connection = pymysql.connect(user = "root", password = "", host = "127.0.0.1", port = "", database = "my_database")
with connection.cursor(pymysql.cursors.DictCursor) as cursor:
cursor.execute("SELECT * FROM my_database.my_table")
rows = cursor.fetchall()
for row in rows:
print(row)
You probably also want a DictCursor as the results are then parsed as dict.
I have a series of identical mysql tables in a list of databases as follows
A_1.table
A_2.table
A_3.table
the tables are accesssed by means of a series of ip addresses as follows
IP1, IP2, IP3.
The mysql database table names are stored as tablelist and the ip addresses are stored in list called hostlist
These tables are stored in mysql database with the following credentials
username=x
password=pw
port=1234
I have created the following code
def proc(db, hostname):
con = mysql.connector.connect(user=x, password=pw,
host=hostname,
database=db, port=1234)
db_cursor = con.cursor()
db_cursor.execute('SELECT * FROM table')
table_rows = db_cursor.fetchall()
df = pd.DataFrame(table_rows)
print(df.head(1))
con.close()
Next I have applied the above code to both lists as follows.
a=map(proc, tablelist, hostlist)
The code seems to work as there are no errors. How do I test if the map command works. In R We get True as an output to let us know if map has worked
I am trying to send monthly data to a MySQL database using Python's pandas to_sql command. My program runs one month of data at a time and I want to append the new data onto the existing database. However, Python gives me an error:
_mysql_exceptions.OperationalError: (1050, "Table 'cps_basic_tabulation' already exists")
Here is my code for connecting and exporting:
conn = MySQLdb.connect(host = config.get('db', 'host'),
user = config.get('db', 'user'),
passwd = config.get('db', 'password'),
db = 'cps_raw')
combined.to_sql(name = "cps_raw.cps_basic_tabulation",
con = conn,
flavor = 'mysql',
if_exists = 'append')
I have also tried using:
from sqlalchemy import create_engine
Replacing conn = MySQLdb.connect... with:
engine = mysql+mysqldb://<user>:<password>#<host>[:<port>]/<dbname>
conn = engine.connect().connection
Any ideas on why I cannot append to a database?
Thanks!
Starting from pandas 0.14, you have to provide directly the sqlalchemy engine, and not the connection object:
engine = create_engine("mysql+mysqldb://<user>:<password>#<host>[:<port>]/<dbname>")
combined.to_sql("cps_raw.cps_basic_tabulation", engine, if_exists='append')
Since I had the same error message and stumbled across this post I leave this here for others to find.
I found two ways to solve the duplicated table creation although I lack the insight as to why this solves it:
Either pass the database name in the url when creating a connection
or pass the database name as a schema in pd.to_sql.
Doing both does not hurt. Also, a few years later it is (again?) possible to pass the pure connection to pandas. My guess would be that in the previous answer by joris the first of my solution cases might have implicitly solved the problem.
```
#create connection to MySQL DB via sqlalchemy & pymysql
user = credentials['user']
password = credentials['password']
port = credentials['port']
host = credentials['hostname']
dialect = 'mysql'
driver = 'pymysql'
db_name = 'test_db'
# setup SQLAlchemy
from sqlalchemy import create_engine
cnx = f'{dialect}+{driver}://{user}:{password}#{host}:{port}/'
engine = create_engine(cnx)
# create database
with engine.begin() as con:
con.execute(f"CREATE DATABASE {db_name}")
############################################################
# either pass the db_name vvvv - HERE- vvvv after creating a database
cnx = f'{dialect}+{driver}://{user}:{password}#{host}:{port}/{db_name}'
############################################################
engine = create_engine(cnx)
table = 'test_table'
col = 'test_col'
with engine.begin() as con:
# this would work here instead of creating a new engine with a new link
# con.execute(f"USE {db_name}")
con.execute(f"CREATE TABLE {table} ({col} CHAR(1));")
# insert into database
import pandas as pd
df = pd.DataFrame({col : ['a','b','c']})
with engine.begin() as con:
# this has no effect here
# con.execute(f"USE {db_name}")
df.to_sql(
name= table,
if_exists='append',
# passing con = cnx here would equally work
con=con,
############################################################
# or pass it as a schema vvvv - HERE - vvvv
#schema=db_name,
############################################################
index=False
)```
Tested with python version 3.8.13, sqlalchemy 1.4.32 and pandas 1.4.2.
Same problem might have appeared here and here.