How to extract select columns from a sqlite database with Python

How to extract select columns from a sqlite database with Python - python

I am trying to write. code that will allow a user to select specific columns from a sqlite database which will then be transformed into a pandas data frame. I am using a test database titled test_database.db with a table titled test. The table has three columns, id, value_one, and value_two. The function I am showing exists within a class that establishes a connection to the database and in this function the user only needs to pass the table name and a list of columns that they would like to extract. For instance in command line sqlite I might type the command select value_one, value_two from test if I wanted only to read in the columns value_one and column_two from the table test. If I type this command into command line the method works. However, in this case I use python to build the text string which is fed into pandas.read_sql_query() and the method does not work. My code is shown below
class ReadSQL:
def __init__(self, database):
self.database = database
self.conn = sqlite3.connect(self.database)
self.cur = self.conn.cursor()
def query_columns_to_dataframe(table, columns):
query = 'select '
for i in range(len(columns)):
query = query + columns[I] + ', '
query = query[:-2] + ' from ' + table
# print(query)
df = pd.read_sql_query(query, self.conn)
return
def close_database()
self.conn.close
return
test = ReadSQL(test_database.db)
df = query_columns_to_dataframe('test', ['value_one', 'value_two'])
I am assuming my problem has something to do with the way that query_columns_to_dataframe() pre-processes the information because if I uncomment the print command in query_columnes_to_dataframe() I get a text string that looks identical to what works if I just type it directly into command line. Any help is appreciated.

I mopped up a few mistakes in your code to produce this, which works. Note that I inadvertently changed the names of the fields in your test db.
import sqlite3
import pandas as pd
class ReadSQL:
def __init__(self, database):
self.database = database
self.conn = sqlite3.connect(self.database)
self.cur = self.conn.cursor()
def query_columns_to_dataframe(self, table, columns):
query = 'select '
for i in range(len(columns)):
query = query + columns[i] + ', '
query = query[:-2] + ' from ' + table
#~ print(query)
df = pd.read_sql_query(query, self.conn)
return df
def close_database():
self.conn.close
return
test = ReadSQL('test_database.db')
df = test.query_columns_to_dataframe('test', ['value_1', 'value_2'])
print (df)
Output:
value_1 value_2
0 2 3

Your code are full of syntax errors and issues
The return in query_columns_to_dataframe should be return df. This is the primary reason why your code does not return anything.
self.cur is not used
Missing self parameter when declaring query_columns_to_dataframe
Missing colon at the end of the line def close_database()
Missing self parameter when declaring close_database
Missing parentheses here: self.conn.close
This df = query_columns_to_dataframe should be df = test.query_columns_to_dataframe
Fixing these errors and your code should work.

Related

Not able to connect to a table using python cx_oracle

I'm getting the mentioned error while I'm using a WHERE condition in cx_oracle.
However, there is no error while I fetch all rows.
Please find my code below. The input - cols is a list of all the columns I want to fetch. Similarly, I want to pass the where condition as variable too, wherein I'm passing on the values in a loop.
For testing, lets keep it static.
Thanks
cols = [
'ID',
'CODE',
'LOGINDEX_CODE',
'IS_ACTIVE',
'IS_PORT_GROUP'
]
table_name = 'PORT'
os.chdir(os.path.dirname(__file__))
main_dir = os.getcwd()
import cx_Oracle
try:
cx_Oracle.init_oracle_client(lib_dir= main_dir + "\\instantclient_21_3\\")
except:
pass
dsn_tns = cx_Oracle.makedsn(r'some_db', 1521, service_name=r'some_service')
conn = cx_Oracle.connect(user='abcdef', password=r'ghijkl', dsn=dsn_tns).cursor()
query_cols = ', '.join(cols)
named_params = {
'varx' : query_cols,
'vary' : 'LEG',
'varz' : 242713
}
sql_query = 'SELECT :varx FROM :vary WHERE START_PORT_ID IN :varz'
conn.prepare(sql_query)
conn.execute(sql_query, named_params)

Bind variables cannot be used to replace parts of the SQL statement itself. They can only be used to supply data that is sent to the database. So you would need to do something like this instead:
sql_query = f"select {', '.join(cols)} from LEG where start_port_id = :varz"
with conn.cursor() as cursor:
for row in cursor.execute(sql_query, varz=242713):
print(row)

I can't understand why i'm getting no such table

I'm trying to run an AI generated SQL statement to run on my table from an unploaded CSV file and everything reads the table fine except the last section when trying to implement the SQL. can anyone please suggest where i'm going wrong?
PandaSQLException: (sqlite3.OperationalError) no such table: df [SQL: select * from df] (Background on this error at: https://sqlalche.me/e/14/e3q8)
import streamlit as st
from pandasql import sqldf
from QG_Backend import AI_backend
def pysqldf(q):
'''
This function allows you to run SQL queries against a pandas dataframe
:param q: the query string
:return: A dataframe.
'''
return sqldf(q, globals())
def app():
backend = AI_backend
df = None
tables = None
st.title('Data')
# This is creating a variable for the user to upload a CSV, XLS or XLS file.
uploaded_file = st.sidebar.file_uploader("Upload CSV:",type=['CSV','xlsx','xls'])
# This is checking to see if the user has uploaded a file. If they have, then the code will try to
# read the file as a CSV. If it is not a CSV, then it will try to read it as an Excel spreadsheet. If
# it is not a CSV or an Excel spreadsheet, then it will display an error message.
if uploaded_file is not None:
try:
df = pd.read_csv(uploaded_file)
df.columns = df.columns.str.replace(' ', '_')
df = df.applymap(lambda s: s.casefold() if type(s) == str else s)
except:
df = pd.read_excel(uploaded_file)
df.columns = df.columns.str.replace(' ', '_')
df = df.applymap(lambda s: s.casefold() if type(s) == str else s)
else:
st.sidebar.info("Upload a file to query")
st.subheader("File Query")
# Create the columns/layout
col1, col2 = st.columns(2)
# This is creating a form for the user to enter their query.
with col1:
with st.form(key='query_form'):
plain_text = st.text_area("Enter your query:")
submit_text = st.form_submit_button("Execute")
# This is checking to see if there are any spelling mistakes in the query. If there
# are, it will correct them.
fixed = backend.spellCheck(plain_text)
# Sends the files column headers into a variable
if df is not None:
tables = (df.dtypes).to_string()
# This is creating the prompt for the AI to generate the SQL code.
prompt="### Example SQL querys:\nSELECT * FROM df WHERE star LIKE '%Tom Hanks%'\n\n----------\n\nCSV table name: df\n" + tables + "\n### A query to " + fixed + ".\nSELECT"
st.write(fixed)
# This is generating the SQL code for the user.
output = backend.generateSQL(prompt)
# This is adding the SELECT statement to the output of the AI generated SQL code.
table_query = "SELECT" + output
st.write(table_query)
# This displays the AI generated SQL code as a visual representation to the user
if df is not None:
with col2:
if submit_text:
st.info("Query Submitted")
st.subheader("SQL Generated Code:")
st.write("SELECT" + output)
# This is creating a collapsible box for the column headers and their types.
with st.expander("File Columns:"):
dashed_tables = tables.replace(' ','-')
st.write(dashed_tables)
# This is creating a collapsible box for the table.
with st.expander("Table:"):
if submit_text:
output_table = pysqldf(table_query.casefold())
#output_table = sqldf(table_query.casefold(), globals())
output_table

Why pymysql not insert record into table?

I am pretty new in python developing. I have a long python script what "clone" a database and add additional stored functions and procedures. Clone means copy only the schema of DB.These steps work fine.
My question is about pymysql insert exection:
I have to copy some table contents into the new DB. I don't get any sql error. If I debug or print the created INSERT INTO command is correct (I've tested it in an sql editor/handler). The insert execution is correct becuse the result contain the exact row number...but all rows are missing from destination table in dest.DB...
(Ofcourse DB_* variables have been definied!)
import pymysql
liveDbConn = pymysql.connect(DB_HOST, DB_USER, DB_PWD, LIVE_DB_NAME)
testDbConn = pymysql.connect(DB_HOST, DB_USER, DB_PWD, TEST_DB_NAME)
tablesForCopy = ['role', 'permission']
for table in tablesForCopy:
with liveDbConn.cursor() as liveCursor:
# Get name of columns
liveCursor.execute("DESCRIBE `%s`;" % (table))
columns = '';
for column in liveCursor.fetchall():
columns += '`' + column[0] + '`,'
columns = columns.strip(',')
# Get and convert values
values = ''
liveCursor.execute("SELECT * FROM `%s`;" % (table))
for result in liveCursor.fetchall():
data = []
for item in result:
if type(item)==type(None):
data.append('NULL')
elif type(item)==type('str'):
data.append("'"+item+"'")
elif type(item)==type(datetime.datetime.now()):
data.append("'"+str(item)+"'")
else: # for numeric values
data.append(str(item))
v = '(' + ', '.join(data) + ')'
values += v + ', '
values = values.strip(', ')
print("### table: %s" % (table))
testDbCursor = testDbConn.cursor()
testDbCursor.execute("INSERT INTO `" + TEST_DB_NAME + "`.`" + table + "` (" + columns + ") VALUES " + values + ";")
print("Result: {}".format(testDbCursor._result.message))
liveDbConn.close()
testDbConn.close()
Result is:
### table: role
Result: b"'Records: 16 Duplicates: 0 Warnings: 0"
### table: permission
Result: b'(Records: 222 Duplicates: 0 Warnings: 0'
What am I doing wrong? Thanks!

You have 2 main issues here:
You don't use conn.commit() (which would be either be liveDbConn.commit() or testDbConn.commit() here). Changes to the database will not be reflected without committing those changes. Note that all changes need committing but SELECT, for example, does not.
Your query is open to SQL Injection. This is a serious problem.
Table names cannot be parameterized, so there's not much we can do about that, but you'll want to parameterize your values. I've made multiple corrections to the code in relation to type checking as well as parameterization.
for table in tablesForCopy:
with liveDbConn.cursor() as liveCursor:
liveCursor.execute("SELECT * FROM `%s`;" % (table))
name_of_columns = [item[0] for item in liveCursor.description]
insert_list = []
for result in liveCursor.fetchall():
data = []
for item in result:
if item is None: # test identity against the None singleton
data.append('NULL')
elif isinstance(item, str): # Use isinstance to check type
data.append(item)
elif isinstance(item, datetime.datetime):
data.append(item.strftime('%Y-%m-%d %H:%M:%S'))
else: # for numeric values
data.append(str(item))
insert_list.append(data)
testDbCursor = testDbConn.cursor()
placeholders = ', '.join(['`%s`' for item in insert_list[0]])
testDbCursor.executemany("INSERT INTO `{}.{}` ({}) VALUES ({})".format(
TEST_DB_NAME,
table,
name_of_columns,
placeholders),
insert_list)
testDbConn.commit()
From this github thread, I notice that executemany does not work as expected in psycopg2; it instead sends each entry as a single query. You'll need to use execute_batch:
from psycopg2.extras import execute_batch
execute_batch(testDbCursor,
"INSERT INTO `{}.{}` ({}) VALUES ({})".format(TEST_DB_NAME,
table,
name_of_columns,
placeholders),
insert_list)
testDbConn.commit()

How to insert data into table using python pymsql
Find my solution below
import pymysql
import datetime
# Create a connection object
dbServerName = "127.0.0.1"
port = 8889
dbUser = "root"
dbPassword = ""
dbName = "blog_flask"
# charSet = "utf8mb4"
conn = pymysql.connect(host=dbServerName, user=dbUser, password=dbPassword,db=dbName, port= port)
try:
# Create a cursor object
cursor = conn.cursor()
# Insert rows into the MySQL Table
now = datetime.datetime.utcnow()
my_datetime = now.strftime('%Y-%m-%d %H:%M:%S')
cursor.execute('INSERT INTO posts (post_id, post_title, post_content, \
filename,post_time) VALUES (%s,%s,%s,%s,%s)',(5,'title2','description2','filename2',my_datetime))
conn.commit()
except Exception as e:
print("Exeception occured:{}".format(e))
finally:
conn.close()

How to remove single quotes around variables when doing mysql queries in Python?

I had a question pertaining to mysql as being used in Python. Basically I have a dropdown menu on a webpage using flask, that provides the parameters to change the mysql queries. Here is a code snippet of my problem.
select = request.form.get('option')
select_2 = request.form.get('option_2')
conn = mysql.connect()
cursor = conn.cursor()
query = "SELECT * FROM tbl_user WHERE %s = %s;"
cursor.execute(query, (select, select_2))
data = cursor.fetchall()
This returns no data from the query because there are single qoutes around the first variable, i.e.
Select * from tbl_user where 'user_name' = 'Adam'
versus
Select * from tbl_user where user_name = 'Adam'.
Could someone explain how to remove these single qoutes around the columns for me? When I hard code the columns I want to use, it gives me back my desired data but when I try to do it this way, it merely returns []. Any help is appreciated.

I have a working solution dealing with pymysql, which is to rewrite the escape method in class 'pymysql.connections.Connection', which obviously adds "'" arround your string. maybe you can try in a similar way, check this:
from pymysql.connections import Connection, converters
class MyConnect(Connection):
def escape(self, obj, mapping=None):
"""Escape whatever value you pass to it.
Non-standard, for internal use; do not use this in your applications.
"""
if isinstance(obj, str):
return self.escape_string(obj) # by default, it is :return "'" + self.escape_string(obj) + "'"
if isinstance(obj, (bytes, bytearray)):
ret = self._quote_bytes(obj)
if self._binary_prefix:
ret = "_binary" + ret
return ret
return converters.escape_item(obj, self.charset, mapping=mapping)
config = {'host':'', 'user':'', ...}
conn = MyConnect(**config)
cur = conn.cursor()

getting only updated data from database

I have to get the recently updated data from database. For the purpose of solving it, I have saved the last read row number into shelve of python. The following code works for a simple query like select * from rows. My code is:
from pyodbc import connect
from peewee import *
import random
import shelve
import connection
d = shelve.open("data.shelve")
db = SqliteDatabase("data.db")
class Rows(Model):
valueone = IntegerField()
valuetwo = IntegerField()
class Meta:
database = db
def CreateAndPopulate():
db.connect()
db.create_tables([Rows],safe=True)
with db.atomic():
for i in range(100):
row = Rows(valueone=random.randrange(0,100),valuetwo=random.randrange(0,100))
row.save()
db.close()
def get_last_primay_key():
return d.get('max_row',0)
def doWork():
query = "select * from rows" #could be anything
conn = connection.Connection("localhost","","SQLite3 ODBC Driver","data.db","","")
max_key_query = "SELECT MAX(%s) from %s" % ("id", "rows")
max_primary_key = conn.fetch_one(max_key_query)[0]
print "max_primary_key " + str(max_primary_key)
last_primary_key = get_last_primay_key()
print "last_primary_key " + str(last_primary_key)
if max_primary_key == last_primary_key:
print "no new records"
elif max_primary_key > last_primary_key:
print "There are some datas"
optimizedQuery = query + " where id>" + str(last_primary_key)
print query
for data in conn.fetch_all(optimizedQuery):
print data
d['max_row'] = max_primary_key
# print d['max_row']
# CreateAndPopulate() # to populate data
doWork()
While the code will work for a simple query without where clause, but the query can be anything from simple to complex, having joins and multiple where clauses. If so, then the portion where I'm adding where will fail. How can I get only last updated data from database whatever be the query?
PS: I cannot modify database. I just have to fetch from it.

Use an OFFSET clause. For example:
SELECT * FROM [....] WHERE [....] LIMIT -1 OFFSET 1000
In your query, replace 1000 with a parameter bound to your shelve variable. That will skip the top "shelve" number of rows and only grab newer ones. You may want to consider a more robust refactor eventually, but good luck.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to extract select columns from a sqlite database with Python - python

Related

Not able to connect to a table using python cx_oracle

I can't understand why i'm getting no such table

Why pymysql not insert record into table?

How to remove single quotes around variables when doing mysql queries in Python?

getting only updated data from database

Categories

Resources