Poll SQL table using Python? - python

I want to implement something similar to this, where the custom function classify_process() continuously asks a Redis db for new entries.
Can this be done a regular SQL table? The db.pipeline() seems to be Redis-specific, so I'm not sure how to imitate this functionality on a SQL db.
My goal is to check one table for new rows, and if there's a new row, run a prediction using a ML model.
QUESTION: How can I continously check if there are any new records in a MS SQL table (which has an interger ID column that automatically increments) and then trigger a function?
My idea of what has to happen using sqlalchemy:
import time
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
DRIVER = 'driver'
SERVER = 'server'
DATABASE = 'database'
engine = create_engine(f'mssql+pyodbc://{SERVER}/{DATABASE}?{DRIVER}')
SessionLocal = sessionmaker(bind=engine)
def some_process():
while True:
# look for new rows in table, then execute prediction on newest rows
...some code...?
# Sleep for x seconds
time.sleep(5)
if __name__ == "__main__":
some_process()

With an auto-incrementing integer (IDENTITY(1,1)) primary key you would check for new rows by looking up the largest PK value …
SELECT TOP 1 id FROM your_table ORDER BY id DESC
… and compare it with the previous largest value, like this
max_id_query = session.query(Thing.id).order_by(Thing.id.desc()).limit(1)
last_max_id = max_id_query.scalar()
while True:
max_id = max_id_query.scalar()
if max_id > last_max_id:
print(
f"New row(s) found. "
f"Processing ids {last_max_id + 1} through {max_id}"
)
# … do stuff
last_max_id = max_id
time.sleep(5)

Related

Fetching data from Postgresq Database using sqlalchemy.select() in python

I am using python and SQLalchemy to fetch data from a table.
import sqlalchemy as db
import pandas as pd
DATABASE_URI = 'postgres+psycopg2://postgres:postgresql#localhost:5432/postgres'
engine = db.create_engine(DATABASE_URI)
connection = engine.connect()
project_table = db.Table('project', metadata, autoload=True, autoload_with=engine)
here i want to fetch records based on a list of ids which i have.
l=[557997, 558088, 623106, 558020, 623108, 557836, 557733, 622792, 623511, 623185]
query1 = db.select([project_table ]).where(project_table .columns.project_id.in_(l))
#sql query= "select * from project where project_id in l"
Result = connection.execute(query1)
Rset = Result.fetchall()
df = pd.DataFrame(Rset)
print(df.head())
Here when i print df.head() I am getting an empty dataframe. I am not able to pass a list to the above query. Is there a way to send a list to in to above query.
The result should contain the rows in the table which are equal to project_id's given.
i.e.
project_id project_name project_date project_developer
557997 Test1 24-05-2011 Ajay
558088 Test2 24-06-2003 Alex
These rows will be inserted into dataset.
The Query is
"select * from project where project_id in (557997, 558088, 623106, 558020, 623108, 557836, 557733, 622792, 623511, 623185)"
here as i cant give static values I will insert the values to a list and pass this list to query as a parameter.
This is where i am having a problem. I cant pass a list as a parameter to db.select().How can i pass a list to db.select()
After many trails i have found out that because of large data the query is fetching and also less ram in my workstation, the query returned null(no results). so what I did was
Result = connection.execute(query1)
while True:
rows = Result.fetchmany(10000)
if not rows:
break
for row in rows:
table_data.append(row)
pass
df1 = pd.DataFrame(table_data)
df1.columns = columns
After this the program was working fine.

How to update table sqlite with python in loop

I am trying to calculate the mode value of each row and store the value in the judge = judge column, however it updates only the first record and leaves the loop
ps: Analisador is my table and resultado_2 is my db
import sqlite3
import statistics
conn = sqlite3.connect("resultado_2.db")
cursor = conn.cursor()
data = cursor.execute("SELECT Bow, FastText, Glove, Wordvec, Python, juiz, id FROM Analisador")
for x in data:
list = [x[0],x[1],x[2],x[3],x[4],x[5],x[6]]
mode = statistics.mode(list)
try:
cursor.execute(f"UPDATE Analisador SET juiz={mode} where id={row[6]}") #row[6] == id
conn.commit()
except:
print("Error")
conn.close()
You have to fetch your records after SQL is executed:
cursor.execute("SELECT Bow, FastText, Glove, Wordvec, Python, juiz, id FROM Analisador")
data = cursor.fetchall()
That type of SQL query is different from UPDATE (that you're using in your code too) which doesn't need additional step after SQL is executed.

getting only updated data from database

I have to get the recently updated data from database. For the purpose of solving it, I have saved the last read row number into shelve of python. The following code works for a simple query like select * from rows. My code is:
from pyodbc import connect
from peewee import *
import random
import shelve
import connection
d = shelve.open("data.shelve")
db = SqliteDatabase("data.db")
class Rows(Model):
valueone = IntegerField()
valuetwo = IntegerField()
class Meta:
database = db
def CreateAndPopulate():
db.connect()
db.create_tables([Rows],safe=True)
with db.atomic():
for i in range(100):
row = Rows(valueone=random.randrange(0,100),valuetwo=random.randrange(0,100))
row.save()
db.close()
def get_last_primay_key():
return d.get('max_row',0)
def doWork():
query = "select * from rows" #could be anything
conn = connection.Connection("localhost","","SQLite3 ODBC Driver","data.db","","")
max_key_query = "SELECT MAX(%s) from %s" % ("id", "rows")
max_primary_key = conn.fetch_one(max_key_query)[0]
print "max_primary_key " + str(max_primary_key)
last_primary_key = get_last_primay_key()
print "last_primary_key " + str(last_primary_key)
if max_primary_key == last_primary_key:
print "no new records"
elif max_primary_key > last_primary_key:
print "There are some datas"
optimizedQuery = query + " where id>" + str(last_primary_key)
print query
for data in conn.fetch_all(optimizedQuery):
print data
d['max_row'] = max_primary_key
# print d['max_row']
# CreateAndPopulate() # to populate data
doWork()
While the code will work for a simple query without where clause, but the query can be anything from simple to complex, having joins and multiple where clauses. If so, then the portion where I'm adding where will fail. How can I get only last updated data from database whatever be the query?
PS: I cannot modify database. I just have to fetch from it.
Use an OFFSET clause. For example:
SELECT * FROM [....] WHERE [....] LIMIT -1 OFFSET 1000
In your query, replace 1000 with a parameter bound to your shelve variable. That will skip the top "shelve" number of rows and only grab newer ones. You may want to consider a more robust refactor eventually, but good luck.

How can I get int result from web python sql query?

My code is as follows. My question is, how can I get the int result from the query?
import web
db = web.database(...)
rs = db.query('select max(id) from tablename')
The query method returns a list. Then you can access your calculated column by using an alias. If you don't want to use an alias, then I think you can do rs[0]['max(id)']
rs = db.query('select max(id) as max_value from tablename')
print rs[0].max_value
This is based on the example used in the web.py docs: http://webpy.org/cookbook/query
Here is the example from the link:
import web
db = web.database(dbn='postgres', db='mydata', user='dbuser', pw='')
results = db.query("SELECT COUNT(*) AS total_users FROM users")
print results[0].total_users # -> prints number of entries in 'users' table

Getting the id of the last record inserted for Postgresql SERIAL KEY with Python

I am using SQLAlchemy without the ORM, i.e. using hand-crafted SQL statements to directly interact with the backend database. I am using PG as my backend database (psycopg2 as DB driver) in this instance - I don't know if that affects the answer.
I have statements like this,for brevity, assume that conn is a valid connection to the database:
conn.execute("INSERT INTO user (name, country_id) VALUES ('Homer', 123)")
Assume also that the user table consists of the columns (id [SERIAL PRIMARY KEY], name, country_id)
How may I obtain the id of the new user, ideally, without hitting the database again?
You might be able to use the RETURNING clause of the INSERT statement like this:
result = conn.execute("INSERT INTO user (name, country_id) VALUES ('Homer', 123)
RETURNING *")
If you only want the resulting id:
result = conn.execute("INSERT INTO user (name, country_id) VALUES ('Homer', 123)
RETURNING id")
[new_id] = result.fetchone()
User lastrowid
result = conn.execute("INSERT INTO user (name, country_id) VALUES ('Homer', 123)")
result.lastrowid
Current SQLAlchemy documentation suggests
result.inserted_primary_key should work!
Python + SQLAlchemy
after commit, you get the primary_key column id (autoincremeted) updated in your object.
db.session.add(new_usr)
db.session.commit() #will insert the new_usr data into database AND retrieve id
idd = new_usr.usrID # usrID is the autoincremented primary_key column.
return jsonify(idd),201 #usrID = 12, correct id from table User in Database.
this question has been asked many times on stackoverflow and no answer I have seen is comprehensive. Googling 'sqlalchemy insert get id of new row' brings up a lot of them.
There are three levels to SQLAlchemy.
Top: the ORM.
Middle: Database abstraction (DBA) with Table classes etc.
Bottom: SQL using the text function.
To an OO programmer the ORM level looks natural, but to a database programmer it looks ugly and the ORM gets in the way. The DBA layer is an OK compromise. The SQL layer looks natural to database programmers and would look alien to an OO-only programmer.
Each level has it own syntax, similar but different enough to be frustrating. On top of this there is almost too much documentation online, very hard to find the answer.
I will describe how to get the inserted id AT THE SQL LAYER for the RDBMS I use.
Table: User(user_id integer primary autoincrement key, user_name string)
conn: Is a Connection obtained within SQLAlchemy to the DBMS you are using.
SQLite
======
insstmt = text(
'''INSERT INTO user (user_name)
VALUES (:usernm) ''' )
# Execute within a transaction (optional)
txn = conn.begin()
result = conn.execute(insstmt, usernm='Jane Doe')
# The id!
recid = result.lastrowid
txn.commit()
MS SQL Server
=============
insstmt = text(
'''INSERT INTO user (user_name)
OUTPUT inserted.record_id
VALUES (:usernm) ''' )
txn = conn.begin()
result = conn.execute(insstmt, usernm='Jane Doe')
# The id!
recid = result.fetchone()[0]
txn.commit()
MariaDB/MySQL
=============
insstmt = text(
'''INSERT INTO user (user_name)
VALUES (:usernm) ''' )
txn = conn.begin()
result = conn.execute(insstmt, usernm='Jane Doe')
# The id!
recid = conn.execute(text('SELECT LAST_INSERT_ID()')).fetchone()[0]
txn.commit()
Postgres
========
insstmt = text(
'''INSERT INTO user (user_name)
VALUES (:usernm)
RETURNING user_id ''' )
txn = conn.begin()
result = conn.execute(insstmt, usernm='Jane Doe')
# The id!
recid = result.fetchone()[0]
txn.commit()
result.inserted_primary_key
Worked for me. The only thing to note is that this returns a list that contains that last_insert_id.
Make sure you use fetchrow/fetch to receive the returning object
insert_stmt = user.insert().values(name="homer", country_id="123").returning(user.c.id)
row_id = await conn.fetchrow(insert_stmt)
For Postgress inserts from python code is simple to use "RETURNING" keyword with the "col_id" (name of the column which you want to get the last inserted row id) in insert statement at end
syntax -
from sqlalchemy import create_engine
conn_string = "postgresql://USERNAME:PSWD#HOSTNAME/DATABASE_NAME"
db = create_engine(conn_string)
conn = db.connect()
INSERT INTO emp_table (col_id, Name ,Age)
VALUES(3,'xyz',30) RETURNING col_id;
or
(if col_id column is auto increment)
insert_sql = (INSERT INTO emp_table (Name ,Age)
VALUES('xyz',30) RETURNING col_id;)
result = conn.execute(insert_sql)
[last_row_id] = result.fetchone()
print(last_row_id)
#output = 3
ex -

Categories