I have a python code that create sql query in order to allow the user to filter the required data from a database.
Until now I am able to create a query using function.
The problem is that the logical operators or/and are the same for all fields
Query: name.str.contains('') or nickname.str.contains('') or mother_name.str.contains('') or first_nationality == 'None'
what i want is to be able to create different logical operator for each field like this:
Query: name.str.contains('') or nickname.str.contains('') and mother_name.str.contains('') or first_nationality == 'None'
the image below show the user input and the constructed query
code:
import streamlit as st
import pandas as pd
#SQL pckgs
import pyodbc
def build_query(input_values, logical_op, compare_op):
query_frags = []
for k, v in input_values.items():
if v['dtype'] == list:
query_frag_expanded = [f"{v['db_col']} {compare_op} '{val}'" for val in v['value']]
query_frag = f' {logical_op} '.join(query_frag_expanded)
elif v['dtype'] == int or v['dtype'] == float:
query_frag = f"{v['db_col']} {compare_op} {v['dtype'](v['value'])}"
elif v['dtype'] == str:
query_frag = f"{v['db_col']}.str.contains('{v['dtype'](v['value'])}')"
else:
query_frag = f"{v['db_col']} {compare_op} '{v['dtype'](v['value'])}')"
query_frags.append(query_frag)
query = f' {logical_op} '.join(query_frags)
return query
def configure_query():
c1, c2, _ = st.columns([1,1,2])
with c1:
logical_op = st.selectbox('Logical operator', options=['and', 'or'], index=1)
with c2:
compare_op = st.selectbox('Comparator operator', options=['==', '>', '<', '<=', '>='], index=0)
return logical_op, compare_op
logical_op, compare_op = configure_query()
query = build_query(input_values, logical_op, compare_op)
Take a look at SQLAlchemy
I've never used pyodbc to be honest and I'm not sure if you have any constraint that makes you use it. I normally use Sqlalchemy to create a connection to a database (my case is PostgreSQL) and then I use pd.read_sql(QUERY,con) where con is the connection I've created with Sqlalchemy and QUERY is a string that represents the sql query.
If you do that you would be able to easily create a function that receives the conditions you want to add and pass it to the query text.
Related
I am trying to make a query to BigQuery in order to modify all the values of a row (in python). When I use a simple string to query, I have no problems. Nevertheless, when I introduce the string formatting the query does not work. As follows I'm presenting the same query, but diminishing the number of columns that I am modifying.
I already made the connection to BigQuery, by defining the Client, etc (and works properly).
I tried:
"UPDATE `riscos-dev.survey_test.data-test-bdrn` SET informaci_meteorol_gica = {inf}, risc = {ri} WHERE objectid = {obj_id}".format(inf = df.informaci_meteorol_gica[index], ri = df.risc[index], obj_id = df.objectid[index])
To specify the input values in format:
df.informaci_meteorol_gica[index] = 'Neu' , also a string for df.risc[index] and df.objectid[index] = 3
I am obtaining the following error message:
BadRequest: 400 Braced constructors are not supported at [1:77]
Instead of using format method of string, I propose you another approach with the f string formating in Python :
def build_query():
inf = "'test_inf'"
ri = "'test_ri'"
obj_id = "'test_obj_id'"
return f"UPDATE `riscos-dev.survey_test.data-test-bdrn` SET informaci_meteorol_gica = {inf}, risc = {ri} WHERE objectid = {obj_id}"
if __name__ == '__main__':
query = build_query()
print(query)
The result is :
UPDATE `riscos-dev.survey_test.data-test-bdrn` SET informaci_meteorol_gica = 'test_inf', risc = 'test_ri' WHERE objectid = 'test_obj_id'
I mocked the query params in my example with :
inf = "'test_inf'"
ri = "'test_ri'"
obj_id = "'test_obj_id'"
I'm trying to implement dynamic filtering using SQLAlchemy ORM.
I was looking through StackOverflow and found very similar question:SQLALchemy dynamic filter_by
It's useful for me, but not enough.
So, here is some example of code, I'm trying to write:
# engine - MySQL engine
session_maker = sessionmaker(bind=engine)
session = session_maker()
# my custom model
model = User
def get_query(session, filters):
if type(filters) == tuple:
query = session.query(model).filter(*filters)
elif type(filters) == dict:
query = session.query(model).filter(**filters)
return query
then I'm trying to reuse it with something very similar:
filters = (User.name == 'Johny')
get_query(s, filters) # it works just fine
filters = {'name': 'Johny'}
get_query(s, filters)
After the second run, there are some issues:
TypeError: filter() got an unexpected keyword argument 'name'
When I'm trying to change my filters to:
filters = {User.name: 'Johny'}
it returns:
TypeError: filter() keywords must be strings
But it works fine for manual querying:
s.query(User).filter(User.name == 'Johny')
What is wrong with my filters?
BTW, it looks like it works fine for case:
filters = {'name':'Johny'}
s.query(User).filter_by(**filters)
But following the recommendations from mentioned post I'm trying to use just filter.
If it's just one possible to use filter_by instead of filter, is there any differences between these two methods?
Your problem is that filter_by takes keyword arguments, but filter takes expressions. So expanding a dict for filter_by **mydict will work. With filter, you normally pass it one argument, which happens to be an expression. So when you expand your **filters dict to filter, you pass filter a bunch of keyword arguments that it doesn't understand.
If you want to build up a set of filters from a dict of stored filter args, you can use the generative nature of the query to keep applying filters. For example:
# assuming a model class, User, with attributes, name_last, name_first
my_filters = {'name_last':'Duncan', 'name_first':'Iain'}
query = session.query(User)
for attr,value in my_filters.iteritems():
query = query.filter( getattr(User,attr)==value )
# now we can run the query
results = query.all()
The great thing about the above pattern is you can use it across multiple joined columns, you can construct 'ands' and 'ors' with and_ and or_, you can do <= or date comparisons, whatever. It's much more flexible than using filter_by with keywords. The only caveat is that for joins you have to be a bit careful you don't accidentally try to join a table twice, and you might have to specify the join condition for complex filtering. I use this in some very complex filtering over a pretty involved domain model and it works like a charm, I just keep a dict going of entities_joined to keep track of the joins.
I have a similar issue, tried to filter from a dictionary:
filters = {"field": "value"}
Wrong:
...query(MyModel).filter(**filters).all()
Good:
...query(MyModel).filter_by(**filters).all()
FWIW, There's a Python library designed to solve this exact problem: sqlalchemy-filters
It allows to dynamically filter using all operators, not only ==.
from sqlalchemy_filters import apply_filters
# `query` should be a SQLAlchemy query object
filter_spec = [{'field': 'name', 'op': '==', 'value': 'name_1'}]
filtered_query = apply_filters(query, filter_spec)
more_filters = [{'field': 'foo_id', 'op': 'is_not_null'}]
filtered_query = apply_filters(filtered_query, more_filters)
result = filtered_query.all()
For the people using FastAPI and SQLAlchemy, here is a example of dynamic filtering:
api/app/app/crud/order.py
from typing import Optional
from pydantic import UUID4
from sqlalchemy.orm import Session
from app.crud.base import CRUDBase
from app.models.order import Order
from app.schemas.order import OrderCreate, OrderUpdate
class CRUDOrder(CRUDBase[Order, OrderCreate, OrderUpdate]):
def get_orders(
self,
db: Session,
owner_id: UUID4,
status: str,
trading_type: str,
pair: str,
skip: int = 0,
limit: int = 100,
) -> Optional[Order]:
filters = {
arg: value
for arg, value in locals().items()
if arg != "self" and arg != "db" and arg != "skip" and arg != "limit" and value is not None
}
query = db.query(self.model)
for attr, value in filters.items():
query = query.filter(getattr(self.model, attr) == value)
return (
query
.offset(skip)
.limit(limit)
.all()
)
order = CRUDOrder(Order)
class Place(db.Model):
id = db.Column(db.Integer, primary_key=True)
search_id = db.Column(db.Integer, db.ForeignKey('search.id'), nullable=False)
#classmethod
def dynamic_filter(model_class, filter_condition):
'''
Return filtered queryset based on condition.
:param query: takes query
:param filter_condition: Its a list, ie: [(key,operator,value)]
operator list:
eq for ==
lt for <
ge for >=
in for in_
like for like
value could be list or a string
:return: queryset
'''
__query = db.session.query(model_class)
for raw in filter_condition:
try:
key, op, value = raw
except ValueError:
raise Exception('Invalid filter: %s' % raw)
column = getattr(model_class, key, None)
if not column:
raise Exception('Invalid filter column: %s' % key)
if op == 'in':
if isinstance(value, list):
filt = column.in_(value)
else:
filt = column.in_(value.split(','))
else:
try:
attr = list(filter(lambda e: hasattr(column, e % op), ['%s', '%s_', '__%s__']))[0] % op
except IndexError:
raise Exception('Invalid filter operator: %s' % op)
if value == 'null':
value = None
filt = getattr(column, attr)(value)
__query = __query.filter(filt)
return __query
Execute like:
places = Place.dynamic_filter([('search_id', 'eq', 1)]).all()
I have a table with equipment and each of them has dates for level of maintenance. The user can select the maintenance level. So, I should adjust my SQLAlchemy for each combination of maintenance level chosen. For example:
SELECT * WHERE (equipment IN []) AND m_level1 = DATE AND m_level2 = DATE ....)
So it is possible to have combinations for each if condition, depending on checkboxes I used multiple strings to reach my goal, but I want to improve the query using SQLAlchemy.
I assume you are using the ORM.
in that case, the filter function returns a query object. You can conditionaly build the query by doing something like
query = Session.query(schema.Object).filter_by(attribute=value)
if condition:
query = query.filter_by(condition_attr=condition_val)
if another_condition:
query = query.filter_by(another=another_val)
#then finally execute it
results = query.all()
The function filter(*criterion) means you can use tuple as it's argument, #Wolph has detail here:
SQLALchemy dynamic filter_by for detail
if we speak of SQLAlchemy core, there is another way:
from sqlalchemy import and_
filters = [table.c.col1 == filter1, table.c.col2 > filter2]
query = table.select().where(and_(*filters))
If you're trying to filter based on incoming form criteria:
form = request.form.to_dict()
filters = []
for col in form:
sqlalchemybinaryexpression = (getattr(MODEL, col) == form[col])
filters.append(sqlalchemybinaryexpression)
query = table.select().where(and_(*filters))
Where MODEL is your SQLAlchemy Model
Another resolution to this question, this case is raised in a more secure way, since it verifies if the field to be filtered exists in the model.
To add operators to the value to which you want to filter. And not having to add a new parameter to the query, we can add an operator before the value e.g ?foo=>1, '?foo=<1,?foo=>=1, ?foo=<=1 ', ?foo=!1,?foo=1, and finally between which would be like this `?foo=a, b'.
from sqlalchemy.orm import class_mapper
import re
# input parameters
filter_by = {
"column1": "!1", # not equal to
"column2": "1", # equal to
"column3": ">1", # great to. etc...
}
def computed_operator(column, v):
if re.match(r"^!", v):
"""__ne__"""
val = re.sub(r"!", "", v)
return column.__ne__(val)
if re.match(r">(?!=)", v):
"""__gt__"""
val = re.sub(r">(?!=)", "", v)
return column.__gt__(val)
if re.match(r"<(?!=)", v):
"""__lt__"""
val = re.sub(r"<(?!=)", "", v)
return column.__lt__(val)
if re.match(r">=", v):
"""__ge__"""
val = re.sub(r">=", "", v)
return column.__ge__(val)
if re.match(r"<=", v):
"""__le__"""
val = re.sub(r"<=", "", v)
return column.__le__(val)
if re.match(r"(\w*),(\w*)", v):
"""between"""
a, b = re.split(r",", v)
return column.between(a, b)
""" default __eq__ """
return column.__eq__(v)
query = Table.query
filters = []
for k, v in filter_by.items():
mapper = class_mapper(Table)
if not hasattr(mapper.columns, k):
continue
filters.append(computed_operator(mapper.columns[k], "{}".format(v))
query = query.filter(*filters)
query.all()
Here is a solution that works both for AND or OR...
Just replace or_ with and_ in the code if you need that case:
from sqlalchemy import or_, and_
my_filters = set() ## <-- use a set to contain only unique values, avoid duplicates
if condition_1:
my_filters.add(MySQLClass.id == some_id)
if condition_2:
my_filters.add(MySQLClass.name == some_name)
fetched = db_session.execute(select(MySQLClass).where(or_(*my_filters))).scalars().all()
Im struggling at a very specific problem here:
I need to make a LIKE search in SQLAlchemy, but the amount of keywords are varying.
Heres the code for one keyword:
search_query = request.form["searchinput"]
if selectet_wg and not lagernd:
query = db_session.query(
Artikel.Artnr,
Artikel.Benennung,
Artikel.Bestand,
Artikel.Vkpreisbr1
).filter(
and_(
Artikel.Benennung.like("%"+search_query+"%"),
Artikel.Wg == selectet_wg
)
).order_by(Artikel.Vkpreisbr1.asc())
"searchinput" looks like this : "property1,property2,property3", but also can be just 1,2,5 or more propertys.
I want to split the searchinput at "," (yes i know how to do that :) ) and insert another LIKE search for every property.
So for the above example the search should be looking like this:
search_query = request.form["searchinput"]
if selectet_wg and not lagernd:
query = db_session.query(
Artikel.Artnr,
Artikel.Benennung,
Artikel.Bestand,
Artikel.Vkpreisbr1
).filter(
and_(
Artikel.Benennung.like("%"+search_query+"%"), #property1
Artikel.Benennung.like("%"+search_query+"%"), #property2
Artikel.Benennung.like("%"+search_query+"%"), #property3
Artikel.Wg == selectet_wg
)
).order_by(Artikel.Vkpreisbr1.asc())
I dont think its a smart idea just to make an if statement for the amount of propertys and write down the query serveral times...
Im using the newest version of sqlalchemy and python 3.4
It should be possible to create a list of you like filters and pass them all to and_.
First create a list of the like queries:
search_queries = search_query.split(',')
filter = [Artikel.Benennung.like("%"+sq"%") for sq in search_queries]
Then pass them to and_, unpacking the list:
and_(
Artikel.Wg == selectet_wg,
*filter
)
*filter has to be the last argument to and_ otherwise python will give you an error.
You can call filter multiple times:
search_query = request.form["searchinput"]
if selectet_wg and not lagernd:
query = db_session.query(
Artikel.Artnr,
Artikel.Benennung,
Artikel.Bestand,
Artikel.Vkpreisbr1
).filter(Artikel.Wg == selectet_wg)
for prop in search_query.split(','):
query = query.filter(Artikel.Benennung.like("%"+prop+"%"))
query = query.order_by(Artikel.Vkpreisbr1.asc())
I have to insert 8000+ records into a SQLite database using Django's ORM. This operation needs to be run as a cronjob about once per minute.
At the moment I'm using a for loop to iterate through all the items and then insert them one by one.
Example:
for item in items:
entry = Entry(a1=item.a1, a2=item.a2)
entry.save()
What is an efficient way of doing this?
Edit: A little comparison between the two insertion methods.
Without commit_manually decorator (11245 records):
nox#noxdevel marinetraffic]$ time python manage.py insrec
real 1m50.288s
user 0m6.710s
sys 0m23.445s
Using commit_manually decorator (11245 records):
[nox#noxdevel marinetraffic]$ time python manage.py insrec
real 0m18.464s
user 0m5.433s
sys 0m10.163s
Note: The test script also does some other operations besides inserting into the database (downloads a ZIP file, extracts an XML file from the ZIP archive, parses the XML file) so the time needed for execution does not necessarily represent the time needed to insert the records.
You want to check out django.db.transaction.commit_manually.
http://docs.djangoproject.com/en/dev/topics/db/transactions/#django-db-transaction-commit-manually
So it would be something like:
from django.db import transaction
#transaction.commit_manually
def viewfunc(request):
...
for item in items:
entry = Entry(a1=item.a1, a2=item.a2)
entry.save()
transaction.commit()
Which will only commit once, instead at each save().
In django 1.3 context managers were introduced.
So now you can use transaction.commit_on_success() in a similar way:
from django.db import transaction
def viewfunc(request):
...
with transaction.commit_on_success():
for item in items:
entry = Entry(a1=item.a1, a2=item.a2)
entry.save()
In django 1.4, bulk_create was added, allowing you to create lists of your model objects and then commit them all at once.
NOTE the save method will not be called when using bulk create.
>>> Entry.objects.bulk_create([
... Entry(headline="Django 1.0 Released"),
... Entry(headline="Django 1.1 Announced"),
... Entry(headline="Breaking: Django is awesome")
... ])
In django 1.6, transaction.atomic was introduced, intended to replace now legacy functions commit_on_success and commit_manually.
from the django documentation on atomic:
atomic is usable both as a decorator:
from django.db import transaction
#transaction.atomic
def viewfunc(request):
# This code executes inside a transaction.
do_stuff()
and as a context manager:
from django.db import transaction
def viewfunc(request):
# This code executes in autocommit mode (Django's default).
do_stuff()
with transaction.atomic():
# This code executes inside a transaction.
do_more_stuff()
Bulk creation is available in Django 1.4:
https://django.readthedocs.io/en/1.4/ref/models/querysets.html#bulk-create
Have a look at this. It's meant for use out-of-the-box with MySQL only, but there are pointers on what to do for other databases.
You might be better off bulk-loading the items - prepare a file and use a bulk load tool. This will be vastly more efficient than 8000 individual inserts.
To answer the question particularly with regard to SQLite, as asked, while I have just now confirmed that bulk_create does provide a tremendous speedup there is a limitation with SQLite: "The default is to create all objects in one batch, except for SQLite where the default is such that at maximum 999 variables per query is used."
The quoted stuff is from the docs--- A-IV provided a link.
What I have to add is that this djangosnippets entry by alpar also seems to be working for me. It's a little wrapper that breaks the big batch that you want to process into smaller batches, managing the 999 variables limit.
You should check out DSE. I wrote DSE to solve these kinds of problems ( massive insert or updates ). Using the django orm is a dead-end, you got to do it in plain SQL and DSE takes care of much of that for you.
Thomas
def order(request):
if request.method=="GET":
cust_name = request.GET.get('cust_name', '')
cust_cont = request.GET.get('cust_cont', '')
pincode = request.GET.get('pincode', '')
city_name = request.GET.get('city_name', '')
state = request.GET.get('state', '')
contry = request.GET.get('contry', '')
gender = request.GET.get('gender', '')
paid_amt = request.GET.get('paid_amt', '')
due_amt = request.GET.get('due_amt', '')
order_date = request.GET.get('order_date', '')
print(order_date)
prod_name = request.GET.getlist('prod_name[]', '')
prod_qty = request.GET.getlist('prod_qty[]', '')
prod_price = request.GET.getlist('prod_price[]', '')
print(prod_name)
print(prod_qty)
print(prod_price)
# insert customer information into customer table
try:
# Insert Data into customer table
cust_tab = Customer(customer_name=cust_name, customer_contact=cust_cont, gender=gender, city_name=city_name, pincode=pincode, state_name=state, contry_name=contry)
cust_tab.save()
# Retrive Id from customer table
custo_id = Customer.objects.values_list('customer_id').last() #It is return
Tuple as result from Queryset
custo_id = int(custo_id[0]) #It is convert the Tuple in INT
# Insert Data into Order table
order_tab = Orders(order_date=order_date, paid_amt=paid_amt, due_amt=due_amt, customer_id=custo_id)
order_tab.save()
# Insert Data into Products table
# insert multiple data at a one time from djanog using while loop
i=0
while(i<len(prod_name)):
p_n = prod_name[i]
p_q = prod_qty[i]
p_p = prod_price[i]
# this is checking the variable, if variable is null so fill the varable value in database
if p_n != "" and p_q != "" and p_p != "":
prod_tab = Products(product_name=p_n, product_qty=p_q, product_price=p_p, customer_id=custo_id)
prod_tab.save()
i=i+1
I recommend using plain SQL (not ORM) you can insert multiple rows with a single insert:
insert into A select from B;
The select from B portion of your sql could be as complicated as you want it to get as long as the results match the columns in table A and there are no constraint conflicts.
def order(request):
if request.method=="GET":
# get the value from html page
cust_name = request.GET.get('cust_name', '')
cust_cont = request.GET.get('cust_cont', '')
pincode = request.GET.get('pincode', '')
city_name = request.GET.get('city_name', '')
state = request.GET.get('state', '')
contry = request.GET.get('contry', '')
gender = request.GET.get('gender', '')
paid_amt = request.GET.get('paid_amt', '')
due_amt = request.GET.get('due_amt', '')
order_date = request.GET.get('order_date', '')
prod_name = request.GET.getlist('prod_name[]', '')
prod_qty = request.GET.getlist('prod_qty[]', '')
prod_price = request.GET.getlist('prod_price[]', '')
# insert customer information into customer table
try:
# Insert Data into customer table
cust_tab = Customer(customer_name=cust_name, customer_contact=cust_cont, gender=gender, city_name=city_name, pincode=pincode, state_name=state, contry_name=contry)
cust_tab.save()
# Retrive Id from customer table
custo_id = Customer.objects.values_list('customer_id').last() #It is return Tuple as result from Queryset
custo_id = int(custo_id[0]) #It is convert the Tuple in INT
# Insert Data into Order table
order_tab = Orders(order_date=order_date, paid_amt=paid_amt, due_amt=due_amt, customer_id=custo_id)
order_tab.save()
# Insert Data into Products table
# insert multiple data at a one time from djanog using while loop
i=0
while(i<len(prod_name)):
p_n = prod_name[i]
p_q = prod_qty[i]
p_p = prod_price[i]
# this is checking the variable, if variable is null so fill the varable value in database
if p_n != "" and p_q != "" and p_p != "":
prod_tab = Products(product_name=p_n, product_qty=p_q, product_price=p_p, customer_id=custo_id)
prod_tab.save()
i=i+1
return HttpResponse('Your Record Has been Saved')
except Exception as e:
return HttpResponse(e)
return render(request, 'invoice_system/order.html')