database field value that matches to every query - python

I would like to insert records into a sqlite database with fields such that every query that specifies a value for that field does not disqualify the record.
Make Model Engine Parameter
Ford * * 1
Ford Taurus * 2
Ford Escape * 3
So a query = (database.table.Make == Ford') & (database.table.Model == 'Taurus') would return the first two records
EDIT: thanks to woot, I decided to use the following: (database.table.Make.belongs('Ford','')) & (database.table.Model.belongs('Taurus','')) which is the syntax for the IN operator in web2py

Are you looking for something like this? It won't perform well due to the ORs if you have a lot of rows.
SELECT *
FROM Cars
WHERE ( Cars.Make = 'Ford' OR Cars.Make = '*' )
AND ( Cars.Model = 'Taurus' OR Cars.Model = '*' )
Here is a SQL Fiddle example.
If you meant to use NULL, you can just replace that and replace the OR condition with OR Cars.Make IS NULL, etc.
Or to make it maybe a little less verbose:
SELECT *
FROM Cars
WHERE Cars.Make IN ('Ford','*')
AND Cars.Model IN ('Taurus','*')
But you wouldn't be able to use NULL in this case and would have to use the * token.
SQL Fiddle

Related

Python Peewee EXISTS Subquery not working as expected

I am using the peewee ORM for a python application and I am trying to write code to fetch batches of records from a SQLite database. I have a subquery that seems to work by itself but when added to an update query the fn.EXISTS(sub_query) seems to have no effect as every record in the database is updated.
Note: I am using the APSW extension for peewee.
def batch_logic(self, id_1, path_1, batch_size=1000, **kwargs):
sub_query = (self.select(ModelClass.granule_id).distinct().where(
(ModelClass.status == 'old_status') &
(ModelClass.collection_id == collection_id) &
(ModelClass.name.contains(provider_path))
).order_by(ModelClass.discovered_date.asc()).limit(batch_size)).limit(batch_size))
print(f'len(sub_query): {len(sub_query)}')
fb_st_2 = time.time()
updated_records= list(
(self.update(status='new_status').where(fn.EXISTS(sub_query)).returning(ModelClass))
)
print(f'update {len(updated_records)}: {time.time() - fb_st_2}')
db.close()
return updated_records
Below is output from testing locally:
id_1: id_1_1676475997_PQXYEQGJWR
len(sub_query): 2
update 20000: 1.0583274364471436
fetch_batch 20000: 1.1167597770690918
count_things 0: 0.02147078514099121
processed_things: 20000
The subquery is correctly returning 2 but the update query where(fn.EXISTS(sub_query)) seems to be ignored. Have I made a mistake in my understanding of how this works?
Edit 1: I believe GROUP BY is needed as rows can have the same granule_id and I need to fetch rows up to batch_size granule_ids
I think your use of UPDATE...WHERE EXISTS is incorrect or inappropriate here. This may work better for you:
# Unsure why you have a GROUP BY with no aggregation, that seems
# incorrect possibly, so I've removed it.
sub_query = (self.select(ModelClass.id)
.where(
(ModelClass.status == 'old_status') &
(ModelClass.collection_id == id_1) &
(ModelClass.name.contains(path_1)))
.order_by(ModelClass.discovered_date.asc())
.limit(batch_size))
update = (self.update(status='new_status')
.where(self.id.in_(sub_query))
.returning(ModelClass))
cursor = update.execute() # It's good to explicitly execute().
updated_records = list(cursor)
The key idea, at any rate, is I'm correlating the update with the subquery.

Dynamically search for null in sqlite select query using python

I'm new to python and I want to do a similar query to this one:
_c.execute('select * from cases where bi = ? and age = ? and
shape = ? and margin = ? and density = ?',(obj['bi'],
obj['age'], obj['margin'], obj['density']))
When some of the parameters are None, for example obj['bi'] = None, the query searches for the row when bi = 'None'. But I want it to search for the row when: 'bi is NULL'
A possible solution is to verify the values of the parameters one by one in a sequence of if-elses. For example:
query = 'select * from cases where'
if obj['bi'] is None:
query += ' bi is null'
else:
query += ' bi = ' + str(obj['bi']) + ' and '
...
# do the same if-else for the other parameters
...
_c.execute(query)
But, it doesn't seem to me as the best solution.
The question is, what is the best solution to the given problem and how to avoid SQL injections.
Okay, after firing up a python REPL and playing around with it a bit, it's simpler than I thought. The Python sqlite bindings turn a Python None into a SQL NULL, not into a string 'None' like it sounded like from your question. In SQL, = doesn't match NULL values, but IS will. So...
Given a table foo looking like:
a | b
--------------
NULL | 1
Dog | 2
Doing:
c = conn.cursor()
c.execute('SELECT * FROM foo WHERE a IS ?', (None,))
print(c.fetchone())
will return the (NULL, 1) row, and
c.execute('SELECT * FROM foo WHERE a IS ?', ('Dog',))
print(c.fetchone())
will return the ('Dog', 2) row.
In other words, use IS not = in your query.

Knowing if the result of a SQL request must be a part of another SQL request result

Let's suppose I have the following table :
Id (int, Primary Key) | Value (varchar)
----------------------+----------------
1 | toto
2 | foo
3 | bar
I would like to know if giving two request, the result of the first must be contained in the result of the second without executing them.
Some examples :
# Obvious example
query_1 = "SELECT * FROM example;"
query_2 = "SELECT * FROM example WHERE id = 1;"
is_sub_part_of(query_2, query_1) # True
# An example we can't know before executing the two requests
query_1 = "SELECT * FROM example WHERE id < 2;"
query_2 = "SELECT * FROM example WHERE value = 'toto' or value = 'foo';"
is_sub_part_of(query_2, query_1) # False
# An example we can know before executing the two requests
query_1 = "SELECT * FROM example WHERE id < 2 OR value = 'bar';"
query_2 = "SELECT * FROM example WHERE id < 2 AND value = 'bar';"
is_sub_part_of(query_2, query_1) # True
# An example about columns
query_1 = "SELECT * FROM example;"
query_2 = "SELECT id FROM example;"
is_sub_part_of(query_2, query_1) # True
Do you know if there's a module in Python that is able to do that, or if it's even possible to do ?
Interesting problem. I don't know of any library that will do this for you. My thoughts:
Parse the SQL, see this for example.
Define which filtering operations can be added to a query that can only result in the same or a narrower result set. "AND x" can always be added, I think, without losing the property of being a subset. "OR x" can not. Anything else you can do to the query? For example "SELECT *", vs "SELECT x", vs "SELECT x, y".
Except for that, I can only say it's an interesting idea. You might get some more input on DBA. Is this an idea you're researching or is it related to a real-world problem you are solving, like optimizing a DB query? Maybe your question could be updated with information about this, since this is not a common way to optimize queries (unless you're working on the DB engine itself, I guess).

Converting LEFT OUTER JOIN query to Django orm queryset/query

Given PostgreSQL 9.2.10, Django 1.8, python 2.7.5 and the following models:
class restProdAPI(models.Model):
rest_id = models.PositiveIntegerField(primary_key=True)
rest_host = models.CharField(max_length=20)
rest_ip = models.GenericIPAddressField(default='0.0.0.0')
rest_mode = models.CharField(max_length=20)
rest_state = models.CharField(max_length=20)
class soapProdAPI(models.Model):
soap_id = models.PositiveIntegerField(primary_key=True)
soap_host = models.CharField(max_length=20)
soap_ip = models.GenericIPAddressField(default='0.0.0.0')
soap_asset = models.CharField(max_length=20)
soap_state = models.CharField(max_length=20)
And the following raw query which returns exactly what I am looking for:
SELECT
app_restProdAPI.rest_id, app_soapProdAPI.soap_id, app_restProdAPI.rest_host, app_restProdAPI.rest_ip, app_soapProdAPI.soap_asset, app_restProdAPI.rest_mode, app_restProdAPI.rest_state
FROM
app_soapProdAPI
LEFT OUTER JOIN
app_restProdAPI
ON
((app_restProdAPI.rest_host = app_soapProdAPI.soap_host)
OR
(app_restProdAPI.rest_ip = app_soapProdAPI.soap_ip))
WHERE
app_restProdAPI.rest_mode = 'Excluded';
Which returns like this:
rest_id | soap_id | rest_host | rest_ip | soap_asset | rest_mode | rest_state
---------+---------+---------------+----------------+------------+-----------+-----------
1234 | 12345 | 1G24019123ABC | 123.123.123.12 | A1234567 | Excluded | Up
What would be the best method for making this work using Django's model and orm structure?
I have been looking around for possible methods for joining the two tables entirely without a relationship but there does not seem to be a clean or efficient way to do this. I have also tried looking for methods to do left outer joins in django, but again documentation is sparse or difficult to decipher.
I know I will probably have to use Q objects to do the or clause I have in there. Additionally I have looked at relationships and it looks like a foreignkey() may work but I am unsure if this is the best method of doing it. Any and all help would be greatly appreciated. Thank you in advance.
** EDIT 1 **
So far Todor has offered a solution that uses a INNER JOIN that works. I may have found a solution HERE if anyone can decipher that mess of inline raw html.
** EDIT 2 **
Is there a way to filter on a field (where something = 'something') like my query above given, Todor's answer? I tried the following but it is still including all records even though my equivalent postresql query is working as expected. It seems I cannot have everything in the where that I do because when I remove one of the or statements and just do a and statement it applies the excluded filter.
soapProdAPI.objects.extra(
select = {
'rest_id' : 'app_restprodapi.rest_id',
'rest_host' : 'app_restprodapi.rest_host',
'rest_ip' : 'app_restprodapi.rest_ip',
'rest_mode' : 'app_restprodapi.rest_mode',
'rest_state' : 'app_restprodapi.rest_state'
},
tables = ['app_restprodapi'],
where = ['app_restprodapi.rest_mode=%s \
AND app_restprodapi.rest_host=app_soapprodapi.soap_host \
OR app_restprodapi.rest_ip=app_soapprodapi.soap_ip'],
params = ['Excluded']
)
** EDIT 3 / CURRENT SOLUTION IN PLACE **
To date Todor has provided the most complete answer, using an INNER JOIN, but the hope is that this question will generate thought into how this still may be accomplished. As this does not seem to be inherently possible, any and all suggestions are welcome as they may possibly lead to better solutions. That being said, using Todor's answer, I was able accomplish the exact query I needed:
restProdAPI.objects.extra(
select = {
'soap_id' : 'app_soapprodapi.soap_id',
'soap_asset' : 'app_soapprodapi.soap_asset'
},
tables = ['app_soapprodapi'],
where = ['app_restprodapi.rest_mode = %s',
'app_soapprodapi.soap_host = app_restprodapi.rest_host OR \
app_soapprodapi.soap_ip = app_restprodapi.rest_ip'
],
params = ['Excluded']
)
** TLDR **
I would like to convert this PostGreSQL query to the ORM provided by Django WITHOUT using .raw() or any raw query code at all. I am completely open to changing the model to having a foreignkey if that facilitates this and is, from a performance standpoint, the best method. I am going to be using the objects returned in conjunction with django-datatables-view if that helps in terms of design.
Solving it with INNER JOIN
In case you can go with only soapProdAPI's that contain corresponding restProdAPI ( in terms of your join statement -> linked by host or ip). You can try the following:
soapProdAPI.objects.extra(
select = {
'rest_id' : "app_restProdAPI.rest_id",
'rest_host' : "app_restProdAPI.rest_host",
'rest_ip' : "app_restProdAPI.rest_ip",
'rest_mode' : "app_restProdAPI.rest_mode",
'rest_state': "app_restProdAPI.rest_state"
},
tables = ["app_restProdAPI"],
where = ["app_restProdAPI.rest_host = app_soapProdAPI.soap_host \
OR app_restProdAPI.rest_ip = app_soapProdAPI.soap_ip"]
)
How to filter more?
Since we are using .extra I would advice to read the docs carefully. In general we can't use .filter with some of the fields inside the select dict, because they are not part of the soapProdAPI and Django can't resolve them. We have to stick with the where kwarg in .extra, and since it's a list, we better just add another element.
where = ["app_restProdAPI.rest_host = app_soapProdAPI.soap_host \
OR app_restProdAPI.rest_ip = app_soapProdAPI.soap_ip",
"app_restProdAPI.rest_mode=%s"
],
params = ['Excluded']
Repeated subquery
If you really need all soapProdAPI's no matter if they have corresponding restProdAPI I can only think of a one ugly example where a subquery is repeated for each field you need.
soapProdAPI.objects.extra(
select = {
'rest_id' : "(select rest_id from app_restProdAPI where app_restProdAPI.rest_host = app_soapProdAPI.soap_host OR app_restProdAPI.rest_ip = app_soapProdAPI.soap_ip)",
'rest_host' : "(select rest_host from app_restProdAPI where app_restProdAPI.rest_host = app_soapProdAPI.soap_host OR app_restProdAPI.rest_ip = app_soapProdAPI.soap_ip)",
'rest_ip' : "(select rest_ip from app_restProdAPI where app_restProdAPI.rest_host = app_soapProdAPI.soap_host OR app_restProdAPI.rest_ip = app_soapProdAPI.soap_ip)",
'rest_mode' : "(select rest_mode from app_restProdAPI where app_restProdAPI.rest_host = app_soapProdAPI.soap_host OR app_restProdAPI.rest_ip = app_soapProdAPI.soap_ip)",
'rest_state': "(select rest_state from app_restProdAPI where app_restProdAPI.rest_host = app_soapProdAPI.soap_host OR app_restProdAPI.rest_ip = app_soapProdAPI.soap_ip)"
},
)
I think this could be usefull for you! Effectively, you can use Q to construct your query.
I try it the Django shell, I create some data and I did something like this:
restProdAPI.objects.filter(Q(rest_host=s1.soap_host)|Q(rest_ip=s1.soap_ip))
Where s1 is a soapProdAPI.
This is all the code i whote, you can try it and to see if can help you
from django.db.models import Q
from core.models import restProdAPI, soapProdAPI
s1 = soapProdAPI.objects.get(soap_id=1)
restProdAPI.objects.filter(Q(rest_id=s1.soap_id)|Q(rest_ip=s1.soap_ip))

Trouble Querying Against int Field using MYSQL

Hey,
I'm trying to run the following query:
self.cursor.execute('SELECT courses.courseid, days, starttime, bldg, roomnum, '
'area, title, descrip, prereqs, endtime FROM '
'classes, courses, crosslistings, coursesprofs, profs WHERE '
'classes.courseid = courses.courseid AND '
'courses.courseid = crosslistings.courseid AND '
'courses.courseid = coursesprofs.courseid AND '
'coursesprofs.profid = profs.profid AND '
'classes.classid LIKE %s'
';',
(self.classid))
classid is an int(11) field in the db. When I set self.classid = %, it returns all the results, but as soon as I set it to say, '3454' or some other amount it returns nothing even when there is a class with that classid. Am I querying incorrectly against int fields?
Even a simpler query like
select * from classes where classes.classid = '3454'; does not work
Try:
select * from classes where classes.classid = 3454;
I resolved this on my own. Based on my db structure, I was querying the wrong fields. I was looking for values that weren't there so that's why I was always returning an empty result set. Thanks for the help on the = operator though, that was utilized.

Categories