django/python: raw sql with multiple tables - python

I need to perform a raw sql on multiple tables. I then render the result set. For one table I would do:
sql = "select * from my_table"
results = my_table.objects.raw(sql)
For multiple tables I am doing:
sql = "select * from my_table, my_other_table where ...."
results = big_model.objects.raw(sql)
But, do I really need to create a table/model/class big_model, which contains all fields that I may need? I will never actually store any data in this "table".
ADDED:
I have a table my_users. I have a table my_listings. These are defined in Models.py. The table my_listings has a foreign key to my_users, indicating who created the listing.
The SQL is
"select user_name, listing_text from my_listings, my_users where my_users.id = my_listings.my_user_id".
I want this SQL to generate a result set that I can use to render my page in django.
The question is: Do I have to create a model that contains the fields user_name and listing_text? Or is there some better way that still uses raw SQL (select, from, where)? Of course, my actual queries are more complicated than this example. (The models that I define in models.py become actual tables in the database hence the use of the model/table term. Not sure how else to refer to them, sorry.) I use raw sql because I found that python table references only work with simple data models.

This works. Don't know why it didn't before :( From Dennis Baker's comment:
You do NOT need to have a model with all the fields in it, you just need the first model and fields from that. You do need to have the fields with unique names and as far as I know you should use "tablename.field as fieldname" to make sure you have all unique fields. I've done some fairly complex queries with 5+ tables this way and always tie them back to a single model. –
2 . Another solution is to use a cursor. However, a cursor has to be changed from a list of tuples to a list of dictionaries. I'm sure there are cleaner ways using iterators, but this function works. It takes a string, which is the raw sql query, and returns a list which can be rendered and used in a template.
from django.db import connection, transaction
def sql_select(sql):
cursor = connection.cursor()
cursor.execute(sql)
results = cursor.fetchall()
list = []
i = 0
for row in results:
dict = {}
field = 0
while True:
try:
dict[cursor.description[field][0]] = str(results[i][field])
field = field +1
except IndexError as e:
break
i = i + 1
list.append(dict)
return list

you do not need a model that includes the fields that you want to return from your raw sql. If you happen to have a model that actually has the fields that you want to return from your raw sql then you can map your raw sql output to this model, otherwise you can use cursors to go around models altogether.

Related

Too many server roundtrips w/ psycopg2

I am making a script, that should create a schema for each customer. I’m fetching all metadata from a database that defines how each customer’s schema should look like, and then create it. Everything is well defined, the types, names of tables, etc. A customer has many tables (fx, address, customers, contact, item, etc), and each table has the same metadata.
My procedure now:
get everything I need from the metadataDatabase.
In a for loop, create a table, and then Alter Table and add each metadata (This is done for each table).
Right now my script runs in about a minute for each customer, which I think is too slow. It has something to do with me having a loop, and in that loop, I’m altering each table.
I think that instead of me altering (which might be not so clever approach), I should do something like the following:
Note that this is just a stupid but valid example:
for table in tables:
con.execute("CREATE TABLE IF NOT EXISTS tester.%s (%s, %s);", (table, "last_seen date", "valid_from timestamp"))
But it gives me this error (it seems like it reads the table name as a string in a string..):
psycopg2.errors.SyntaxError: syntax error at or near "'billing'"
LINE 1: CREATE TABLE IF NOT EXISTS tester.'billing' ('last_seen da...
Consider creating tables with a serial type (i.e., autonumber) ID field and then use alter table for all other fields by using a combination of sql.Identifier for identifiers (schema names, table names, column names, function names, etc.) and regular format for data types which are not literals in SQL statement.
from psycopg2 import sql
# CREATE TABLE
query = """CREATE TABLE IF NOT EXISTS {shm}.{tbl} (ID serial)"""
cur.execute(sql.SQL(query).format(shm = sql.Identifier("tester"),
tbl = sql.Identifier("table")))
# ALTER TABLE
items = [("last_seen", "date"), ("valid_from", "timestamp")]
query = """ALTER TABLE {shm}.{tbl} ADD COLUMN {col} {typ}"""
for item in items:
# KEEP IDENTIFIER PLACEHOLDERS
final_query = query.format(shm="{shm}", tbl="{tbl}", col="{col}", typ=i[1])
cur.execute(sql.SQL(final_query).format(shm = sql.Identifier("tester"),
tbl = sql.Identifier("table"),
col = sql.Identifier(item[0]))
Alternatively, use str.join with list comprehension for one CREATE TABLE:
query = """CREATE TABLE IF NOT EXISTS {shm}.{tbl} (
"id" serial,
{vals}
)"""
items = [("last_seen", "date"), ("valid_from", "timestamp")]
val = ",\n ".join(["{{}} {typ}".format(typ=i[1]) for i in items])
# KEEP IDENTIFIER PLACEHOLDERS
pre_query = query.format(shm="{shm}", tbl="{tbl}", vals=val)
final_query = sql.SQL(pre_query).format(*[sql.Identifier(i[0]) for i in items],
shm = sql.Identifier("tester"),
tbl = sql.Identifier("table"))
cur.execute(final_query)
SQL (sent to database)
CREATE TABLE IF NOT EXISTS "tester"."table" (
"id" serial,
"last_seen" date,
"valid_from" timestamp
)
However, this becomes heavy as there are too many server roundtrips.
How many tables with how many columns are you creating that this is slow? Could you ssh to a machine closer to your server and run the python there?
I don't get that error. Rather, I get an SQL syntax error. A values list is for conveying data. But ALTER TABLE is not about data, it is about metadata. You can't use a values list there. You need the names of the columns and types in double quotes (or no quotes) rather than single quotes. And you can't have a comma between name and type. And you can't have parentheses around each pair. And each pair needs to be introduced with "ADD", you can't have it just once. You are using the wrong tool for the job. execute_batch is almost the right tool, except it will use single quotes rather than double quotes around the identifiers. Perhaps you could add a flag to it tell it to use quote_ident.
Not only is execute_values the wrong tool for the job, but I think python in general might be as well. Why not just load from a .sql file?

raw query with primary key

My model
class Despacho (models.Model):
bus=models.ForeignKey(Bus)
contador = models.IntegerField()
cerrado = models.BooleanField(editable=False)
class Bus(models.Model):
numero_bus=models.CharField(max_length=255,unique=True)
en_ruta = models.BooleanField(editable=False)
I need a query to extract the data which I save a bus, and I enter a
number of the bus and I need to know if there is a dispatch that
matches the search try to do the following query
my database is postgresql
d = Despacho.objects.raw('''SELECT * FROM operaciones_despacho WHERE operaciones_despacho.bus = '%s' AND operaciones_despacho.cerrado = '%s'
;'''%(bus.numero_bus,False))
error : column operaciones_despacho.bus does not exist
First up in Django, we use raw sql only in the rare instances when it's particularly hard to write an ORM query. In this instance, writing an ORM query is much easier and shorter than a raw query.
Despacho.objects.filter(bus=bus).filter(cerrado=False)
On those rare instances when you need to do a raw query, take care you use the params argument to raw instead of string formatting. The correct way to write your raw query is
Despacho.objects.raw('''SELECT * FROM operaciones_despacho WHERE operaciones_despacho.bus = '%s' AND operaciones_despacho.cerrado = '%s'''' ,
[bus.numero_bus,False])
But I emphasis once again that you should not be using a raw query here because it's a simple ORM query.

Web2py SQLFORM.grid with executesql

I am making a web2py application and I have my two mysql tables defined in my models db.py file:
db.define_table('table1',
Field('id','integer'),
Field('name','string'),
migrate=False)
db.define_table('table2',
Field('id','integer'),
Field('name','string'),
migrate=False)
I want my application to return a union of these tables:
data=db.executesql('SELECT * FROM table1 union select * from table2;')
in a SQLFORM.grid but apparently
form=SQLFORM.grid(data, create=False, deletable=False, editable=False, maxtextlength=100, paginate=10)
is not the way to go.
Can somebody help me please? It must be really simple but I'm having trouble finding the solution.
Thank you
The grid is designed to take a table or query, so you cannot pass a Rows object or arbitrary SQL. The best approach would be to create a view in the database and create a new DAL model definition associated with that view (be sure to set migrate=False, as you don't want the DAL to attempt to create a table with the name of the view). Then you can pass the view model to the grid:
db.define_table('t1_t2_union_view',
Field('id','integer'),
Field('name','string'),
migrate=False)
grid = SQLFORM.grid(db.t1_t2_union_view, ...)
The above works because web2py will treat the model of the database view like any other database table, issuing a query to select all of its records. There is no need for executesql in this case because the union of the tables is handled in the database by the view.
Actually, you can simplify the table definition to:
db.define_table('t1_t2_union_view', db.table1, migrate=False)
When you pass an existing table to .define_table(), you get a new table with the same field defintions as the original, which is what we want here.
If creating separate views for each possible union is not feasible, a possible alternative would be to retrieve the data via executesql and then iterate through the records, inserting each one into an in-memory SQLite database table, which could then be passed to the grid:
union_tables = ('table1', 'table2')
temp_db = DAL('sqlite:memory')
union_table = temp_db.define_table('union_table', db[union_tables[0]])
records = db.executesql(sql, as_dict=True)
for record in records:
union_table.insert(**union_table._filter_fields(record))
grid = SQLFORM.grid(union_table, create=False, editable=False, deletable=False)
Setting as_dict=True results in a list of dictionaries being returned, which makes it easier to do the inserts, as the keys of the dictionaries are the field names needed for the inserts.
Note, this method is somewhat inefficient, so you would have to test it to see how it performs with your workload.

Creating Insert Statement for MySQL in Python

I am trying to construct an insert statement that is built from the results of a query. I run a query that retrieves results from one database and then creates an insert statement from the results and inserts that into a different database.
The server that is initially queried only returns those fields in the reply which are populated and this can differ from record to record. The destination database table has all of the possible fields available. This is why I need to construct the insert statement on the fly for each record that is retrieved and why I cannot use a default list of fields as I have no control over which ones will be populated in the response.
Here is a sample of the code, I send off a request for the T&C for an isin and the response is a name and value.
fields = []
data = []
getTCQ = ("MDH:T&C|"+isin+"|NAME|VALUE")
mdh.execute(getTCQ)
TC = mdh.fetchall()
for values in TC:
fields.append(values[0])
data.append(values[1])
insertQ = ("INSERT INTO sp_fields ("+fields+") VALUES ('"+data+"')")
The problem is with the fields part, mysql is expecting the following:
INSERT INTO sp_fields (ACCRUAL_COUNT,AMOUNT_OUTSTANDING_CALC_DATE) VALUES ('030/360','2014-11-10')
But I am getting the following for insertQ:
INSERT INTO sp_fields ('ACCRUAL_COUNT','AMOUNT_OUTSTANDING_CALC_DATE') VALUES ('030/360','2014-11-10')
and mysql does not like the ' ' around the fields names.
How do I get rid of these? so that it looks like the 1st insertQ statement that works.
many thanks in advance.
You could use ','.join(fields) to create the desired string (without quotes around each field).
Then use parametrized sql and pass the values as the second argument to cursor.execute:
insertQ = ("INSERT INTO sp_fields ({}) VALUES ({})".format(
','.join(fields), ','.join(['%s']*len(dates)))
cursor.execute(insertQ, dates)
Note that the correct placemarker to use, e.g. %s, depends on the DB adapter you are using. MySQLdb uses %s, but oursql uses ?, for instance.

Django model search concatenated string

I am trying to use a Django model to for a record but then return a concatenated field of two different tables joined by a foreign key.
I can do it in SQL like this:
SELECT
location.location_geoname_id as id,
CONCAT_WS(', ', location.location_name, region.region_name, country.country_name) AS 'text'
FROM
geonames_location as location
JOIN
geonames_region as region
ON
location.region_geoname_id = region.region_geoname_id
JOIN
geonames_country as country
ON
region.country_geoname_id = country.country_geoname_id
WHERE
location.location_name like 'location'
ORDER BY
location.location_name, region.region_name, country.country_name
LIMIT 10;
Is there a cleaner way to do this using Django models? Or do I need to just use SQL for this one?
Thank you
Do you really need the SQL to return the concatenated field? Why not query the models in the usual way (with select_related()) and then concatenate in Python? Or if you're worried about querying more columns than you need, use values_list:
locations = Location.objects.values_list(
'location_name', 'region__region_name', 'country__country_name')
location_texts = [','.join(l) for l in locations]
You can also write raw query for this in your code like that and later on you can concatenate.
Example:
org = Organization.objects.raw('SELECT organization_id, name FROM organization where is_active=1 ORDER BY name')
Keep one thing in a raw query you have to always fetch primary key of table, it's mandatory. Here organization_id is a primary key of contact_organization table.
And it's depend on you which one is useful and simple(raw query or model query).

Categories