outer join modelisation in django - python

I have a many to many relationship table whith some datas in the jointing base
a basic version of my model look like:
class FooLine(models.Model):
name = models.CharField(max_length=255)
class FooCol(models.Model):
name = models.CharField(max_length=255)
class FooVal(models.Model):
value = models.CharField(max_length=255)
line = models.ForeignKey(FooLine)
col = models.ForeignKey(FooCol)
I'm trying to search every values for a certain line with a null if the value is not present (basically i'm trying to display the fooval table with null values for values that haven't been filled)
a typical sql would be
SELECT value FROM FooCol LEFT OUTER JOIN
(FooVal JOIN FooLine
ON FooVal.line_id == FooLine.id AND FooLine.name = "FIXME")
ON FooCol.id = col_id;
Is there any way to modelise above query using django model
Thanks

Outer joins can be viewed as a hack because SQL lacks "navigation".
What you have is a simple if-statement situation.
for line in someRangeOfLines:
for col in someRangeOfCols:
try:
cell= FooVal.objects().get( col = col, line = line )
except FooVal.DoesNotExist:
cell= None
That's what an outer join really is -- an attempted lookup with a NULL replacement.
The only optimization is something like the following.
matrix = {}
for f in FooVal.objects().all():
matrix[(f.line,f.col)] = f
for line in someRangeOfLines:
for col in someRangeOfCols:
cell= matrix.get((line,col),None)

Related

How to do efficient reverse foreign key check on Django multi-table inheritance model?

I've got file objects of different types, which inherit from a BaseFile, and add custom attributes, methods and maybe fields. The BaseFile also stores the File Type ID, so that the corresponding subclass model can be retrieved from any BaseFile object:
class BaseFile(models.Model):
name = models.CharField(max_length=80, db_index=True)
size= models.PositiveIntegerField()
time_created = models.DateTimeField(default=datetime.now)
file_type = models.ForeignKey(ContentType, on_delete=models.PROTECT)
class FileType1(BaseFile):
storage_path = '/path/for/filetype1/'
def custom_method(self):
<some custom behaviour>
class FileType2(BaseFile):
storage_path = '/path/for/filetype2/'
extra_field = models.CharField(max_length=12)
I also have different types of events which are associated with files:
class FileEvent(models.Model):
file = models.ForeignKey(BaseFile, on_delete=models.PROTECT)
time = models.DateTimeField(default=datetime.now)
I want to be able to efficiently get all files of a particular type which have not been involved in a particular event, such as:
unprocessed_files_type1 = FileType1.objects.filter(fileevent__isnull=True)
However, looking at the SQL executed for this query:
SELECT "app_basefile"."id", "app_basefile"."name", "app_basefile"."size", "app_basefile"."time_created", "app_basefile"."file_type_id", "app_filetype1"."basefile_ptr_id"
FROM "app_filetype1"
INNER JOIN "app_basefile"
ON("app_filetype1"."basefile_ptr_id" = "app_basefile"."id")
LEFT OUTER JOIN "app_fileevent" ON ("app_basefile"."id" = "app_fileevent"."file_id")
WHERE "app_fileevent"."id" IS NULL
It looks like this might not be very efficient because it joins on BaseFile.id instead of FileType1.basefile_ptr_id, so it will check ALL BaseFile ids to see whether they are present in FileEvent.file_id, when I only need to check the BaseFile ids corresponding to FileType1, or FileType1.basefile_ptr_ids.
This could result in a significant performance difference if there are a very large number of BaseFiles, but FileType1 is only a small subset of that, because it will be doing a large amount of unnecessary lookups.
Is there a way to force Django to join on "app_filetype1"."basefile_ptr_id" or otherwise achieve this functionality more efficiently?
Thanks for the help
UPDATE:
Using annotations and Exists subquery seems to do what I'm after, however the resulting SQL still appears strange:
unprocessed_files_type1 = FileType1.objects.annotate(file_event=Exists(FileEvent.objects.filter(file=OuterRef('pk')))).filter(file_event=False)
SELECT "app_basefile"."id", "app_basefile"."name", "app_basefile"."size", "app_basefile"."time_created", "app_basefile"."file_type_id", "app_filetype1"."basefile_ptr_id",
EXISTS(
SELECT U0."id", U0."file_id", U0."time"
FROM "app_fileevent" U0
WHERE U0."file_id" = ("app_filetype1"."basefile_ptr_id"))
AS "file_event"
FROM "app_filetype1"
INNER JOIN "app_basefile" ON ("app_filetype1"."basefile_ptr_id" = "app_basefile"."id")
WHERE EXISTS(
SELECT U0."id", U0."file_id", U0."time"
FROM "app_fileevent" U0
WHERE U0."file_id" = ("app_filetype1"."basefile_ptr_id")) = 0
It appears to be doing the WHERE EXISTS subquery twice instead of just using the annotated 'file_event' label... Maybe this is just a Django/SQLite driver bug?

issue in one2many write

In the below code i have get a 2 rows as a result. then i create the object by using the results. Then i assign the created values in to the one2many field (inventory_line) . but here only one row has displayed. i want to list out the all created vales in the one2many..? how can i fix this issue..?
#api.multi
def _inventory(self):
result = {}
if not self: return result
print ("Trueeeeeeeeeeeeee")
inventory_obj = self.env['tpt.product.inventory']
print (inventory_obj,"inventory_obj")
for id in self:
print (id,"id")
result.setdefault(id, [])
sql = 'delete from tpt_product_inventory where product_id=%s'%(id.id)
print (sql,"sql")
self._cr.execute(sql)
sql = '''
select foo.loc,foo.prodlot_id,foo.id as uom,sum(foo.product_qty) as ton_sl, foo.product_id from
(select l2.id as loc,st.prodlot_id,pu.id,st.product_qty,st.product_id
from stock_move st
inner join stock_location l2 on st.location_dest_id= l2.id
inner join product_uom pu on st.product_uom = pu.id
where st.state='done' and st.product_id=%s and l2.usage = 'internal'
union all
select l1.id as loc,st.prodlot_id,pu.id,st.product_qty*-1, st.product_id
from stock_move st
inner join stock_location l1 on st.location_id= l1.id
inner join product_uom pu on st.product_uom = pu.id
where st.state='done' and st.product_id=%s and l1.usage = 'internal'
)foo
group by foo.loc,foo.prodlot_id,foo.id, foo.product_id
'''%(id.id,id.id)
self._cr.execute(sql)
})
for inventory in self._cr.dictfetchall():
print (inventory,"inventory")
new_id = inventory_obj.create( {'warehouse_id':inventory['loc'],'product_id':inventory['product_id'],'prodlot_id':inventory['prodlot_id'],'hand_quantity':inventory['ton_sl'],'uom_id':inventory['uom']})
print (new_id,"new_id")
self.inventory_line = new_id
I think that you better create an SQL View for this scenario and associate it with the model tpt.product.inventory and you could remove all of the code you are using to match the records(delete and create the new ones)
You could find a very similar example here:
https://github.com/odoo/odoo/blob/695050dd10e786d7b316f6e7e40418441cf0c8dd/addons/stock/report/report_stock_forecast.py

Django; how to find all bugs that have certain flags set

I am writing a Django app that queries a Bugzilla database for reporting. I am trying to build a query that can get all of the bugs that have specific flags set.
The model representing the flags table.
class Bugzilla_flags(models.Model):
class Meta:
db_table = 'flags'
type_id = models.IntegerField()
status = models.CharField(max_length=50)
bug_id = models.IntegerField()
creation_date = models.DateTimeField()
modification_date = models.DateTimeField()
setter_id = models.IntegerField()
requestee_id = models.IntegerField()
def __unicode__(self):
return str(self.bug_id)
I have a dictionary that represents the flags I want to look for (type_id : status).
flags = {'36':'?','12':'+'}
I tried using the reduce function but I don't think it will work because I is checking that all of the flags to be present in the same row. If I run the query with a dictionary with just a single k,v pair, it works fine, but not with more than 1.
query = reduce(operator.and_, (Q(type_id=flag,status=val) for (flag,val) in flags.items()))
I will then take the results of that query, and use it as the search for the actual bugs database.
inner = Bugzilla_flags.objects.using('bugzilla').filter(query)
bugs = Bugzilla_bugs.objects.using('bugzilla').filter(bug_id__in=inner)
For some history, I am currently using a series of steps to generate some sql which I send as a raw query, but I am trying to see if I can do it in Django. The resulting sql is like this:
select b.bug_id, b.priority, b.bug_severity, b.bug_status, b.resolution, b.cf_verified_in, b.assigned_to, b.qa_contact, b.short_desc, b.cf_customercase,
MAX(CASE WHEN f.type_id = 31 THEN f.status ELSE NULL END) as Unlocksbranch1,
MAX(CASE WHEN f.type_id = 31 THEN f.status ELSE NULL END) as Unlocksbranch2,
MAX(CASE WHEN f.type_id = 33 THEN f.status ELSE NULL END) as Unlocksbranch3,
MAX(CASE WHEN f.type_id = 34 THEN f.status ELSE NULL END) as Unlocksbranch4,
MAX(CA5E WHEN f.type_id = 36 THEN f.status ELSE NULL END) as Unlocksbranch5,
MAX(CASE WHEN f.type_id = 41 THEN f.status ELSE NULL END) as Unlocksbranch6,
MAX(CASE WHEN f.type_id = 12 THEN f.status ELSE NULL END) as CodeReviewed
from bugs b
inner join flags f on f.bug_id = b.bug_id
where ( b.bug_status = 'RESOLVED' or b.bug_status = 'VERIFIED' or b.bug_status = 'CLOSED' )
and b.resolution = 'FIXED'
group by b.bug_id
having CodeReviewed = '+' and Unlocksbranch1 = '?' and Unlocksbranch2 = '+'
The result of this gives me a single queryset that has all of the flags I care about as columns, which I can then do my analysis on. The last "having" section is what I am actually querying on, and is what I am trying to get with the above Django queries.
EDIT
Basically what I need to do is like this:
flags1 = {'36':'?'}
flags2 = {'12':'+'}
query1 = reduce(operator.and_, (Q(type_id=flag,status=val) for (flag,val) in flags1.items()))
query2 = reduce(operator.and_, (Q(type_id=flag,status=val) for (flag,val) in flags2.items()))
inner1 = Bugzilla_flags.objects.using('bugzilla').filter(query1)
inner2 = Bugzilla_flags.objects.using('bugzilla').filter(query2)
inner1_bugs = [row.bug_id for row in inner1] # list of just the bug_ids
inner2_bugs = [row.bug_id for row in inner2] # list of just the bug_ids
intersect = set(inner1_bugs) & set(inner2_bugs)
The intersect is a set that has all of the bug_ids that I can then use in the Bugzilla_bugs query to get the actual bug data.
How can I do the 3 operations (query, inner, inner_bugs) and then the intersect using a variable length dictionary input such as:
flags = {'36':'?','12':'+','15','?',etc}
Your inner query looks right to me. To find bugs that have all those flags, not just any one, you can either use reduce again to and together a bunch of flag= Q objects, or iterate and build up multiple filter clauses.
inner = Bugzilla_flags.objects.using('bugzilla').filter(query)
flag_filter = reduce(operator.and_, (Q(flag=flag) for flag in inner))
bugs = Bugzilla_bugs.objects.using('bugzilla').filter(flag_filter)
Or:
inner = Bugzilla_flags.objects.using('bugzilla').filter(query)
bugs = Bugzilla_bugs.objects.using('bugzilla').all()
for flag in inner:
bugs = bugs.filter(flag=flag)
Or, for that matter, take advantage of the fact that multiple Q objects are anded together:
inner = Bugzilla_flags.objects.using('bugzilla').filter(query)
flag_filters = [Q(flag=flag) for flag in inner]
bugs = Bugzilla_bugs.objects.using('bugzilla').filter(*flag_filters)

Merging tables from django-tables2 and dynamic models

I would like to have a possibility on merging two or more tables (generated by django-tables2) which are generated from
dynamic models.
Let me first describe my problem:
I have the following fields for a dynamically generated model:
...
fields = {
'colA': models.IntegerField(),
'colB': models.IntegerField(),
'colC': models.IntegerField(),
'colD': models.IntegerField(),
}
...
The dynamic handling gives me the possibility to store my tables structered, without defining redundant model classes.
Example for models derived from stored tables in the database:
myDynamicModelA = DataModels().create_model('myDynamicModelA_tablename')
myDynamicModelB = DataModels().create_model('myDynamicModelB_tablename')
myDynamicModelC = DataModels().create_model('myDynamicModelC_tablename')
myDynamicModelD = DataModels().create_model('myDynamicModelD_tablename')
....
'_tablename' is commonly shared by all tables. What differs is the prefix 'myDynamicModel A,B,C...'
This is the model-part.
According to that let me describe the table structure by using django-tables2:
Each model/table shares some columns/fields, so that I can define a django-tables2 class like this:
class Table_A(tables.Table):
colA = tables.Column()
colB = tables.Column()
The fields which are different can be handled simply by using inheritance:
class Table_B(Table_A):
colC = tables.Column()
colD = tables.Column()
def __init__(self, *args, **kwargs):
self.colname = kwargs['colname']
kwargs.pop('colname')
super(Table_B, self).__init__(*args, **kwargs)
for col in self.base_columns:
if col not in ['colA', 'colB']:
self.base_columns[col].verbose_name = '%s_%s' % (self.colname, col)
As you can see, the constructor gives a different column-name-prefix for fields that differ across the models.
Now it is possible to generate tables from the different models, eg:
Table for 'myDynamicModelA_tablename':
columns: colA, colB, myDynamicModelA_tablename_colC, myDynamicModelA_tablename_colD
Table for 'myGenericModelB_tablename':
columns: colA, colB, myDynamicModelB_tablename_colC, myDynamicModelB_tablename_colD
...
Now my question:
is it possible to merge both tables so that I receive something like that:
colA, colB, myDynamicModelA_tablename_colC, myDynamicModelB_tablename_colC, myDynamicModelA_tablename_colD, myDynamicModelB_tablename_colD
The values which will be shown should result from an intersection between the tables (this is possible, because of the common values from colA, which could be interpreted as a primary key)
It is nessecary that the result is a django-tables2 object, because I want to provide pagination as well as sorting options.
I hope my descriptions were understandable, sorry if not.
Many thanks for your time and help.
perhaps this prev question and answer of mine will help?
I access data from combined dynamically created tables via raw sql and send this data to be rendered in a Django-Tables2
If you want to specify the order of and indeed which columns are rendered, define a Meta Class as shown below (where 'sequence' is the order of the columns; nb '...' just means 'and all other columns' - check out the documentary for Tables2 [search for 'Swapping the position of columns']):
def getTable(table_name):
cursor = runSQL(table_name,"""
SELECT * FROM subscription_exptinfo,%s
WHERE %s.id = subscription_exptinfo.id
;""" %(table_name,table_name))
exptData = dictfetchall(cursor)
class Meta:
attrs = {'class': 'paleblue'}
sequence = ('id','...')
attrs = {}
attrs['Meta'] = Meta
cols=exptData[0]
for item in cols:
if item=='timeStart':
attrs[str(item)] = tables.DateTimeColumn(format='d-m-Y g:i:s',short=False)
else:
attrs[str(item)] = tables.Column()
myTable = type('myTable', (TableReport,), attrs)
#TableOptions(table).sequence = ["Id","..."]
#print ".........................." + str(TableOptions(table).sequence)
return myTable(exptData)

SQLAlchemy session query with INSERT IGNORE

I'm trying to do a bulk insert/update with SQLAlchemy. Here's a snippet:
for od in clist:
where = and_(Offer.network_id==od['network_id'],
Offer.external_id==od['external_id'])
o = session.query(Offer).filter(where).first()
if not o:
o = Offer()
o.network_id = od['network_id']
o.external_id = od['external_id']
o.title = od['title']
o.updated = datetime.datetime.now()
payout = od['payout']
countrylist = od['countries']
session.add(o)
session.flush()
for country in countrylist:
c = session.query(Country).filter(Country.name==country).first()
where = and_(OfferPayout.offer_id==o.id,
OfferPayout.country_name==country)
opayout = session.query(OfferPayout).filter(where).first()
if not opayout:
opayout = OfferPayout()
opayout.offer_id = o.id
opayout.payout = od['payout']
if c:
opayout.country_id = c.id
opayout.country_name = country
else:
opayout.country_id = 0
opayout.country_name = country
session.add(opayout)
session.flush()
It looks like my issue was touched on here, http://www.mail-archive.com/sqlalchemy#googlegroups.com/msg05983.html, but I don't know how to use "textual clauses" with session query objects and couldn't find much (though admittedly I haven't had as much time as I'd like to search).
I'm new to SQLAlchemy and I'd imagine there's some issues in the code besides the fact that it throws an exception on a duplicate key. For example, doing a flush after every iteration of clist (but I don't know how else to get an the o.id value that is used in the subsequent OfferPayout inserts).
Guidance on any of these issues is very appreciated.
The way you should be doing these things is with session.merge().
You should also be using your objects relation properties. So the o above should have o.offerpayout and this a list (of objects) and your offerpayout has offerpayout.country property which is the related countries object.
So the above would look something like
for od in clist:
o = Offer()
o.network_id = od['network_id']
o.external_id = od['external_id']
o.title = od['title']
o.updated = datetime.datetime.now()
payout = od['payout']
countrylist = od['countries']
for country in countrylist:
opayout = OfferPayout()
opayout.payout = od['payout']
country_obj = Country()
country_obj.name = country
opayout.country = country_obj
o.offerpayout.append(opayout)
session.merge(o)
session.flush()
This should work as long as all the primary keys are correct (i.e the country table has a primary key of name). Merge essentially checks the primary keys and if they are there merges your object with one in the database (it will also cascade down the joins).

Categories