Django join two tables while keeping ORM - python

Using Django, I am trying to fetch this specific result view from the database using Django:
select * from CO2_Low_Adj a JOIN CO2_Low_Metrics b on a.gene_id_B = b.gene_id where a.gene_id_A='Traes_1AL_00A8A2030'
I know I can do it using connections, cursor, fetchall and get back a list of dictionaries. However, I am wondering if there is a way to do this in Django while keeping the ORM.
The tables look like this:
class Co2LowMetrics(models.Model):
gene_id = models.CharField(primary_key=True, max_length=24)
modular_k = models.FloatField()
modular_k_rank = models.IntegerField()
modular_mean_exp_rank = models.IntegerField()
module = models.IntegerField()
k = models.FloatField()
k_rank = models.IntegerField()
mean_exp = models.FloatField()
mean_exp_rank = models.IntegerField()
gene_gene = models.ForeignKey(Co2LowGene, db_column='Gene_gene_id') # Field name made lowercase.
class Meta:
managed = False
db_table = 'CO2_Low_Metrics'
class Co2LowGene(models.Model):
gene_id = models.CharField(primary_key=True, max_length=24)
entry = models.IntegerField(unique=True)
gene_gene_id = models.CharField(db_column='Gene_gene_id', max_length=24) # Field name made lowercase.
class Meta:
managed = False
db_table = 'CO2_Low_Gene'
class Co2LowAdj(models.Model):
gene_id_a = models.CharField(db_column='gene_id_A', max_length=24) # Field name made lowercase.
edge_number = models.IntegerField(unique=True)
gene_id_b = models.CharField(db_column='gene_id_B', max_length=24) # Field name made lowercase.
value = models.FloatField()
gene_gene_id_a = models.ForeignKey('Co2LowGene', db_column='Gene_gene_id_A') # Field name made lowercase.
gene_gene_id_b = models.ForeignKey('Co2LowGene', db_column='Gene_gene_id_B') # Field name made lowercase.
class Meta:
managed = False
db_table = 'CO2_Low_Adj'
The database table descriptions are:
mysql> describe CO2_Low_Metrics;
+-----------------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------------------+-------------+------+-----+---------+-------+
| gene_id | varchar(24) | NO | PRI | NULL | |
| modular_k | double | NO | | NULL | |
| modular_k_rank | int(8) | NO | | NULL | |
| modular_mean_exp_rank | int(8) | NO | | NULL | |
| module | int(8) | NO | | NULL | |
| k | double | NO | | NULL | |
| k_rank | int(8) | NO | | NULL | |
| mean_exp | double | NO | | NULL | |
| mean_exp_rank | int(8) | NO | | NULL | |
| Gene_gene_id | varchar(24) | NO | MUL | NULL | |
+-----------------------+-------------+------+-----+---------+-------+
mysql> describe CO2_Low_Gene;
+--------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-------------+------+-----+---------+----------------+
| gene_id | varchar(24) | NO | PRI | NULL | |
| entry | int(8) | NO | UNI | NULL | auto_increment |
| Gene_gene_id | varchar(24) | NO | | NULL | |
+--------------+-------------+------+-----+---------+----------------+
mysql> describe CO2_Low_Adj;
+----------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+-------------+------+-----+---------+----------------+
| gene_id_A | varchar(24) | NO | MUL | NULL | |
| edge_number | int(9) | NO | PRI | NULL | auto_increment |
| gene_id_B | varchar(24) | NO | MUL | NULL | |
| value | double | NO | | NULL | |
| Gene_gene_id_A | varchar(24) | NO | MUL | NULL | |
| Gene_gene_id_B | varchar(24) | NO | MUL | NULL | |
+----------------+-------------+------+-----+---------+----------------+
Assume that I do not have the ability to change the underlying database schema. That may change and if a suggestion can help in making it easier to use Django's ORM then I can attempt to get it changed.
However, I have been trying to use prefetch_related and select_related but I'm doing something wrong and am not getting everything back right.
With my SQL query I get essentially with the described tables in order CO2_Low_Adj then CO2_Low_Metrics where gene_id_A is the same as gene_gene_id_A ('Traes_1AL_00A8A2030') and gene_id_B is the same as gene_gene_id_B. CO2_Low_Gene does not seem to be used at all with the SQL query.
Thanks.

Django does not have a way to perform JOIN queries without foreign keys. This is why prefetch_related and select_related will not work - they work on foreign keys.
I am not sure what are you trying to achieve. Since your gene_id is unique, there will be only one CO2_Low_Metrics instance and a list of adj instances:
adj = CO2_Low_Adj.objects.filter(gene_id_A='Traes_1AL_00A8A2030')
metrics = CO2_Low_Metrics.objects.get(pk='Traes_1AL_00A8A2030')
and then work on a separate list.

Related

Django: Field 'X' doesn't have a default value

The models.py of my table is :
class TestCaseGlobalMetricData(models.Model):
testRunData = models.ForeignKey(TestRunData)
testCase = models.CharField(max_length=200)
cvmTotalFree = models.IntegerField(default=0)
systemFree = models.IntegerField(default=0)
sharedMemory = models.IntegerField(default=0)
And the part of the code which uses this table is
TestCaseGlobalMetricData(testRunData=testRunDataObj,
testCase=tokens['tc_name'],
timestamp=tokens['timestamp'],
cvmTotalFree=totalFree).save()
When this line is executed the following error is seen,
File "/web/memmon/eink-memmon/projects/process.py", line 382, in process_cvm_usage
cvmTotalFree=totalFree).save()
.....
Warning: Field 'sharedMemory' doesn't have a default value
I tried inserting into the table using the python manage.py shell and it works fine.
The schema of the table as seen in mysql is
+-----------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| testRunData_id | int(11) | NO | MUL | NULL | |
| testCase | varchar(200) | NO | | NULL | |
| cvmTotalFree | int(11) | NO | | NULL | |
| systemFree | int(11) | NO | | NULL | |
| sharedMemory | int(11) | NO | | NULL | |
+-----------------+--------------+------+-----+---------+----------------+
Note:
The field sharedMemory was added in the models.py today. I did proper migrations using makemigrations and migrate commands.

sqlalchemy not setting timestmap default

SQLAlchemy isn't respecting default=datetime.datetime.utcnow or default=func.now() (I've tried both) for DateTime columns.
Python:
class DataStuff(BASE):
"""some data"""
__tablename__ = 'datastuff'
_id = Column(Integer, primary_key=True)
_name = Column(String(64))
json = Column(sqlalchemy.UnicodeText())
_timestamp = Column(DateTime, default=datetime.datetime.utcnow)
MySQL:
mysql> describe datastuff;
+---------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+---------+----------------+
| _id | int(11) | NO | PRI | NULL | auto_increment |
| _name | varchar(64) | YES | | NULL | |
| json | text | YES | | NULL | |
| _timestamp | datetime | YES | | NULL | |
According to the documentation using server_default is the way to go when the desired action is to provide a default value on a column within the CREATE TABLE statement.
http://docs.sqlalchemy.org/en/rel_1_0/core/defaults.html
I ended up using server_default with the func.now() function instead of default=datetime.datetime.utcnow.
New code:
from sqlalchemy import func
_timestamp = Column(DateTime, server_default=func.now())
Result:
+---------------+--------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+-------------------+----------------+
| _timestamp | datetime | YES | | CURRENT_TIMESTAMP | |
violá

Join in Django views

I have 2 Models in my django app.
The corresponding tables for the two models are:
Table 1:
+-------------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| post_id | int(11) | NO | MUL | NULL | |
| user_id | int(11) | NO | MUL | NULL | |
+-------------+---------+------+-----+---------+----------------+
Table2:
+---------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| p_text | varchar(200) | NO | | NULL | |
| p_slug | varchar(50) | YES | MUL | NULL | |
| user_id | int(11) | NO | MUL | NULL | |
+---------------+--------------+------+-----+---------+----------------+
Now what i want is to write the equivalent of the below query in my django view in the best way possible? The query i want to write is a simple join as:
select B.p_slug from Table1 A, Table2 B where A.post_id = B.id;
I tried but could not get it working. Any help please? How to implement the above query in Django views
The models are: Model1:
class Model1(models.Model):
user = models.ForeignKey(settings.AUTH_USER_MODEL)
post = models.ForeignKey(Model1)
class Model2(models.Model):
p_text = models.CharField(max_length=200)
user = models.ForeignKey(settings.AUTH_USER_MODEL)
p_slug = models.SlugField(null=True,blank=True)
Try this: Model2.objects.filter(pk__in=Model1.objcts.values_list('post_id', flat=True)).values('p_slug'). I hope it helps.

Why django order_by is so slow in a manytomany query?

I have a ManyToMany field. Like this:
class Tag(models.Model):
books = models.ManyToManyField ('book.Book', related_name='vtags', through=TagBook)
class Book (models.Model):
nump = models.IntegerField (default=0, db_index=True)
I have around 450,000 books, and for some tags, it related around 60,000 books. When I did a query like:
tag.books.order_by('nump')[1:11]
It gets extremely slow, like 3-4 minutes.
But if I remove order_by, it run queries as normal.
The raw sql for the order_by version looks like this:
'SELECT `book_book`.`id`, ... `book_book`.`price`, `book_book`.`nump`,
FROM `book_book` INNER JOIN `book_tagbook` ON (`book_book`.`id` =
`book_tagbook`.`book_id`) WHERE `book_tagbook`.`tag_id` = 1 ORDER BY
`book_book`.`nump` ASC LIMIT 11 OFFSET 1'
Do you have any idea on this? How could I fix it? Thanks.
---EDIT---
Checked the previous raw query in mysql as #bouke suggested:
SELECT `book_book`.`id`, `book_book`.`title`, ... `book_book`.`nump`,
`book_book`.`raw_data` FROM `book_book` INNER JOIN `book_tagbook` ON
(`book_book`.`id` = `book_tagbook`.`book_id`) WHERE `book_tagbook`.`tag_id` = 1
ORDER BY `book_book`.`nump` ASC LIMIT 11 OFFSET 1;
11 rows in set (4 min 2.79 sec)
Then use explain to find out why:
+----+-------------+--------------+--------+---------------------------------------------+-----------------------+---------+-----------------------------+--------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+--------+---------------------------------------------+-----------------------+---------+-----------------------------+--------+---------------------------------+
| 1 | SIMPLE | book_tagbook | ref | book_tagbook_3747b463,book_tagbook_752eb95b | book_tagbook_3747b463 | 4 | const | 116394 | Using temporary; Using filesort |
| 1 | SIMPLE | book_book | eq_ref | PRIMARY | PRIMARY | 4 | legend.book_tagbook.book_id | 1 | |
+----+-------------+--------------+--------+---------------------------------------------+-----------------------+---------+-----------------------------+--------+---------------------------------+
2 rows in set (0.10 sec)
And for the table book_book:
mysql> explain book_book;
+----------------+----------------+------+-----+-----------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+----------------+------+-----+-----------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| title | varchar(200) | YES | | NULL | |
| href | varchar(200) | NO | UNI | NULL | |
..... skip some part.............
| nump | int(11) | NO | MUL | 0 | |
| raw_data | varchar(10000) | YES | | NULL | |
+----------------+----------------+------+-----+-----------+----------------+
24 rows in set (0.00 sec)

Creating a slug field in database tied to name field

I have a django project for which I'm trying to update the database tables without using any migration tools. I'm adding a 'slug' field to a model which simply references it's name, and I was intending on doing so by replicating what was in another table that already has a slug for.
So, I have an existing table 'gym' in the database as follows
+----------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| gym_name | varchar(50) | NO | UNI | NULL | |
| gym_slug | varchar(50) | NO | MUL | NULL | |
| created | datetime | NO | | NULL | |
| modified | datetime | NO | | NULL | |
+----------+--------------+------+-----+---------+----------------+
and I have another table 'wall' in the database as follows:
+-----------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| wall_name | varchar(50) | NO | UNI | NULL | |
| gym_id | int(11) | NO | MUL | NULL | |
| created | datetime | NO | | NULL | |
| modified | datetime | NO | | NULL | |
+-----------+-------------+------+-----+---------+----------------+
This would get me part of the way there:
ALTER TABLE wall ADD COLUMN wall_slug varchar(50);
But I'm not certain how to figure out where the foreign key in the first table is referencing and thus where I should point the new one.
End goal: the wall_slug field tied to a unique wall_name field. Hope that makes sense.
from django.template.defaultfilters import slugify
class Wall(Model):
name = CharField(max_length=60)
slug = CharField(max_length=60, unique=True)
...
def save(self, *args, **kwargs):
if self.slug == '':
self.slug = slugify(self.name)
super(Wall, self).save(*args, **kwargs)
This will generate for you a slug based on the name when saving. It also protects you from changing the slug next time you change the name, so URLs remain stable and unchanged.

Categories