I'm using Python/Flask/SQLAlchemy. I have a class Contest which I want to sort by rating which is (tricky) the sum of its child (Side) properties. The formula for the rating is
leftside.score + len(leftside.votes) + rightside.score + len(leftside.votes)
Models:
class Contest(db.Model):
leftside_id = db.Column(db.Text, db.ForeignKey('sides.id'))
rightside_id = db.Column(db.Text, db.ForeignKey('sides.id'))
leftside = db.relationship("Side", foreign_keys=[leftside_id])
rightside = db.relationship("Side", foreign_keys=[rightside_id])
rating = #??? leftside.score + len(leftside.votes) + rightside.score + len(leftside.votes)
class Side(db.Model):
score = db.Column(db.Integer, default=0)
votes = db.relationship('SideVote')
class SideVote(db.Model):
side_id = db.Column(db.Text, db.ForeignKey('sides.id'))
side = db.relationship('Side')
I can write a raw SQL, but it will return simple list, but I need SQLAlchemy query
SELECT *, (
score
+ (SELECT COUNT(*) FROM sidevotes WHERE side_id = contests.leftside_id or side_id = contests.rightside_id)
) as Field
FROM contests, sides
WHERE contests.leftside_id = sides.id or contests.rightside_id = sides.id
ORDER BY Field DESC
So once again, I need to sort Contests by the formula written above, here I see 2 possible solutions:
Either create some hybrid_property/column_property
Or execute SQL and map it SQLAlchemy query so I can use this results
Related
I have a simple polling script that polls entries based on new ID's in a MSSQL table. I'm using SQLAlchemy's ORM to create a table class and then query that table. I want to be able to add more tables "dynamically" without coding it directly into the method.
My polling function:
def poll_db():
query = db.query(
Transactions.ID).order_by(Transactions.ID.desc()).limit(1)
# Continually poll for new images to classify
max_id_query = query
last_max_id = max_id_query.scalar()
while True:
max_id = max_id_query.scalar()
if max_id > last_max_id:
print(
f"New row(s) found. "
f"Processing ids {last_max_id + 1} through {max_id}"
)
# Insert ML model
id_query = db.query(Transactions).filter(
Transactions.ID > last_max_id)
df_from_query = pd.read_sql_query(
id_query.statement, db.bind, index_col='ID')
print(f"New query was made")
last_max_id = max_id
time.sleep(5)
My table model:
import sqlalchemy as db
from sqlalchemy import Boolean, Column, ForeignKey, Integer, String, Text
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import defer, relationship, query
from database import SessionLocal, engine
insp = db.inspect(engine)
db_list = insp.get_schema_names()
Base = declarative_base(cls=BaseModel)
class Transactions(Base):
__tablename__ = 'simulation_data'
sender_account = db.Column('sender_account', db.BigInteger)
recipient_account = db.Column('recipient_account', db.String)
sender_name = db.Column('sender_name', db.String)
recipient_name = db.Column('recipient_name', db.String)
date = db.Column('date', db.DateTime)
text = db.Column('text', db.String)
amount = db.Column('amount', db.Float)
currency = db.Column('currency', db.String)
transaction_type = db.Column('transaction_type', db.String)
fraud = db.Column('fraud', db.BigInteger)
swift_bic = db.Column('swift_bic', db.String)
recipient_country = db.Column('recipient_country', db.String)
internal_external = db.Column('internal_external', db.String)
ID = Column('ID', db.BigInteger, primary_key=True)
QUESTION
How can I pass the table class name "dynamically" in the likes of poll_db(tablename), where tablename='Transactions', and instead of writing similar queries for multiple tables, such as:
query = db.query(Transactions.ID).order_by(Transactions.ID.desc()).limit(1)
query2 = db.query(Transactions2.ID).order_by(Transactions2.ID.desc()).limit(1)
query3 = db.query(Transactions3.ID).order_by(Transactions3.ID.desc()).limit(1)
The tables will have identical structure, but different data.
I can't give you a full example right now (will edit later) but here's one hacky way to do it (the documentation will probably be a better place to check):
def dynamic_table(tablename):
for class_name, cls in Base._decl_class_registry.items():
if cls.__tablename__ == tablename:
return cls
Transactions2 = dynamic_table("simulation_data")
assert Transactions2 is Transactions
The returned class is the model you want. Keep in mind that Base can only access the tables that have been subclassed already so if you have them in other modules you need to import them first so they are registered as Base's subclasses.
For selecting columns, something like this should work:
def dynamic_table_with_columns(tablename, *columns):
cls = dynamic_table(tablename)
subset = []
for col_name in columns:
column = getattr(cls, col_name)
if column:
subset.append(column)
# in case no columns were given
if not subset:
return db.query(cls)
return db.query(*subset)
I am trying to do a join on a table that has two foriegn keys to the same table. Namely, sourceToOutputRelation points twice to Entries, as shown in the code. Also, Entries have tags. I am trying to do a join so that I get every sourceToOutputRelation that has all the given tags (via Entries). I am just trying to understand the join (the filtering works, I think). Here is the code I have for the join and filter. :
'''
tags is a list of strings that are supposed to match the Tags.tag strings
'''
from sqlalchemy.orm import aliased
q = SourceToOutputRelation.query.\
join(Entries.source_entries, Entries.output_entries).\
join(original_tag_registration).\
join(Tags).\
filter(Tags.tag == tags[0])
print(q.all())
Here are my model definitions :
class SourceToOutputRelation(alchemyDB.Model):
__tablename__ = 'sourceToOutputRel'
id = alchemyDB.Column(alchemyDB.Integer, primary_key = True)
source_article = alchemyDB.Column(alchemyDB.Integer, alchemyDB.ForeignKey('entries.id'))
output_article = alchemyDB.Column(alchemyDB.Integer, alchemyDB.ForeignKey('entries.id'))
class Entries(alchemyDB.Model):
__tablename__ = 'entries'
id = alchemyDB.Column(alchemyDB.Integer, primary_key = True)
tags = alchemyDB.relationship('Tags',
secondary = original_tag_registration,
backref = alchemyDB.backref('relevant_entries', lazy = 'dynamic'),
lazy = 'dynamic')
source_entries = alchemyDB.relationship('SourceToOutputRelation',
primaryjoin="SourceToOutputRelation.output_article==Entries.id",
foreign_keys = [SourceToOutputRelation.output_article],
backref = alchemyDB.backref('output', lazy = 'joined'),
lazy = 'dynamic',
cascade = 'all, delete-orphan')
output_entries = alchemyDB.relationship('SourceToOutputRelation',
primaryjoin="SourceToOutputRelation.source_article==Entries.id",
foreign_keys = [SourceToOutputRelation.source_article],
backref = alchemyDB.backref('source', lazy = 'joined'),
lazy = 'dynamic',
cascade = 'all, delete-orphan')
original_tag_registration = alchemyDB.Table('original_tag_registration',
alchemyDB.Column('tag_id', alchemyDB.Integer, alchemyDB.ForeignKey('tagTable.id')),
alchemyDB.Column('entry_id', alchemyDB.Integer, alchemyDB.ForeignKey('entries.id'))
)
class Tags(alchemyDB.Model):
'''
a table to hold unique tags
'''
__tablename__ = 'tagTable'
id = alchemyDB.Column(alchemyDB.Integer, primary_key = True)
tag = alchemyDB.Column(alchemyDB.String(64), unique=True)
entries_with_this_tag = alchemyDB.relationship('Entries',
secondary = original_tag_registration,
backref = alchemyDB.backref('tag', lazy = 'dynamic'),
lazy = 'dynamic')
I get this error :
OperationalError: (OperationalError) ambiguous column name:
sourceToOutputRel.id u'SELECT "sourceToOutputRel".id AS
"sourceToOutputRel_id", "sourceToOutputRel".source_article AS
"sourceToOutputRel_source_article", "sourceToOutputRel".output_article
AS "sourceToOutputRel_output_article",
"sourceToOutputRel".needs_processing AS
"sourceToOutputRel_needs_processing",
"sourceToOutputRel".number_of_votes AS
"sourceToOutputRel_number_of_votes", "sourceToOutputRel".date_related
AS "sourceToOutputRel_date_related",
"sourceToOutputRel".confirmed_relationship_type AS
"sourceToOutputRel_confirmed_relationship_type", entries_1.id AS
entries_1_id, entries_1.title AS entries_1_title, entries_1.text AS
entries_1_text, entries_1.body_html AS entries_1_body_html,
entries_1.user_id AS entries_1_user_id, entries_1.date_posted AS
entries_1_date_posted, entries_2.id AS entries_2_id, entries_2.title
AS entries_2_title, entries_2.text AS entries_2_text,
entries_2.body_html AS entries_2_body_html, entries_2.user_id AS
entries_2_user_id, entries_2.date_posted AS entries_2_date_posted
\nFROM entries JOIN "sourceToOutputRel" ON
"sourceToOutputRel".output_article = entries.id JOIN
"sourceToOutputRel" ON "sourceToOutputRel".source_article = entries.id
JOIN original_tag_registration ON entries.id =
original_tag_registration.entry_id JOIN "tagTable" ON "tagTable".id =
original_tag_registration.tag_id LEFT OUTER JOIN entries AS entries_1
ON "sourceToOutputRel".output_article = entries_1.id LEFT OUTER JOIN
entries AS entries_2 ON "sourceToOutputRel".source_article =
entries_2.id \nWHERE "tagTable".tag = ?' (u'brods',)
Look at the docs.
Paragraph
Joins to a Target with an ON Clause
a_alias = aliased(Address)
q = session.query(User).\
join(User.addresses).\
join(a_alias, User.addresses).\
filter(Address.email_address=='ed#foo.com').\
filter(a_alias.email_address=='ed#bar.com')
There are multiple join on one table.
You already import aliased funciton.
Try this code
'''
tags is a list of strings that are supposed to match the Tags.tag strings
'''
from sqlalchemy.orm import aliased
entry_alias = aliased(Entries)
q = SourceToOutputRelation.query.\
join(Entries.source_entries).\
join(entry_alias, Entries.output_entries).\
join(original_tag_registration).\
join(Tags).\
filter(Tags.tag == tags[0])
print(q.all())
I am trying to do a complex hybrid_property using SQLAlchemy: my model is
class Consultation(Table):
patient_id = Column(Integer)
patient = relationship('Patient', backref=backref('consultations', lazy='dynamic'))
class Exam(Table):
consultation_id = Column(Integer)
consultation = relationship('Consultation', backref=backref('exams', lazy='dynamic'))
class VitalSign(Table):
exam_id = Column(Integer)
exam = relationship('Exam', backref=backref('vital', lazy='dynamic'))
vital_type = Column(String)
value = Column(String)
class Patient(Table):
patient_data = Column(String)
#hybrid_property
def last_consultation_validity(self):
last_consultation = self.consultations.order_by(Consultation.created_at.desc()).first()
if last_consultation:
last_consultation_conclusions = last_consultation.exams.filter_by(exam_type='conclusions').first()
if last_consultation_conclusions:
last_consultation_validity = last_consultation_conclusions.vital_signs.filter_by(sign_type='validity_date').first()
if last_consultation_validity:
return last_consultation_validity
return None
#last_consultation_validity.expression
def last_consultation_validity(cls):
subquery = select([Consultation.id.label('last_consultation_id')]).\
where(Consultation.patient_id == cls.id).\
order_by(Consultation.created_at.desc()).limit(1)
j = join(VitalSign, Exam).join(Consultation)
return select([VitalSign.value]).select_from(j).select_from(subquery).\
where(and_(Consultation.id == subquery.c.last_consultation_id, VitalSign.sign_type == 'validity_date'))
As you can see my model is quite complicated.
Patients get Consultations. Exams and VitalSigns are cascading data for the Consultations. The idea is that all consultations do not get a validity but that new consultations make the previous consultations validity not interesting: I only want the validity from the last consultation; if a patient has a validity in previous consultations, I'm not interested.
What I would like to do is to be able to order by the hybrid_property last_consultation_validity.
The output SQL looks ok to me:
SELECT vital_sign.value
FROM (SELECT consultation.id AS last_consultation_id
FROM consultation, patient
WHERE consultation.patient_id = patient.id ORDER BY consultation.created_at DESC
LIMIT ? OFFSET ?), vital_sign JOIN exam ON exam.id = vital_sign.exam_id JOIN consultation ON consultation.id = exam.consultation_id
WHERE consultation.id = last_consultation_id AND vital_sign.sign_type = ?
But when I order the patients by last_consultation_validity, the rows do not get ordered ...
When I execute the same select outside of the hybrid_property, to retrieve the date for each patient (just setting the patient.id), I get the good values. Surprising is that the SQL is slightly different, removing patient in the FROMin the SELECT.
So I'm actually wondering if this is a bug in SQLAlchemy or if I'm doing something wrong ... Any help would be greatly appreciated.
I am trying to join the table STUDENT to STUDY_PROGRAM. STUDENT to STUDY_PROGRAM is a one to many relationship. The query on a simple natural join didn't give the expected result. Debugging shows that the query result didn't have 'program' columns.
(Pdb) print mystudents[0].program
*** AttributeError: 'Student' object has no attribute 'program'
def students():
mystudentinfo = mydb.session.query(Student).join(StudyProgram)
return render_template('administration/students.html', studentinfo = mystudentinfo)
class Student(mydb.Model):
__tablename__ = 'STUDENT'
study_no = mydb.Column(mydb.String(20), primary_key = True)
std_first_name = mydb.Column(mydb.String(64))
std_last_name = mydb.Column(mydb.String(64))
std_birthdate = mydb.Column(mydb.Date())
std_email = mydb.Column(mydb.String(62))
std_password = mydb.Column(mydb.String())
study_programs = mydb.relationship('StudyProgram', backref='student')
project_apps = mydb.relationship('ProjectApp', backref='student')
class StudyProgram(mydb.Model):
__tablename__ = 'STUDY_PROGRAM'
study_no = mydb.Column(mydb.String(20), mydb.ForeignKey('STUDENT.study_no'), primary_key = True)
program = mydb.Column(mydb.String(100), primary_key = True)
degree_type = mydb.Column(mydb.String(8), primary_key = True)
reg_date = mydb.Column(mydb.Date())
status = mydb.Column(mydb.String(20))
earned_ECTs = mydb.Column(mydb.Numeric(4, 1))
reg_ECTs = mydb.Column(mydb.Numeric(3, 1))
tot_ECTs = mydb.Column(mydb.Numeric(4, 1))
graduation_date = mydb.Column(mydb.Date())
The query didn't select any of Program because SQLAlchemy treats joins separately from selects.
The loading strategy for the relationship can be changed using the options() call on the query. Since you are not doing any filtering on StudyProgram, you can omit the join and set the joinedload option instead.
students = db.session.query(Student).options(db.joinedload('study_programs'))
Now the study_programs relationship will be loaded during the main query, rather than as a separate query. If you do need to join for filtering, you can use the contains_eager option instead.
To access the programs for each student, use the relationship. For example:
for s in students:
print(s.first_name)
for p in s.study_programs:
print(p.program)
print()
If you used joinedload this will not issue any queries except the first one to get the students. If you did not, the default behavior of a relationship is to issue a SELECT when it is accessed, so you will incur one query per student.
The reason for your specific error is that you named the relationship attribute study_programs, not program.
I'm trying to set up Sqlalchemy and am running into problems with setting up relationships between tables. Most likely it's misunderstanding on my part.
A table is set up like so. The important line is the one with two asterisks one either side, setting up the relationship to table "jobs."
class Clocktime(Base):
"""Table for clockin/clockout values
ForeignKeys exist for Job and Employee
many to one -> employee
many to one -> job
"""
__tablename__ = "clocktimes"
id = Column(Integer, primary_key=True)
time_in = Column(DateTime)
time_out = Column(DateTime)
employee_id = Column(Integer, ForeignKey('employees.id'))
**job_id = Column(Integer, ForeignKey('jobs.id'))**
# employee = many to one relationship with Employee
# job = many to one relationship with Job
#property
def timeworked(self):
return self.time_out - self.time_in
#property
def __str__(self):
formatter="Employee: {employee.name}, "\
"Job: {job.abbr}, "\
"Start: {self.time_in}, "\
"End: {self.time_out}, "\
"Hours Worked: {self.timeworked}, "\
"ID# {self.id}"
return formatter.format(employee=self.employee, job=self.job, self=self)
Now, the jobs table follows. Check the asterisked line:
class Job(Base):
"""Table for jobs
one to many -> clocktimes
note that rate is cents/hr"""
__tablename__ = "jobs"
id = Column(Integer, primary_key=True)
name = Column(String(50))
abbr = Column(String(16))
rate = Column(Integer) # cents/hr
**clocktimes = relationship('Clocktime', backref='job', order_by=id)**
def __str__(self):
formatter = "Name: {name:<50} {abbr:>23}\n" \
"Rate: ${rate:<7.2f}/hr {id:>62}"
return formatter.format(name=self.name,
abbr="Abbr: " + str(self.abbr),
rate=self.rate/100.0,
id="ID# " + str(self.id))
When a user starts a new task, the following code is executed in order to write the relevant data to tables jobs and clocktimes:
new_task_job = [Job(abbr=abbrev, name=project_name, rate=p_rate), Clocktime(time_in=datetime.datetime.now())]
for i in new_task_job:
session.add(i)
session.commit()
start_time = datetime.datetime.now()
status = 1
Then, when the user takes a break...
new_break = Clocktime(time_out=datetime.datetime.now())
session.add(new_break)
session.commit()
If you look in the screenshot, the job_id field isn't being populated. Shouldn't it be populated with the primary key (id) from the jobs table, per
job_id = Column(Integer, ForeignKey('jobs.id'))
or am I missing something? I'm assuming that I'm to write code to do that, but I don't want to break anything that Sqlalchemy is trying to do in the backend. This should be a one job to many clocktimes, since a person can spend several days per task.
Checking out the docs it
looks like you've set up a collection of ClockTime objects on Job called clocktimes and a .job attribute on ClockTime that will refer to the parent Job object.
The expected behaviour is,
c1 = ClockTime()
j1 = Job()
>>> j1.clocktimes
[]
>>> print c1.job
None
When you populate j1.clocktimes with an object, you should also see c1.job get a non None value.
j1.clocktimes.append(c1)
>>> j1.clocktimes
[an instance of `ClockTime`]
>>> c1.job
[an instance of `Job`]
Do you find that behaviour? I don't see in your code where you populate clocktimes so the population of job is never triggered.
I think you are expecting the addition of ForeignKey to the column definition to do something it doesn't do. The ForeignKey constraint you put on job_id simply means that it is constrained to be among the values that exist in the id column of the Jobs table. Check here for more details