Generating Boolean Expressions from Subqueries in SQLAlchemy

Generating Boolean Expressions from Subqueries in SQLAlchemy - python

I have the following SQLAlchemy models:
PENDING_STATE = 'pending'
COMPLETE_STATE = 'success'
ERROR_STATE = 'error'
class Assessment(db.Model):
__tablename__ = 'assessments'
id = db.Column(db.Integer, primary_key=True)
state = db.Column(
db.Enum(PENDING_STATE, COMPLETE_STATE, ERROR_STATE,
name='assessment_state'),
default=PENDING_STATE,
nullable=False,
index=True)
test_results = db.relationship("TestResult")
class TestResult(db.Model):
__tablename__ = 'test_results'
name = db.Column(db.String, primary_key=True)
state = db.Column(
db.Enum(PENDING_STATE, COMPLETE_STATE, ERROR_STATE,
name='test_result_state_state'),
default=PENDING_STATE,
nullable=False,
index=True)
assessment_id = db.Column(
db.Integer,
db.ForeignKey(
'assessments.id', onupdate='CASCADE', ondelete='CASCADE'),
primary_key=True)
And I am trying to implement logic to update an assessment to the error state if any of its test results are in the error state and update the assessment to the success state if all of its test results are in the success state.
I can write raw SQL like this:
SELECT 'error'
FROM assessments
WHERE assessments.state = 'error' OR 'error' IN (
SELECT test_results.state
FROM test_results
WHERE test_results.assessment_id = 1);
But I don't know how to translate that into SQLAlchemy. I'd think that subquery would be something like:
(select([test_results.state]).where(test_results.assessment_id == 1)).in_('error')
but I can't find any way to compare query results against literals like I'm doing in the raw SQL. I swear I must be missing something, but I'm just not seeing a way to write queries which return boolean expressions, which I think is fundamentally what I'm butting up against. Just something as simple as:
SELECT 'a' = 'b'
Seems to be absent from the documentation.
Any ideas on how to express this state change in SQLAlchemy? I'd also be perfectly open to rethinking my schemas if it looks like I'm going about this in a silly way.
Thanks!

Query below should do it for error check. Keep in mind that no rows will be returned in case it is not an eror.
q = (db.session.query(literal_column("'error'"))
.select_from(Assessment)
.filter(Assessment.id == sid)
.filter(or_(
Assessment.state == ERROR_STATE,
Assessment.test_results.any(TestResult.state == ERROR_STATE),
)))
If you wish to do similar check for success, you could find if there is any TestResult which is not a success and negate boolean result.

I actually ended up doing this with postgres triggers, which is probably the better way to handle state updates. So for the error case, I've got:
sqlalchemy.event.listen(TestResult.__table__, 'after_create', sqlalchemy.DDL("""
CREATE OR REPLACE FUNCTION set_assessment_failure() RETURNS trigger AS $$
BEGIN
UPDATE assessments
SET state='error'
WHERE id=NEW.assessment_id;
RETURN NEW;
END;
$$ LANGUAGE 'plpgsql';
CREATE TRIGGER assessment_failure
AFTER INSERT OR UPDATE OF state ON test_results
FOR EACH ROW
WHEN (NEW.state = 'error')
EXECUTE PROCEDURE set_assessment_failure();"""))
And something similar for the 'success' case where I count the number of test results vs the number of successful test results.
Credit to van for answering my question as I asked it, though! Thanks, I hadn't bumped into relationship.any before.

Related

SQLAlchemy: How to output a calculated field that takes in parameters from the user as input

I have a model something like this:
class User:
__tablename__ = "user"
first_name = Column(String(), nullable=False)
last_name = Column(String(), nullable=False)
I then have a query like this:
name = ["...", ...]
user_query = (
db.session.query(User)
.add_column(case([
(and_(User.first_name).ilike(f'%{name}%'), User.last_name.ilike(f'%{name}%')), True),
(or_(User.first_name).ilike(f'%{name}%'), User.last_name.ilike(f'%{name}%')), False)
]).label('perfect_match'))
.order_by(literal_column("'perfect_match'"))
)
This is obviously an over-simplification, but essentially I'm trying to do a search across fields and sort the perfect matches separately from the imperfect matches.
This is giving me an error saying that the column "perfect_match" doesn't exist, so I think I'm using add_column incorrectly here.
I also tried using hybrid_method like so:
#hybrid_method
def perfect_match(self, terms):
perfect_match = True
matched = False
for term in terms:
if term.lower() in self.first_name.lower() or term.lower() in self.last_name.lower():
matched = True
continue
perfect_match = False
return perfect_match if matched else None
#perfect_match.expression
def perfect_match(self, terms):
perfect_match = and_(...)
imperfect_match = or_(...)
return case(...)
And then my query looks like this:
name = [...]
user_query = (
db.session.query(User)
.filter(User.perfect_match(name) != None)
.order_by(User.perfect_match(name))
I want perfect_match to be in the output. Here's my desired SQL:
SELECT
first_name,
last_name,
case (
...
) as perfect_match
FROM user
WHERE perfect_match != NULL
ORDER BY perfect_match
The first using add_column gives me that SQL, but SQLAlchemy errors saying it can't column perfect_match does not exist. The second (using hybrid_method) puts the case in the WHERE and ORDER BY clauses (probably inefficient?) and doesn't include perfect_match in the output, but it does seem to be working properly. I need to use that output column to determine where the perfect matches end.
I've tried:
adding perfect_match as a Column(Boolean)
adding perfect_match as a column_property()
adding perfect_match as .query(User, literal_column("'perfect_match'"))
Any thoughts? Most of the examples I've seen use hybrid_property, but I need to take in an argument here. I'm pretty new to SQLAlchemy, so I probably missed something.

Had a similar problem, needed a virtual/property based on a parameter only available during runtime, just a boolean value.
I tried this SQLAlchemy hybrid method as an attribute in the query result? and it didn't work, I wanted the conversion from model to schema to be more direct so I did this.
Define a column named perfect_match as query_expression() in your model:
class User(BaseModel):
...
perfect_match = query_expression()
...
Then when building the query or statement, define an expression, in your case:
exp = case([
(and_(User.first_name).ilike(f'%{name}%'), User.last_name.ilike(f'%{name}%')), True),
(or_(User.first_name).ilike(f'%{name}%'), User.last_name.ilike(f'%{name}%')), False)])
query = (
db.session.query(User)
.options(with_expression(User.perfect_match, exp))
.filter(User.perfect_match(name) != None)
.order_by(User.perfect_match(name)
)
return query.all()
It works using statements also, same syntax. Pydantic should recognize that field as a column.

SQLalchemy custom String primary_key sequence

For the life of me, I cannot think of a simple way to accomplishing this without querying the database whenever a new record is created, but this is what I'm trying to do with sqlalchemy+postgresql:
I would like to have a primary key of a given table follow this format:
YYWW0001, YYWW0002 etc. such that I see values like 20010001, 20010002 such that the last four digits are only incremented within the given week of the year, then resetting when a new week or year is entered.
I'm at the limit of my knowledge here so any help is greatly appreciated!
In the meantime, I am looking into sqlalchemy.schema.Sequence.
Another thing I can think to try is creating a table that has let's say 10,000 records that just have a plain Integer primary key and the actual ID I want, then find some sort of 'next' method to pull from that table when my Core object is constructed? This seems less than Ideal in my mind since I would still need to ensure that the data portion of the id in the table is correct and current. I think if there is a dynamic approach it would best suit my needs.
so far my naiive implementation looks like this:
BASE = declarative_base()
_name = os.environ.get('HW_QC_USER_ID', None)
_pass = os.environ.get('HW_QC_USER_PASS', None)
_ip = os.environ.get('HW_QC_SERVER_IP', None)
_db_name = 'mock'
try:
print('Creating engine')
engine = create_engine(
f'postgresql://{_name}:{_pass}#{_ip}/{_db_name}',
echo=False
)
except OperationalError as _e:
print('An Error has occured when connecting to the database')
print(f'postgresql://{_name}:{_pass}#{_ip}/{_db_name}')
print(_e)
class Core(BASE):
"""
This class describes a master table.
"""
__tablename__ = 'cores'
udi = Column(String(11), primary_key=True, unique=True) # <-- how do I get this to be the format described?
_date_code = Column(
String(4),
default=datetime.datetime.now().strftime("%y%U")
)
BASE.metadata.create_all(engine)
session = sessionmaker(bind=engine)()
date_code_now = datetime.datetime.now().strftime("%y%U")
cores_from_this_week = session.query(Core).filter(
Core._date_code == date_code_now
).all()
num_cores_existing = len(cores_from_this_week)
new_core = Core(
udi=f'FRA{date_code_now}{num_cores_existing+1:04}'
)
session.add(new_core)
session.commit()
session.close()
engine.dispose()

AttributeError: 'NoneType' object has no attribute 'time_recorded' in Flask, SQLAlchemy

I have an api endpoint that passes a variable which is used to make a call in the database. For some reason it cannot run the query yet the syntax is correct. My code is below.
#app.route('/api/update/<lastqnid>')
def check_new_entries(lastqnid):
result = Trades.query.filter_by(id=lastqnid).first()
new_entries = Trades.query.filter(Trades.time_recorded > result.time_recorded).all()
The id field is:
id = db.Column(db.String,default=lambda: str(uuid4().hex), primary_key=True)
I have tried filter instead of filter_by and it does not work. When I remove the filter_by(id=lastqnid) it works. What could be the reason it is not running the query?
The trades table am querying from is
class Trades(db.Model):
id = db.Column(db.String,default=lambda: str(uuid4().hex), primary_key=True)
amount = db.Column(db.Integer, unique=False)
time_recorded = db.Column(db.DateTime, unique=False)

The issue you seem to be having is not checking if you found anything before using your result
#app.route('/api/update/<lastqnid>')
def check_new_entries(lastqnid):
result = Trades.query.filter_by(id=lastqnid).first()
# Here result may very well be None, so we can make an escape here
if result == None:
# You may not want to do exactly this, but this is an example
print("No Trades found with id=%s" % lastqnid)
return redirect(request.referrer)
new_entries = Trades.query.filter(Trades.time_recorded > result.time_recorded).all()

SQLAlchemy: hybrid_property expression and subquery

I am trying to do a complex hybrid_property using SQLAlchemy: my model is
class Consultation(Table):
patient_id = Column(Integer)
patient = relationship('Patient', backref=backref('consultations', lazy='dynamic'))
class Exam(Table):
consultation_id = Column(Integer)
consultation = relationship('Consultation', backref=backref('exams', lazy='dynamic'))
class VitalSign(Table):
exam_id = Column(Integer)
exam = relationship('Exam', backref=backref('vital', lazy='dynamic'))
vital_type = Column(String)
value = Column(String)
class Patient(Table):
patient_data = Column(String)
#hybrid_property
def last_consultation_validity(self):
last_consultation = self.consultations.order_by(Consultation.created_at.desc()).first()
if last_consultation:
last_consultation_conclusions = last_consultation.exams.filter_by(exam_type='conclusions').first()
if last_consultation_conclusions:
last_consultation_validity = last_consultation_conclusions.vital_signs.filter_by(sign_type='validity_date').first()
if last_consultation_validity:
return last_consultation_validity
return None
#last_consultation_validity.expression
def last_consultation_validity(cls):
subquery = select([Consultation.id.label('last_consultation_id')]).\
where(Consultation.patient_id == cls.id).\
order_by(Consultation.created_at.desc()).limit(1)
j = join(VitalSign, Exam).join(Consultation)
return select([VitalSign.value]).select_from(j).select_from(subquery).\
where(and_(Consultation.id == subquery.c.last_consultation_id, VitalSign.sign_type == 'validity_date'))
As you can see my model is quite complicated.
Patients get Consultations. Exams and VitalSigns are cascading data for the Consultations. The idea is that all consultations do not get a validity but that new consultations make the previous consultations validity not interesting: I only want the validity from the last consultation; if a patient has a validity in previous consultations, I'm not interested.
What I would like to do is to be able to order by the hybrid_property last_consultation_validity.
The output SQL looks ok to me:
SELECT vital_sign.value
FROM (SELECT consultation.id AS last_consultation_id
FROM consultation, patient
WHERE consultation.patient_id = patient.id ORDER BY consultation.created_at DESC
LIMIT ? OFFSET ?), vital_sign JOIN exam ON exam.id = vital_sign.exam_id JOIN consultation ON consultation.id = exam.consultation_id
WHERE consultation.id = last_consultation_id AND vital_sign.sign_type = ?
But when I order the patients by last_consultation_validity, the rows do not get ordered ...
When I execute the same select outside of the hybrid_property, to retrieve the date for each patient (just setting the patient.id), I get the good values. Surprising is that the SQL is slightly different, removing patient in the FROMin the SELECT.
So I'm actually wondering if this is a bug in SQLAlchemy or if I'm doing something wrong ... Any help would be greatly appreciated.

Sqlalchemy ID field isn't populated when relationship with another table is set up

I'm trying to set up Sqlalchemy and am running into problems with setting up relationships between tables. Most likely it's misunderstanding on my part.
A table is set up like so. The important line is the one with two asterisks one either side, setting up the relationship to table "jobs."
class Clocktime(Base):
"""Table for clockin/clockout values
ForeignKeys exist for Job and Employee
many to one -> employee
many to one -> job
"""
__tablename__ = "clocktimes"
id = Column(Integer, primary_key=True)
time_in = Column(DateTime)
time_out = Column(DateTime)
employee_id = Column(Integer, ForeignKey('employees.id'))
**job_id = Column(Integer, ForeignKey('jobs.id'))**
# employee = many to one relationship with Employee
# job = many to one relationship with Job
#property
def timeworked(self):
return self.time_out - self.time_in
#property
def __str__(self):
formatter="Employee: {employee.name}, "\
"Job: {job.abbr}, "\
"Start: {self.time_in}, "\
"End: {self.time_out}, "\
"Hours Worked: {self.timeworked}, "\
"ID# {self.id}"
return formatter.format(employee=self.employee, job=self.job, self=self)
Now, the jobs table follows. Check the asterisked line:
class Job(Base):
"""Table for jobs
one to many -> clocktimes
note that rate is cents/hr"""
__tablename__ = "jobs"
id = Column(Integer, primary_key=True)
name = Column(String(50))
abbr = Column(String(16))
rate = Column(Integer) # cents/hr
**clocktimes = relationship('Clocktime', backref='job', order_by=id)**
def __str__(self):
formatter = "Name: {name:<50} {abbr:>23}\n" \
"Rate: ${rate:<7.2f}/hr {id:>62}"
return formatter.format(name=self.name,
abbr="Abbr: " + str(self.abbr),
rate=self.rate/100.0,
id="ID# " + str(self.id))
When a user starts a new task, the following code is executed in order to write the relevant data to tables jobs and clocktimes:
new_task_job = [Job(abbr=abbrev, name=project_name, rate=p_rate), Clocktime(time_in=datetime.datetime.now())]
for i in new_task_job:
session.add(i)
session.commit()
start_time = datetime.datetime.now()
status = 1
Then, when the user takes a break...
new_break = Clocktime(time_out=datetime.datetime.now())
session.add(new_break)
session.commit()
If you look in the screenshot, the job_id field isn't being populated. Shouldn't it be populated with the primary key (id) from the jobs table, per
job_id = Column(Integer, ForeignKey('jobs.id'))
or am I missing something? I'm assuming that I'm to write code to do that, but I don't want to break anything that Sqlalchemy is trying to do in the backend. This should be a one job to many clocktimes, since a person can spend several days per task.

Checking out the docs it
looks like you've set up a collection of ClockTime objects on Job called clocktimes and a .job attribute on ClockTime that will refer to the parent Job object.
The expected behaviour is,
c1 = ClockTime()
j1 = Job()
>>> j1.clocktimes
[]
>>> print c1.job
None
When you populate j1.clocktimes with an object, you should also see c1.job get a non None value.
j1.clocktimes.append(c1)
>>> j1.clocktimes
[an instance of `ClockTime`]
>>> c1.job
[an instance of `Job`]
Do you find that behaviour? I don't see in your code where you populate clocktimes so the population of job is never triggered.
I think you are expecting the addition of ForeignKey to the column definition to do something it doesn't do. The ForeignKey constraint you put on job_id simply means that it is constrained to be among the values that exist in the id column of the Jobs table. Check here for more details

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Generating Boolean Expressions from Subqueries in SQLAlchemy - python

Related

SQLAlchemy: How to output a calculated field that takes in parameters from the user as input

SQLalchemy custom String primary_key sequence

AttributeError: 'NoneType' object has no attribute 'time_recorded' in Flask, SQLAlchemy

SQLAlchemy: hybrid_property expression and subquery

Sqlalchemy ID field isn't populated when relationship with another table is set up

Categories

Resources