trouble getting pylint to find inherited methods in pylons/SA models - python

I have a Pylons app that I'm using SqlAlchemy declarative models for. In order to make the code a bit cleaner I add a .query onto the SA Base and inherit all my models from that.
So in my app.model.meta I have
Base = declarative_base()
metadata = Base.metadata
Session = scoped_session(sessionmaker())
Base.query = Session.query_property(Query)
I think inherit this into app.model.mymodel and declare it as a child of meta.Base. This lets me write my queries as
mymodel.query.filter(mymodel.id == 3).all()
The trouble is that pylint is not seeing .query as a valid attribute of my models.
E:102:JobCounter.reset_count: Class 'JobCounter' has no 'query' member
Obviously this error is all over the place since it occurs on any model doing any query. I don't want to just skip the error because it might point out something down the road on non-orm classes, but I must be missing something for pylint to accept this.
Any hints?

Best I could find for this is to pass pylint a list of classes to ignore this check on. It'll still do other checks for these classes, you'll just have to maintain a list of these somewhere:
pylint --ignored-classes=MyModel1,MyModel2 myfile.py
I know it's not ideal, but there's something about the way that sqlalchemy sets up the models that confuses pylint. At least with this you still get the check for non-orm classes.

Related

How to find AssociationProxy declared_attr name in SQLAlchemy from inspection with all_orm_descriptors

I have a SQLAlchemy table class that has an association proxy defined (through a mixin) as below:
#declared_attr
def nationality_name(self):
return association_proxy("nationality", "name")
I would like to get the name of this attribute using the SQLAlchemy inspection methods. I've tried the following:
for column in mapper.all_orm_descriptors:
print(column.key)
but this gives the name of the the assocation proxy as _AssociationProxy_nationality_4587917712, not the name that I actually gave it, which was nationality_name. I can't seem to find any other attributes that I could use to get the name it's actually been given.
Is there a way to get this name through SQLAlchemy inspection?
It would probably be possible to fall back on just using dir on the table class, but that seems rather fragile (and I'd have to filter out all of the attributes of the class that aren't column names or association proxies).
In searching StackOverflow, I've found this question where someone says "I worked around it by getting the name of the proxy, and then using getattr" - but I don't know how they got the name. Possibly they weren't using a declared_attr, which might be making this more complicated.

What's the difference between Model.query and session.query(Model) in SQLAlchemy?

I'm a beginner in SQLAlchemy and found query can be done in 2 method:
Approach 1:
DBSession = scoped_session(sessionmaker())
class _Base(object):
query = DBSession.query_property()
Base = declarative_base(cls=_Base)
class SomeModel(Base):
key = Column(Unicode, primary_key=True)
value = Column(Unicode)
# When querying
result = SomeModel.query.filter(...)
Approach 2
DBSession = scoped_session(sessionmaker())
Base = declarative_base()
class SomeModel(Base):
key = Column(Unicode, primary_key=True)
value = Column(Unicode)
# When querying
session = DBSession()
result = session.query(SomeModel).filter(...)
Is there any difference between them?
In the code above, there is no difference. This is because, in line 3 of the first example:
the query property is explicitly bound to DBSession
there is no custom Query object passed to query_property
As #petr-viktorin points out in the answer here, there must be a session available before you define your model in the first example, which might be problematic depending on the structure of your application.
If, however, you need a custom query that adds additional query parameters automatically to all queries, then only the first example will allow that. A custom query class that inherits from sqlalchemy.orm.query.Query can be passed as an argument to query_property. This question shows an example of that pattern.
Even if a model object has a custom query property defined on it, that property is not used when querying with session.query, as in the last line in the second example. This means something like the first example the only option if you need a custom query class.
I see these downsides to query_property:
You cannot use it on a different Session than the one you've configured (though you could always use session.query then).
You need a session object available before you define your schema.
These could bite you when you want to write tests, for example.
Also, session.query fits better with how SQLAlchemy works; query_property looks like it's just added on top for convenience (or similarity with other systems?).
I'd recommend you stick to session.query.
An answer (here) to a different SQLAlchemy question might help. That answer starts with:
You can use Model.query, because the Model (or usually its base class, especially in cases where declarative extension is used) is assigned Session.query_property. In this case the Model.query is equivalent to Session.query(Model).

SqlAlchemy optimizations for read-only object models

I have a complex network of objects being spawned from a sqlite database using sqlalchemy ORM mappings. I have quite a few deeply nested:
for parent in owner.collection:
for child in parent.collection:
for foo in child.collection:
do lots of calcs with foo.property
My profiling is showing me that the sqlalchemy instrumentation is taking a lot of time in this use case.
The thing is: I don't ever change the object model (mapped properties) at runtime, so once they are loaded I don't NEED the instrumentation, or indeed any sqlalchemy overhead at all. After much research, I'm thinking I might have to clone a 'pure python' set of objects from my already loaded 'instrumented objects', but that would be a pain.
Performance is really crucial here (it's a simulator), so maybe writing those layers as C extensions using sqlite api directly would be best. Any thoughts?
If you reference a single attribute of a single instance lots of times, a simple trick is to store it in a local variable.
If you want a way to create cheap pure python clones, share the dict object with the original object:
class CheapClone(object):
def __init__(self, original):
self.__dict__ = original.__dict__
Creating a copy like this costs about half of the instrumented attribute access and attribute lookups are as fast as normal.
There might also be a way to have the mapper create instances of an uninstrumented class instead of the instrumented one. If I have some time, I might take a look how deeply ingrained is the assumption that populated instances are of the same type as the instrumented class.
Found a quick and dirty way that seems to at least somewhat work on 0.5.8 and 0.6. Didn't test it with inheritance or other features that might interact badly. Also, this touches some non-public API's, so beware of breakage when changing versions.
from sqlalchemy.orm.attributes import ClassManager, instrumentation_registry
class ReadonlyClassManager(ClassManager):
"""Enables configuring a mapper to return instances of uninstrumented
classes instead. To use add a readonly_type attribute referencing the
desired class to use instead of the instrumented one."""
def __init__(self, class_):
ClassManager.__init__(self, class_)
self.readonly_version = getattr(class_, 'readonly_type', None)
if self.readonly_version:
# default instantiation logic doesn't know to install finders
# for our alternate class
instrumentation_registry._dict_finders[self.readonly_version] = self.dict_getter()
instrumentation_registry._state_finders[self.readonly_version] = self.state_getter()
def new_instance(self, state=None):
if self.readonly_version:
instance = self.readonly_version.__new__(self.readonly_version)
self.setup_instance(instance, state)
return instance
return ClassManager.new_instance(self, state)
Base = declarative_base()
Base.__sa_instrumentation_manager__ = ReadonlyClassManager
Usage example:
class ReadonlyFoo(object):
pass
class Foo(Base, ReadonlyFoo):
__tablename__ = 'foo'
id = Column(Integer, primary_key=True)
name = Column(String(32))
readonly_type = ReadonlyFoo
assert type(session.query(Foo).first()) is ReadonlyFoo
You should be able to disable lazy loading on the relationships in question and sqlalchemy will fetch them all in a single query.
Try using a single query with JOINs instead of the python loops.

Should I create mapper objects or use the declarative syntax in SQLAlchemy?

There are two (three, but I'm not counting Elixir, as its not "official") ways to define a persisting object with SQLAlchemy:
Explicit syntax for mapper objects
from sqlalchemy import Table, Column, Integer, String, MetaData, ForeignKey
from sqlalchemy.orm import mapper
metadata = MetaData()
users_table = Table('users', metadata,
Column('id', Integer, primary_key=True),
Column('name', String),
)
class User(object):
def __init__(self, name):
self.name = name
def __repr__(self):
return "<User('%s')>" % (self.name)
mapper(User, users_table) # <Mapper at 0x...; User>
Declarative syntax
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
name = Column(String)
def __init__(self, name):
self.name = name
def __repr__(self):
return "<User('%s')>" % (self.name)
I can see that while using the mapper objects, I separate completely the ORM definition from the business logic, while using the declarative syntax, whenever I modify the business logic class, I can edit right there the database class (which ideally should be edited little).
What I'm not completely sure, is which approach is more maintainable for a business application?
I haven't been able to find a comparative between the two mapping methods, to be able to decide which one is a better fit for my project.
I'm leaning towards using the "normal" way (i.e. not the declarative extension) as it allows me to "hide", and keep out of the business view all the ORM logic, but I'd like to hear compelling arguments for both approaches.
"What I'm not completely sure, is which approach is more maintainable for a business application?"
Can't be answered in general.
However, consider this.
The Django ORM is strictly declarative -- and people like that.
SQLAlchemy does several things, not all of which are relevant to all problems.
SQLAlchemy creates DB-specific SQL from general purpose Python. If you want to mess with the SQL, or map Python classes to existing tables, then you have to use explicit mappings, because your focus is on the SQL, not the business objects and the ORM.
SQLAlchemy can use declarative style (like Django) to create everything for you. If you want this, then you are giving up explicitly writing table definitions and explicitly messing with the SQL.
Elixir is an alternative to save you having to look at SQL.
The fundamental question is "Do you want to see and touch the SQL?"
If you think that touching the SQL makes things more "maintainable", then you have to use explicit mappings.
If you think that concealing the SQL makes things more "maintainable", then you have to use declarative style.
If you think Elixir might diverge from SQLAlchemy, or fail to live up to it's promise in some way, then don't use it.
If you think Elixir will help you, then use it.
In our team we settled on declarative syntax.
Rationale:
metadata is trivial to get to, if needed: User.metadata.
Your User class, by virtue of subclassing Base, has a nice ctor that takes kwargs for all fields. Useful for testing and otherwise. E.g.: user=User(name='doe', password='42'). So no need to write a ctor!
If you add an attribute/column, you only need to do it once. "Don't Repeat Yourself" is a nice principle.
Regarding "keeping out ORM from business view": in reality your User class, defined in a "normal" way, gets seriously monkey-patched by SA when mapper function has its way with it. IMHO, declarative way is more honest because it screams: "this class is used in ORM scenarios, and may not be treated just as you would treat your simple non-ORM objects".
I've found that using mapper objects are much simpler then declarative syntax if you use sqlalchemy-migrate to version your database schema (and this is a must-have for a business application from my point of view). If you are using mapper objects you can simply copy/paste your table declarations to migration versions, and use simple api to modify tables in the database. Declarative syntax makes this harder because you have to filter away all helper functions from your class definitions after copying them to the migration version.
Also, it seems to me that complex relations between tables are expressed more clearly with mapper objects syntax, but this may be subjective.
as of current (2019), many years later, sqlalchemy v1.3 allows a hybrid approach with the best of both worlds
https://docs.sqlalchemy.org/en/13/orm/extensions/declarative/table_config.html#using-a-hybrid-approach-with-table
metadata = MetaData()
users_table = Table('users', metadata,
Column('id', Integer, primary_key=True),
Column('name', String),
)
# possibly in a different file/place the orm-declaration
Base = declarative_base(metadata)
class User(Base):
__table__ = Base.metadata.tables['users']
def __str__():
return "<User('%s')>" % (self.name)

How can I query a model by if there is a subclass instance?

I have these two simple models, A and B:
from django.db import models
class A(models.Model):
name = models.CharField(max_length=10)
class B(A):
age = models.IntegerField()
Now, how can I query for all instances of A which do not have an instance of B?
The only way I found requires an explicitly unique field on each subclass, which is NOT NULL, so that I can do A.objects.filter(b__this_is_a_b=None), for example, to get instances that are not also B instances. I'm looking for a way to do this without adding an explicit silly flag like that.
I also don't want to query for all of the objects and then filter them in Python. I want to get the DB to do it for me, which is basically something like SELECT * FROM A WHERE A.id in (SELECT id from B)
Since some version of django or python this works as well:
A.Objects.all().filter(b__isnull=True)
because if a is an A object a.b gives the subclass B of a when it exists
I Know this is an old question, but my answer might help new searchers on this subject.
see also:
multi table inheritance
And one of my own questions about this: downcasting a super class to a sub class
I'm not sure it's possible to do this purely in the DB with Django's
ORM, in a single query. Here's the best I've been able to do:
A.objects.exclude(id__in=[r[0] for r in B.objects.values_list("a_ptr_id")])
This is 2 DB queries, and works best with a simplistic inheritance
graph - each subclass of A would require a new database query.
Okay, it took a lot of trial and error, but I have a solution. It's
ugly as all hell, and the SQL is probably worse than just going with
two queries, but you can do something like so:
A.objects.exclude(b__age__isnull=True).exclude(b__age_isnull=False)
There's no way to get Django to do the join without referencing a
field on b. But with these successive .exclude()s, you make any A
with a B subclass match one or the other of the excludes. All you're left with are A's
without a B subclass.
Anyway, this is an interesting use case, you should bring it up on django-dev...
I don't work with django, but it looks like you want the isinstance(obj, type) built-in python method.
Edit:
Would A.objects.exclude(id__exact=B__id) work?

Categories