sqlalchemy - Performance issues with too many methods/attributes?

sqlalchemy - Performance issues with too many methods/attributes? - python

I have been using sqlalchemy for a few years now in replacement for Django models. I have found it very convenient to have custom methods attached to these models
i.e.
class Widget(Base):
__tablename__ = 'widgets'
id = Column(Integer, primary_key=True)
name = Column(Unicode(100))
def get_slug(self, max_length=50):
return slugify(self.name)[:max_length]
Is there a performance hit when doing things like session.query(Widget) if the model has a few dozen complex methods (50-75 lines)? Are these loaded into memory for every row returned and would it be more efficient to move some of these less-used methods into helper functions and import as-necessary?
def some_helper_function(widget):
':param widget: a instance of Widget()'
# do something
Thanks!

You would not have any performance hit when loading the objects from the database using SA using just a session.query(...).
And you definitely should not move any methods out to any helper function for the sake of performance, as in doing so you would basically destroy the object oriented paradigm of your model.

Related

Create a reusable component with ZCA and SQLAlchemy

Actually i'm designing a large application for desktop with python 3.4.
I choosed the Port and Adapter architecture as know as Hexagonal architecture.
The main purpose it to work with reusable component.
To organize the code and put some rules we will use the Zope Component Architecture (ZCA)
So to do some POC, i'm creating the database component.
But the fact to work with an ORM block me. I mean i designed some database component which looked like :
-IDatabase
-IDatabaseConfig
-IEntity
-IKey
-IReader
-IWriter
...
and the implementation.
SQLAlchemy management lot of thing and i don't know how to make my component reusable.
i found this code :
#Reflect each database table we need to use, using metadata
class Customer(Base):
__table__ = Table('Customers', metadata, autoload=True)
orders = relationship("Order", backref="customer")
class Shipper(Base):
__table__ = Table('Shippers', metadata, autoload=True)
orders = relationship("Order", backref="shipper")
class Product(Base):
__table__ = Table('Products', metadata, autoload=True)
supplier = relationship('Supplier', backref='products')
category = relationship('Category', backref='products')
But this code is realy hardly coupled with SQLAlchemy i guess.
So what is the approach i should use with my architecture?
As the Entities must be the center of the application (Domain layer) it will have a problem with that solution if i need to change my database component and don't use SQLAlchemy ?
i'm open to all suggestions.

I use the ORM as entity objects and put Adapters over it:
somewhere in a interfaces.py the API is defined:
from zope.interface import Interface
class IEntity(Interface):
def pi_ing():
'''makes the entity go "pi-ing" '''
somewhere the database model is defined:
class EntityA(Base):
# .. table and relationship definitions are here
somewhere else the API is implemented:
from zope.interface import implementer
from zope.component import adapter, getGlobalSiteManager
class EntityAPI(object):
def __init__(self, instance):
self.instance = instance
def pi_ing(self):
# make "pi_ing"
#implementer(IEntityAProtocol)
#adapter(int)
def get_entity_a_by_id(id):
return EntityAPI(session().query(EntityA).get(id))
getGlobalSiteManager().registerAdapter(get_entity_a_by_id)
Now everything is in place. Somewhere in the business logic of the code
when you get the id of an entity_a you can just do:
from interfaces import IEntityAProtocol
def stuff_which_goes_splat():
# ...
entity_id = got_from_somewhere()
IEntityProtocol(entity_id).pi_ing()
Now you have a complete separate implementation of interface, database entity and API logic. The nice thing is, you don't have to adapt only such primitive stuff like an int ID of an object, you can adapt anything, as long as you can implement a direct transformation from your adaptee to the database entity.
caveat: of course, relative to the point in execution time, the getGlobalSiteManager().registerAdapter(get_entity_a_by_id) must have been called already.

Is it possible to have a collection on an object that does not have a foreign key relationship to each other?

I'm using an declarative SQLAlchemy class to perform computations. Part of the computations require me to perform the computations for all configurations provided by a different table which doesn't have any foreign key relationships between the two tables.
This analogy is nothing like my real application, but hopefully will help to comprehend what I want to happen.
I have a set of cars and a list of paint colors.
The car object has a factory which provides a car in all possible colors
from sqlalchemy import *
from sqlachemy.orm import *
def PaintACar(car, color):
pass
Base = declarative_base()
class Colors(Base):
__table__ = u'colors'
id = Column('id', Integer)
color= Column('color', Unicode)
class Car(Base):
__table__ = u'car'
id = Column('id', Integer)
model = Column('model', Unicode)
# is this somehow possible?
all_color_objects = collection(...)
# I know this is possible, but would like to know if there's another way
#property
def all_colors(self):
s = Session.object_session(self)
return s.query(A).all()
def CarColorFactory(self):
for color in self.all_color_objects:
yield PaintACar(self, color)
My question: Is it possible to produce all_color_objects somehow? Without having to resort to finding the session and manually issuing a query as in the all_colors property?

It's been a while, so I'm providing the best answer I saw (as a comment by zzzeek). Basically, I was looking for one-off syntactic sugar. My original 'ugly' implementation works just fine.
what better way would there be here besides getting a Session and producing the query you
want? Are you looking for being able to add to the collection and that automatically
flushes things? (just add the objects to the Session?) Do you not like using
object_session(self) >(you can build some mixin class or something that hides that for
you?) It's not really clear >what the problem is. The objects here have no relationship to
the parent class so there's no particular intelligence SQLAlchemy would be able to add.
– zzzeek Jun 17 at 5:03

Data Modeling, Flask-SQLAlchemy and Plugins

I working on a website based on Flask and Flask-SQLAlchemy with MySQL. I have a handful bunch of feeds, each feed has a few data, but it needs a function.
At first, I used MySQL-python (with raw SQL) to store data, and feeds were on plugins system so each feed overrides update() function to import data by its way.
Now I changed to use Flask-SQLAlchemy and added Feed model to the database as it helps with SQLAlchemy ORM, but I'm stuck at how to handle update() function?
Keep the plugins system in parallel with the database model, but I think that's unpractical/noneffective.
Extend model class, I'm not sure if that's possible, e.g. FeedOne(Feed) will represent item(name="one") only.
Make update() function handle all feeds, by using if self.name == "" statement.
Added some code bits.
Feed model:
class Feed(db.Model):
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(255))
datapieces = db.relationship('Datapiece', backref = 'feed', lazy = 'dynamic')
update() function
def update(self):
data = parsedata(self.data_source)
for item in data.items:
new_datapiece = Datapiece(feed=self.id, name=item.name, value=item.value)
db.session.add(new_datapiece)
db.session.commit()
What I hope to achieve in option 2 is:
for feed in Feed.query.all():
feed.update()
And every feed will use its own class update().

Extending the class and adding an .update() method is just how it is supposed to work for option 2.
I don't see any problem in it (and i'm using that style of coding with flask/sqlalchemy all the time).
And if you (can) omit the dynamic lazy attribute you could also do a thing like:
self.datapieces.append(new_datapiece)
in your Feed's update function.

What's the difference between Model.query and session.query(Model) in SQLAlchemy?

I'm a beginner in SQLAlchemy and found query can be done in 2 method:
Approach 1:
DBSession = scoped_session(sessionmaker())
class _Base(object):
query = DBSession.query_property()
Base = declarative_base(cls=_Base)
class SomeModel(Base):
key = Column(Unicode, primary_key=True)
value = Column(Unicode)
# When querying
result = SomeModel.query.filter(...)
Approach 2
DBSession = scoped_session(sessionmaker())
Base = declarative_base()
class SomeModel(Base):
key = Column(Unicode, primary_key=True)
value = Column(Unicode)
# When querying
session = DBSession()
result = session.query(SomeModel).filter(...)
Is there any difference between them?

In the code above, there is no difference. This is because, in line 3 of the first example:
the query property is explicitly bound to DBSession
there is no custom Query object passed to query_property
As #petr-viktorin points out in the answer here, there must be a session available before you define your model in the first example, which might be problematic depending on the structure of your application.
If, however, you need a custom query that adds additional query parameters automatically to all queries, then only the first example will allow that. A custom query class that inherits from sqlalchemy.orm.query.Query can be passed as an argument to query_property. This question shows an example of that pattern.
Even if a model object has a custom query property defined on it, that property is not used when querying with session.query, as in the last line in the second example. This means something like the first example the only option if you need a custom query class.

I see these downsides to query_property:
You cannot use it on a different Session than the one you've configured (though you could always use session.query then).
You need a session object available before you define your schema.
These could bite you when you want to write tests, for example.
Also, session.query fits better with how SQLAlchemy works; query_property looks like it's just added on top for convenience (or similarity with other systems?).
I'd recommend you stick to session.query.

An answer (here) to a different SQLAlchemy question might help. That answer starts with:
You can use Model.query, because the Model (or usually its base class, especially in cases where declarative extension is used) is assigned Session.query_property. In this case the Model.query is equivalent to Session.query(Model).

SqlAlchemy optimizations for read-only object models

I have a complex network of objects being spawned from a sqlite database using sqlalchemy ORM mappings. I have quite a few deeply nested:
for parent in owner.collection:
for child in parent.collection:
for foo in child.collection:
do lots of calcs with foo.property
My profiling is showing me that the sqlalchemy instrumentation is taking a lot of time in this use case.
The thing is: I don't ever change the object model (mapped properties) at runtime, so once they are loaded I don't NEED the instrumentation, or indeed any sqlalchemy overhead at all. After much research, I'm thinking I might have to clone a 'pure python' set of objects from my already loaded 'instrumented objects', but that would be a pain.
Performance is really crucial here (it's a simulator), so maybe writing those layers as C extensions using sqlite api directly would be best. Any thoughts?

If you reference a single attribute of a single instance lots of times, a simple trick is to store it in a local variable.
If you want a way to create cheap pure python clones, share the dict object with the original object:
class CheapClone(object):
def __init__(self, original):
self.__dict__ = original.__dict__
Creating a copy like this costs about half of the instrumented attribute access and attribute lookups are as fast as normal.
There might also be a way to have the mapper create instances of an uninstrumented class instead of the instrumented one. If I have some time, I might take a look how deeply ingrained is the assumption that populated instances are of the same type as the instrumented class.
Found a quick and dirty way that seems to at least somewhat work on 0.5.8 and 0.6. Didn't test it with inheritance or other features that might interact badly. Also, this touches some non-public API's, so beware of breakage when changing versions.
from sqlalchemy.orm.attributes import ClassManager, instrumentation_registry
class ReadonlyClassManager(ClassManager):
"""Enables configuring a mapper to return instances of uninstrumented
classes instead. To use add a readonly_type attribute referencing the
desired class to use instead of the instrumented one."""
def __init__(self, class_):
ClassManager.__init__(self, class_)
self.readonly_version = getattr(class_, 'readonly_type', None)
if self.readonly_version:
# default instantiation logic doesn't know to install finders
# for our alternate class
instrumentation_registry._dict_finders[self.readonly_version] = self.dict_getter()
instrumentation_registry._state_finders[self.readonly_version] = self.state_getter()
def new_instance(self, state=None):
if self.readonly_version:
instance = self.readonly_version.__new__(self.readonly_version)
self.setup_instance(instance, state)
return instance
return ClassManager.new_instance(self, state)
Base = declarative_base()
Base.__sa_instrumentation_manager__ = ReadonlyClassManager
Usage example:
class ReadonlyFoo(object):
pass
class Foo(Base, ReadonlyFoo):
__tablename__ = 'foo'
id = Column(Integer, primary_key=True)
name = Column(String(32))
readonly_type = ReadonlyFoo
assert type(session.query(Foo).first()) is ReadonlyFoo

You should be able to disable lazy loading on the relationships in question and sqlalchemy will fetch them all in a single query.

Try using a single query with JOINs instead of the python loops.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

sqlalchemy - Performance issues with too many methods/attributes? - python

Related

Create a reusable component with ZCA and SQLAlchemy

Is it possible to have a collection on an object that does not have a foreign key relationship to each other?

Data Modeling, Flask-SQLAlchemy and Plugins

What's the difference between Model.query and session.query(Model) in SQLAlchemy?

SqlAlchemy optimizations for read-only object models

Categories

Resources