I'm working on a Flask project and I am using Flask-SQLAlchemy.
I need to work with multiple already existing databases.
I created the "app" object and the SQLAlchemy one:
from flask import Flask
from flask_sqlalchemy import SQLAlchemy
app = Flask(__name__)
db = SQLAlchemy(app)
In the configuration I set the default connection and the additional binds:
SQLALCHEMY_DATABASE_URI = 'postgresql://pg_user:pg_pwd#pg_server/pg_db'
SQLALCHEMY_BINDS = {
'oracle_bind': 'oracle://oracle_user:oracle_pwd#oracle_server/oracle_schema',
'mssql_bind': 'mssql+pyodbc://msssql_user:mssql_pwd#mssql_server/mssql_schema?driver=FreeTDS'
}
Then I created the table models using the declarative system and, where needed, I set the
__bind_key__ parameter to indicate in which database the table is located.
For example:
class MyTable(db.Model):
__bind_key__ = 'mssql_bind'
__tablename__ = 'my_table'
id = db.Column(db.Integer, nullable=False, primary_key=True)
val = db.Column(db.String(50), nullable=False)
in this way everything works correctly, when I do a query it is made on the right database.
Reading the SQLAlchemy documentation and the Flask-SQLALchemy documentation I understand these things
(i write them down to check I understand correctly):
You can handle the transactions through the session.
In SQLAlchemy you can bind a session with a specific engine.
Flask-SQLAlchemy automatically creates the session (scoped_session) at the request start and it destroys it at the request end
so I can do:
record = MyTable(1, 'some text')
db.session.add(record)
db.session.commit()
I can not understand what happens when we use multiple databases, regarding the session, in Flask-SqlAlchemy.
I verified that the system is able to bind the table correctly at the right database through the __bind_key__ parameter,
I can, therefore, insert data on different databases through db.session and, at the commit, everything is saved.
I can't, however, understand if Flask-SQLAlchemy create multiple sessions (one for each engine) or if manages the thing in a different way.
In both cases, how is it possible refer to the session/transaction of a specific database?
If I use db.session.commit() the system does the commit on all involved databases, but how can I do if I want to commit only for a single database?
I would do something like:
db.session('mssql_bind').commit()
but I can not figure out how to do this.
I also saw a Flask-SQLAlchemy implementation which should ease the management of these situations:
Issue: https://github.com/mitsuhiko/flask-sqlalchemy/issues/107
Implementation: https://github.com/mitsuhiko/flask-sqlalchemy/pull/249
but I can not figure out how to use it.
In Flask-SQLAlchemy how can I manage sessions specifically for each single engine?
Flask-SQLAlchemy uses a customized session that handles bind routing according to given __bind_key__ attribute in mapped class. Under the hood it actually adds that key as info to the created table. In other words, Flask does not create multiple sessions, one for each bind, but a single session that routes to correct connectable (engine/connection) according to the bind key. Note that vanilla SQLAlchemy has similar functionality out of the box.
In both cases, how is it possible refer to the session/transaction of a specific database?
If I use db.session.commit() the system does the commit on all involved databases, but how can I do if I want to commit only for a single database?
It might not be a good idea to subvert and issue commits to specific databases mid session using the connections owned by the session. The session is a whole and keeps track of state for object instances, flushing changes to databases when needed etc. That means that the transaction handled by the session is not just the database transactions, but the session's own transaction as well. All that should commit and rollback as one.
You could on the other hand create new SQLAlchemy (or Flask-SQLAlchemy) sessions that possibly join the ongoing transaction in one of the binds:
session = db.create_scoped_session(
options=dict(bind=db.get_engine(app, 'oracle_bind'),
binds={}))
This is what the pull request is about. It allows using an existing transactional connection as the bind for a new Flask-SQLAlchemy session. This is very useful for example in testing, as can be seen in the rationale for that pull request. That way you can have a "master" transaction that can for example rollback everything done in testing.
Note that the SignallingSession always consults the db.get_engine() method if a bind_key is present. This means that the example session is unable to query tables without a bind key and which don't exist on your oracle DB, but would still work for tables with your mssql_bind key.
The issue you linked to on the other hand does list ways to issue SQL to specific binds:
rows = db.session.execute(query, params,
bind=db.get_engine(app, 'oracle_bind'))
There were other less explicit methods listed as well, but explicit is better than implicit.
Related
I created a rest API with openapi generator that contains all the requests necessary for selecting, inserting, and updating my SQL database.
I use from my database generation and manipulation SQLAlchemy and I'm not sure how to use the session to interact with the database in this context.
My project looks like this:
DB
| openapi_server (generated)
| __init__.py
| request.py
| database.py
In database.py I keep my database structure.
In request.py I have all the functions that need to be processed on every request(to interact with the database).
My way of handling this situation is: I create a session variable at the beginning of each function and after the operation is complete I close it.
Any other methods that are more scalable and easy to maintain or which are the best practices?
My understanding is that the sqlalchemy session is different from the client session in that the client session stores information about authorization & permissions whereas the sqlalchemy session is a gil-bound transaction state which associates your code / machine to an external database.
Assuming you're not utilizing multithreading or parallel processing, a single sqlalchemy session shared between your application would be appropriate. In the case where your users have different levels of database permissions, I would establish those rules in your application authorization, rather than the database user-permission schema. (That should be reserved for system-users.)
Bear in mind, multiple sqlalchemy sessions are appropriate in many scenarios and there are advantages for creating and closing sessions on the fly. But there are also potential downsides, such as write collisions (2 processes try to write the same record) and so on. In these more fine grained cases, I'd suggest a queuing process as a central orchestrator.
For implementation:I usually create a file create_session.py which has a function to create a new db session with the appropriate DB URI. I then call that function in the main __init__.py like so: session = create_session() --> importing that session throughout the application is done by importing session from the main module ex: from database import session.
In cases where you need to create new / multiple sessions, do so with:
# Getting the path right here isn't always straightforward tbh
# basically, import the function from the module directly
from create_session import create_session
def do_something():
# Always create your session in a method
# otherwise your db will open many unnecessary connections
my_session = create_session()
print('Done')
# Close the session when you're done
my_session.close()
I have a caching problem when I use sqlalchemy.
I use sqlalchemy to insert data into a MySQL database. Then, I have another application process this data, and update it directly.
But sqlalchemy always returns the old data rather than the updated data. I think sqlalchemy cached my request ... so ... how should I disable it?
The usual cause for people thinking there's a "cache" at play, besides the usual SQLAlchemy identity map which is local to a transaction, is that they are observing the effects of transaction isolation. SQLAlchemy's session works by default in a transactional mode, meaning it waits until session.commit() is called in order to persist data to the database. During this time, other transactions in progress elsewhere will not see this data.
However, due to the isolated nature of transactions, there's an extra twist. Those other transactions in progress will not only not see your transaction's data until it is committed, they also can't see it in some cases until they are committed or rolled back also (which is the same effect your close() is having here). A transaction with an average degree of isolation will hold onto the state that it has loaded thus far, and keep giving you that same state local to the transaction even though the real data has changed - this is called repeatable reads in transaction isolation parlance.
http://en.wikipedia.org/wiki/Isolation_%28database_systems%29
This issue has been really frustrating for me, but I have finally figured it out.
I have a Flask/SQLAlchemy Application running alongside an older PHP site. The PHP site would write to the database and SQLAlchemy would not be aware of any changes.
I tried the sessionmaker setting autoflush=True unsuccessfully
I tried db_session.flush(), db_session.expire_all(), and db_session.commit() before querying and NONE worked. Still showed stale data.
Finally I came across this section of the SQLAlchemy docs: http://docs.sqlalchemy.org/en/latest/dialects/postgresql.html#transaction-isolation-level
Setting the isolation_level worked great. Now my Flask app is "talking" to the PHP app. Here's the code:
engine = create_engine(
"postgresql+pg8000://scott:tiger#localhost/test",
isolation_level="READ UNCOMMITTED"
)
When the SQLAlchemy engine is started with the "READ UNCOMMITED" isolation_level it will perform "dirty reads" which means it will read uncommited changes directly from the database.
Hope this helps
Here is a possible solution courtesy of AaronD in the comments
from flask.ext.sqlalchemy import SQLAlchemy
class UnlockedAlchemy(SQLAlchemy):
def apply_driver_hacks(self, app, info, options):
if "isolation_level" not in options:
options["isolation_level"] = "READ COMMITTED"
return super(UnlockedAlchemy, self).apply_driver_hacks(app, info, options)
Additionally to zzzeek excellent answer,
I had a similar issue. I solved the problem by using short living sessions.
with closing(new_session()) as sess:
# do your stuff
I used a fresh session per task, task group or request (in case of web app). That solved the "caching" problem for me.
This material was very useful for me:
When do I construct a Session, when do I commit it, and when do I close it
This was happening in my Flask application, and my solution was to expire all objects in the session after every request.
from flask.signals import request_finished
def expire_session(sender, response, **extra):
app.db.session.expire_all()
request_finished.connect(expire_session, flask_app)
Worked like a charm.
I have tried session.commit(), session.flush() none worked for me.
After going through sqlalchemy source code, I found the solution to disable caching.
Setting query_cache_size=0 in create_engine worked.
create_engine(connection_string, convert_unicode=True, echo=True, query_cache_size=0)
First, there is no cache for SQLAlchemy.
Based on your method to fetch data from DB, you should do some test after database is updated by others, see whether you can get new data.
(1) use connection:
connection = engine.connect()
result = connection.execute("select username from users")
for row in result:
print "username:", row['username']
connection.close()
(2) use Engine ...
(3) use MegaData...
please folowing the step in : http://docs.sqlalchemy.org/en/latest/core/connections.html
Another possible reason is your MySQL DB is not updated permanently. Restart MySQL service and have a check.
As i know SQLAlchemy does not store caches, so you need to looking at logging output.
I am currently working on a new web application that needs to execute an SQL statement before giving a session to the application itself.
In detail: I am running a PostgreSQL database server with multiple schemas and I need to execute a SET search_path statement before the application uses the session. I am also using the ZopeTransactionExtension to have transactions automatically handled at the request level.
To ensure the exectuion of the SQL statement, there seem to be two possible ways:
Executing the statement at the Engine/Connection level via SQLAlchemy events (from Multi-tenancy with SQLAlchemy)
Executing the statement at the session level (from SQLAlchemy support of Postgres Schemas)
Since I am using a scoped session and want to keep my transactions intact, I wonder which of these ways will possibly disturb transaction management.
For example, does the Engine hand out a new connection from the Pool on every query? Or is it attached to the session for its lifetime, i.e. until the request has been processed and the session & transaction are closed/committed?
On the other hand, since I am using a scoped session, can I perform it the way zzzeek suggested it in the second link? That is, is the context preserved and automatically reset once the transaction is over?
Is there possibly a third way that I am missing?
For example, does the Engine hand out a new connection from the Pool on every query?
only if you have autocommit=True, which should not be the case.
Or is it attached to the session for its lifetime, i.e. until the request has been processed and the session & transaction are closed/committed?
it's attached per transaction. But the "search_path" in Postgresql is per postgresql session (not to be confused with SQLAlchemy session) - its basically the lifespan of the connection itself.
The session (and the engine, and the pool) these days has a ton of event hooks you can grab onto in order to set up state like this. If you want to stick with the Session you can try after_begin.
I have an sqlalchemy application that currently uses a local database.The code for the application is given below.
log = core.getLogger()
engine = create_engine('sqlite:///nwtopology.db', echo=False)
Base = declarative_base()
Session = sessionmaker(bind=engine)
session = Session()
class SourcetoPort(Base):
""""""
__tablename__ = 'source_to_port'
id = Column(Integer, primary_key=True)
port_no = Column(Integer)
src_address = Column(String,index=True)
#-----------------------------------------
def __init__(self, src_address,port_no):
""""""
self.src_address = src_address
self.port_no = port_no
I want to create the database itself in a remote machine.I came across this document.
http://www.sqlalchemy.org/doc_pdfs/sqlalchemy_0_6_3.pdf
In the explanation they mentioned the lines given below.
engine = create_engine(’postgresql://scott:tiger#localhost:5432/mydatabase’)
My first question is
1) does sqlite support remote database creation?
2) How do I keep the connection to the remote machine open always? I don't want to initiate an ssh connection every time I have to insert an entry or make a query.
These question may sound stupid but I am very new to python and sqlalchemy.Any help is appreciated.
Answering your questions:
SQLite doesn't support remote database connection - you'll have to implement this by yourself - like putting sqlite database file on shared by network filesystem, but it would make your solution less reliable
My suggestion - do not try to use user remote sqlite database but switch to traditional RDBMS. Please see below for more details.
Sounds like your application had overgrown SQLite. And it is good time to switch to using traditional RDBMS like MySQL or PosgreSQL where network connections are supporting out of the box.
SQLite is local database. SQLite has a page explaining when to use it. It says:
If you have many client programs accessing a common database over a
network, you should consider using a client/server database engine
instead of SQLite.
The good thing is that your application might be database agnostic as you are using SQLAlchemy for generating queries.
So I would do the following:
install database system to machine (it doesn't matter - local or
remote, you can always repeat move your database to remote machine) and configure permissions for your user (create database, alter, select, update and insert)
create database schema and populate data - to clone your existing. There are some tools available for doing so - i.e. Copying Databases across Platforms with SQLAlchemy
sqlite database.
update db connection string in your application from using sqlite to use remote database of your choice
Times have changed.
If one wishes to make a SQLite database available over the web, one option would be to use CubeSQL as a server, and SQLiteManager for SQLite as its client. For details, see e.g. https://www.sqlabs.com/
Another option might be to use Valentina Server similarly: see https://www.valentina-db.com/en/valentina-server-overview
(These options will probably only be suitable if there is at most one client with write-access at a time.)
Are there any others?
I have a Pylons-based web application which connects via Sqlalchemy (v0.5) to a Postgres database. For security, rather than follow the typical pattern of simple web apps (as seen in just about all tutorials), I'm not using a generic Postgres user (e.g. "webapp") but am requiring that users enter their own Postgres userid and password, and am using that to establish the connection. That means we get the full benefit of Postgres security.
Complicating things still further, there are two separate databases to connect to. Although they're currently in the same Postgres cluster, they need to be able to move to separate hosts at a later date.
We're using sqlalchemy's declarative package, though I can't see that this has any bearing on the matter.
Most examples of sqlalchemy show trivial approaches such as setting up the Metadata once, at application startup, with a generic database userid and password, which is used through the web application. This is usually done with Metadata.bind = create_engine(), sometimes even at module-level in the database model files.
My question is, how can we defer establishing the connections until the user has logged in, and then (of course) re-use those connections, or re-establish them using the same credentials, for each subsequent request.
We have this working -- we think -- but I'm not only not certain of the safety of it, I also think it looks incredibly heavy-weight for the situation.
Inside the __call__ method of the BaseController we retrieve the userid and password from the web session, call sqlalchemy create_engine() once for each database, then call a routine which calls Session.bind_mapper() repeatedly, once for each table that may be referenced on each of those connections, even though any given request usually references only one or two tables. It looks something like this:
# in lib/base.py on the BaseController class
def __call__(self, environ, start_response):
# note: web session contains {'username': XXX, 'password': YYY}
url1 = 'postgres://%(username)s:%(password)s#server1/finance' % session
url2 = 'postgres://%(username)s:%(password)s#server2/staff' % session
finance = create_engine(url1)
staff = create_engine(url2)
db_configure(staff, finance) # see below
... etc
# in another file
Session = scoped_session(sessionmaker())
def db_configure(staff, finance):
s = Session()
from db.finance import Employee, Customer, Invoice
for c in [
Employee,
Customer,
Invoice,
]:
s.bind_mapper(c, finance)
from db.staff import Project, Hour
for c in [
Project,
Hour,
]:
s.bind_mapper(c, staff)
s.close() # prevents leaking connections between sessions?
So the create_engine() calls occur on every request... I can see that being needed, and the Connection Pool probably caches them and does things sensibly.
But calling Session.bind_mapper() once for each table, on every request? Seems like there has to be a better way.
Obviously, since a desire for strong security underlies all this, we don't want any chance that a connection established for a high-security user will inadvertently be used in a later request by a low-security user.
Binding global objects (mappers, metadata) to user-specific connection is not good way. As well as using scoped session. I suggest to create new session for each request and configure it to use user-specific connections. The following sample assumes that you use separate metadata objects for each database:
binds = {}
finance_engine = create_engine(url1)
binds.update(dict.fromkeys(finance_metadata.sorted_tables, finance_engine))
# The following line is required when mappings to joint tables are used (e.g.
# in joint table inheritance) due to bug (or misfeature) in SQLAlchemy 0.5.4.
# This issue might be fixed in newer versions.
binds.update(dict.fromkeys([Employee, Customer, Invoice], finance_engine))
staff_engine = create_engine(url2)
binds.update(dict.fromkeys(staff_metadata.sorted_tables, staff_engine))
# See comment above.
binds.update(dict.fromkeys([Project, Hour], staff_engine))
session = sessionmaker(binds=binds)()
I would look at the connection pooling and see if you can't find a way to have one pool per user.
You can dispose() the pool when the user's session has expired