sqlalchemy, specify database name with DSN - python

I am trying to connect to a SQL Server from Linux using sqlalchemy. This page shows DSN-based connection as below.
engine = create_engine("mssql+pyodbc://scott:tiger#some_dsn")
Is there a way to specify a database name using DSN? I am aware that we can specify a database name either in odbc.ini or a SQL query but I would like to know if we can also do something like this.
engine = create_engine("mssql+pyodbc://scott:tiger#some_dsn/databasename")

You can pass arguments directly to the pyodbc.connect method through the connect_args parameter in create_engine:
def my_create_engine(mydsn, mydatabase, **kwargs):
connection_string = 'mssql+pyodbc://#%s' % mydsn
cargs = {'database': mydatabase}
cargs.update(**kwargs)
e = sqla.create_engine(connection_string, connect_args=cargs)
return e
This will also enable the database to be persisted through several transactions / sessions.

I just tried something like this and it seemed to work fine
engine = create_engine("mssql+pyodbc://scott:tiger#some_dsn")
with engine.begin() as conn:
conn.execute("USE databasename")
As a general rule we should be careful about changing the current catalog (a.k.a. "database") after establishing a connection because some technologies (e.g., JDBC Connection objects) keep track of the current catalog and can get confused if we directly call USE ... in T-SQL to change the current catalog. However, I'm not aware that pyodbc's Connection object does any such caching so this approach is probably okay.

Related

sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedTable) relation XXXXXXXX does not exist ------- Error using SQLAlchemy ORM [duplicate]

We host a multitenant app with SQLAlchemy and postgres. I am looking at moving from having separate databases for each tenant to a single database with multiple schemas. Does SQLAlchemy support this natively? I basically just want every query that comes out to be prefixed with a predetermined schema... e.g
select * from client1.users
instead of just
select * from users
Note that I want to switch the schema for all tables in a particular request/set of requests, not just a single table here and there.
I imagine that this could be accomplished with a custom query class as well but I can't imagine that something hasn't been done in this vein already.
well there's a few ways to go at this and it depends on how your app is structured. Here is the most basic way:
meta = MetaData(schema="client1")
If the way your app runs is one "client" at a time within the whole application, you're done.
But what may be wrong with that here is, every Table from that MetaData is on that schema. If you want one application to support multiple clients simultaneously (usually what "multitenant" means), this would be unwieldy since you'd need to create a copy of the MetaData and dupe out all the mappings for each client. This approach can be done, if you really want to, the way it works is you'd access each client with a particular mapped class like:
client1_foo = Client1Foo()
and in that case you'd be working with the "entity name" recipe at http://www.sqlalchemy.org/trac/wiki/UsageRecipes/EntityName in conjunction with sometable.tometadata() (see http://docs.sqlalchemy.org/en/latest/core/metadata.html#sqlalchemy.schema.Table.tometadata).
So let's say the way it really works is multiple clients within the app, but only one at a time per thread. Well actually, the easiest way to do that in Postgresql would be to set the search path when you start working with a connection:
# start request
# new session
sess = Session()
# set the search path
sess.execute("SET search_path TO client1")
# do stuff with session
# close it. if you're using connection pooling, the
# search path is still set up there, so you might want to
# revert it first
sess.close()
The final approach would be to override the compiler using the #compiles extension to stick the "schema" name in within statements. This is doable, but would be tricky as there's not a consistent hook for everywhere "Table" is generated. Your best bet is probably setting the search path on each request.
If you want to do this at the connection string level then use the following:
dbschema='schema1,schema2,public' # Searches left-to-right
engine = create_engine(
'postgresql+psycopg2://dbuser#dbhost:5432/dbname',
connect_args={'options': '-csearch_path={}'.format(dbschema)})
But, a better solution for a multi-client (multi-tenant) application is to configure a different db user for each client, and configure the relevant search_path for each user:
alter role user1 set search_path = "$user", public
It can now be done using schema translation map in Sqlalchemy 1.1.
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
__table_args__ = {'schema': 'per_user'}
On each request, the Session can be set up to refer to a different schema each time:
session = Session()
session.connection(execution_options={
"schema_translate_map": {"per_user": "account_one"}})
# will query from the ``account_one.user`` table
session.query(User).get(5)
Referred it from the SO answer here.
Link to the Sqlalchemy docs.
You may be able to manage this using the sqlalchemy event interface. So before you create the first connection, set up a listener along the lines of
from sqlalchemy import event
from sqlalchemy.pool import Pool
def set_search_path( db_conn, conn_proxy ):
print "Setting search path..."
db_conn.cursor().execute('set search_path=client9, public')
event.listen(Pool,'connect', set_search_path )
Obviously this needs to be executed before the first connection is created (eg in the application initiallization)
The problem I see with the session.execute(...) solution is that this executes on a specific connection used by the session. However I cannot see anything in sqlalchemy that guarantees that the session will continue to use the same connection indefinitely. If it picks up a new connection from the connection pool, then it will lose the search path setting.
I am needing an approach like this in order to set the application search_path, which is different to the database or user search path. I'd like to be able to set this in the engine configuration, but cannot see a way to do this. Using the connect event does work. I'd be interested in a simpler solution if anyone has one.
On the other hand, if you are wanting to handle multiple clients within an application, then this won't work - and I guess the session.execute(...) approach may be the best approach.
from sqlalchemy 1.1,
this can be done easily using using schema_translation_map.
https://docs.sqlalchemy.org/en/11/changelog/migration_11.html#multi-tenancy-schema-translation-for-table-objects
connection = engine.connect().execution_options(
schema_translate_map={None: "user_schema_one"})
result = connection.execute(user_table.select())
Here is a detailed reviews of all options available:
https://github.com/sqlalchemy/sqlalchemy/issues/4081
It's possible to solve this on DB level. I suppose you have a dedicated user for your application who is granted some privileges on the schema. Just set search_path for him to this schema:
ALTER ROLE your_user IN DATABASE your_db SET search_path TO your_schema;
There is a schema property in Table definitions
I'm not sure if it works but you can try:
Table(CP.get('users', metadata, schema='client1',....)
I tried:
con.execute('SET search_path TO {schema}'.format(schema='myschema'))
and that didn't work for me. I then used the schema= parameter in the init function:
# We then bind the connection to MetaData()
meta = sqlalchemy.MetaData(bind=con, reflect=True, schema='myschema')
Then I qualified the table with the schema name
house_table = meta.tables['myschema.houses']
and everything worked.
You can just change your search_path. Issue
set search_path=client9;
at the start of your session and then just keep your tables unqualified.
You can also set a default search_path at a per-database or per-user level. I'd be tempted to set it to an empty schema by default so you can easily catch any failure to set it.
http://www.postgresql.org/docs/current/static/ddl-schemas.html#DDL-SCHEMAS-PATH
I found none of the above answers worked with SqlAlchmeny 1.2.4. This is the solution that worked for me.
from sqlalchemy import MetaData, Table
from sqlalchemy import create_engine
def table_schemato_psql(schema_name, table_name):
conn_str = 'postgresql://{username}:{password}#localhost:5432/{database}'.format(
username='<username>',
password='<password>',
database='<database name>'
)
engine = create_engine(conn_str)
with engine.connect() as conn:
conn.execute('SET search_path TO {schema}'.format(schema=schema_name))
meta = MetaData()
table_data = Table(table_name, meta,
autoload=True,
autoload_with=conn,
postgresql_ignore_search_path=True)
for column in table_data.columns:
print column.name
I use the following pattern.
engine = sqlalchemy.create_engine("postgresql://postgres:mypass#172.17.0.2/mydb")
for schema in ['schema1', 'schema2']:
engine.execute(CreateSchema(schema))
tmp_engine = engine.execution_options(schema_translate_map = { None: schema } )
Base.metadata.create_all(tmp_engine)
For anyone who is coming here, for a more general solution that can support MYSQL or Oracle, please refer to this guide.
So basically it set the schemas for the engine when the first connection to the database is made.
engine = create_engine("engine_url")
#event.listens_for(engine, "connect", insert=True)
def set_current_schema(dbapi_connection, connection_record):
cursor_obj = dbapi_connection.cursor()
cursor_obj.execute(f"USE {self.schemas_name}")
cursor_obj.close()
the query to execute depends is specific to the database you are using, so for PSQL you will have a different query, for ORACLE, you will have a different, etc.

Showing current/actives engines of SQLAlchemy

I have a function to run queries in a MySQL database and stores the results in a DataFrame, like that:
def getData(connect_string, query, echo=False):
sql_engine = sql.create_engine(connect_string, echo=echo)
df = pd.read_sql_query(query, sql_engine)
return df
This function is called multiple times in multiple files, sometimes passing the same connect_string.
At this moment, I'm concern about memory usage and/or have too many connections on database. My first thought was kill Engine after use, but the official docs says:
The Engine is not synonymous to the DBAPI connect function, which represents just one connection resource - the Engine is most efficient when created just once at the module level of an application, not per-object or per-function call.
So now I'm looking for a way to check if I have an active connection with same connect_string and use they instead of create a new one, but I cannot find any reference/documentation explaining how to do this.
Right now, I'm considering create a dict to store created connections, like that:
engines = {'mysql+pymysql://scott:tiger#localhost/test1': <Engine Object1>,
'mysql+pymysql://scott:tiger#localhost/test2': <Engine Object2>}
And then modify my function to this:
def getData(connect_string, query, echo=False):
if connect_string not in engines.keys():
sql_engine = sql.create_engine(connect_string, echo=echo)
engines[connect_string] = sql_engine
else:
sql_engine = engines[connect_string]
df = pd.read_sql_query(query, sql_engine)
return df
But I don't think that is the best approach to do this.
Any thoughts?

How to target PostgreSQL schema in SQLAlchemy DB URI? [duplicate]

We host a multitenant app with SQLAlchemy and postgres. I am looking at moving from having separate databases for each tenant to a single database with multiple schemas. Does SQLAlchemy support this natively? I basically just want every query that comes out to be prefixed with a predetermined schema... e.g
select * from client1.users
instead of just
select * from users
Note that I want to switch the schema for all tables in a particular request/set of requests, not just a single table here and there.
I imagine that this could be accomplished with a custom query class as well but I can't imagine that something hasn't been done in this vein already.
well there's a few ways to go at this and it depends on how your app is structured. Here is the most basic way:
meta = MetaData(schema="client1")
If the way your app runs is one "client" at a time within the whole application, you're done.
But what may be wrong with that here is, every Table from that MetaData is on that schema. If you want one application to support multiple clients simultaneously (usually what "multitenant" means), this would be unwieldy since you'd need to create a copy of the MetaData and dupe out all the mappings for each client. This approach can be done, if you really want to, the way it works is you'd access each client with a particular mapped class like:
client1_foo = Client1Foo()
and in that case you'd be working with the "entity name" recipe at http://www.sqlalchemy.org/trac/wiki/UsageRecipes/EntityName in conjunction with sometable.tometadata() (see http://docs.sqlalchemy.org/en/latest/core/metadata.html#sqlalchemy.schema.Table.tometadata).
So let's say the way it really works is multiple clients within the app, but only one at a time per thread. Well actually, the easiest way to do that in Postgresql would be to set the search path when you start working with a connection:
# start request
# new session
sess = Session()
# set the search path
sess.execute("SET search_path TO client1")
# do stuff with session
# close it. if you're using connection pooling, the
# search path is still set up there, so you might want to
# revert it first
sess.close()
The final approach would be to override the compiler using the #compiles extension to stick the "schema" name in within statements. This is doable, but would be tricky as there's not a consistent hook for everywhere "Table" is generated. Your best bet is probably setting the search path on each request.
If you want to do this at the connection string level then use the following:
dbschema='schema1,schema2,public' # Searches left-to-right
engine = create_engine(
'postgresql+psycopg2://dbuser#dbhost:5432/dbname',
connect_args={'options': '-csearch_path={}'.format(dbschema)})
But, a better solution for a multi-client (multi-tenant) application is to configure a different db user for each client, and configure the relevant search_path for each user:
alter role user1 set search_path = "$user", public
It can now be done using schema translation map in Sqlalchemy 1.1.
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
__table_args__ = {'schema': 'per_user'}
On each request, the Session can be set up to refer to a different schema each time:
session = Session()
session.connection(execution_options={
"schema_translate_map": {"per_user": "account_one"}})
# will query from the ``account_one.user`` table
session.query(User).get(5)
Referred it from the SO answer here.
Link to the Sqlalchemy docs.
You may be able to manage this using the sqlalchemy event interface. So before you create the first connection, set up a listener along the lines of
from sqlalchemy import event
from sqlalchemy.pool import Pool
def set_search_path( db_conn, conn_proxy ):
print "Setting search path..."
db_conn.cursor().execute('set search_path=client9, public')
event.listen(Pool,'connect', set_search_path )
Obviously this needs to be executed before the first connection is created (eg in the application initiallization)
The problem I see with the session.execute(...) solution is that this executes on a specific connection used by the session. However I cannot see anything in sqlalchemy that guarantees that the session will continue to use the same connection indefinitely. If it picks up a new connection from the connection pool, then it will lose the search path setting.
I am needing an approach like this in order to set the application search_path, which is different to the database or user search path. I'd like to be able to set this in the engine configuration, but cannot see a way to do this. Using the connect event does work. I'd be interested in a simpler solution if anyone has one.
On the other hand, if you are wanting to handle multiple clients within an application, then this won't work - and I guess the session.execute(...) approach may be the best approach.
from sqlalchemy 1.1,
this can be done easily using using schema_translation_map.
https://docs.sqlalchemy.org/en/11/changelog/migration_11.html#multi-tenancy-schema-translation-for-table-objects
connection = engine.connect().execution_options(
schema_translate_map={None: "user_schema_one"})
result = connection.execute(user_table.select())
Here is a detailed reviews of all options available:
https://github.com/sqlalchemy/sqlalchemy/issues/4081
It's possible to solve this on DB level. I suppose you have a dedicated user for your application who is granted some privileges on the schema. Just set search_path for him to this schema:
ALTER ROLE your_user IN DATABASE your_db SET search_path TO your_schema;
There is a schema property in Table definitions
I'm not sure if it works but you can try:
Table(CP.get('users', metadata, schema='client1',....)
I tried:
con.execute('SET search_path TO {schema}'.format(schema='myschema'))
and that didn't work for me. I then used the schema= parameter in the init function:
# We then bind the connection to MetaData()
meta = sqlalchemy.MetaData(bind=con, reflect=True, schema='myschema')
Then I qualified the table with the schema name
house_table = meta.tables['myschema.houses']
and everything worked.
You can just change your search_path. Issue
set search_path=client9;
at the start of your session and then just keep your tables unqualified.
You can also set a default search_path at a per-database or per-user level. I'd be tempted to set it to an empty schema by default so you can easily catch any failure to set it.
http://www.postgresql.org/docs/current/static/ddl-schemas.html#DDL-SCHEMAS-PATH
I found none of the above answers worked with SqlAlchmeny 1.2.4. This is the solution that worked for me.
from sqlalchemy import MetaData, Table
from sqlalchemy import create_engine
def table_schemato_psql(schema_name, table_name):
conn_str = 'postgresql://{username}:{password}#localhost:5432/{database}'.format(
username='<username>',
password='<password>',
database='<database name>'
)
engine = create_engine(conn_str)
with engine.connect() as conn:
conn.execute('SET search_path TO {schema}'.format(schema=schema_name))
meta = MetaData()
table_data = Table(table_name, meta,
autoload=True,
autoload_with=conn,
postgresql_ignore_search_path=True)
for column in table_data.columns:
print column.name
I use the following pattern.
engine = sqlalchemy.create_engine("postgresql://postgres:mypass#172.17.0.2/mydb")
for schema in ['schema1', 'schema2']:
engine.execute(CreateSchema(schema))
tmp_engine = engine.execution_options(schema_translate_map = { None: schema } )
Base.metadata.create_all(tmp_engine)
For anyone who is coming here, for a more general solution that can support MYSQL or Oracle, please refer to this guide.
So basically it set the schemas for the engine when the first connection to the database is made.
engine = create_engine("engine_url")
#event.listens_for(engine, "connect", insert=True)
def set_current_schema(dbapi_connection, connection_record):
cursor_obj = dbapi_connection.cursor()
cursor_obj.execute(f"USE {self.schemas_name}")
cursor_obj.close()
the query to execute depends is specific to the database you are using, so for PSQL you will have a different query, for ORACLE, you will have a different, etc.

SqlAlchemy Python multiple databases

I'm using SqlAlchemy to access multiple databases (on the same server). My current connection string is the following
connect_string = "mssql+pyodbc://{0}:{1}#{2}/{3}".format(USERNAME_R, PASSWORD_R, SERVER_R, DATABASE_R)
engine = create_engine(connect_string)
session = sessionmaker(bind=engine)()
Record = declarative_base(engine)
How do I modify this declaration to be able to connect to multiple databases on the same server (e.g. DATABASE1 & DATABASE2). It seems like this is pointing in the right direction, but it's not very clear and I'm not sure if it is specific to Flask:
https://pythonhosted.org/Flask-SQLAlchemy/binds.html
Hi you can achieve this using follwing .
engines = {
'drivers':create_engine('postgres://postgres:admin#localhost:5432/Drivers'),
'dispatch':create_engine('postgres://postgres:admin#localhost:5432/dispatch')
}
i have two databases in sever and two tables.After that you can using Routing class to route for specific database connection while making a query :
class RoutingSession(Session):
def get_bind(self, mapper=None, clause=None):
if mapper and issubclass(mapper.class_, drivers):
return engines['drivers']
elif self._flushing:
return engines['dispatch']
now you can fire queries accordingly for eg first you need to make a session using like this :
Session = sessionmaker(class_=RoutingSession)
session = Session()
driverssql = session.query(drivers).all()
this was you can use multiple database in sqlalchemy
Adding a bit more of insight, the official SQLAlchemy documentation now has a couple more strategies than the ones presented on the rest of the answers.
Just take a look at the following links:
Partitioning Strategies (e.g. multiple database backends per Session)
Session API - Session and sessionmaker() - 'binds' parameter
(A hand-wavy answer because of lack of time, sorry.)
Do not bind an engine to declarative_base. Inherit form just declarative_base().
Instead, pass a specific engine to sqlalchemy.orm.sessionmaker when creating a session. Create different transaction factories (that sessionmaker returns) for different engines. In your queries, use .with_session to bind to a specific session, and thus a specific engine.

How to directly query a relational/SQL database from a ZOPE product?

How do I connect to and query a relational database that was configured with a DA e.g. ZPsycopgDA in a Zope product?
I want to send my own SQL queries using bound parameters and receive the results preferrable as a set of Result objects.
I don't want to use ZSQLMethods as I can't create one for every query beforehand, and ZSQLMethods don't support bound parameters.
The Zope Database Adapters (DA) code base is some of the oldest code still in use in Zope, and is thus somewhat archaic. When you call a Zope DA, you get a database connection object (a DB instance) that in turn has a query method:
connection = context.idOfZPsycoPGDA()
connection.query('SELECT * FROM your_table')
The query method is not exactly a Python database API standard method. It accepts multiple SQL statements separated by \0 null characters, but won't support SELECTs if there is more than one statement.
The ZPsychoDA query method also accepts query parameters:
connection.query('SELECT * FROM your_table WHERE id=?', ('yourid',))
but more importantly, the ZPsychoDA adapter also gives you access to a regular database cursor:
c = connection.cursor()
c.query('SELECT * FROM your_table WHERE id=?', ('yourid',))
My advise is to just use that and do regular Python DB API calls via the database cursor you get there.
The database connector instances should provide a query() method accepting a SQL command. And you get hold of the database adapter instance through acquisition (or traversal via restrictedTraverse('/path/to/da'):
result = context.my_zpyscopgda.query("select * from foo")
result = context.restrictedTraverse('/path/to/my_zpyscopgda').query("select * from foo")

Categories