Is it possible to access an in-memory SQLite database from different threads?
In the following sample code I create a SQLite database in memory and create a table. When I now go to a different execution context, which I think have to do when I go to a different thread, the created table isn't there anymore. If I would open a file based SQLite database, the table would be there.
Can I achieve the same behavior for an in-memory database?
from peewee import *
db = SqliteDatabase(':memory:')
class BaseModel(Model):
class Meta:
database = db
class Names(BaseModel):
name = CharField(unique=True)
print(Names.table_exists()) # this returns False
Names.create_table()
print(Names.table_exists()) # this returns True
print id(db.get_conn()) # Our main thread's connection.
with db.execution_context():
print(Names.table_exists()) # going to another context, this returns False if we are in :memory: and True if we work on a file *.db
print id(db.get_conn()) # A separate connection.
print id(db.get_conn()) # Back to the original connection.
Working!!
cacheDB = SqliteDatabase('file:cachedb?mode=memory&cache=shared')
Link
http://charlesleifer.com/blog/managing-database-connections-with-peewee/
https://groups.google.com/forum/#!topic/peewee-orm/78invrt3xyo
Related
We have data in a Snowflake cloud database that we would like to move into an Oracle database. As we would like to work toward refreshing the Oracle database regularly, I am trying to use SQLAlchemy to automate this.
I would like to do this using Core because my team is all experienced with SQL, but I am the only one with Python experience. I think it would be easier to tweak the data pulls if we just pass SQL strings. Plus the Snowflake db has some columns with JSON that seems easier to parse using direct SQL since I do not see JSON in the SnowflakeDialect.
I have established connections to both databases and am able to do select queries from both. I have also manually created the tables in our Oracle db so that the keys and datatypes match what I am pulling from Snowflake. When I try to insert, though, my Jupyter notebook just continuously says "Executing Cell" and hangs. Any thoughts on how to proceed or how to get the notebook to tell me where the hangup is?
from sqlalchemy import create_engine,pool,MetaData,text
from snowflake.sqlalchemy import URL
import pandas as pd
eng_sf = create_engine(URL( #engine for snowflake
account = 'account'
user = 'user'
password = 'password'
database = 'database'
schema = 'schema'
warehouse = 'warehouse'
role = 'role'
timezone = 'timezone'
))
eng_o = create_engine("oracle+cx_oracle://{}[{}]:{}#{}".format('user','proxy','password','database'),poolclass=pool.NullPool) #engine for oracle
meta_o = MetaData()
meta_o.reflect(bind=eng_o)
person_o = meta_o['bb_lms_person'] # other oracle tables follow this example
meta_sf = MetaData()
meta_sf.reflect(bind=eng_sf,only=['person']) # other snowflake tables as well, but for simplicity, let's look at one
person_sf = meta_sf.tables['person']
person_query = """
SELECT ID
,EMAIL
,STAGE:student_id::STRING as STUDENT_ID
,ROW_INSERTED_TIME
,ROW_UPDATED_TIME
,ROW_DELETED_TIME
FROM cdm_lms.PERSON
"""
with eng_sf.begin() as connection:
result = connection.execute(text(person_query)).fetchall() # this snippet runs and returns result as expected
with eng_o.begin() as connection:
connection.execute(person_o.insert(),result) # this is a coinflip, sometimes it runs, sometimes it just hangs 5ever
eng_sf.dispose()
eng_o.dispose()
I've checked the typical offenders. The keys for both person_o and the result are all lowercase and match. Any guidance would be appreciated.
use the metadata for the table. the fTable_Stage update or inserted as fluent functions and assign values to lambda variables. This is very safe because only metadata field variables can be used in the lambda. I am updating three fields:LateProbabilityDNN, Sentiment_Polarity, Sentiment_Subjectivity
engine = create_engine("mssql+pyodbc:///?odbc_connect=%s" % params)
connection=engine.connect()
metadata=MetaData()
Session = sessionmaker(bind = engine)
session = Session()
fTable_Stage=Table('fTable_Stage', metadata,autoload=True,autoload_with=engine)
stmt=fTable_Stage.update().where(fTable_Stage.c.KeyID==keyID).values(\
LateProbabilityDNN=round(float(late_proba),2),\
Sentiment_Polarity=round(my_valance.sentiment.polarity,2),\
Sentiment_Subjectivity= round(my_valance.sentiment.subjectivity,2)\
)
connection.execute(stmt)
I need to create a new database connection(session) to avoid an unexpected commit from a MySql procedure in my django transaction. How to set up it in django?
I have tried to duplicate the database configuration in setting file. It worked for me but it seems not a good solution. See my code for more detail.
#classmethod
def get_sequence_no(cls, name='', enable_transaction=False):
"""
return the sequence no for the given key name
"""
if enable_transaction:
valobj = cls.objects.using('sequence_no').raw("call sp_get_next_id('%s')" % name)
return valobj[0].current_val
else:
valobj = cls.objects.raw("call sp_get_next_id('%s')" % name)
return valobj[0].current_val
Does anyone know how to use a custom database connection to call the procedure?
If you have a look at the django.db module, you can see that django.db.connection is a proxy for django.db.connections[DEFAULT_DB_ALIAS] and django.db.connections is an instance of django.db.utils.ConnectionHandler.
Putting this together, you should be able to get a new connection like this:
from django.db import connections
from django.db.utils import DEFAULT_DB_ALIAS, load_backend
def create_connection(alias=DEFAULT_DB_ALIAS):
connections.ensure_defaults(alias)
connections.prepare_test_settings(alias)
db = connections.databases[alias]
backend = load_backend(db['ENGINE'])
return backend.DatabaseWrapper(db, alias)
Note that this function will open a new connection every time it is called and you are responsible for closing it. Also, the APIs it uses are probably considered internal and might change without notice.
To close the connection, it should be enough to call .close() on the object return by the create_connection function:
conn = create_connection()
# do some stuff
conn.close()
With modern Django you can just do:
from django.db import connections
new_connection = connections.create_connection('default')
I have a class that create the a MongoClient inside:
db = MongoDB ('mydb' , 'config')
I am successfully able to connect to 'mydb' database and 'config' collection - but after querying the collection I do need this connection to database again. I proceed to create connection with another database and collection
db = MongoDB ('mapping' , 'box_details')
In such a case how can I close the connection to DB previously - is it that it would automatically get closed when app exits?
I'd recommend you to open connection using pymongo.MongoClient which will return mongo_client object. mongo_cient has instance method close allowing you to close connection manually.
Please see documentation about mongo_client
There are many questions about is django db connection thread safe, but they all seem to be asking the default request threads.
What if I am writing custom script that uses database connection in threads:
from django.db import connections
import threading
class Transform(object):
def transform_data(self, listing):
cursor = self.connection.cursor()
cursor.execute('SELECT ... WHERE id = %s', listing.id)
data = cursor.fetchall()
...
def run(self):
connection = self.connections['legacy']
for listing in listings:
threading.Thread(target=self.transform_data, args=[listing])
How safe is data inside transform_data thread in terms of the result from cursor is not mixed up with other threads?
Ideally each thread should be using its own connection. If you do that when you execute the select query inside transform_data you are essentially getting a snapshot of the data at that point in time. You can retrieve the rows without having to worry about their being updated or deleted by other threads provided that the other threads have their own connection.
If all threads share the same connection what exactly happens is very dependent on what database you are using and transaction isolation level
Each item in the connections object returns a thread-local connection to that database. By default, these connections cannot be shared between threads; attempting to do so will result in a DatabaseError.
Always use connections[alias] within the thread that executes your queries. Never access connections[alias] in the parent thread and pass the object to the child thread. This will ensure that every connection object you use is local to the current thread, avoiding any threading issues.
To fix your code and make it thread-safe, you would change it like this:
from django.db import connections
import threading
class Transform(object):
def transform_data(self, listing):
# Access the database connection on the global `connections` object
# from within the child thread.
cursor = connections['legacy'].cursor()
cursor.execute('SELECT ... WHERE id = %s', listing.id)
data = cursor.fetchall()
...
def run(self):
for listing in listings:
threading.Thread(target=self.transform_data, args=[listing])
DBSession = sessionmaker(bind=self.engine)
def add_person(name):
s = DBSession()
s.add(Person(name=name))
s.commit()
Everytime I run add_person() another connection is created with my postgreSQL DB.
Looking at:
SELECT count(*) FROM pg_stat_activity;
I see the count going up, until I get a Remaining connection slots are reserved for non-replication superuser connections error.
How do I kill those connections? Am I wrong in opening a new session everytime I want to add a Person record?
In general, you should keep your Session object (here DBSession) separate from any functions that make changes to the database. So in your case you might try something like this instead:
DBSession = sessionmaker(bind=self.engine)
session = DBSession() # create your session outside of functions that will modify database
def add_person(name):
session.add(Person(name=name))
session.commit()
Now you will not get new connections every time you add a person to the database.