How can I connect to two different databases using MongoClient() in Pymongo? - python

I need to access two different collection each in their respective databases on the same server. For example i need the collection "dummy" in the database "dummy" and collection "foo" in database "bar". To connect to single database I have been using this code
client = MongoClient()
db = client.dummy()
collection = db['dummy']
But if I also add
db1 = client.bar
collection = db1['foo']
This is not working.

Related

Fetch from one database, Insert/Update into another using SQLAlchemy

We have data in a Snowflake cloud database that we would like to move into an Oracle database. As we would like to work toward refreshing the Oracle database regularly, I am trying to use SQLAlchemy to automate this.
I would like to do this using Core because my team is all experienced with SQL, but I am the only one with Python experience. I think it would be easier to tweak the data pulls if we just pass SQL strings. Plus the Snowflake db has some columns with JSON that seems easier to parse using direct SQL since I do not see JSON in the SnowflakeDialect.
I have established connections to both databases and am able to do select queries from both. I have also manually created the tables in our Oracle db so that the keys and datatypes match what I am pulling from Snowflake. When I try to insert, though, my Jupyter notebook just continuously says "Executing Cell" and hangs. Any thoughts on how to proceed or how to get the notebook to tell me where the hangup is?
from sqlalchemy import create_engine,pool,MetaData,text
from snowflake.sqlalchemy import URL
import pandas as pd
eng_sf = create_engine(URL( #engine for snowflake
account = 'account'
user = 'user'
password = 'password'
database = 'database'
schema = 'schema'
warehouse = 'warehouse'
role = 'role'
timezone = 'timezone'
))
eng_o = create_engine("oracle+cx_oracle://{}[{}]:{}#{}".format('user','proxy','password','database'),poolclass=pool.NullPool) #engine for oracle
meta_o = MetaData()
meta_o.reflect(bind=eng_o)
person_o = meta_o['bb_lms_person'] # other oracle tables follow this example
meta_sf = MetaData()
meta_sf.reflect(bind=eng_sf,only=['person']) # other snowflake tables as well, but for simplicity, let's look at one
person_sf = meta_sf.tables['person']
person_query = """
SELECT ID
,EMAIL
,STAGE:student_id::STRING as STUDENT_ID
,ROW_INSERTED_TIME
,ROW_UPDATED_TIME
,ROW_DELETED_TIME
FROM cdm_lms.PERSON
"""
with eng_sf.begin() as connection:
result = connection.execute(text(person_query)).fetchall() # this snippet runs and returns result as expected
with eng_o.begin() as connection:
connection.execute(person_o.insert(),result) # this is a coinflip, sometimes it runs, sometimes it just hangs 5ever
eng_sf.dispose()
eng_o.dispose()
I've checked the typical offenders. The keys for both person_o and the result are all lowercase and match. Any guidance would be appreciated.
use the metadata for the table. the fTable_Stage update or inserted as fluent functions and assign values to lambda variables. This is very safe because only metadata field variables can be used in the lambda. I am updating three fields:LateProbabilityDNN, Sentiment_Polarity, Sentiment_Subjectivity
engine = create_engine("mssql+pyodbc:///?odbc_connect=%s" % params)
connection=engine.connect()
metadata=MetaData()
Session = sessionmaker(bind = engine)
session = Session()
fTable_Stage=Table('fTable_Stage', metadata,autoload=True,autoload_with=engine)
stmt=fTable_Stage.update().where(fTable_Stage.c.KeyID==keyID).values(\
LateProbabilityDNN=round(float(late_proba),2),\
Sentiment_Polarity=round(my_valance.sentiment.polarity,2),\
Sentiment_Subjectivity= round(my_valance.sentiment.subjectivity,2)\
)
connection.execute(stmt)

How to instantiate DB object from configuration?

I instantiate Mongo Client as below. It works fine. However I am trying to read the DB name (primer here) from the configuration. How do I do that?
from pymongo import MongoClient
client = MongoClient()
db = client.primer # want to read "primer" string from a variable
coll = db.dataset
You could do:
db_name = 'primer'
db = getattr(client, db_name)
if you are trying to connect to only one database you can specify the dbname while creating the db object itself
dbname = "primer"
db = MongoClient()[dbname]

Multiple SQLite connections to a database in :memory:

Is it possible to access an in-memory SQLite database from different threads?
In the following sample code I create a SQLite database in memory and create a table. When I now go to a different execution context, which I think have to do when I go to a different thread, the created table isn't there anymore. If I would open a file based SQLite database, the table would be there.
Can I achieve the same behavior for an in-memory database?
from peewee import *
db = SqliteDatabase(':memory:')
class BaseModel(Model):
class Meta:
database = db
class Names(BaseModel):
name = CharField(unique=True)
print(Names.table_exists()) # this returns False
Names.create_table()
print(Names.table_exists()) # this returns True
print id(db.get_conn()) # Our main thread's connection.
with db.execution_context():
print(Names.table_exists()) # going to another context, this returns False if we are in :memory: and True if we work on a file *.db
print id(db.get_conn()) # A separate connection.
print id(db.get_conn()) # Back to the original connection.
Working!!
cacheDB = SqliteDatabase('file:cachedb?mode=memory&cache=shared')
Link
http://charlesleifer.com/blog/managing-database-connections-with-peewee/
https://groups.google.com/forum/#!topic/peewee-orm/78invrt3xyo

Where to put client and db connection in python file

So I am building a mongo database class that will be provide access to inserting documents to the insertion service and provide access for viewing documents via a querying service. Right now I have the following for my database.py class:
import pymongo
client = pymongo.MongoClient('mongodb://localhost:27017/')
db_connection = client['my_database']
class DB_Object(object):
""" A class providing structure and access to the Database """
def add_document(self, json_obj):
coll = db_connection["some collection"]
document = {
"name" : "imma name",
"raw value" : 777,
"converted value" : 333
}
coll.insert(document)
def query_response(self, query):
"""query logic here"""
If I want concurrent queries and inserts with this class being called by multiple services is this the correct location for the lines:
client = pymongo.MongoClient('mongodb://localhost:27017/')
db_connection = client['my_database']
And is this a standard way to provide access?
Your code is correct. You should continue to use the same MongoClient instance for all operations in your application, this will ensure that all operations share the same connection pool and use as few connections as possible--this will maximize your efficiency. MongoClient is thread-safe so this will work even if you have concurrent operations on multiple threads.

How do I use pymongo to connect to an existing document collection/db?

On the command line, this works:
$ mongo
> show dbs
mydatabase 1.0GB
However, this does not:
$ python
>>> import pymongo
>>> connection = pymongo.MongoClient()
>>> connection.mydatabase.find()
I read through docs here:
http://api.mongodb.org/python/current/tutorial.html
But do not understand how to either...
connect to an existing database (using pymongo)
query what databases exist in the mongodb connection.
Why can't I access my database?
Connect to an existing database
import pymongo
from pymongo import MongoClient
connection = MongoClient()
db = connection.mydatabase
List existing databases
import pymongo
from pymongo import MongoClient
connection = MongoClient()
# connection.database_names() # depreciated
connection.list_database_names()
The question implies user has a local MongoDB. However I found this question trying to connect to a remote MongoDB. I think the tutorial is worth mentioning (no other answer here mentioned how I can specify the host and the port)
The above code will connect on the default host and port. We can also specify the host and port explicitly, as follows:
client = MongoClient('localhost', 27017)
Or use the MongoDB URI format:
client = MongoClient('mongodb://localhost:27017/')
show dbs and find() are totally different commands as such you cannot compare the two.
connection.mydatabase.find()
Will actually do nothing because you cannot find() documents on database level. You are probably looking for:
cursor = connection.mydatabase.mycol.find()
I am no Python programmer but something like that and the foreach the cursor var to get your data.
As an added note you will want to replace mycol with the collection name that contains your documents.
As for querying for a list of databases you can do something like:
databases = connection.mydatabase.command({'listDatabases': 1});
As shown here: http://docs.mongodb.org/manual/reference/command/listDatabases/#listDatabases
However again I am no Python programmer but this should get you started.
On the python command line:
import pymongo
from pymongo import MongoClient
connection = MongoClient() ## connects by default to db at localhost:27017
connection.database_names() ## python binding equivalent to show dbs.
Although there doesn't seem to be a wealth of examples, it appears that the bindings are pretty complete within the Python Driver API Documentation.
database_names() is deprecated. One can use list_database_names() instead.
mongo_db_url will be something like "mongodb://localhost:27017/". 27017 is deafult port number, replace suitably.
from pymongo import MongoClient
client = MongoClient(<mongo_db_url>)
#or client = MongoClient('localhost', 27017)
client.list_database_names()

Categories