Checking database connection with flask_mongoengine - python

I am practicing python with flask, flask_mongoengine and mongoDB.
I have followed various tutorials and for the moment am using application factory to set up my App (webservice for querying a database).
Code looks something like this:
db = MongoEngine()
def initialise_db(app):
db.init_app(app)
def create_app():
app = Flask(__name__)
api = Api(app)
initialise_db(app)
initialise_routes(api)
return app
I'd like to know if there is a way of checking the database connection is OK, before creating the app. Here the initialise_db(app) runs with no error even if there is no running instance of mongoDB (on port 21017 or other if specified in app.config).
I only get an error at the point were I try to do a db.Document.save()...
EDIT:
I have implemented a simple check_connection() function where I try to save a generic document in my DB, and raise an exception if this fails:
from db_models import checkElement
def check_connection():
try:
checkElement(checkID=1).save()
saved = True
except Exception as e:
saved = False
raise e
finally:
if saved:
checkElement.objects.get(checkID=1).delete()
I'm pretty sure this is not a very clean way to check for a DB connection, is there a better way to do the same with mongoengine/pymongo ??

Related

Flask session object does not persist between requests despite hardcoded secret key

I am currently running into an issue deploying a Flask app on Amazon's EB2 service. The Flask app works locally. When it is deployed, however, it only works for the first person who clicks the link. After that it throws the following error:
Internal Server Error The server encountered an internal error and was
unable to complete your request. Either the server is overloaded or
there is an error in the application.
The error it is throwing out concerns the Flask session - it becomes empty after routing from one site to another. I also noticed that the before_first_request function detailed below is ran only once, for the first user, and never again - which is even more bewildering.
Here's the minimal example:
from flask import Flask, render_template, request, session, url_for
application = Flask(__name__)
application.secret_key = "mysecretkey"
#application.before_first_request
def before_first_request():
""" these commands are run before the first request"""
# setup logging
application.logger.setLevel(logging.INFO)
application.logger.info('starting up Flask')
# clear session
session.clear()
# load in PID
session['pid'] = 123
# add parameters to the session
params = dict()
params['parameter'] = 0
session['params'] = params
application.logger.info(session) # it is printing the session as expected
return 'OK'
#application.route('/')
def main():
""" landing page """
application.logger.info(session) # empty
application.logger.info(application.secret_key) # as expected
params, results = session.pop('params'), session.pop('results') # throws out the error
return render_template('empty_template.jinja', args = session)
I am wondering if anyone might know what is going on how to resolve the issue?
I managed to solve it.
The error was that #before_first_request wrapper actually only ran once before first request ever made to the app. Hence, the session was actually only created and populated once.
I fixed that error by adding the call to before_first_request function at the top of the main function.

How to actually use pymongo ChangeStreams with Flask in a non-blocking way?

I am learning flask and PyMongo right now and came across ChangeStreams. I do understand how ChangeStreams work but I have only worked with them in Node and Express. I have implemented ChangeStreams in my Flask app as following:
with ms.db.collection.watch() as stream:
for change in stream:
print(change)
On the official docs pages, it says that it's a blocking method. But how would I go about in making it non-blocking? Because currently my ChangeStream logic is in a different file and I import it into the server.py file. So when it never goes past that import and the Flask App doesn't start at all. Below is my server.py
from flask import Flask, render_template, request
import mongo_starter as ms
import changestream as cs
app = Flask(__name__)
#app.route('/')
def home():
return render_template('index.html')
if __name__ == "__main__":
app.run(host="0.0.0.0", port="5000")
Below is my ChangeStream.py
import mongo_starter as ms
with ms.db.collection.watch() as stream:
for change in stream:
print(change)
Below is my MongoStarter.py that actually initiates the connection to Mongo
import pymongo
import mongo_config as mc
print(mc.data_header)
try:
print('Connecting to Database...')
mongo_client = pymongo.MongoClient(mc.mongo_url)
db = mongo_client['PyMongo']
collection = db['Test Data']
print("Connection to Database Successful!")
except pymongo.errors.InvalidURI:
print('Error Connecting to Database')
When I run the app using nodemon it prints the following to the output.
[nodemon] restarting due to changes...
[nodemon] starting `python server.py`
----------------- MONGO CONNECTION LOG --------------------
Connecting to Database...
Connection to Database Successful!
So it never actually goes past the change stream method. How can I make it so it worked in an async way? I have looked at asyncio, but wanted to see if there was any way to implement it without using asyncio.

Query hive from flask

I am new to flask and i am using the following flask cookiecutter to start with a quick prototype. The main idea of project is to collect data from hive cluster and push it to the end user using flask.
Although, i was successfully able to connect flask to the hive server using pyhive connector but I am getting a weird issue that's the related to the select limit where i am trying to query more than 50 items.
In my case i built just Hive class similar to the flask extension development around for pyhive similar demo:
from pyhive import hive
from flask import current_app
# Find the stack on which we want to store the database connection.
# Starting with Flask 0.9, the _app_ctx_stack is the correct one,
# before that we need to use the _request_ctx_stack.
try:
from flask import _app_ctx_stack as stack
except ImportError:
from flask import _request_ctx_stack as stack
class Hive(object):
def __init__(self, app=None):
self.app = app
if app is not None:
self.init_app(app)
def init_app(self, app):
# Use the newstyle teardown_appcontext if it's available,
# otherwise fall back to the request context
if hasattr(app, 'teardown_appcontext'):
app.teardown_appcontext(self.teardown)
else:
app.teardown_request(self.teardown)
def connect(self):
return hive.connect(current_app.config['HIVE_DATABASE_URI'], database="orc")
def teardown(self, exception):
ctx = stack.top
if hasattr(ctx, 'hive_db'):
ctx.hive_db.close()
return None
#property
def connection(self):
ctx = stack.top
if ctx is not None:
if not hasattr(ctx, 'hive_db'):
ctx.hive_db = self.connect()
return ctx.hive_db
and created an endpoint to load data from hive:
#blueprint.route('/hive/<limit>')
def connect_to_hive(limit):
cur = hive.connection.cursor()
query = "select * from part_raw where year=2018 LIMIT {0}".format(limit)
cur.execute(query)
res = cur.fetchall()
return jsonify(data=res)
At the first run everything works fine if i try to load things with limited to 50 items, but as soon as i increase it keeps in state where nothing load. However when i load data using jupyter notebooks it works fine that's why i suspect that i might missed something from my flask code.
The issue was library version issues, solved this by adding the following to my requirements:
# Hive with needed dependencies
sasl==0.2.1
thrift==0.11.0
thrift-sasl==0.3.0
PyHive==0.6.1
The old version was as follow:
sasl>=0.2.1
thrift>=0.10.0
#thrift_sasl>=0.1.0
git+https://github.com/cloudera/thrift_sasl # Using master branch in order to get Python 3 SASL patches
PyHive==0.6.1
As stated in the development requirement files within pyhive project.

how to create pymongo connection per request in Flask

In my Flask application, I hope to use pymongo directly. But I am not sure what's the best way to create pymongo connection for each request and how to reclaim the connection resource.
I know Connection in pymongo is thread-safe and has built-in pooling. I guess I need to create a global Connection instance, and use before_request to put it in flask g.
In the app.py:
from pymongo import Connection
from admin.views import admin
connection = Connection()
db = connection['test']
#app.before_request
def before_request():
g.db = db
#app.teardown_request
def teardown_request(exception):
if hasattr(g, 'db'):
# FIX
pass
In admin/views.py:
from flask import g
#admin.route('/')
def index():
# do something with g.db
It actually works. So questions are:
Is this the best way to use Connection in flask?
Do I need to explicitly reclaim resources in teardown_request and how to do it?
I still think this is an interesting question, but why no response... So here is my update.
For the first question, I think using current_app is more clearer in Flask.
In app.py
app = Flask(__name__)
connection = Connection()
db = connection['test']
app.db = db
In the view.py
from Flask import current_app
db = current_app.db
# do anything with db
And by using current_app, you can use application factory to create more than one app as http://flask.pocoo.org/docs/patterns/appfactories/
And for the second question, I'm still figuring it out.
Here's example of using flask-pymnongo extension:
Example:
your mongodb uri (till db name) in app.config like below
app.config['MONGO_URI'] = 'mongodb://192.168.1.1:27017/your_db_name'
mongo = PyMongo(app, config_prefix='MONGO')
and then under your api method where u need db do the following:
db = mongo.db
Now you can work on this db connection and get your data:
users_count = db.users.count()
I think what you present is ok. Flask is almost too flexible in how you can organize things, not always presenting one obvious and right way. You might make use of the flask-pymongo extension which adds a couple of small conveniences. To my knowledge, you don't have to do anything with the connection on request teardown.

How do I make one instance in Python that I can access from different modules?

I'm writing a web application that connects to a database. I'm currently using a variable in a module that I import from other modules, but this feels nasty.
# server.py
from hexapoda.application import application
if __name__ == '__main__':
from paste import httpserver
httpserver.serve(application, host='127.0.0.1', port='1337')
# hexapoda/application.py
from mongoalchemy.session import Session
db = Session.connect('hexapoda')
import hexapoda.tickets.controllers
# hexapoda/tickets/controllers.py
from hexapoda.application import db
def index(request, params):
tickets = db.query(Ticket)
The problem is that I get multiple connections to the database (I guess that because I import application.py in two different modules, the Session.connect() function gets executed twice).
How can I access db from multiple modules without creating multiple connections (i.e. only call Session.connect() once in the entire application)?
Try the Twisted framework with something like:
from twisted.enterprise import adbapi
class db(object):
def __init__(self):
self.dbpool = adbapi.ConnectionPool('MySQLdb',
db='database',
user='username',
passwd='password')
def query(self, sql)
self.dbpool.runInteraction(self._query, sql)
def _query(self, tx, sql):
tx.execute(sql)
print tx.fetchone()
That's probably not what you want to do - a single connection per app means that your app can't scale.
The usual solution is to connect to the database when a request comes in and store that connection in a variable with "request" scope (i.e. it lives as long as the request).
A simple way to achieve that is to put it in the request:
request.db = ...connect...
Your web framework probably offers a way to annotate methods or something like a filter which sees all requests. Put the code to open/close the connection there.
If opening connections is expensive, use connection pooling.

Categories