How can I update a flask session inside a python thread? The below code is throwing this error:
*** RuntimeError: working outside of request context
from flask import session
def test(ses):
ses['test'] = "test"
#app.route('/test', methods=['POST', 'GET'])
def mytest():
t = threading.Thread(target=test, args=(session, ))
t.start()
When you execute t.start(), you are creating an independent thread of execution which is not synchronized with the execution of the main thread in any way.
The Flask session object is only defined in the context of a particular HTTP request.
What does the variable session mean in the second thread (t)?
When t executes, there is no guarantee that the user request from the main thread still exists or is in a modifiable state. Perhaps the HTTP request has already been fully handled in the main thread.
Flask detects that you are trying to manipulate an object that is dependent on a particular context, and that your code is not running in that context. So it raises an exception.
There are a variety of approaches to synchronizing output from multiple threads into a single request context but... what are you actually trying to do here?
None of the documentation I've seen really elaborates why this isn't possible in this framework - it's as if they have never heard of the use case.
In a nutshell, the built in session uses the user's browser (the cookie) as storage for the session - this is not what I understand sessions to be, and oh boy the security issues - don't store any secrets in there - the session is basically JSON encoded, compressed then set as a cookie - at least it's signed, I guess.
Flask-Session mitigates the security issues by behaving more like sessions do in other frameworks - the cookie is just an opaque identifier meaningful only in the back end - but the value changes every time the session changes, requiring the cookie be sent to the browser again - a background thread won't have access to the request when it's been completed a long time ago, so all you have is a one way transfer of data - out of the session and into your background task.
Might I suggest the baggage claim pattern? Your initial request handling function designates some key in some shared storage - a file on disk, a row in a database identified by some key, an object key in an in memory cache - whatever - and puts that in the session, then passes the session to your background process which can inspect the session for the location to place the results. Your subsequent request handling functions can then check this location for the results.
Related
I am writing an application which connects to a database. I want to create that db connection once, and then reuse that connection throughout the life of the application.
I also want to authenticate users. A user's auth will live for only the life of a request.
How can I differentiate between objects stored for the life of a flask app, versus specific to the request? Where would I store them so that all modules (and subsequent blueprints) have access to them?
Here is my sample app:
from flask import Flask, g
app = Flask(__name__)
#app.before_first_request
def setup_database(*args, **kwargs):
print 'before first request', g.__dict__
g.database = 'DATABASE'
print 'after first request', g.__dict__
#app.route('/')
def index():
print 'request start', g.__dict__
g.current_user = 'USER'
print 'request end', g.__dict__
return 'hello'
if __name__ == '__main__':
app.run(debug=True, port=6001)
When I run this (Flask 0.10.1) and navigate to http://localhost:6001/, here is what shows up in the console:
$ python app.py
* Running on http://127.0.0.1:6001/
* Restarting with reloader
before first request {}
after first request {'database': 'DATABASE'}
request start {'database': 'DATABASE'}
request end {'current_user': 'USER', 'database': 'DATABASE'}
127.0.0.1 - - [30/Sep/2013 11:36:40] "GET / HTTP/1.1" 200 -
request start {}
request end {'current_user': 'USER'}
127.0.0.1 - - [30/Sep/2013 11:36:41] "GET / HTTP/1.1" 200 -
That is, the first request is working as expected: flask.g is holding my database, and when the request starts, it also has my user's information.
However, upon my second request, flask.g is wiped clean! My database is nowhere to be found.
Now, I know that flask.g used to apply to the request only. But now that it is bound to the application (as of 0.10), I want to know how to bind variables to the entire application, rather than just a single request.
What am I missing?
edit: I'm specifically interested in MongoDB - and in my case, maintaining connections to multiple Mongo databases. Is my best bet to just create those connections in __init__.py and reuse those objects?
flask.g will only store things for the duration of a request. The documentation mentioned that the values are stored on the application context rather than the request, but that is more of an implementation issue: it doesn't change the fact that objects in flask.g are only available in the same thread, and during the lifetime of a single request.
For example, in the official tutorial section on database connections, the connection is made once at the beginning of the request, then terminated at the end of the request.
Of course, if you really wanted to, you could create the database connection once, store it in __init__.py, and reference it (as a global variable) as needed. However, you shouldn't do this: the connection could close or timeout, and you could not use the connection in multiple threads.
Since you didn't specify HOW you will be using Mongo in Python, I assume you will be using PyMongo, since that handles all of the connection pooling for you.
In this case, you would do something like this...
from flask import Flask
from pymongo import MongoClient
# This line of code does NOT create a connection
client = MongoClient()
app = Flask()
# This can be in __init__.py, or some other file that has imported the "client" attribute
#app.route('/'):
def index():
posts = client.database.posts.find()
You could, if you wish, do something like this...
from flask import Flask, g
from pymongo import MongoClient
# This line of code does NOT create a connection
client = MongoClient()
app = Flask()
#app.before_request
def before_request():
g.db = client.database
#app.route('/'):
def index():
posts = g.db.posts.find()
This really isn't all that different, however it can be helpful for logic that you want to perform on every request (such as setting g.db to a specific database depending on the user that is logged in).
Finally, you can realize that most of the work of setting up PyMongo with Flask is probably done for you in Flask-PyMongo.
Your other question deals with how you keep track of stuff specific to the user that is logged in. Well, in this case, you DO need to store some data that sticks around with the connection. flask.g is cleared at the end of the reuquest, so that's no good.
What you want to use is sessions. This is a place where you can store values that is (with the default implementation) stored in a cookie on the user's browser. Since the cookie will be passed along with every request the user's browser makes to your web site, you will have available the data you put in the session.
Keep in mind, though, that the session is NOT stored on the server. It is turned into a string that is passed back and forth to the user. Therefore, you can't store things like DB connections onto it. You would instead store identifiers (like user IDs).
Making sure that user authentication works is VERY hard to get right. The security concerns that you need to make sure of are amazingly complex. I would strongly recommend using something like Flask-Login to handle this for you. You can still use the session for storing other items as needed, or you can let Flask-Login handle determining the user ID and store the values you need in the database and retrieving them from the database in every request.
So, in summary, there are a few different ways to do what you want to do. Each have their usages.
Globals are good for items that are thread-safe (such as the PyMongo's MongoClient).
flask.g can be used for storing data in the lifetime of a request. With SQLAlchemy-based flask apps, a common thing to do is to ensure that all changes happen at once, at the end of a request using an after_request method. Using flask.g for something like this is very helpful.
The Flask session can be used to store simple data (strings and numbers, not connection objects) that can be used on subsequent requests that come from the same user. This is entirely dependent on using cookies, so at any point the user could delete the cookie and everything in the "session" will be lost. Therefore, you probably want to store much of your data in databases, with the session used to identify the data that relates to the user in the session.
"bound to the application" does not mean what you think it means. It means that g is bound to the currently running request. Quoth the docs:
Flask provides you with a special object that ensures it is only valid for the active request and that will return different values for each request.
It should be noted that Flask's tutorials specifically do not persist database objects, but that this is not normative for any application of substantial size. If you're really interested in diving down the rabbit hole, I suggest a database connection pooling tool. (such as this one, mentioned in the SO answer ref'd above)
I suggest you use session to manage user information. Sessions help you keep information b/w multiple requests and flask provides you a session framework already.
from flask import session
session['usename'] = 'xyz'
Look at the extension Flask-Login. It is well designed to handle user authentications.
For database, I suggest looking at Flask-SQLAlchemy extension. This takes care of initialization, pooling, teardowns etc. for you out of the box. All you need to do is define the database URI in a config and bind it to the application.
from flask.ext.sqlalchemy import SQLAlchemy
app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:////tmp/test.db'
db = SQLAlchemy(app)
Scenario:
Major web app w. Python+Flask
Flask login and Flask.session for basic session variables (user-id and session-id)
Flask.session and limitations? (Cookies)
Cookie based and basically persist only at the client side.
For some session variables that will be regularly read (ie, user permissions, custom application config) it feels awkward to carry all that info around in a cookie, at every single page request and response.
Database is too much?
Since the session can be identified at the server side by introducing unique session id at login, some server-side session variable management can be used. Reading this data at the server side from a database also feels like unnecessary overhead.
Question
What is the most efficient way to handle the session variables at the server side?
Perhaps that could be a memory-based solution, but I am worried that different Flask app requests could be executed at different threads that would not share the memory-stored session data, or cause conflicts in case of simultaneous reading-writing.
I am looking for advice and best practice for planning the basic level architecture.
Flask-Caching
What you need is a server-side caching package that's Flask-Caching.
A simple setup:
from flask import Flask
from flask_caching import Cache
app = Flask(__name__)
app.config['CACHE_TYPE'] = 'SimpleCache'
cache = Cache(app)
Then a explicitly use of a cached variable:
#app.route('/')
def load():
cache.set("foo", foo)
bar = cache.get("foo")
There is much more in Flask-Caching and that's the recommended approach by Flask.
In case of a multithread server with gunicorn from here you better use ['CACHE_TYPE'] = 'FileSystemCache'
Your instinct is correct, it's probably not the way to do it.
Session data should only be ephemeral information that is not too troublesome to lose and recreate. For example, the user will just have to login again to restore it.
Configuration data or anything else that's necessary on the server and that must survive a logout is not part of the session and should be stored in a DB.
Now, if you really need to easily keep this information client-side and it's not too much of a problem if it's lost, then use a session cookie for logged in/out state and a permanent cookie with a long lifespan for the rest of the configuration information.
If the information is too much size-wise, then the only option I can think of is to store the data, other than the logged in/out state, in a DB.
I have created a class in a Bottle application which handles and stores URL information and is created each time a http request is made:
#route('/<fullurl:path>')
def page_req(fullurl=''):
urlData = urlReq(request.urlparts[1], fullurl)
urlData is the instance name and urlReq is the class name.
Obviously the urlData instance will contain information generated from one request. I'm just wondering what happens if another request comes in before the cycle of the first request has finished and sent its output. Will the second request change the data in urlData or will there be two separate processes each with their own version of urlData?
I've been reading the WSGI processes/threads information and the Bottle docs all afternoon and it's still not immediately clear. I have tried writing a small automated script fire multiple requests at the development server but it seems to hold excess requests off til one has finished. Hope I've been clear enough.
bottle.request is a thread-safe instance of LocalRequest(). If accessed from within a request callback, this instance always refers to the current request (even on a multithreaded server).
see http://bottlepy.org/docs/dev/api.html#bottle.request
When setting up a Pyramid app and adding settings to the Configurator, I'm having issues understanding how to access information from request, like request.session and such. I'm completely new at using Pyramid and I've searched all over the place for information on this but found nothing.
What I want to do is access information in the request object when sending out exception emails on production. I can't access the request object, since it's not global in the __init__.py file when creating the app. This is what I've got now:
import logging
import logging.handlers
from logging import Formatter
config.include('pyramid_exclog')
logger = logging.getLogger()
gm = logging.handlers.SMTPHandler(('localhost', 25), 'email#email.com', ['email#email.com'], 'Error')
gm.setLevel(logging.ERROR)
logger.addHandler(gm)
This works fine, but I want to include information about the logged in user when sending out the exception emails, stored in session. How can I access that information from __init__.py?
Attempting to make request a global variable, or somehow store a pointer to "current" request globally (if that's what you're going to try with subscribing to NewRequest event) is not a terribly good idea - a Pyramid application can have more than one thread of execution, so more than one request can be active within a single process at the same time. So the approach may appear to work during development, when the application runs in a single thread mode and just one user accesses it, but produce really funny results when deployed to a production server.
Pyramid has pyramid.threadlocal.get_current_request() function which returns thread-local request variable, however, the docs state that:
This function should be used extremely sparingly, usually only in unit
testing code. it’s almost always usually a mistake to use
get_current_request outside a testing context because its usage makes
it possible to write code that can be neither easily tested nor
scripted.
which suggests that the whole approach is not "pyramidic" (same as pythonic, but for Pyramid :)
Possible other solutions include:
look at exlog.extra_info parameter which should include environ and params attributes of the request into the log message
registering exception views would allow completely custom processing of exceptions
Using WSGI middleware, such as WebError#error_catcher or Paste#error_catcher to send emails when an exception occurs
if you want to log not only exceptions but possibly other non-fatal information, maybe just writing a wrapper function would be enough:
if int(request.POST['donation_amount']) >= 1000000:
send_email("Wake up, we're rich!", authenticated_userid(request))
Using google app engine:
# more code ahead not shown
application = webapp.WSGIApplication([('/', Home)],
debug=True)
def main():
run_wsgi_app(application)
if __name__ == "__main__":
main()
If two different users request the webpage on two different machine, two individual instances of the server will be invoked?
Or just one instance of the server is running all the time which handle all the requests?
How about if one user open the webpage twice in the same browser?
Edit:
According to the answers below, one instance may handle requests from different users turn by turn. Then consider the following fraction of code, taken from the example Google gave:
class User(db.Model):
email = db.EmailProperty()
nickname = db.StringProperty()
1, email and nickname here are defined as class variables?
2, All the requests handled by the same instance of server share the same variables and thus by mistake interfere with each other? (Say, one's email appears in another's page)
ps. I know that I should read the manual and doc more and I am doing it, however answers from experienced programmer will really help me understand faster and more through, thanks
An instance can handle many requests over its lifetime. In the python runtime's threading model, each instance can only handle a single request at any given time. If 2 requests arrive at the same time they might be handled one after the other by a single instance, or a second instance might be spawned to handle the request.
EDIT:
In general, variables used by each request will be scoped to a RequestHandler instance's .get() or .post() method, and thus can't "leak" into other requests. You should be careful about using global variables in your scripts, as these will be cached in the instance and would be shared between requests. Don't use globals without knowing exactly why you want to (which is good advice for any application, for that matter), and you'll be fine.
App Engine dynamically builds up and tears down instances based on request volume.
From the docs:
App Engine applications are powered by
any number of instances at any given
time, depending on the volume of
requests received by your application.
As requests for your application
increase, so do the number of
instances powering it.
Each instance has its own queue for
incoming requests. App Engine monitors
the number of requests waiting in each
instance's queue. If App Engine
detects that queues for an application
are getting too long due to increased
load, it automatically creates a new
instance of the application to handle
that load.
App Engine scales instances in reverse
when request volumes decrease. In this
way, App Engine ensures that all of
your application's current instances
are being used to optimal efficiency.
This automatic scaling makes running
App Engine so cost effective.
When an application is not being used
all, App Engine turns off its
associated instances, but readily
reloads them as soon as they are
needed.