I'm building a data analysis flask application and I want to allow the user to save the session object as a file on their local file system (in case their session expires after 31 days). What's the best way to do this?
I've looked into pickling the session object but it doesn't look like the pickle can be sent to the user's computer (pickle.dump simply saves the pickle to the computer hosting the application).
The session is a dictionary. Add an endpoint that dumps the session to JSON, and serves it as a file download.
#app.route('/download_session')
def download_session():
r = jsonify(dict(session))
r.headers.set('Content-Disposition', 'attachment', filename='session.json')
return r
This doesn't seem like a very good idea though. If you're putting enough data in the session that the user would care to look at it, you're putting too much in the session. Also, the session can contain data that's relevant to the web app but not useful to the user. Instead, you probably want to write an endpoint that serves just the data the user needs.
Related
I'm building a data analysis Flask application which takes a ton of user inputs, performs some calculations then projects the results to various web pages. I'm using a pandas Dataframe to store the inputs and perform the calculations. Then I convert the DF to a dictionary and store that in the session object.
I'm having problems due to the fact that the session object can only hold ~4k bytes. Several pages read the data so I need a means to pass this large amount of data (~5k-50k) from one request to another (which the session object does perfectly but for smaller memory sizes).
Can I set the storage limit higher for the session object (I imagine I can't as 4k is the limit for a cookie and the session object is a cookie)? Or is there something else I should do here (store the dict in a database, etc.)?
EDIT:
I'm thinking a viable alternative would be to grab the data from the database (mongodb in my case), store it in a local variable and pass that variable to the template directly. Are there negatives to this? And is there a limit on how much memory I can pass directory to a template? See example below:
#app.route('/results')
def results():
# get data I need from database (~5k-50k bytes)
data = mongo.db[collection_name].find_one({'key': 'query'})
# pass directory to template (instead of storing in session object)
return render_template('results_page.html', data=data)
Yeah this definitely sounds like a case for server-side sessions.
There are code snippets on the official site for the most popular databases.
These shouldn't be hard to migrate to since they use the same SessionMixin interface as the cookie session system.
An even easier approach could be to use Flask-KVSession, which claims that
Integration with Flask is seamless, once the extension is loaded for a Flask application, it transparently replaces Flask’s own Session management. Any application working with sessions should work the same with Flask-KVSession
Flask-Session adds support to Server-side Session.
Here is how you can start Flask-Session (extracted from the official documentation).
from flask import Flask, session
from flask.ext.session import Session
app = Flask(__name__)
# Check Configuration section for more details
SESSION_TYPE = 'redis'
app.config.from_object(__name__)
Session(app)
#app.route('/set/')
def set():
session['key'] = 'value'
return 'ok'
#app.route('/get/')
def get():
return session.get('key', 'not set')
Regarding the size limit on rendering templates, the official Flask documentation has no mention of this.
I don't really believe there is any size limit actually.
If there is any limitation, that might be on Jinja's side.
Basically what I want to do is to regenerate every session with some new set of keys without having users to log in again. How can I do this?
edited for clarity
So let's assume we are using Redis as a backend for sessions and keeping cookies of it on the client-side. Cookie just consists of the session id. This session id corresponds to a session on the Redis. After we have initialized Session by writing Session(APP) in our application, for every request context, we can fetch the session of the current user by
from flask import session
After admin changes some general settings on the application, I am willing to regenerate the session of every current user which can be seen just for the current user by again
from flask import session
This is as far as I know.
For example, let's say there is a value on the user's session determined as
session['arbitrary_key'] = not_important_database_function()
After admin changes some stuff at application, I need to reload a key on the current user's session by
session['arbitrary_key'] = not_important_database_function()
Because after changes admin made, it will yield a different value. After that, I am changing session.modified as true. What I want to learn is how can I change the arbitrary_key on sessions of EVERY USER. Because I am lacking information on how to fetch every session and change them from Flask.
If I delete the sessions from Redis, users are required to reauthenticate. I don't want them to reauthenticate. I just want back-end sessions to be changed because I use some information inside of the user's session which needs to be fetched from Redis so I do not have to call not_important_database_function for every request.
I hope this is enough information for you to at least NOT answer but also NOT downvote so I can continue to seek a solution for my problem.
I am not sharing code snippets because no code snippet is helpful for the case in my opinion.
The question is rather old but it looks like many developers are interested in the answer. There are several approaches which come to mind:
1. Lazy calculations
You need a way to differentiate old and new session values. For example, storing version number in session. Then you can force clients to update their sessions when they are on your resource:
CURRENT_VERSION = '1.2.3'
#app.route('/some_route')
def some_handler():
if session.get('version', '0.0.0') < CURRENT_VERSION:
session['arbitrary_key'] = not_important_database_function()
session['version'] = CURRENT_VERSION
Pros: The method is easy to implement and it is agnostic to the way of storing session data (database, user-agent, etc.)
Cons: Session data is not updated instantly after deploy. You have to give up that some users' session data may not be updated at all.
2. Session storage update
You need to make some kind of a database migration for session storage. It is backend-dependable and will look different for different storages. For Redis it may look like this:
import json
import redis
r = redis.Redis() # Your connection settings here
for key in r.scan_iter():
raw_value = r.get(key)
session = json.loads(raw_value)
session['arbitrary_key'] = not_important_database_function()
r.set(key, json.dumps(session))
Pros: Session data for all users will be updated right after your service deployment.
Cons: The method implementations differ for different storages. It is not applicable if all session data is stored in user-agents.
3. Dropping session data
It is still an option though it is clearly stated in the question that you need to keep the users logged in. It may be a part of the policy of deploying new application versions: session key is regenerated on application deployment and all user sessions are invalidated.
Pros: You don't need to implement any logic to set new user session values.
Cons: There are no new user session values, the data is just wiped out.
I'm using:
from flask import session
#app.route('/')
def main_page():
if session.get('key'):
print ("session exist" + session.get('key'))
else:
print ("could not find session")
session['key'] = '34544646###########'
return render_template('index.html')
I don't have the Flask-Session extension installed but this still works fine. I'm trying to understand why and when is that extension imp to me. As far as I see, the default session works well for me.
The difference is in where the session data is stored.
Flask's sessions are client-side sessions. Any data that you write to the session is written to a cookie and sent to the client to store. The client will send the cookie back to the server with every request, that is how the data that you write in the session remains available in subsequent requests. The data stored in the cookie is cryptographically signed to prevent any tampering. The SECRET_KEY setting from your configuration is used to generate the signature, so the data in your client-side sessions is secure as long as your secret key is kept private. Note that secure in this context means that the data in the session cannot be modified by a potential attacker. The data is still visible to anybody who knows how to look, so you should never write sensitive information in a client-side session.
Flask-Session and Flask-KVSession are two extensions for Flask that implement server-side sessions. These sessions work exactly in the same way as the Flask native sessions from the point of view of your application, but they store the data in the server. The data is never sent to the client, so there is a bit of increased security. The client still receives a signed cookie, but the only data in the cookie is a session ID that references the file or database index in the server where the data is stored.
from flask import session
Cookies of all session data is stored client-side.
Pros:
Validating and creating sessions is fast (no data storage)
Easy to scale (no need to replicate session data across web servers)
Cons:
Sensitive data cannot be stored in session data, as it's stored on the web browser
Session data is limited by the size of the cookie (usually 4 KB)
Sessions cannot be immediately revoked by the Flask app
from flask_session import Session
Session data is stored server side.
Pros:
Sensitive data is stored on the server, not in the web browser
You can store as much session data as you want without worrying about the cookie size
Sessions can easily be terminated by the Flask app
Cons:
Difficult to set up and scale
Increased complexity since session state must be managed
*this information is from Patrick Kennedy on this excellent tutorial: https://testdriven.io/blog/flask-server-side-sessions/
Session
A session makes it possible to remember information from one request to another. The way Flask does this is by using a signed cookie. Cookie can be modified unless they have SECRET KEY. Save in Client Side unless permanent is set to TRUE(boolean). If Permanent is set True, it's store in the server default 31 days unless it mentioned PERMANENT_SESSION_LIFETIME in flask app.
Flask-Session:
Flask-Session is an extension for Flask that adds support for Server-side Session to your application. It's main goal to store the session in Server side
Server Side method are
- redis: RedisSessionInterface
- memcached: MemcachedSessionInterface
- filesystem: FileSystemSessionInterface
- mongodb: MongoDBSessionInterface
- sqlalchemy: SqlAlchemySessionInterface
Flask-Session is an extension of Session.
Bases on config method it's over write the existing session saving method.
flask.sessions.SessionInterface: SessionInterface is the basic interface you have to implement in order to replace the default session interface which uses flask(werkzeug’s) secure cookie implementation.
The only methods you have to implement are open_session() and save_session(), the others have useful defaults which you don’t need to change.
Based on this, they are updating the session in the selected storage
Session Interface
Reference Links:
Session
flask_session
Session Interface
`
Scenario:
Major web app w. Python+Flask
Flask login and Flask.session for basic session variables (user-id and session-id)
Flask.session and limitations? (Cookies)
Cookie based and basically persist only at the client side.
For some session variables that will be regularly read (ie, user permissions, custom application config) it feels awkward to carry all that info around in a cookie, at every single page request and response.
Database is too much?
Since the session can be identified at the server side by introducing unique session id at login, some server-side session variable management can be used. Reading this data at the server side from a database also feels like unnecessary overhead.
Question
What is the most efficient way to handle the session variables at the server side?
Perhaps that could be a memory-based solution, but I am worried that different Flask app requests could be executed at different threads that would not share the memory-stored session data, or cause conflicts in case of simultaneous reading-writing.
I am looking for advice and best practice for planning the basic level architecture.
Flask-Caching
What you need is a server-side caching package that's Flask-Caching.
A simple setup:
from flask import Flask
from flask_caching import Cache
app = Flask(__name__)
app.config['CACHE_TYPE'] = 'SimpleCache'
cache = Cache(app)
Then a explicitly use of a cached variable:
#app.route('/')
def load():
cache.set("foo", foo)
bar = cache.get("foo")
There is much more in Flask-Caching and that's the recommended approach by Flask.
In case of a multithread server with gunicorn from here you better use ['CACHE_TYPE'] = 'FileSystemCache'
Your instinct is correct, it's probably not the way to do it.
Session data should only be ephemeral information that is not too troublesome to lose and recreate. For example, the user will just have to login again to restore it.
Configuration data or anything else that's necessary on the server and that must survive a logout is not part of the session and should be stored in a DB.
Now, if you really need to easily keep this information client-side and it's not too much of a problem if it's lost, then use a session cookie for logged in/out state and a permanent cookie with a long lifespan for the rest of the configuration information.
If the information is too much size-wise, then the only option I can think of is to store the data, other than the logged in/out state, in a DB.
I was writing debugging methods for my CherryPy application. The code in question was (very) basically equivalent to this:
import cherrypy
class Page:
def index(self):
try:
self.body += 'okay'
except AttributeError:
self.body = 'okay'
return self.body
index.exposed = True
cherrypy.quickstart(Page(), config='root.conf')
I was surprised to notice that from request to request, the output of self.body grew. When I visited the page from one client, and then from another concurrently-open client, and then refreshed the browsers for both, the output was an ever-increasing string of "okay"s. In my debugging method, I was also recording user-specific information (i.e. session data) and that, too, showed up in both users' output.
I'm assuming that's because the python module is loaded into working memory instead of being re-run for every request.
My question is this: How does that work? How is it that self.debug is preserved from request to request, but cherrypy.session and cherrypy.response aren't?
And is there any way to set an object attribute that will only be used for the current request? I know I can overwrite self.body per every request, but it seems a little ad-hoc. Is there a standard or built-in way of doing it in CherryPy?
(second question moved to How does CherryPy caching work?)
synthesizerpatel's analysis is correct, but if you really want to store some data per request, then store it as an attribute on cherrypy.request, not in the session. The cherrypy.request and .response objects are new for each request, so there's no fear that any of their attributes will persist across requests. That is the canonical way to do it. Just make sure you're not overwriting any of cherrypy's internal attributes! cherrypy.request.body, for example, is already reserved for handing you, say, a POSTed JSON request body.
For all the details of exactly how the scoping works, the best source is the source code.
You hit the nail on the head with the observation that you're getting the same data from self.body because it's the same in memory of the Python process running CherryPy.
self.debug maintains 'state' for this reason, it's an attribute of the running server.
To set data for the current session, use cherrypy.session['fieldname'] = 'fieldvalue', to get data use cherrypy.session.get('fieldname').
You (the programmer) do not need to know the session ID, cherrypy.session handles that for you -- the session ID is automatically generated on the fly by cherrypy and is persisted by exchanging a cookie between the browser and server on subsequent query/response interactions.
If you don't specify a storage_type for cherrypy.session in your config, it'll be stored in memory (accessible to the server and you), but you can also store the session files on disk if you wish which might be a handy way for you to debug without having to write a bunch of code to dig out session IDs or key/pair values from the running server.
For more info check out http://www.cherrypy.org/wiki/CherryPySessions